460 likes | 656 Views
Lecture 10 Cheminformatics of TCM Y.Z. Chen Department of Pharmacy National University of Singapore Tel: 65-6616-6877; Email: phacyz@nus.edu.sg ; Web: http://bidd.nus.edu.sg. Content TCM ingredients and databases Digital representation of TCM ingredients Molecular descriptors
E N D
Lecture 10 Cheminformatics of TCMY.Z. ChenDepartment of PharmacyNational University of SingaporeTel: 65-6616-6877; Email: phacyz@nus.edu.sg ; Web: http://bidd.nus.edu.sg • Content • TCM ingredients and databases • Digital representation of TCM ingredients • Molecular descriptors • TCM ingredient classification by molecular descriptors
TCM Ingredients Pharmacology & Therapeutics 2000, 86:191-198
Medicinal Herb Databases at BIDD • TCM-ID: Traditional Chinese Medicine -Information Database • Only database providing integrated and comprehensive info about: • TCM formula, constituent herbs, herbal ingredients, effect on proteins • Molecular structure • Function at the formula, herb and compound levels Function Structure Protein • Comparison with • existing TCM • databases: • Formula: • TCM-ID: 1000 • TCHFL: 270 • Herb: • TCM-ID: 1200 • TCSHL: 520 • TCMD: 1500 • Compound: • TCM-ID: 9000 • CNPD: 3000 • TCMD: 6800 Protein Compound Function Protein Herb Function Structure Protein Function Compound Protein Function Structure Protein TCM Formula Function Protein Compound Protein Herb TCHF Library Structure Function Protein Protein Compound TCSH Library TCMD CNPD
TCM-ID Database at BIDD http://bidd.nus.edu.sg/group/TCMsite/Default.aspx
TCM-ID Database at BIDD http://bidd.nus.edu.sg/group/TCMsite/Default.aspx
TCM-ID Database at BIDD http://bidd.nus.edu.sg/group/TCMsite/Default.aspx
TCM-ID Database at BIDD http://bidd.nus.edu.sg/group/TCMsite/Default.aspx
TCM-ID Database at BIDD http://bidd.nus.edu.sg/group/TCMsite/Default.aspx
TCM-ID Database at BIDD http://bidd.nus.edu.sg/group/TCMsite/Default.aspx
TCM-ID Database at BIDD http://bidd.nus.edu.sg/group/TCMsite/Default.aspx
TCM-ID Database at BIDD http://bidd.nus.edu.sg/group/TCMsite/Default.aspx
PUBCHEM Database http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=90781
PUBCHEM Database http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=90781
Representation of Herbal Ingredients by SMILES Simplified Molecular Input Line Entry System (SMILES) Widely used AND computationally efficient Uses atomic symbols and a set of intuitive rules Uses hydrogen-suppressed molecular graphs (HSMG)
SMILES Bonds SINGLE* DOUBLE TRIPLE AROMATIC* * can be omitted - = # :
Butanols 2-Butanol iso-Butanol tert-Butanol
SMILES Branches Represented by enclosure in parentheses Can be nested or stacked Examples: CC(O)CC is 2-Butanol OCC(C)C is iso-Butanol OC(C)(C)C is tert-Butanol
SMILES Bonds Ethene Chloroethene 1,1-Dichloroethene cis-1,2-Dichloroethene Trichloroethene Perchloroethene C=C ClC=C ClC(Cl)=C ClC=CCl ClC(Cl)=CCl ClC(Cl)=C(Cl)Cl
SMILES Atoms Use normal chemical symbols Add punctuation symbols if necessary No super- or subscripts
SMILES Symbols String of alphanumeric characters and certain punctuation symbols Terminates at the first space encountered when read left to right The ORGANIC SUBSET: B, C, N, O, P, S, F, Cl, Br, I
Other SMILES Atoms Aliphatic or nonaromatic carbon: C Atom in aromatic ring: lowercase letter Designate ring closure with pairs of matching digits, e.g. c1ccccc1 (or C1=CC=CC=C1) is Benzene, whereas C1CCCCC1 is Cyclohexane
SMILES Charges Specify attached hydrogens and charges in square brackets Number of attached hydrogens is the symbol H followed by optional digit
SMILES Charges [H+] [OH-] [OH3+] [Fe++] [NH4+] proton hydroxyl anion hydronium cation iron(II) cation ammonium cation
SMILES Cyclic Structures Break one single or one aromatic bond in each ring Number in any order Designate ring-breaking atoms by the same digit following the atomic symbol
Representation of Herbal Ingredients by Molecular Descriptor Molecular descriptors are numerical values that characterize properties of molecules Examples: Physicochemical properties (empirical) Values from algorithms, such as 2D fingerprints Vary in complexity of encoded information and in compute time
Molecular Descriptors Constitutional MW, N atoms, Topological Connectivity,Weiner index Electrostatic Polarity, polarizability, partial charges Geometrical Descriptors Length, width, Molecular volume Quantum Chemical HOMO and LUMO energies Vibrational frequencies Bond orders Total energy 32
Molecular Descriptors van der Waals volume The sum of the non-overlaping volume of van der Waals sphere of each atom of the molecule Molecular surface The area of the surface contours generated by rolling a probing sphere against the surface atoms of the molecule 33
Molecular size vectors Define ranges for distances and angles Molecular Descriptors 34
Molecular Descriptors for Large Data Sets Descriptors representing properties of complete molecules Examples: LogP, Molar Refractivity Descriptors calculated from 2D graphs Examples: Topological Indexes, 2D fingerprints Descriptors requiring 3D representations Example: Pharmacophore descriptors
Molecular Descriptors Calculated From 2D Structures Simple counts of features Lipinski Rule of Five (H bonds, MW, etc.) Number of ring systems Number of rotatable bonds Not likely to discriminate sufficiently when used alone Combined with other descriptors for best effect
Physicochemical Properties Hydrophobicity LogP – the logarithm of the partition coefficient between n-octanol and water ClogP (Leo and Hansch) – based on small set of values from a small set of simple molecules BioByte: http://www.biobyte.com/ Daylight’s MedChem Help page http://www.daylight.com/dayhtml/databases/medchem/medchem-help.html Isolating carbon: one not doubly or triply bonded to a heteroatom
Molecular Descriptor LogP Octanol-Water Partition Coefficients P = C(octanol) / C(water) log P like rG = - RT ln Keq Hydrophobic - hydrophilic character P increases then more hydrophobic 38
TCM Ingredient Classification by Molecular Descriptors J. Chem. Inf. Model., Vol. 47, No. 6, 2007
TCM Ingredient Classification by Molecular Descriptors Classification of TCM ingredients of specific chemical classes by decision trees method
TCM Ingredient Classification by Molecular Descriptors Classification of TCM ingredients of specific chemical classes by decision trees method
TCM Ingredient Classification by Molecular Descriptors Distribution of TCM ingredients of specific chemical classes without using decision trees method
TCM Ingredient Classification by Molecular Descriptors Distribution of TCM ingredients of specific chemical classes without using decision trees method
Acknowledgement • Current Group Members: • Computer-Aided Drug Design: CY Ung, XH Ma, XH Liu, Pankaj Kumar, F Zhu, X Liu, J Jia • Protein Function, Interaction, Network: HL Zhang, CY Ung, XH Ma, F Zhu, WK Teo, Z Shi • Databases and Servers: J Jia • Medicinal Herb: CY Ung, Pankaj Kumar, Cao Jinyi(undergraduate students) • Microarray and biomarkers: J Jia, ZQ Tang • Former Members: • PhD: • ZW Cao (Prof SCBIT, Tongji U), ZL Ji (Assoc Prof Xiamen U), X Chen (Assoc Prof Zhejiang U), • CW Yap (Assist Prof NUS), LY Han (Postdoc NIH), CJ Zheng (Postdoc NIH), • HH Lin (Postdoc Harvard ), J Cui (Postdoc U Georgia), H Li (Postdoc Einstein College Med) • Research Fellow/Assistant: • ZR Li (Assoc Prof SiChuan U), Y Xue (Prof SiChuan U), W Liu (Assoc Prof DUT), • D Mi (Assoc Prof DUT), CZ Cai (Prof ChongQing U), DG Zhi (Postdoc, Berkeley), • MSc: • Y.J. Guo (Postdoc NIH), L.Z. Sun (RA, U Tenn.), J. F. Wang (MSU), L.X. Yao (Columbia), • S Ong (Washington U), H Zhou (local company), B Xie (local company) • BSc: • W.K. Yeo (IMCB, Novartis)