1 / 16

Molecular Descriptors

Molecular Descriptors. C371 Fall 2004. INTRODUCTION. Molecular descriptors are numerical values that characterize properties of molecules Examples: Physicochemical properties (empirical) Values from algorithms, such as 2D fingerprints

Albert_Lan
Download Presentation

Molecular Descriptors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Molecular Descriptors C371 Fall 2004

  2. INTRODUCTION • Molecular descriptors are numerical values that characterize properties of molecules • Examples: • Physicochemical properties (empirical) • Values from algorithms, such as 2D fingerprints • Vary in complexity of encoded information and in compute time

  3. Descriptors for Large Data Sets • Descriptors representing properties of complete molecules • Examples: LogP, Molar Refractivity • Descriptors calculated from 2D graphs • Examples: Topological Indexes, 2D fingerprints • Descriptors requiring 3D representations • Example: Pharmacophore descriptors

  4. DESCRIPTORS CALCULATED FROM 2D STRUCTURES • Simple counts of features • Lipinski Rule of Five (H bonds, MW, etc.) • Number of ring systems • Number of rotatable bonds • Not likely to discriminate sufficiently when used alone • Combined with other descriptors for best effect

  5. Physicochemical Properties • Hydrophobicity • LogP – the logarithm of the partition coefficient between n-octanol and water • ClogP (Leo and Hansch) – based on small set of values from a small set of simple molecules • BioByte: http://www.biobyte.com/ • Daylight’s MedChem Help page • http://www.daylight.com/dayhtml/databases/medchem/medchem-help.html • Isolating carbon: one not doubly or triply bonded to a heteroatom

  6. ACD Labs Calculated Properties • http://www.acdlabs.com • ACD Labs values now incorporated into the CAS Registry File for millions of compounds • I-Lab: http://ilab.acdlabs.com/ • Name generation • NMR prediction • Physical property prediction

  7. Molar Refractivity • MR = n2 – 1 MW -------- ----- n2 + 2 d where n is the refractive index, d is density, and MW is molecular weight. • Measures the steric bulk of a molecule.

  8. Topological Indexes • Single-valued descriptors calculated from the 2D graph of the molecule • Characterize structures according to size, degree of branching, and overall shape • Example: Wiener Index – counts the number of bonds between pairs of atoms and sums the distances between all pairs

  9. Topological Indexes: Others • Molecular Connectivity Indexes • Randić (et al.) branching index • Defines a “degree” of an atom as the number of adjacent non-hydrogen atoms • Bond connectivity value is the reciprocal of the square root of the product of the degree of the two atoms in the bond. • Branching index is the sum of the bond connectivities over all bonds in the molecule. • Chi indexes – introduces valence values to encode sigma, pi, and lone pair electrons

  10. Kappa Shape Indexes • Characterize aspects of molecular shape • Compare the molecule with the “extreme shapes” possible for that number of atoms • Range from linear molecules to completely connected graph

  11. 2D Fingerprints • Two types: • One based on a fragment dictionary • Each bit position corresponds to a specific substructure fragment • Fragments that occur infrequently may be more useful • Another based on hashed methods • Not dependent on a pre-defined dictionary • Any fragment can be encoded • Originally designed for substructure searching, not for molecular descriptors

  12. Atom-Pair Descriptors • Encode all pairs of atoms in a molecule • Include the length of the shortest bond-by-bond path between them • Elemental type plus the number of non-hydrogen atoms and the number of π-bonding electrons

  13. BCUT Descriptors • Designed to encode atomic properties that govern intermolecular interactions • Used in diversity analysis • Encode atomic charge, atomic polarizability, and atomic hydrogen bonding ability

  14. DESCRIPTORS BASED ON 3D REPRESENTATIONS • Require the generation of 3D conformations • Can be computationally time consuming with large data sets • Usually must take into account conformational flexibility • 3D fragment screens encode spatial relationships between atoms, ring centroids, and planes

  15. Pharmacophore Keys & Other 3D Descriptors • Based on atoms or substructures thought to be relevant for receptor binding • Typically include hydrogen bond donors and acceptors, charged centers, aromatic ring centers and hydrophobic centers • Others: 3D topographical indexes, geometric atom pairs, quantum mechanical calculations for HUMO and LUMO

  16. DATA VERIFICATION AND MANIPULATION • Data spread and distribution • Coefficient of variation (standard deviation divided by the mean) • Scaling (standardization): making sure that each descriptor has an equal chance of contributing to the overall analysis • Correlations • Reducing the dimensionality of a data set: Principal Components Analysis

More Related