350 likes | 710 Views
Handling ligands with PRODRG. Daan van Aalten. Division of Biological Chemistry and Drug Discovery College of Life Sciences. PRODRG - why? . Early 1990s - no software to generate topologies for non-macromolecular entities
E N D
Handling ligands with PRODRG Daan van Aalten Division of Biological Chemistry and Drug Discovery College of Life Sciences
PRODRG - why? • Early 1990s - no software to generate topologies for non-macromolecular entities • Manual topology generation is time consuming and error prone (but instructive) • Small molecule coordinate generators essentially only commercially available
+ Topologies for SHELX, REFMAC5, CNS, O, TNT, … PRODRG - why? • For small molecules, we need to go from imagination/some chemical info to a correct topology and optimised coordinates in seconds
PRODRG - why? Citrate (1AJ8) NADP+ (1DDI) Cyclohexylamine (1PPA) (1997; 1.9 Å) (1999; 2.5 Å) (1991; 2.0 Å) Diphosphate (1N5L) Sulphate (1DW9) Ethylene glycol (1JKV) (2002; 2.3 Å) (1999; 1.7 Å) (2001; 1.4 Å)
PRODRG History • Version 1 (1995) • Started as a DRuG PROgram in GROMOS87 • Takes PDB file and generates ‘MOLDES’ (SMILES-like 1D string) and MD topologies • Version 2 (2004) • Many additional input formats • Many additional output formats, including topologies for crystallographic software • Version 2.5 (2005) • Internal all-atom representation
PRODRG History • Details covered in two publications • Webserver (~300 runs/day) with short FAQ
PRODRG Guts • Essentially FORTRAN (30000 lines) with some supporting C (5000) lines • Compiles well on all major platforms • Few dependencies (GROMACS for coordinate generation)
Model building & refinement PDB file Molecular dynamics Molfile DB lookups & property pred. Human Docking & analysis Visualisation PRODRG What is PRODRG? • Generates information about small molecules Molecular descripton • Atomic coordinates • Chemical types • Connectivity • Bond orders / aromaticity • Hybridisation • Formal charges • Atomic charges • Force field parameters • Hydrogen atoms • Free torsions • Hydrogen bonding
How does PRODRG work? • Fixed order of steps is bad • Input analysis is rather rude: • Deletes hydrogens • Ignores bond order information • Analysis of input • Initial data gathering • Addition of hydrogens • Atom reordering • Topology generation • Formal and partial charges • Additional molecule data • Output
Add 1+sp(x)-ncon(x) hydrogens Do nothing Add 1 hydrogen How does PRODRG work? • Most steps use ‘chemical pattern matching’ • Example: hydrogen generation
How does PRODRG work? • Currently all Hs generated by 17 ‘rules’ • Chemical knowledge in data, not code • More flexible • Potentially user-configurable
Limitations • Supported atom types limited • C,H,N,O,P,S,F,Cl,Br,I only • Other chemical limitations • No more than 4 connections/atom • Standard version limited to <=300 atoms • Ignoring hydrogens and bond types may lead to unexpected results • (Apolar hydrogens as second-class atoms) • SMILES not yet implemented (but trivial)
Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg Basic usage: web server • Four easy steps:
Basic usage: web server Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg Paste input • Four easy steps:
Chirality restraints? • Reduced charges? • Coordinates? Basic usage: web server Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg Paste input Edit settings • Four easy steps:
Basic usage: web server Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg Paste input Edit settings Run it • Four easy steps:
Basic usage: web server Go to http://davapc1.bioch.dundee.ac.uk/programs/prodrg Paste input Edit settings Run it Success! • Four easy steps:
PRODRG inputs • PDB coordinates • MDL molfile • MOLDES (SMILES-like 1D string) • JME editor (web server) • “TEXT” input
N C-C | " " c-C-C C-O | | | C=O C=C | O L-Tyr Text drawings • Atoms represented by their element symbols • Connected by bonds • Single: - or | • Double: = or ” • Triple: # • Change case of symbol to invert chirality N C-C | " " C-C-C C-O | | | C=O C=C | O D-Tyr
PRODRG outputs • PDB (generated/minimzed) coordinates (with/out hydrogens, with proper atoms names for protein/sugars/DNA), but GIGO principle applies • Quality control on input coordinates vs topology • WHAT IF topology - accurate protein-ligand Hbonds • CNS/REFMAC/TNT/SHELX topology (including PTM amino acid building blocks) • GROMOS/GROMACS/OPLS topologies • Consistent topology from crystal -> publication
Helping (or kicking) PRODRG • Additional commands/hints in input file: • PATCH (hybridisation) • INSHYD and DELHYD • PATCH (chirality) • PATCH (torsions) • CPNAME
PRODRG> WARNING: multiplicity of generated molecule is not 1. PRODRG> WARNING: bond type assignment failed at CAF . Hybridisation hints PATCH <atom> <number> • Useful if PDB analysis did not quite work • Allows to nudge PRODRG in right direction: O “ C=C-C | | C-C=N “ O
Hybridisation hints PATCH <atom> <number> • Useful if PDB analysis did not quite work • Allows to nudge PRODRG in right direction: O “ C=C-C | | C-C=N “ O PATCH NAG 21
Adding/removing hydrogens INSHYD <atom> DELHYD <atom> • Allows to override default protonation • Often not actually what you want C-C=O | O INSHYD OAD PRODRG> Cannot assign type to atom ' OAD'. ERRDRG> Error in GROMOS atom names/types. PRODRG> Drug topology not made, sorry!
Adding/removing hydrogens INSHYD <atom> DELHYD <atom> • Allows to override default protonation • Often not actually what you want C-C=O | O PATCH OAD 3
Modifying chirality PATCH <atom> -1 • Inverts stereocenter <atom>, useful for PDB input PATCH <atom> <pattern> N C-C | " " C-C-C C-O | | | C=O C=C | O PATCH CA L N C-C | " " C-C-C C-O | | | C=O C=C | O PATCH CA D • ‘Absolute’ chiralityfor certain classesof molecules L-Tyr D-Tyr
Adding dihedral restraints PATCH <atom> ><pattern> • After EM pyranose rings often found in undesirable conformations • PATCH statement introduces additional dihedral restraints to fix conformation C-C-O-C-O | | | O C-C-C | | | O O O PATCH C1 ALPHA PATCH C2 D PATCH C3 L PATCH C4 D PATCH C5 D PATCH C1 >4C1 -D-Glucose
BUILD CB PHI BUILD CZ OH L-Phe L-Tyr Building • PRODRG can add molecular fragments to existing molecules: BUILD <atom> <fragment> L-Ala
Building • Allows quick alterations to existing molecules • Preserves coordinates of root structure • Fragment libraries contain text drawings –easy to define: FRAG OH X-O FRAG PHI X-C-C=C " | C-C=C FRAG ...
-D-Glc -D-Man -D-NAG Building • Can also be used to generate oligopeptides and oligosaccharides, using BUILD and START <fragment> START bdGLC BUILD O4 adMAN1 BUILD O0F bdNAG1 PATCH C1 >4C1 PATCH C0B >4C1 PATCH C1B >4C1
PRODRG IP issues • Currently PRODRG freely accessible for academics through webserver and binaries • Commercial licenses (~10) have provided useful income that contributes (but does not cover) PRODRG development / maintenance • Currently no PRODRG grant funding (previously WT senior fellowship) Thoughts on the future: • Make PRODRG as accessible as possible • Release of source? • Keen to incorporate/integrate with CCP4 but this will require some development
PRODRG - what next • Make PRODRG as accessible as possible • Release of source? • Keen to incorporate/integrate with CCP4 but this will require some development • Need to incorporate SMILES • Make PDB input foolproof by quality control • Move away from the united-atom-with-hydrogen-addition model • Link up with GUI - not only drawing but also “building” • Link up with coot (build-place-fit ligand at pointer)
Acknowledgements • Alexander Schüttelkopf • PRODRG users