440 likes | 581 Views
Using X-ray structures for bioinformatics . Robbie P. Joosten Netherlands Cancer Institute Autumnschool 2013. Introduction. S tructures in bioinformatics. Understand biology Direct interpretation Data mining Homology modeling Drug design Molecular dynamics. Basic rule:
E N D
Using X-ray structures for bioinformatics Robbie P. Joosten Netherlands Cancer Institute Autumnschool 2013
Introduction Structures in bioinformatics • Understand biology • Direct interpretation • Data mining • Homology modeling • Drug design • Molecular dynamics Basic rule: Better structures → Better results
Introduction Right structure(s) for the job • Selection: find (a number of) PDB entries • Validation: check the quality of your selection • Optimisation: maximise the quality of your selection Focus on X-ray structures
Selection X-ray structures have a history • Protein expression • Crystallisation • X-ray diffraction experiment • Model building and refinement • Deposition at the PDB All these steps affect the final PDB file
History Protein expression A ‘construct’ is made • Partial proteins • E.g. only extracellular domain of membrane protein • Frankenstein proteins • Fusion proteins or chimeras • Mutants are introduced • Some by accident! • Poly-histidine tags added for purification • Altered glycosylation state • Large sugars hamper crystallisation
History Crystallisation The protein stacks regularly to form a crystal • Protein still functional in the crystal • Much solvent in the crystal (~40%) • Some residues can move • Disorder: missing loops/side chains • Alternate conformation
History Crystallisation Beware of crystal packing • One copy of the protein can influence the next
History Crystallisation Chemicals are used for crystallisation • Buffers to stabilise the pH • Precipitants • Change solubility of the protein • Neutralise local charges • Bind water • High concentrations are used • Compounds compete with natural ligands • Examples: • Polyethylene glycol (PEG) • Ammonium sulphate
History Crystallisation Beware of the crystallisation conditions
History Crystallisation Beware of the crystallisation conditions
History X-ray diffraction Typical experiment Detector X-ray source
History X-ray diffraction • X-rays interact with electrons • Atoms with few electrons (H, Li) do not diffract well • X-rays cause damage to the protein • Acidic groups (ASP en GLU) can be destroyed • Disulphide bridges are broken • Hydrogens are stripped • Coolingcrystals in liquid nitrogen helps • Glycerol added to the crystal!
History X-ray diffraction • We are not using a microscope • We don’t measure everything we need Measured Missing: phase X-ray diffraction gives an indirect and incomplete measurement
History Model building and refinement Iterative process FT Measured X-ray diffraction data FT Model building Initial phases
History Model building and refinement Two types of maps • Regular electron density map (2mFo-DFc) • Difference map (mFo-DFc)
History Model building and refinement Fitting atoms to the ED map and trying to remove difference density peaks
History Model building and refinement • Requires skill and experience • Requires time and patience • Requires good software Lack of any of these can be seen in the final PDB file
History Deposition at the PDB • Both coordinates and experimental X-ray data are deposited • PDB standardises files and adds annotation • Sometimes things go wrong
History Deposition at the PDB LINKs between alternate conformations
History Deposition at the PDB Un-biological LINKs (in 1a1a) LINK C ACE C 100 N PTH C 101 LINK C PTH C 101 N GLU C 102 LINK CF PTH C 101 OG SER A 188 LINK N DIP C 103 C GLU C 102 LINK C ACE D 100 N PTH D 101 LINK C PTH D 101 N GLU D 102 LINK N DIP D 103 C GLU D 102
Think of what happened to the structure before you downloaded it
Validation X-ray specific validation Use the experimental data • Resolution says very little about the structure • (free) R-factor gives the overall fit of the structure to the experimental data • For biological interpretation more detail is needed Use the maps
Validation X-ray specific validation Which is the better structure of berenil bound to DNA?
Validation X-ray specific validation The real-space R-factor (RSR) • A per-residue score of how well the atoms fit the map • Works like the R-factor (lower is better)
Validation X-ray specific validation Maps can help distinguish the good and bad bits of a structure
Validation Things you can find in maps Poorlyfittedside-chains Evilpeptides
Validation Things you can find in maps The wrong drug
Validation Things you can find in maps • Sequence error K -> R • Accidental mutant • Also a missing sulfate
Validation Things you can find in maps Missing water Missing alternate conformation
Validation Checking maps • Visualisation in Coot • http://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot/ • Get maps and real-space R values from the Electron Density Server • http://eds.bmc.uu.se/eds/index.html • Direct interface with Coot • Get maps and updated modelsfrom PDB_REDO Practical session
Optimisation Structures in the PDB • Solved by a diverse group of scientists • People make errors & gain experience • Since 1976 • Structures are not updated • Solved with the methods of their era • Methods improve over time Structures in the PDB do not represent the best we can do NOW
Optimisation Improve structures in PDB • Take structure + experimental data • Use latest X-ray crystallography methods • Decision making: use case-specific methods • Create new methods when needed • Improve model quality • Fit with experimental data • Geometric quality • Fix errors PDB_REDO
Optimisation PDB_REDO method Step 1: prepare data • Clean-up structure and X-ray data • Data mining Step 2: establish baseline • Fit with experimental data (R-factors) • Geometric quality • Validation with WHAT_CHECK
Optimisation PDB_REDO method Step 3: re-refine structure (with Refmac) • Improve fit with experimental data • Use restraints to improve geometric quality • Improve description of protein dynamics • Concerted movement of groups of atoms (TLS) • Anisotropic movement of individual atoms
Optimisation PDB_REDO method Step 4: rebuild structure • Delete nonsense waters • Flip peptide planes • Rebuild side-chains • Add missing ones • OptimiseH-bonding Step 5: validate structure • Geometry • Density map fit • Ligand interactions
Availability PDB_REDO databank • www.cmbi.ru.nl/pdb_redo • > 72,000 structures (98%) • Detailed methods & reprints • Directly in molecular graphics software • YASARA • CCP4mg • Coot (needs plugin) • PyMOL(needs plugin) • Linked via PDBe & RCSB
Optimisation Does it work? (12,000 structures) • Improved fit with the data • Better geometry
Optimisation MolProbity validation(1eoi) PDB PDB_REDO
Optimisation Electrostatics calculations • ‘Missing’ positive lysine atoms distort electrostatics calculations • Adding missing atoms correctly describes C-terminus interaction with side chains
Optimisation Protein-ligand interaction • Wrong peptide plane in peptide ligand • Fixed by PDB_REDO • Better understanding of H-bonds in the interaction
Optimisation Protein-protein interaction • Packing interface with poor ionic interactions • Rebuilt interface properly describes ionic dimerisationinteractions
Optimised structures give a better view of the biology of the protein
PDB_REDOers Amsterdam: • R Joosten • K Joosten • A Perrakis Key contributors: Eleanor Dodson, Ian Tickle, Paul Emsley, Ethan Merritt, Elmar Krieger, Thomas Lütteke, Rachel Kramer Green, Sanchayita Sen Nijmegen: • T te Beek • M Hekkelman • G Vriend Cambridge: • G Murshudov • F Long