1 / 21

Coordinate handling and exploitation

Coordinate handling and exploitation. An overview of coordinate functionality in CCP4 suite Coordinate functionality in REFMAC group of programs (A. Vaguine) New CCP4 project “Protein Interfaces” (E. Krissinel). Coordinate support in CCP4.

oki
Download Presentation

Coordinate handling and exploitation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coordinate handling and exploitation • An overview of coordinate functionality in CCP4 suite • Coordinate functionality in REFMAC group of programs (A. Vaguine) • New CCP4 project “Protein Interfaces” (E. Krissinel)

  2. Coordinate support in CCP4 Old FORTRAN coordinate-related applications not using RWBrook (42%) Own coordinate functions Old FORTRAN coordinate-related applications using RWBrook (33%) RWBrook emulator New C & C++ coordinate-related applications (a few) Clipper MMDB (C++ Coordinate Library) Molecular Graphics Coot SSM Refmac group of programs Own coordinate functions DNA group Own coordinate functions other Own coordinate functions

  3. CCP4 Coordinate Library (MMDB) PDB file mmCIF file Binary file Interface API • C++ class hierarchy • PDB/mmCIF support • Database features • ~600 interface functions • Emulate RWBrook • Wealth of retrieval, selection, transformation and edit tools • User-defined data • Built-in high-level functionality (contacts, alignment, superposition etc.) • Monomer database • SWIG interface • Stable and documented Manager Header Model Model Cryst Sequence Chain Chain Chain Residue Residue Residue E. Krissinel et.al. (2004) Acta Cryst. D60 2250-55 Atom Atom Atom One or more C++ classes

  4. General remarks • Approximately 40% of CCP4 suite now uses a common set of coordinate functions provided by MMDB. This should help greatly in maintenance and adaptation to possible format changes. • Conversion of older FORTRAN applications, which are not using RWBrook, to MMDB, in most cases means a complete rewriting. This does not seem to be necessary at the moment. • All on-going developments in FORTRAN seem to be using their own coordinate functions and libraries. • MMDB delivers all its power only in C++ interface. Most of MMDB functionality cannot be expressed in traditional FORTRAN terms. Should we encourage new coordinate developments in C/C++ using MMDB? - shift away from FORTRAN thinking. • New coordinate-related CCP4 projects - MG, Coot, SSM and Protein Interfaces - are all based on MMDB and that seems to be an advantage for the projects.

  5. PIAS Protein Interactions, Assemblies and Searches E. Krissinel CCP4 - EBI/MSD project

  6. PIAS Project goals Develop a tool and publicly available interactive service to aid solution of different tasks that involve structural and chemical analysis of protein interactions, such as • prediction of oligomeric states • analysis of structure-function relationship • analysis and prediction of protein interactions • search for interface homologues • active site recognition and analysis • protein surface analysis • structure specificity analysis • other Project started in 2004.

  7. PIAS Project overview PIAS database Interface calculations, analysis, scoring & biological significance Crystal interfaces Interface fingerprinting Prediction of oligomeric states (PQS-3) Interfaces & structure similarity searches Interfaces & surface similarity searches Prediction of interfaces Active site recognition Docking Applied studies (e.g. discovery of multispecific proteins) Procedures for CCP4 MG Interactive Web server provisional parts, subject to progress and feasibility

  8. PIAS Project schedule PIAS database Interface calculations, analysis, scoring & biological significance Crystal interfaces Interface fingerprinting Prediction of oligomeric states (PQS-3) Interfaces & structure similarity searches Interfaces & surface similarity searches Prediction of interfaces Active site recognition Docking Applied studies (e.g. discovery of multispecific proteins) Procedures for CCP4 MG 2006-2008 2004-2008 2004-2005 2005-2007

  9. PIAS Database PIAS database Contains interfaces between polypeptides found in all PDB entries: all crystal contacts for X-ray entries and chain contacts for NMR entries. Also contains predicted protein assemblies. • Interface is defined as area that becomes inaccessible to solvent upon complex formation • Databased properties for interfacing structures: • Size, weight • Solvent accessible area per residue (+ selection of surface atoms and residues) • Solvation energy per residue • SSM data for structure search • Structure and sequence alignment • Databased properties for interfaces: • Interface area per residue (+ selection of interfacing atoms and residues) • Number of atoms and residues involved • Solvation energy gain (per residue) and P-value of hydrophobic patches • List of potential hydrogen bonds and salt bridges • Complexation significance score • Databased properties for assemblies: • Composition, chemical formula • List of engaged interfaces • Transformation matrices • Solvation energy gain • Solvent accessible and buried surface area • Dissociation pattern and barrier

  10. Prediction of oligomeric states Prediction of oligomeric states (PQS-3) Existing tools for the calculation of quaternary structures • PQS server @ MSD (Kim Henrick) (PQS-1) Method: progressive built-up by addition of monomeric chains that suit the selection criteria. The results are partly curated. • PITA software @ Thornton group EBI (Hannes Ponstingl) (PQS-2) Method: recursive splitting of the largest complexes allowed by crystal symmetry. Termination criteria is derived from the individual statistical scores of crystal contacts. The results are not curated.

  11. Prediction of oligomeric states Prediction of oligomeric states (PQS-3) Graph-chemical approach • Crystal is represented as a periodic graph of monomers (a “supermolecule”) • All possible assemblies that obey the symmetry criteria are recursively enumerated as subgraphs covering all the crystal • Only sets of chemically stable assemblies are left as an answer:

  12. Prediction of oligomeric states Prediction of oligomeric states (PQS-3) Success rate obtained on a benchmark set of 212 structures (H. Ponstingl) PQS server @ MSD 78% (not optimised on the benchmark set) PITA software 84% (optimised with 18 parameters) PIAS89% (optimised with 8 parameters, underfit) Early results outside the benchmark set indicate some prevalence of PIAS, however the actual differences may be less significant.

  13. Prediction of oligomeric states Prediction of oligomeric states (PQS-3) PQS may be predicted only up to a certain level of confidence. It seems that 85-90% of correct predictions may be reached. Main reasons for why 100% success rate can never be achieved: • theoretical models for protein affinity and entropy change upon complexation are primitive • coordinate (experimental) data are of limited accuracy • there is no feasible way to take conformation changes into account • experimental data on multimeric states is very limited and not always reliable - calibration of parameters is difficult • assemblies may exist in some environments and dissociate in other - a definitive answer is simply not there

  14. Interfaces and structure similarity searches Interfaces & structure similarity searches Searching the PIAS database for structurally similar interfaces and interfaces between similar structures Questions to answer • What interfaces are formed by structures similar to the given one(s) in PDB • What are the interface partners of a given structure in PDB • What is the relation between sequence and biological (complexation) significance of the interface (function) • What PQS may be formed by structures similar to the given one(s) and how the PQS may depend on the sequence • Is a given structure interaction-specific and/or multispecific

  15. PIAS web server • Interface area • Solvation energy gain • Hydrogen bonds and salt bridges • Hydrophobic P-value • Biological relevance score • Selection of interfacing residues and atoms A preliminary version of the MSD protein interaction service is set up at http://www.ebi.ac.uk/msd-srv/prot_int/cgi-bin/piserver The version includes: • Calculations for uploaded files or database retrievals on PDB Id code of • Solvent-Accessible Surface area • Crystal contacts / interfaces • Protein interface parameters and scoring • Protein Quaternary Structures • Interface and structure searches in protein interface database derived from PDB • Visualisation of the structures, interfaces and PQS

  16. PIAS web server http://www.ebi.ac.uk/msd-srv/prot_int/cgi-bin/piserver

  17. PIAS web server http://www.ebi.ac.uk/msd-srv/prot_int/cgi-bin/piserver

  18. PIAS web server http://www.ebi.ac.uk/msd-srv/prot_int/cgi-bin/piserver

  19. PIAS web server

  20. PIAS web server http://www.ebi.ac.uk/msd-srv/prot_int/cgi-bin/piserver 3gcb hexamer Dissociation of 3gcb hexamer

  21. Concluding remarks The PIAS software is almost ready for first release. It may be released in 2 months time after catching up with • on-line help and documentation • minor cleaning and re-design of output pages • enhancement of structural search options • further entropy calibration to increase accuracy of PQS prediction Further work will concentrate on • surface calculation and analysis • surface / active sites searches • possibly docking • additions to and improvements of existing functions (based on users’ feedback and own needs)

More Related