1 / 26

ChemModLab: A Web-based Cheminformatics Modeling Laboratory

ChemModLab: A Web-based Cheminformatics Modeling Laboratory. S. Stanley Young + ECCR and ChemSpider Teams. S. Stanley Young + ECCR and ChemSpider Teams. ChemSpider : A Web-based Chemical Informatics Resource. What is ChemSpider?.

minerva
Download Presentation

ChemModLab: A Web-based Cheminformatics Modeling Laboratory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams

  2. S. Stanley Young + ECCR and ChemSpider Teams ChemSpider : A Web-based Chemical Informatics Resource

  3. What is ChemSpider? • ChemSpider is a molecular structure-centric web service for chemists: • Chemical structure drawing, manipulation, visualization, modeling & databasing • Web location to deposit, curate and enhance data associated with chemical structures • Web structure-based access to federated chemistry databases representing chemical vendors, literature, online data, patents and other forms of chemistry data

  4. How do people generally use ChemSpider? • Searching for chemical structures, in rank order, via: • Registry numbers, trade names and synonyms. • Structure identifiers such as SMILES or InChI • Intrinsic properties: commonly mass-based searches executed by mass spectrometrists • By systematic names: IUPAC or CAS Index name • Generation of physicochemical properties • Text-based searching of Open Access articles

  5. ChemSpider Status August 2007 • Online database of over 16.5 million structures • Systems in place for: • Single structure and data collection depositions • Association of analytical data with structures • Ability to curate data for each individual record • Indexing of and Integration to: • Over 70 individual databases • Patents from the US, European and Asian Patent offices • Text-based searching of over 50,000 Open Access articles • Over a thousand unique users access ChemSpider per day

  6. Flexible Boolean Searching

  7. Predicted Properties Details “Prozac”

  8. Search result: 49 hits in 2.8 seconds

  9. Integrated Visualization Tools

  10. External Integrations - Wikipedia The links between Wikipedia and ChemSpider are formed automatically

  11. What is ChemModLab? • ChemModLab is a Web Service for building and evaluating QSAR models. • Send your data: assay results and SD file. • Use any or all of five descriptor types (2D). (Use your own descriptors) • Use any or all of 16 statistical modeling methods. • Predict potency of untested compound.

  12. Virtual Screening ChemModLab ChemSpider

  13. ChemModLab Dialog (1) Data Input

  14. ChemModLab Dialog (2) Five 2D Descriptor Sets

  15. ChemModLab Dialogue (3) 16 Modeling Methods

  16. ChemModLab Modeling Methods • 16 Statistical Modeling Methods • Trees: RandomForest, rpart, tree • Neural networks • k-nearest neighbors • Support vector machines • Partial least squares • Partial least squares with linear discriminant analysis • Least angle regression • Ridge regression • Elastic net • Principal components regression • Family ensemble of k-nearest neighbors, using 70% selection • Family ensemble of tree, using 70% selection • Family ensemble of rpart, using 70% selection • randomForest using 70% selection

  17. ECCR@NCSU + ChemSpider Plan User submits data to ChemModLab to get QSAR Model(s). Model is sent to ChemSpider. ChemSpider computes a “virtual screen”. The hit-list is clustered and sent to the user.

  18. Accumulation curves Compare descriptor sets, given a method

  19. Accumulation Curves Compare modeling methods, given a descriptor set

  20. Diversity Map Cluster Active Compounds Modeling Methods

  21. ContinuousResponse

  22. Continuous Response

  23. ContinuousResponse

  24. ModelEvaluation Take detailed looks at which models? AID348 (NCGC): KNN – Ph ENet – CAP RF – B# RF – CAP RF – FF Tree – CAP Tree – Ph Tree – FF PLS – CAP

  25. Summary • ChemSpider is a web chemical informatics center. • ChemModLab is a free, web service for QSAR. • Together they support sophisticated virtual screening. • * ChemModLab is supported by the NCI RoadMap project.

  26. ECCR@NCSU Group ChemSpider Group ChemModLab Team Jacqueline M. Hughes-Oliver Atina D. Brooks Gary W. Howell Kirtesh Patil Stan Young Qianyi Zhang ChemSpider Team Antony Williams (project lead) A rotating team of advisors and developers including many contributions from the Open Source community eccr.stat.ncsu.edu www.chemspider.com

More Related