1 / 37

GO based data analysis

This workshop provides an overview of GO-based data analysis tools and materials available online at the AgBase database. Topics covered include GOanna, GOSlimViewer, Gene Ontology enrichment analysis, annotation clustering, and comparison of enrichment analysis tools. Supported by USDA CSREES grant.

marionhunt
Download Presentation

GO based data analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GO based data analysis Iowa State Workshop 11 June 2009

  2. All tools and materials from this workshop are available online at the AgBase database Educational Resources link. • For continuing support and assistance please contact: agbase@cse.msstate.edu This workshop is supported by USDA CSREES grant number MISV-329140.

  3. GOanna GOSlimViewer AgBase protein annotation process Protein identifiers or Fasta format GORetriever Annotated Proteins Proteins with no annotations

  4. Hypothesis generating • Gene Ontology enrichment analysis GO terms that are statistically (Fisher’s exact test) over or underrepresented in a set of genes • Annotation Clustering groupsimilar annotations based on the hypothesis that they should have similar gene members 

  5. Some resources • DAVID: http://david.abcc.ncifcrf.gov/ • GOStat: http://gostat.wehi.edu.au/ • EasyGO: http://bioinformatics.cau.edu.cn/easygo/ • AmiGO http://amigo.geneontology.org/cgi-bin/amigo/term_enrichment(does not use IEA) • Onto-Express & OE2GOhttp://vortex.cs.wayne.edu/projects.htm • GOEAST http://omicslab.genetics.ac.cn/GOEAST • http://www.geneontology.org/GO.tools.shtml • Comparison of enrichment analysis tools : Nucleic Acids Research, 2009, Vol. 37, No. 1 1–13 (Tool_Comparison_09.pdf) DAVID and EasyGO analysis included DAVID&EasyGo.ppt

  6. Database for Annotation, Visualization and Integrated Discovery

  7. http://vortex.cs.wayne.edu/ontoexpress Onto-Express analysis instructions are Available in onto-express.ppt

  8. Species represented in Onto-Express

  9. For uploading your own annotations use OE2GO

  10. Comparison • Onto-Express , EasyGO, GOstat and DAVID • Test set: 60 randomly selected chicken genes • Used AgBase GO annotations as baseline annotations Vandenberg et al (BMC Bioinformatics, in review)

  11. Networks & Pathways Iowa State Workshop 11 June 2009

  12. Multiple data analysis platforms Proteomics LIST Transcriptomics ESTs

  13. Our original aim….…understand biological phenomena…. • Bits and pieces of information • Do not have the full picture • How do we get back to BIOLOGY in this digital information landscape?

  14. Francis Crick, 1958 What do we know about biological systems …. • biological systems are dynamic, not static • how molecules interact is key to understanding complex systems

  15. Types of interactions • protein (enzyme) – metabolite (ligand) • metabolic pathways • protein – protein • cell signaling pathways, protein complexes • protein – gene • genetic networks

  16. STRING Database Sod1 Mus musculus http://string.embl.de/

  17. Database/URL/FTP DIP http://dip.doe-mbi.ucla.edu BIND http://bind.ca MPact/MIPS http://mips.gsf.de/services/ppi STRING http://string.embl.de MINT http://mint.bio.uniroma2.it/mint IntAct http://www.ebi.ac.uk/intact BioGRID http://www.thebiogrid.org HPRD http://www.hprd.org ProtCom http://www.ces.clemson.edu/compbio/ProtCom 3did, Interprets http://gatealoy.pcb.ub.es/3did/ Pibase, Modbase http://alto.compbio.ucsf.edu/pibase CBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbm SCOPPI http://www.scoppi.org/ iPfam http://www.sanger.ac.uk/Software/Pfam/iPfam InterDom http://interdom.lit.org.sg DIMA http://mips.gsf.de/genre/proj/dima/index.html Prolinks http://prolinks.doe-mbi.ucla.edu/cgibin/functionator/pronav/ Predictome http://predictome.bu.edu/ PLoS Computational Biology March 2007, Volume 3 e42

  18. Pathways & Networks • A network is a collection of interactions • Pathways are a subset of networks Network of interacting proteins that carry out biological functions such as metabolism and signal transduction • All pathways are networks of interactions • NOT ALL NETWORKS ARE PATHWAYS

  19. Biological Networks • Networks often represented as graphs • Nodes represent proteins or genes that code for proteins • Edges represent the functional links between nodes (ex regulation) • Small changes in graph’s topology/architecture can result in the emergence of novel properties

  20. Yeast Protein-Protein Interaction Map Nature411, 2001, H. Jeong, et al

  21. Some resources KEGG http://www.genome.jp/kegg/pathway.html/ BioCyc http://www.biocyc.org/ Reactome http://www.reactome.org/ GenMAPP http://www.genmapp.org/ BioCarta http://www.biocarta.com/ Pathguide – the pathway resource list http://www.pathguide.org/

  22. Pathguide Statistics Gallus gallus is missing

  23. Reactome

  24. What is feasible with my specific dataset?

  25. Systems Biology Workflow Nanduri & McCarthy CAB reviews, 2008

  26. Systems Biology Workflow For a given species of interest what type of data is available???

  27. Retrieval of interaction datasets • Evaluate PPI resources such as Predictome Prolinks for existence of species of interest • If unavailable, find orthologous proteins in related species that have interactions!

  28. I have interactions what next? • Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods?

  29. I have interactions what next? • Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods? STRING Database

  30. PPI Identification Computational Experimental Phylogenetic profile Yeast two hybrid Yeast two hybrid (Y2H) Gene Cluster TAP assays TAP assays Sequence coevolution Gene Coexpression Rosetta stone method Protein arrays Text mining PLoS Computational Biology March 2007, Volume 3 e42

  31. PPI database comparisons Proteins: Structure, Function and Bioinformatics 63:490-500 2006

  32. I have interactions what next? • Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods? • Visualize these interactions as a network and analyze… what are the available tools?

More Related