1 / 61

Part II GO-Vocabulary of Genome

Part II GO-Vocabulary of Genome. S. cerevisiae. D. melanogaster. Cells that normally survive. CED-3 CED-4 OFF. CED-9 ON. Cells that normally die. CED-3 CED-4 ON. CED-9 OFF. C elegans. M. musculus. Comparison of sequences from 4 organisms. MCM3. MCM2. CDC46/MCM5. CDC47/MCM7.

mikel
Download Presentation

Part II GO-Vocabulary of Genome

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Part II GO-Vocabulary of Genome

  2. S. cerevisiae

  3. D. melanogaster

  4. Cells that normally survive CED-3 CED-4 OFF CED-9 ON Cells that normally die CED-3 CED-4 ON CED-9 OFF C elegans

  5. M. musculus

  6. Comparison of sequences from 4 organisms MCM3 MCM2 CDC46/MCM5 CDC47/MCM7 CDC54/MCM4 MCM6 These proteins form a hexamer in the species that have been examined

  7. The Gene Ontologies A Common Language for Annotation of Genes from Yeast, Flies and Mice …and Plants and Worms …and Humans …and anything else!

  8. Gene Ontology - 1998 FlyBase Drosophila Cambridge, EBI, Harvard Berkeley & Bloomington. SGD Saccharomyces Stanford. MGI Mus Jackson Labs., Bar Harbor.

  9. Fruitfly - FlyBase Budding yeast - SaccharomycesGenome Database (SGD) Mouse - Mouse Genome Database (MGD & GXD) Rat - Rat Genome Database (RGD) Weed - TheArabidopsisInformation Resource (TAIR) Worm - WormBase Dictyostelium discoidem - Dictybase InterPro/UniProt at EBI - InterPro Fission yeast - Pombase Human - UniProt, Ensembl, NCBI, Incyte, Celera, Compugen Parasites - Plasmodium, Trypanosoma, Leishmania - GeneDB - Sanger Microbes - Vibrio, Shewanella, B. anthracus, … -TIGR Grasses - rice & maize - Gramene database zebra fish –Zfin ......... Gene Ontology -now

  10. To provide structured controlled vocabularies for the representation of biological knowledge in biological databases.

  11. Be open source • Use open standards • Make data & code available without constraint • Involve your community

  12. Outline • Introduction to the Gene Ontologies (GO) • Annotations to GO terms • GO Tools • Applications of GO

  13. Gene Ontology Objectives • GO represents concepts used to classify specific parts of our biological knowledge: • Biological Process • Molecular Function • Cellular Component • GO develops a common language applicable to any organism • GO terms can be used to annotate gene products from any species, allowing comparison of information across species

  14. GO: Three ontologies What does it do? Molecular Function What processes is it involved in? Biological Process Where does it act? Cellular Component gene product

  15. Example: Gene Product = hammer Function (what) Process (why) Drive nail (into wood) Carpentry Drive stake (into soil) Gardening Smash roach Pest Control Clown’sjuggling object Entertainment

  16. Biological Examples Biological Process Molecular Function Biological Process Molecular Function Cellular Component Cellular Component

  17. The 3 Gene Ontologies • Molecular Function = elemental activity/task • the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity • Biological Process = biological goal or objective • broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions • Cellular Component= location or complex • subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme

  18. Molecular Function • A single reaction or activity, not a gene product • A gene product may have several functions • Sets of functions make up a biological process

  19. Molecular Function

  20. Carbonate dehydratase activity

  21. Biological Process

  22. Gluconeogenesis

  23. Cellular Component • where a gene product acts

  24. Mitochondrial membrane

  25. What’s in a GO term? term: gluconeogenesis id: GO:0006094 definition: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol.

  26. What’s in a name?

  27. Content of GO • Molecular Function 7,309 terms • Biological Process 10,041terms • Cellular Component1,629 terms • Total 18, 975 terms • Definitions: 94.9 % • Obsolete terms: 992 As of October 2005

  28. What’s in a name? • Glucose synthesis • Glucose biosynthesis • Glucose formation • Glucose anabolism • Gluconeogenesis • All refer to the process of making glucose from simpler components

  29. tree directed acyclic graph

  30. Parent-Child Relationships Nucleus Nuclear envelope Nucleoplasm Nucleolus Chromosome Perinuclear space A child is a subset of a parent’s elements The cell component term Nucleus has 5 children

  31. Ontology Relationships Directed Acyclic Graph

  32. Evidence Codes for GO Annotations http://www.geneontology.org/doc/GO.evidence.html

  33. IEAInferredfromElectronicAnnotation ISSInferred from Sequence Similarity IEPInferred from Expression Pattern IMPInferred from Mutant Phenotype IGIInferred from Genetic Interaction IPIInferred from Physical Interaction IDAInferred from Direct Assay RCA Inferred from Reviewed Computational Analysis TASTraceable Author Statement NASNon-traceable Author Statement ICInferred by Curator NDNo biological Data available

  34. IEAInferred from Electronic Annotation • Sequence Similarity (BLAST) • Automatic transfer from mappings (InterPro2GO, EC2GO etc.) • -> Not manually reviewed

  35. ISSInferred from Sequence or Structural Similarity • Sequence similarity • Recognized domains • Structural similarity -> Use of ‘with’ column recommended

  36. IEPInferred from Expression Pattern • Transcript levels (Northerns, microarrays) • Protein levels (Western blots) -> Timing or localization of expression -> Biological process annotations

  37. IMPInferred from Mutant Phenotype • Gene mutation/knockout • Overexpression/ectopic expression • Anti-sense experiments • RNAi experiments • Specific protein inhibitors

  38. IGIInferred from Genetic Interaction • Suppressors, synthetic lethals… • Functional complementation • Rescue experiments • -> Use of ‘with’ column recommended

  39. IPIInferred from Physical Interaction • 2-hybrid interactions • Co-purification • Co-immunoprecipitation • Ion/complex/protein binding experiments • -> Use of ‘with’ column recommended

  40. IDAInferred from Direct Assay • Enzyme assays • In vitro reconstitution (e.g. transcription) • Immunofluorescence (for cell. comp.) • Cell fractionation (for cell. comp.) • Physical interaction/binding assay

  41. RCAInferred from Reviewed Computational Analysis • Non-sequence-based computational methods • Genome-wide analyses (e.g. 2-hybrid) • Combinations of large-scale experiments

  42. TASTraceable Author Statement • Support from review article • Textbook ‘common knowledge’ • -> Data that can be ‘traced’ back

  43. NASNon-traceable Author Statement • Database entries that don't cite a paper • -> Data that cannot be ‘traced’ back

  44. ICInferred by Curator • Not supported by any direct evidence • Inferred from other GO annotations • -> GO term in ‘with/from’ column required

  45. NDNo biological Data available Curator found no information supporting any annotation • molecular function unknown GO:0005554 • biological process unknown GO:0000004 • cellular component unknown GO:0008372

  46. Term Hierarchy TAS/IDA IMP/IGI/IPI ISS/IEP NAS IEA

  47. Annotation summaries Meloidogyne incognita: McCarter et al. 2003

More Related