1 / 50

Strategies & Examples for Functional Modeling

Strategies & Examples for Functional Modeling. COST Functional Modeling Workshop 22-24 April, Helsinki. Types of data sets and modeling. Commercial array data – more likely to have tools that support the use of array IDs.

aziza
Download Presentation

Strategies & Examples for Functional Modeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Strategies & Examples for Functional Modeling COST Functional Modeling Workshop 22-24 April, Helsinki

  2. Types of data sets and modeling • Commercial array data – more likely to have tools that support the use of array IDs. • Custom/USDA array data – problems with updating IDs, linking to function and using array IDs directly in functional modeling tools. • Proteomics data – larger data sets; need to make background references to determine enrichment. • RNA-Seqdata – largerand more complex data sets; novel transcripts currently can’t be included in modeling (contact AgBase to assign GO). • Real-time data or quantitative proteomics data – hypothesis testing.

  3. Functional Modeling Strategies • GO summary (using Slim sets) • GO enrichment (statistical!) • Pathways analysis • Interaction or networks analysis • Hypothesis testing Note: • Functional modeling should be integrated. • Approaches are complementary, not exclusive. • Modeling is driven by the biology (not the other way round).

  4. Modeling Strategy • Think about using multiple functional approaches. • GO, pathways, networks • complementary • What is available for your species? • What GO is available? • What species does the pathways/network analysis use? • What resources do you have? • at your institute (e.g. commercial pathways analysis) • open source (e.g. GO Enrichment analysis) • using online vs installed • Iterative – further functional modeling based on initial results • GO hypothesis testing?

  5. 1. GO Functional Summary • high throughput data sets gives us 1000s -10,000s of gene products • can’t know everything about all gene products • tendency to ‘cherry pick’ ones you recognize • instead, can group gene products by function • this gives us a manageable number of categories to process • enables us to see trends, patterns, etc • Use GO Slim sets to ‘summarize’ data • Lose details (but can gain perspective). • Some GO Slim sets are ageing – not being updated as changes to the GO are made. • Different Slim sets have different terms – which is best for your data? AgBaseGOSlimViewer tool.

  6. http://www.agbase.msstate.edu/help/slimviewerhelp.htm The Slim set you use matters - need to determine which one to use & report it in Methods.

  7. Functional Summary • Not all GO terms are annotated equally, e.g., metabolism! • can slim the complete GO for a species as a background set and then determine terms in your data are disproportionately expressed. • Can use Slims to compare two data sets (e.g., control vs treatment). • Use Slims for your own sanity – are you seeing what you expect to see?

  8. B-cells Stroma Membrane proteins grouped by GO BP: cell cycle/cell proliferation cell adhesion cell growth apoptosis immune response ion/proton transport cell migration cell-cell signaling function unknown development endocytosis proteolysis and peptidolysis signal transduction protein modification

  9. B-cells Stroma Membrane proteins grouped by GO BP: cell cycle/cell proliferation apoptosis immune response cell migration cell-cell signaling function unknown

  10. BVDV Infection – cytopathic (CP) vs non-cytopathic (NCP) infection (comparing function between 2 different conditions)

  11. 2. Determining over-represented or under-represented function. • most typically used functional analysis method • many, many tools that do this – see: http://www.geneontology.org/GO.tools.microarray.shtml • very different visualization • will use some of these tools in practical session

  12. http://david.abcc.ncifcrf.gov/home.jsp

  13. Some useful expression analysis tools: • Database for Annotation, Visualization and Integrated Discovery (DAVID) • http://david.abcc.ncifcrf.gov/ • AgriGO -- GO Analysis Toolkit and Database for Agricultural Community • http://bioinfo.cau.edu.cn/agriGO/ • used to be EasyGO • chicken, cow, pig, mouse, cereals, dicots • adding new species by request • Onto-Express • http://vortex.cs.wayne.edu/projects.htm#Onto-Express • can provide your own gene association file • Ontologizer • WebStart widget (requires Java); now on Galaxy • http://compbio.charite.de/contao/index.php/ontologizer2.html • requires OBO file & GAF (enables users to select their own annotations)

  14. GO Enrichment tools that support agricultural species.

  15. structurally and functionally re-annotated a microarray • quantified the impact of this re-annotation based on GO annotations & pathways represented on the array • tested using a previously published experiment that used this microarray • re-annotation allows more comprehensive GO based modeling and improves pathway coverage • re-annotation resulted in a different model from previously published research findings

  16. Evaluating GO tools Some criteria for evaluating GO Tools: • Does it include my species of interest (or do I have to “humanize” my list)? • What does it require to set up (computer usage/online) • What was the source for the GO (primary or secondary) and when was it last updated? • Does it report the GO evidence codes (and is IEA included)? • Does it report which of my gene products has no GO? • Does it report both over/under represented GO groups and how does it evaluate this? • Does it allow me to add my own GO annotations? • Does it represent my results in a way that facilitates discovery?

  17. RNASeq GO Enrichment • RNASeq experiments: longer transcripts and more highly expressed transcript are more likely to be differentially expressed. • Current GO enrichment tools do not account for RNASeq platform bias (most based upon arrays). • assume that all genes are independent and equally likely to be selected as DE

  18. 3. Pathway Analysis • Freely available tools: • from public databases, e.g. KEGG & Reactome • Freely available tools, e.g. Cytoscape • Commercial pathways analysis tools: e.g., Ingenuity Pathways Analysis (IPA), Pathway Studio, etc. • some tools only have limited species – need to “humanize” animal data, etc for plants with Arabidopsis • everything gives you cancer • Many pathways analysis tools combine pathways analysis, network analysis.

  19. ReactomeSkypainter http://www.reactome.org/cgi-bin/skypainter2

  20. KEGG Pathways http://www.kegg.jp/kegg/download/kegtools.html

  21. Analysis tools(commercial) Networks Ingenuity Pathway Analysis Pathways functions and diseases http://www.ingenuity.com Gene Ontology (GO) groups Pathway Studio GSEA Pathways http://www.ariadnegenomics.com/ IPA analysis included as IPA.txt

  22. Data Curation • Ingenuity: Manually curated database by Ph.D level scientists (mining 32 different peer reviewed journals). • Pathway studio: Automated curation by Medscan Reader using Natural language processing (NLP) technology. Mining Pubmed abstracts and peer reviewed journals • users can do their own text mining

  23. Comparison Criteria • Features • Proportion of proteins involved in modeling • Data generation • Display • Test Dataset: 3,600 bovine spermatozoa proteins (Comparison by DivyaPeddinti)

  24. Proteins involved in modeling

  25. Data generation 37 7 26

  26. Pathway display EGF signaling pathway

  27. 4. Network Analysis • IPA & Pathway Studio equally efficient at drawing networks of relationships. • IPA : simplifies the pathway display and creates more manageable user friendly network for users to analyze. • Pathway Studio: Shows the relations in a table format. • STRING Database - known and predicted protein interactions.

  28. http://string-db.org/

  29. http://www.cytoscape.org/

  30. 5. Hypothesis Testing • high throughput data sets – ‘fishing expedition’ or hypothesis generation • but GO also serves as a repository of biological function – can be used for hypothesis testing based on these data sets

  31. The critical time point in MD lymphomagenesis 18 16 Genotype Hypothesis At the critical time point of 21 dpi, MD-resistant genotypes have a T-helper (Th)-1 microenvironment (consistent with CTL activity), but MD-susceptible genotypes have a T-reg or Th-2 microenvironment (antagonistic to CTL). 14 Susceptible (L72) Resistant (L61) 12 mean total lesion score 10 Non-MHC associated resistance and susceptibility 8 6 4 2 0 0 20 40 60 80 100 days post infection

  32. NAIVE CD4+ T CELL APC Th-2 T reg Th-1 CYTOKINES AND T HELPER CELL DIFFERENTIATION Shyamesh Kumar

  33. NAIVE CD4+ T CELL Macrophage APC Th-2 T reg Th-1 NK Cell CTL L6 Whole Smad 7 L7 Whole L7 Micro IL 12 IL 4 Th-1, Th-2, T-reg ? Inflammatory? TGFβ IL 4 IL10 IFN γ IL 12 IL 18

  34. Gene product Th1 Th2 Treg Inflammation IL-2 1 ND 1 -1 IL-4 -1 1 1 ND IL-6 1 -1 1 IL-8 ND ND 1 1 IL-10 -1 1 1 0 IL-12 1 -1 ND ND IL-13 -1 1 ND ND IL-18 1 1 1 1 IFN-g 1 -1 1 1 TGF-b -1 0 1 -1 CTLA-4 -1 -1 1 -1 GPR-83 -1 -1 1 -1 SMAD-7 1 1 -1 1 ND = No data Step III. Inclusion of quantitative data to the phenotype scoring table and calculation of net affect. Step I. GO-based Phenotype Scoring. Step II. Multiply by quantitative data for each gene product.

  35. Microscopic lesions 60 L6 (R) 50 5mm 40 L7 (S) Net Effect 30 20 10 0 Th-1 Th-2 T-reg - 10 Inflammation Phenotype - 20

  36. L6 Resistant L7 Susceptible Pro T-reg Pro T-reg Pro Th-1 Pro Th-2 Anti Th-2 Anti Th-1 Anti CTL Pro CTL Anti CTL Pro CTL

  37. Concluding thoughts on functional modeling. “By doing just a little every day, I can gradually let the task overwhelm me.” Ashleigh Brilliant

  38. Bringing it all together… • There is no one “correct” way; there is no “right” answer. • Using multiple functional modeling strategies (e.g., GO, pathways, networks) can help with insights. • Need to use biological knowledge to bring these different approaches together. • Functional modeling is often iterative. • Need to focus not only on what is known but what is new!

  39. Overview of Functional Modeling Strategy Genes/Proteins with no GO annotations Microarrays ArrayIDer GOanna GORetriever Blast2GO Protein/Gene identifiers Proteomics GO annotations Genome2seq RNASeq GO Enrichment analysis Ingenuity Pathways Analysis (IPA) Pathway Studio Cytoscape DAVID AgriGO Onto-tools GOSlimViewer AutoSlim Pathways and network analysis Ingenuity Pathways Analysis (IPA) Pathway Studio Cytoscape DAVID Yellow boxes represent AgBase tools Green boxes are non-AgBase resources

  40. Functional Modeling Considerations • Should I add my own GO? • use GOProfiler to see how much GO is available for your species • use GORetriever to find existing GO for your dataset • Does analysis tool allow me to add my own GO? • Should I do GO analysis and pathway analysis and network analysis? • different functional modeling methods show different aspects about your data (complementary) • is this type of data available for your species (or a close ortholog)? • What tools should I use? • which tools have data for your species of interest? • what type of accessions are accepted? • availability (commercial and freely available)

  41. Some Limitations • Annotation is not complete. • not all the data is annotated • some gene products have no functional information • Gene Ontology is only one aspect of functional modeling. • anatomy, tissue expression, phenotype, disease, etc • Gene nomenclature – need to know what we are annotating! • Functional modeling tools need to handle larger data sets (& multiple ontologies?).

More Related