1 / 31

Finding Orthologous Groups

Finding Orthologous Groups. René van der Heijden. What is this lecture about?. What is ‘orthology’? Why do we study gene-ancestry/gene-trees (phylogenies)? Several approaches to find orthologous genes High-resolution orthology Steps involved Things to think about (homework). Homology.

nessa
Download Presentation

Finding Orthologous Groups

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding Orthologous Groups René van der Heijden

  2. What is this lecture about? • What is ‘orthology’? • Why do we study gene-ancestry/gene-trees (phylogenies)? • Several approaches to find orthologous genes • High-resolution orthology • Steps involved • Things to think about (homework)

  3. Homology Genes are homologous if and only if they derive from the same ancestral gene • Sufficient sequence similarity proofs homology • Very dissimilar sequences: PSI blast, HMM searches

  4. The usual range Homologous genes tend to have similar functions

  5. Homologous genes tend to have similar functions Accurate function prediction requires something better than homology Orthology

  6. Duplications, Speciations, and Orthology Evolution results in: • Growing number of genes • Gene duplications • Horizontal gene transfer • De novo generation • Growing number of species Tendency for functional expansion • The fate of gene duplicates: • Perish • Find a new functional niche

  7. Duplications, Speciations, and Orthology Two genes in two species are orthologous if they derive from one gene in their last common ancestor gene duplication by cell division • Orthologous genes are likely to have the same function • Much stronger than “tend to have similar function”

  8. Duplications, Speciations, and Orthology present genes primal ancestor evolutionary distance

  9. Homologs, Orthologs,and Paralogs • Homologous: one common ancestral gene • Orthologous: separated by a speciation event • Paralogous: separated by a duplication event • Orthologs and Paralogs must be Homologs The view on orthology and paralogy is relative to a certain speciation Are there homologous genes which are not orthologous nor paralogous?

  10. Inparalogs and Outparalogs • Both, In- and Outparalogous genes are separated by a gene duplication event • For Inparalogs, the duplication event is not followed by speciation(s) • Outparalogs are separated by a duplication event, followed by speciation(s) Are Inparalogs Orthologs ? Depends on your definition: Yes: two genes are orthologous if they derive from one gene in their last common ancestor No: two genes are orthologous if they are only separated by cell division events • Inparalogs are recent paralogs • Outparalogs are more ancient paralogs

  11. Reading Gene-Trees Although genes spec1,1 and spec2,1 are closer relatives, their distance is larger than that between spec1,1 and spec3,1 The tree suggests at least 2 gene losses

  12. In-, and Outparalogs, Orthologs, and Co-orthologs

  13. More examples

  14. www = What, Why, and hoW? • What:Orthologous genes are separated by cell division only • Why:Orthologous genes are likely to have the same function • How:Yes, how can orthologous relations be established ?

  15. Several approaches • The COG approach • InParanoid • Tree-based methods

  16. COG approach • Based on blast hits • Establishment and extension of triangles:

  17. COG approachII Extension of orthologous groups

  18. InParanoid I • Method denotes • IN- and OUTparalogs • For TWO species • Find all hits from species A on B • Find all hits from species B on A • Find all bi-directional best hits (BBH) • These for putative orthologs

  19. InParanoid II • Find all hits from A on A • Find all hits from B on B • Find all InParalogs • These are all hits better than the orthologs • Better => more recently split

  20. InParanoid III • Putative orthologous pairs are curated by an outgroup species C • InParalogs are given a confidence value • Bootstrapping is used to give confidence values for orthologous pairs

  21. Genes with promiscuous domains • Gene A may hit on gene B because of a shared domain X • Gene B may hit on gene C because of a shared domain Y • Promiscuous domains require (manual) curation

  22. Tree-based methods • Get all homologous genes • Make multiple alignments • Generate phylogenetic gene trees • Analyze trees • Uncertainty in multiple alignment? • Different methods for distance calculations • Superpose a trusted species tree? • How to assess a level of accuracy?

  23. The Phylogenetic Gene-Tree • Multiple alignment for all genes • Distance matrix calculation • Kimura correction • PAM model • Categories model • Large trees: distance-based methods • Neighbor Joining

  24. Uncertainty in trees • Evolutionary noise • Differing rates of evolution • Convergent evolution (low complexity, coiled coils) • Promiscuous domains (recombination, fusion, fission) • Use of heuristic methods • Multiple alignment • Tree making

  25. Analyze trees … but don’t trust them fully • Rigid analysis suggests many duplications and losses • Presume scp branch is wrongly placed! If this is correct …. this can’t be

  26. Analyze trees … but don’t trust them fully • And if we accept wrong placement of branches … Considering one wrongly placed gene leaves only 2 gene losses Three orthologous groups suggesting 15 gene losses

  27. High-res versus Low-res • Many, • Complete, and • Closely related genomes Challenge: Automatic Orthology assignment

  28. Things to think about (homework) • Select a partner • Collect a gene tree (and some copies) • Carefully deduce which nodes are duplications and which are speciations • Denote which genes are orthologous to each other (orthologous groups) • Select interesting parts to predict what • The COG procedure would say • InParanoid would say • What would have happened if some genes (or species) where not involved in the analysis

  29. Homework: also think about … Start discussions here and now Hand in written results before january 5th at CMBI secretary or room A3009

More Related