1 / 61

Banche dati specializzate

Banche dati specializzate. Banche dati Specializzate. Le banche dati specializzate raccolgono insiemi di dati omogenei dal punto di vista tassonomico e/o funzionale disponibili nelle Banche dati Primarie e/o in Letteratura, rivisti e annotati con informazioni di valore aggiunto.

nishi
Download Presentation

Banche dati specializzate

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Banche dati specializzate

  2. Banche dati Specializzate Le banche dati specializzate raccolgono insiemi di dati omogenei dal punto di vista tassonomico e/o funzionale disponibili nelle Banche dati Primarie e/o in Letteratura, rivisti e annotati con informazioni di valore aggiunto

  3. Banche dati Specializzate di Patterns Proteici • Data una sequenza non caratterizzata: • A che famiglia appartiene? • Qual è la sua funzione?

  4. “The protein signature approach” • Confrontiamo sequenze appartenenti alla stessa famiglia, cercando ‘pattern’ comuni • Costruiamo un database di profili conservati (elementi di sequenza conservati in specifiche posizioni) • Usiamo questi profili (pattern) per classificare una sequenza incognita

  5. Multiple sequence alignment What are protein signatures? Protein family/domain Build model Search UniProt Protein analysis Significant match ITWKGPVCGLDGKTYRNECALL Mature model AVPRSPVCGSDDVTYANECELK

  6. Diagnostic approaches (sequence-based) Single motif methods Regex patterns (PROSITE) Full domain alignment methods Profiles (Profile Library) HMMs (Pfam) Multiple motif methods Identity matrices (PRINTS)

  7. Motif Define pattern xxxxxx xxxxxx xxxxxx xxxxxx Extract pattern sequences Build regular expression C-C-{P}-x(2)-C-[STDNEKPI]-x(3)-[LIVMFS]-x(3)-C Pattern signature PS00000 Patterns Sequence alignment

  8. Banche dati Specializzate di Patterns Proteici

  9. Protein families • PFAM (acronimo di Protein Families) è un database di domini di proteine descritti con modelli markoviani. E’ diviso in due sezioni: pfam-A contiene allineamenti curati da esperti; pfam-B contiene sequenze che vengono automaticamente raggruppate.

  10. Pfam

  11. InterPro Entry Groups similar signatures together Adds extensive annotation Adds extensive annotation Links to other databases Links to other databases Structural information and viewers • Hierarchical classification

  12. Interpro hierarchies: Families FAMILIES can have parent/child relationships with other Families • Parent/Child relationships are based on: • Comparison of protein hits • child should be a subset of parent • siblings should not have matches in common • Existing hierarchies in member databases • Biological knowledge of curators

  13. Interpro hierarchies: Domains DOMAINS can have parent/child relationships with other domains

  14. Domains and Families may be linked through Domain Organisation Hierarchy

  15. InterPro Entry Groups similar signatures together Adds extensive annotation Adds extensive annotation Links to other databases Links to other databases Structural information and viewers

  16. InterPro Entry Groups similar signatures together Adds extensive annotation Adds extensive annotation Links to other databases Links to other databases Structural information and viewers The Gene Ontology project provides a controlled vocabulary of terms for describing gene product characteristics

  17. InterPro Entry Groups similar signatures together Adds extensive annotation Adds extensive annotation Links to other databases Links to other databases Structural information and viewers UniProt KEGG ... Reactome ... IntAct ... UniProt taxonomy PANDIT ... MEROPS ... Pfam clans ... Pubmed

  18. InterPro Entry Groups similar signatures together Adds extensive annotation Adds extensive annotation Links to other databases Links to other databases Structural information and viewers PDB 3-D Structures SCOP Structural domains CATH Structural domain classification

  19. Searching InterPro

  20. Searching InterPro Protein family membership Domain organisation Domains, repeats & sites GO terms

  21. Searching InterPro

  22. Searching InterPro

  23. Banche dati Specializzateassociate a Patterns Nucleotidici Eukaryotic Promoter Database (http://www.epd.isb-sib.ch/) Transcription Factors TRANSFAC Translation Terminations TransTERM Vector database VectorDB Repeats Database Repbase

  24. Profili strutturali CATH (http://www.cathdb.info/) SCOP (http://scop.mrc-lmb.cam.ac.uk/scop/)

  25. CATH

  26. SCOP

  27. Banche dati Specializzate di • Geni • Genomi • Trascritti e Profili di Espressione • Pathways Metabolici • Mutazioni

  28. Banche dati Specializzatedi Geni • COGs • Entrez Gene • RefSeq

  29. ENTREZ Gene

  30. Siti Genomici NCBI Genomes EBI Genomes TIGR (Craig Venter)

  31. Il Genoma Umano Il Genoma Umano all’NCBI Il Genoma Umano alla Celera Ensembl UCSC Genome Bioinformatics

  32. Banche dati del Trascrittoma dbEST UniGene UTRdb/UTRsite

  33. Banche dati di Espressione GEO ArrayExpress EPDex

  34. Banche dati diPathways Metabolici Kyoto Encyclopedia of Genes and Genomes http://www.genome.jp/kegg/

  35. Banche dati diPathways Metabolici REACT_945.4

More Related