1 / 23

An Introduction to Bioinformatics

An Introduction to Bioinformatics. Molecular Biology Databases. AIMS. To introduce the major databases - nucleotide - protein. To explain how to search the appropriate databases. To explain how to retrieve information from databases. OBJECTIVES.

rowa
Download Presentation

An Introduction to Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Bioinformatics Molecular Biology Databases

  2. AIMS To introduce the major databases - nucleotide - protein To explain how to search the appropriate databases To explain how to retrieve information from databases OBJECTIVES Choose appropriate databases for information retrieval Use of Boolean operators to search databases Retrieve nucleotide and protein sequence files

  3. Introduction • Hundreds! • Databases of databases! • Acronym rich! • Subcomponents • organisms • structure • metabolism……. • Searched • text, sequences

  4. Historically • 1960s • Mary Dayhoff - Protein Sequences • (Eck, R. V., and M. O. Dayhoff. 1966. Atlas of Protein Sequence and Structure 1966. National Biomedical Research Foundation, Silver Spring, Maryland.) • 1980s - explosion in DNA sequences • EMBL (European Molecular Biology Laboratory) • NIH (National Institute of Health) Genbank • DDBJ (DNA database of Japan) • 1988 • agreed on international collaboration

  5. Primary Databases • Experimentally determined nucleotide sequence, • Inferred protein sequence • EMBL, GenBank, DDBJ nucleotides • GenPept • PIR Protein Identification Resource proteins • SWISS-PROT • Which to choose? }

  6. Composite Databases SWISS-PROT, Swissnew, Trembl, Tremblnew, Genbank, PIR, Wormpep and PDB SWISS-PROT + PIR + GenPept +

  7. Secondary Databases • Analytical results of primary databases • Searching for related patterns • Prosite • Pfam More on these later

  8. Sub-Databases • EST - Expressed Sequence Tags • STS - Sequence Tagged Sites • SNP - Single Nucleotide Polymorphisms • OMIM - Online Medelian Inheritance in Man

  9. Searching and Retrieval • Entrez - National Center for Biotechnology Information • SRS - European Bioinformatics Institute • DBGET - Japan’s GenomeNet. Capable of retrieving specific nucleotide or protein sequence. Provide links to additional related information.

  10. Entrez

  11. Entrez Tutorial Q/ Are there any genes that code for penicillin binding in the Mycobacterium genome? • Search for penicillin-binding genes • Search for Mycobacterium tuberculosis • Combine the searches • Scan the output Example of a text based search to identify genes that have already been annotated.

  12. #1 AND #2

  13. SRS guide

  14. Searching the Databases • Subject • Accession Numbers • Author e.g. AF208262

  15. Boolean Operators AND will locate all records containing both the words e.g. human AND protease OR will locate all records containing either word not necessarily both e.g. human OR protease) NOT will locate records containing one word, but NOT the other word e.g. human NOT protease

More Related