140 likes | 411 Views
Mus musculus - a model organism in SWISS-PROT. SWISS-PROT. Curated protein sequence data bank established in 1986 by Amos Bairoch in Geneva and maintained collaboratively with EMBL since 1987 Contains currently >74 000 protein sequence entries (Release 36). Distinguishing features.
E N D
SWISS-PROT • Curated protein sequence data bank established in 1986 by Amos Bairoch in Geneva and maintained collaboratively with EMBL since 1987 • Contains currently >74 000 protein sequence entries (Release 36)
Distinguishing features • High level of annotation • Minimal redundancy • High level of integration with other databases
TrEMBL • Computer-annotated supplement to SWISS-PROT • Consists of entries in SWISS-PROT format • Contains translations of all coding sequences in EMBL Nucleotide Sequence Database which are not yet in SWISS-PROT • Consists of SP-TrEMBL and REM-TrEMBL
Model organisms in SWISS-PROT • Target of genome sequencing and/or mapping • Priority annotation • incorporate new sequences/updates as quickly as possible • high level of annotation • cross-references to specialised databases • additional information in form of specific documents
Current status of mouse sequences • Total of 7006 entries in SWISS-PROT and TrEMBL • 3264 entries in SWISS-PROT • 3742 in SP-TrEMBL
Citation information Taxonomic data Sequence data Function(s) of the protein Post-translational modification(s) Domains and sites Secondary and quaternary structure Similarities to other proteins Disease(s) associated with deficiencie(s) in the protein Sequence variants, conflicts Annotation
Annotation sources • Publications reporting sequence data • Review articles • External experts
Integration with other databases • Sequence • EMBL Nucleotide Sequence Database • Structure • Protein Data Bank (PDB)
Integration • Specialised data collections • e.g. ENZYME, PROSITE • Mouse Genome Database (MGD) • Index of MGD entries referenced in SWISS-PROT: http://www.expasy.ch/cgi-bin/lists?mgdtosp.txt
ID GCDH_MOUSE STANDARD; PRT; 438 AA. • AC Q60759; • DT 01-NOV-1997 (REL. 35, CREATED) • DT 01-NOV-1997 (REL. 35, LAST SEQUENCE UPDATE) • DT 01-NOV-1997 (REL. 35, LAST ANNOTATION UPDATE) • DE GLUTARYL-COA DEHYDROGENASE PRECURSOR (EC 1.3.99.7). • GN GCDH. • OS MUS MUSCULUS (MOUSE). • OC EUKARYOTA; METAZOA; CHORDATA; VERTEBRATA; TETRAPODA; MAMMALIA; EUTHERIA; RODENTIA. • RN [1] • RP SEQUENCE FROM N.A. • RC STRAIN=129/SV; TISSUE=LIVER; • RX MEDLINE; 96039264. • RA KOELLER D.M., DIGIULIO K.A., ANGELONI S.V., DOWLER L.L., FRERMAN F.E., WHITE R.A., GOODMAN S.I.; • RL GENOMICS 28:508-512(1995). • CC -!- CATALYTIC ACTIVITY: GLUTARYL-COA + ACCEPTOR = CROTONOYL-COA + CO(2) + REDUCED ACCEPTOR. • CC -!- COFACTOR: FAD FLAVOPROTEIN. • CC -!- PATHWAY: DEGRADATIVE PATHWAY OF L-LYSINE, L-HYDROXYLYSINE, AND L-TRYPTOPHAN METABOLISM. • CC -!- SUBUNIT: HOMOTETRAMER. • CC -!- SUBCELLULAR LOCATION: MITOCHONDRIAL MATRIX. • CC -!- SIMILARITY: BELONGS TO THE ACYL-COA DEHYDROGENASES FAMILY. • DR EMBL; U18992; G1439521; -. • DR MGD; MGI:104541; GCDH. • DR PROSITE; PS00072; ACYL_COA_DH_1; FALSE_NEG. • DR PROSITE; PS00073; ACYL_COA_DH_2; 1. • KW OXIDOREDUCTASE; FLAVOPROTEIN; FAD; MITOCHONDRION; TRANSIT PEPTIDE. • FT TRANSIT 1 44 MITOCHONDRION (POTENTIAL). • FT CHAIN 45 438 GLUTARYL-COA DEHYDROGENASE. • SQ SEQUENCE 438 AA; 48646 MW; 8C9149C3 CRC32;
Summary • Complete with minimal redundancy • As much up-to-date information as possible on each sequence • Priority annotation of mouse sequences • MGD cross-refererences
Rolf Apweiler Sergio Contrino Wolfgang Fleischmann Henning Hermjakob Vivien Junker Stephanie Kappus Fiona Lang Michele Magrane Maria Jesus Martin Nicoletta Mitaritonna Steffen Moeller Claire O’Donovan SWISS-PROT at EBI
Some ways to access SWISS-PROT + TREMBL • ftp.ebi.ac.uk/pub/databases • http://srs.ebi.ac.uk:5000/ • http://www2.ebi.ac.uk/fasta3/ • http://www2.ebi.ac.uk/blast2/ • http://www2.ebi.ac.uk/bic_sw/