640 likes | 787 Views
For Bioinformatics. , Start with:. Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence. carry out dideoxy sequencing. connect seqs. to make whole chromosomes. find the genes!. For Bioinformatics. , Start with:. Genomics: READING genome sequences
E N D
For Bioinformatics , Start with: Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes!
For Bioinformatics , Start with: Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes!
2 ways to annotate eukaryotic genomes: -ab initio gene finders: Work on basic biological principles: Open reading frames Consensus splice sites Met start codons ….. -Genes based on previous knowledge….EVIDENCE of message 2 ways to annotate eukaryotic genomes: -ab initio gene finders: -Genes based on previous knowledge….EVIDENCE of message
2 ways to annotate eukaryotic genomes: -ab initio gene finders: Work on basic biological principles: Open reading frames Consensus splice sites Met start codons ….. -Genes based on previous knowledge….EVIDENCE of message cDNA sequence of the gene’s message cDNA of a closely related gene’ message sequence Protein sequence of the known gene Same gene’s Same gene’s from another species Related gene’s protein…….
Information for Ab initio gene finding start and stop site predictions Unique identifiers Splice site predictions Homology based exon predictions computational exon predictions Tracking information Consensus gene structure (both strands)
Automatically generated annotation
A zebrafish hit shows a gene model protein encoded by a 6 exon gene. This gene structure (intron/exon) is seen in other species, as is the protein size. The proteins, if corresponding to MSP in S. gal., must be heavily glycosylated (likely). At least some have a signal peptide.
The zebrafish hit can be viewed down to nucleotide resolution GO LIVE!
Is there linkage between a mutant gene/phenotype and a SNP? USE standard genetic mapping technique, with SNP alternative sequences as “phenotype” B= bad hair, Dominant SNP1 ..ACGTC.. SNP1’ ..ACGCC.. SNP2 ..GCTAA.. SNP2’ ..GCAAA.. SNP3 ..GTAAC.. SNP3’ ..GTCAC.. X F1 B X START with Inbred lines- SNPs are homozygosed SNP1’ ..ACGCC.. SNP1’ ..ACGCC.. SNP2’ ..GCAAA.. SNP2’ ..GCAAA.. SNP3’ ..GTCAC.. SNP3’ ..GTCAC.. SNP1 ..ACGTC.. SNP1 ..ACGTC.. SNP2 ..GCTAA.. SNP2 ..GCTAA.. SNP3 ..GTAAC.. SNP3 ..GTAAC..
Is there linkage between a mutant gene/phenotype and a SNP? USE standard genetic mapping technique, with SNP alternative sequences as “phenotype” B= bad hair, Dominant B 2’ / b 2 SNP1 ..ACGTC.. SNP1’ ..ACGCC.. SNP2 ..GCTAA.. SNP2’ ..GCAAA.. SNP3 ..GTAAC.. SNP3’ ..GTCAC.. X 1’/11/1 B/b b/b 2’/2 2/2 3’/3 3/3 2’/2 47% 2/2 3% 2’/2 3% 2/2 47% 3’/3 25% 3/3 25% 3’/3 25% 3/3 25% 1’/1 25% 1/1 25% 1’/1 25% 1/1 25% B/b B/b b/b b/b SO…B is 6 cM from SNP2, and is unlinked to SNP 1 or 3
Is there linkage between a mutant gene/phenotype and a SNP? USE standard genetic mapping technique, with SNP alternative sequences as “phenotype” B= bad hair, Dominant SNP1 ..ACGTC.. SNP1’ ..ACGCC.. SNP2 ..GCTAA.. SNP2’ ..GCAAA.. SNP3 ..GTAAC.. SNP3’ ..GTCAC.. X 1/1’1/1 B/b b/b 2/2’ 2/2 3/3’ 3/3 We have the ENTIRE genome sequence of mouse, so we know where the SNPs are Now-do this while checking the sequence of THOUSANDS of SNPs SO…B is 6 cM from SNP2, and is unlinked to SNP 1 or 3
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes! But Bioinformatics is more…
End Reads (Mates) Primer SEQUENCE TRANSCRIPTOMICS: cDNAs & ESTs: Expressed Sequence Tags RNA target sample cDNA Library Each cDNA provides sequence from the two ends – two ESTs
Protein sequence: from peptide sequencing, or from translation of sequenced nucleic acids !!AA_SEQUENCE 1.0 ab025413 peptide tenm4.pep Length: 2771 May 12, 1999 09:34 Type: P Check: 2254 .. 1 MDVKERKPYR SLTRRRDAER RYTSSSADSE EGKGPQKSYS SSETLKAYDQ 51 DARLAYGSRV KDMVPQEAEE FCRTGTNFTL RELGLGEMTP PHGTLYRTDI 101 GLPHCGYSMG ASSDADLEAD TVLSPEHPVR LWGRSTRSGR SSCLSSRANS 151 NLTLTDTEHE NTETDHPSSL QNHPRLRTPP PPLPHAHTPN QHHAASINSL 201 NRGNFTPRSN PSPAPTDHSL SGEPPAGSAQ EPTHAQDNWL LNSNIPLETR 251 NLGKQPFLGT LQDNLIEMDI LSASRHDGAY SDGHFLFKPG GTSPLFCTTS 301 PGYPLTSSTV YSPPPRPLPR STFSRPAFNL KKPSKYCNWK CAALSAILIS 351 ATLVILLAYF VAMHLFGLNW HLQPMEGQMQ MYEITEDTAS SWPVPTDVSL 401 YPSGGTGLET PDRKGKGAAE GKPSSLFPED SFIDSGEIDV GRRASQKIPP
Structural proteomics: Coordinates, rather than 1D sequence, Saved
Where? When? Who? are the RNAs RNA for ALL C. elegans genes
Where? When? Who? are the RNAs
Where? When? Who? are the RNAs
Where? When? Who? are the RNAs MICROARRAY ANALYSIS
Where? When? Who? are the RNAs Figure 4.15 Microarray Technique
Where? When? Who? are the RNAs Figure 4.15 Microarray Technique
Where? When? Who? are the RNAs Array analysis: see animation from Griffiths
Where? When? Who? are the RNAs Figure 4.16(1) Microarray Analysis of Those Genes Whose Expression in the Early Xenopus Embryo Is Caused by the Activin-Like Protein Nodal-Related 1 (Xnr1)
Where? When? Who? are the RNAs Figure 4.16(2) Microarray Analysis of Those Genes Whose Expression in the Early Xenopus Embryo Is Caused by the Activin-Like Protein Nodal-Related 1 (Xnr1)
Where? When? Who? are the RNAs
Where? When? Who? are the RNAs
RNAi for every C. elegans gene too! -results on the web Projects to systematically Knock-out (or pseudo-knockout) every gene, in order to establish phenotype of each gene -> function of each gene
Figure 4.23(1) Use of Antisense RNA to Examine the Roles of Genes in Development (here fly)
Figure 4.23(2) Use of Antisense RNA to Examine the Roles of Genes in Development (here fly)
Figure 4.24 Injection of dsRNA for E-Cadherin into the Mouse ZygoteBlocks E-Cadherin Expression