1 / 37

DNA Sequencing

DNA Sequencing. Basic Techniques Project Design Process Improvements. 500 bases 2500 bases 10 kbp 150 kbp 3 Mbp simple repeats BIG. 1 locus EST,STS whole cDNA/EST gene, virus BAC, big virus bacterial genome YAC-size HUMAN, etc. Project Size/Type. DNA Sequencing Methods.

Mercy
Download Presentation

DNA Sequencing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DNA Sequencing Basic Techniques Project Design Process Improvements Chuck Staben

  2. 500 bases 2500 bases 10 kbp 150 kbp 3 Mbp simple repeats BIG 1 locus EST,STS whole cDNA/EST gene, virus BAC, big virus bacterial genome YAC-size HUMAN, etc. Project Size/Type Chuck Staben

  3. DNA Sequencing Methods • Chain termination/Dideoxy/Sanger • fluorescence paradigm, ABI, HOOD • Sequencing by hybridization • chips Affymetrix (Lander, et al) • other formats • Hyseq (Church, et al) • Lark Chuck Staben

  4. Dideoxy/Chain Terminator/Sanger • Template • Primer • Extension Chemistry • polymerase • termination • labeling • Separation • Detection Chuck Staben

  5. Target ddC ddA Template-Primer ddG ddT Terminators ddA A ddC AC ddG ACG ddT Chain Terminator Basics TGCA Extend dN : ddN 100 : 1 Ladder n, n+1... Chuck Staben

  6. Electrophoresis Chuck Staben

  7. Template Preparation • ssDNA vectors • M13 • pUC • PCR • dsDNA (+/- PCR) Chuck Staben

  8. Primers • Universal primers • cheap, reliable, easy, fast, parallel • BULK sequencing • Custom primers • expensive, slow, one-at-a-time • ADAPTABLE Primer Label Dye Terminator Chuck Staben

  9. Extension Chemistry 100% termination Accurate Even signal • Polymerase • Sequenase • Thermostable (Cycle Sequencing) • Terminators • Dye labels (“Big Dye”) • spectrally different, high fluorescence • (mass labels??) • ddA,C,G,T with primer labels Chuck Staben

  10. Separation • Gel Electrophoresis • Capillary Electrophoresis • suited to automation • rapid (2 hrs vs 12 hrs) • re-usable • simple temperature control • 96 well format migration ~1/log N Chuck Staben

  11. Paradigm Instrument • Applied Biosystems • ABI3700 (early 1999) • 1500 samples/day! • http://www2.perkin-elmer.com/ga/3700/features.html • ABI377 (gel) and ABI310 (capillary) Chuck Staben

  12. Alternate Instruments • Molecular Dynamics, Beckman Coulter… • ALF, LiCor • infrared detection Not Complete List Chuck Staben

  13. 1 lane Sample Output Chuck Staben

  14. Trace Editing • EditView • Mac • Chromas • WinNT • Consed • UNIX Chuck Staben

  15. Project Goals • de novo sequence • Chain terminators • repetitive sequencing • Sequencing by hybridization • Chip technology, eg Chuck Staben

  16. Sequencing Strategies • Random Sequence • Brute Force • Ordered • Divide and Conquer Sequencing Assembly Finishing Annotation Mix to Suit Chuck Staben

  17. Random Method • Shear DNA (nebulize) • finish ends, ligate into vector • Produce template • Sequence to target coverage • read length (500 typical) • accuracy (99% good) Assemble Contigs Chuck Staben

  18. T T C No coverage DISAGREEMENT Only 1 strand Random Chuck Staben

  19. Poisson Statistics L=read length N=#reads G=genome size P0=e-L(N)/G Chuck Staben

  20. Poisson-2 Gap Length=P0G Chuck Staben

  21. Poisson-3 Gap Number=P0N (assume N=500 bases) Chuck Staben

  22. 4 Mbp Genome • 10x Coverage • 80,000 reads at 500 bases/read • 4 gaps • 400 bases in gaps 55 instrument days on ABI3700 Chuck Staben

  23. 300 machines, 300 days 3 years Plenty 3000 Mbp GenomeHUMAN 50000 instrument days on ABI3700 Chuck Staben

  24. Automation QT Chuck Staben

  25. Costs • Raw cost ~$0.01/base • “Semi-finished” $0.10 per base • “finished” $0.30 per base • High-quality Genome Project • $0.50/base Chuck Staben

  26. Ordered Methods Primer Walking Nested Deletion Chuck Staben

  27. Limitations • Slow, Expensive • Expertise Needed • especially nested deletion • Repeat Problems • especially primer walking Chuck Staben

  28. Finishing • GOALS • >95% coverage on BOTH strands • every base covered 3X • resolve ambiguities • Finish when random no longer productive (3-10 X range) Chuck Staben

  29. Finish-How • Identify gaps, ambiguities • Extend from end of contigs • specific primers • subclones, etc. • Resolve ambiguities • consensus or resequence • specific primers, different chemistry Chuck Staben

  30. Assembly Methods • Strip out vector • Mask known repeats • Trim off unreliable data • Find Matches (500 x 500 x many!!) • how long (and what ktuple) • how perfect (reliability index) • where to look? (ends only vs entire) Chuck Staben

  31. Assembly Programs • PHRAP FAMILY • phrap, kangaroo, phrapo, • GAP4, TIGRAssembler,... • GCG • gelstart, gelenter, gelmerge, gelassemble, geldisassemble • thinly veiled vi editor • SeqWeb…. Chuck Staben

  32. Assembly ImprovementsRepeat Problems • Multiple fragment sizes in 1 project • Use length/distance info Chuck Staben

  33. Project Management • Editing and Assembly • RepeatMasker • Phred/Phrap • Consed • Databases • ACeDB • A C. elegans database • Oracle Chuck Staben

  34. Annotation • ORFs • GRAIL, PowerBLAST • Repeats • Other Regions Submit to Genbank ...HTGS (level1,2,3) ...nr Chuck Staben

  35. A C G T A C G T Sequencing by Hybridization Hybridize labeled query DNA CHIP OLIGOS (20-mers) ...gaactAatact... ...gaactCatact... ...gaactGatact... ...gaactTatact... site 1 ...gaactaAtact... ...gaactaCtact... ...gaactaGtact... ...gaactaTtact... site 2 GAACTATGTACT Chuck Staben

  36. Modern Sequencing Challenges • Heterozygous DNAs • germline differences • somatic variation • Massive sequencing • population studies • genome scans • Minimal sample preparation • “Doctor’s Office” Chips, Quantitative Seq Automation Miniaturization Chuck Staben

  37. Physical MappingGenome Characterization • Genome fragmentation and cloning • vectors, etc. • Physical map assembly • hybridization • fingerprinting Chuck Staben

More Related