340 likes | 474 Views
The Genomics Education Partnership TA AnnotationWorkshop 2006. August 21-23 Funded by the Howard Hughes Medical Institute. WU Program Participants. Sarah Elgin, Prof Biology & Genetics Jeremy Buhler, Asst Prof Computer Science Chris Shaffer, Biology, Senior Teaching Fellow
E N D
The Genomics Education PartnershipTA AnnotationWorkshop 2006 August 21-23 Funded by the Howard Hughes Medical Institute
WU Program Participants • Sarah Elgin, Prof Biology & Genetics • Jeremy Buhler, Asst Prof Computer Science • Chris Shaffer, Biology, Senior Teaching Fellow • Wilson Leung, Biology, Res. Asst, TA & Web Master • Taylor Cordonnier (Teaching Assistant & Lab Participant) • John Russell (Professor, Director of DBBS) • Tricia Wallace (Tour Guide, WU Genome Sequencing Center) • Undergraduate alumni of Bio 4342: • Kasia Falkowska, David Desruisseau • Washington University Graduate Students • Michael Brooks (genetics/computational biology) • Deanna Mendez (biophysics/chromosomal proteins) • Sanjida Rangwala (genetics/plant genomes)
Participating Schools • Catherine Coyle-Thompson California State University - Northridge • Chunguang Du Montclair State University • Todd Eckdahl Missouri Western • Anya Goodman Cal Poly State University – San Luis Obispo • Charles Hauser St. Edward’s University • Karmella Haynes WU, Davidson College • Chris Jones Moravian College • Olga Ruiz Kopp Utah Valley State College • Gary Kuleck Loyola Marymount University • Jennifer Myka Thomas More College • Paul Overvoorde Macalester College • Debbie Parrilla-Hernandez Universidad de Puerto Rico en Humacao • Dennis Revie California Lutheran University Stephanie Schroeder Webster University • Mary Shaw New Mexico Highlands University • Gary Skuse Rochester Institute of Technology • Colette Witkowski Southwest Missouri State
Goals • Better integration of genomics into the undergraduate biology curriculum • Better integration of research thinking into the academic year curriculum • Creation of a dynamic student-scientist partnership to engage students in genomics research
GOAL: To provide students the opportunity to work as a research team through a large-scale sequencing project. • PROCESS: Students begin with sample preparation, data generation, finishing and quality control at the WU Genome Sequencing Center, and complete annotation and analysis with WU Computer Science faculty.
Challenge: making it work at a distance, with your curriculum Virtual Tour of the Genome Sequencing Center - available on line, as CD, or DVD • Web site: lecture notes, PowerPoint presentations, references, homework with answer keys, example student presentations • Key analytical work is computer based • Major resources for annotation, databases, are open access (NIH, UCSC, Ensembl)
Choice of research problems? Comparative analysis of Drosophila dot chromosomes D. erecta annotation; D. mojavensis sequencing Annotation of corn genome? Gut bacteria genomes? Requires lead scientist(s) committed to publication
Our ‘04-’06 research goal: To compare finished sequence from the dot chromosomes of D. melanogaster with D. virilis
The sequencing “pipeline” • Genomes enter the GSC as BAC or fosmid librar y • Clones to be sequenced are selected • The GSC prepares ~2 kb libraries from each clone • The 2 kb fragments are sequenced from each end (~700 bases each) • Phred/Phrap assembles the sequenced fragments • Finishers use Consed, request additional data to generate a single, high-quality contig • Annotation identifies sequence features of interest • Future: start from posted unfinished sequence: annotate D. erecta, finish & annotate D. mojavensis
Current status, spring 2006 Finished sequence D. virilis dot chromosome, reference strain Chosen fosmids ~12kb 12kb 15kb 8kb 13kb 3kb 9kb Remaining gaps • 13 fosmids (~40 kb each) were selected to be made into libraries for sequencing • Each student sequences and annotates one fosmid • 8 smaller gaps will be sequenced using a PCR-based method (summer work, Michelle & Taylor)
Shotgun sequencing & assembly genome Shotgun (paired ends) Assemble sequence reads scaffold Additional sequence reads needed
From 2X reads to 6X coverage…. • Three significant contigs • All gaps spanned • Fair coverage, but weak spots
GSC libraries for sequencing insert (2-4 kb) primer read plasmid
Final Assembly • 40,809 base pairs • 438 reads • Good coverage, no low quality regions
Annotation: analyzing sequence data • Practice problem: genes and pseudogenes in man and chimpanzee • Annotating Drosophila fosmid: • Finding genes • Finding repeats • Searching for conserved elements • Clustal analysis • Evaluating synteny • Final challenge: putting it all together
Annotation: what do students gain by analyzing sequence data? • What tools are available for finding genes & other features of interest? How do they work? Managing data… • How do you define a gene? a psuedogene? • How are genomes organized? Repeats? • Power of comparative genomics • Questions of evolution
Initial analysis of D. virilis dot chromosome fosmids 27/28 genes remain on the dot, but rearrangements within the chromosome are common!
Examples of genome organization in Drosophila D.v. Arm D.m. Arm D.v. Dot D.m. Dot
Dot chromosomes genes have larger introns due to repetitious DNA Other Chromosomes Dot Chromosomes Perc. D. virilis Dot Perc. D. melanogaster Dot Legend: Perc. D. virilis Other Perc. D. melanogaster Other
The dot chromosomes of D. melanogaster and D. virilis both have a high density of repeat sequences, but differ in type of repeats 1360 Elements DINES Other DNA Transposons Unknown Simple Repeats Retroelements
Resulting publication: Slawson, E.E., Shaffer, C.D., Leung, W, Malone, C.D., Kellmann, E., Shevchek, R.B., Craig, C.A., Bloom, S., Bogenpohl, J. II, Dee, J., Morimoto, E.T.A., Myoung, J., Nett, A.S., Ozsolak, F., Tittiger, M.E., Zeug, A., Pardue, M.L., Buhler, J., Mardis, E., and Elgin, S.C.R. (2006) “Comparison of dot chromosome sequences from D. melanogaster and D. virilis reveals an enrichment of DNA transposon sequences in heterochromatic domains,” Genome Biology 7: R15. • But required ca. 10 months additional full-time work!
Assessment: Likert Scale(5 = Agree, 1 = Disagree) • Before the course, I understood how the human genome had been sequenced: 2.5 • After the course, I understood… how the human genome had been sequenced: 4.9; … how eukaryotic genomes are organized 4.5; … nature of genes 4.4. • The course helped me improve my wet lab skills: 2.5 • The course helped me improve my computer skills: 4.5 • Genomics is awesome! I love the power of databases! 4.8
4.24 1. Understanding of the research process 4.16 2. Understanding how knowledge is constructed 4.08 3. Ability to analyze data 3.92 4. Skill in interpretation of results 3.88 5. Understanding how scientists work on real problems 3.88 6. Assertions require supporting evidence 7. Skill in scientific writing 3.80 Data from Course Work (25) SURE 2003 (1135) Learning Gains from WU Lab Courses Compared to Summer Program Research Experiences Mean Values Scale: 1-5
8. Readiness for more research 3.64 9. Tolerance for obstacles 3.63 10. Ability to integrate theory and practice 3.60 11. Learning lab skills 3.56 12. Clarification of a career path 3.13 13. Learning to work independently 2.83 14. Understanding primary literature 2.79 15. Learning ethical conduct 2.22 Data from Course Work (25) SURE 2003 (1135) Learning Gains from WU Lab Courses Compared to Summer Program Research Experiences Mean Values Scale: 1-5
Learning to work independently Understanding knowledge construction Skill in scientific writing Comparison of Learning Gains from WU Lab Courses with Summer Research Experiences Mean values Learning Gains Course Work SURE 2003 SURE 2004
What Students Say They Learned: • Oral presentation skills, defending ideas • Scientific writing • Why you do things, and how to choose a strategy • That research doesn’t always work, and goes slowly • That research is collaborative • That science is more ambiguous than it appears in lectures
Things Students Said Helped Them Understand the Material Better: • Writing formal lab reports • Defending their work against challenges from others (in oral presentations) • Having lots of opportunities to ask questions • Doing trouble-shooting
Lessons Learned • Students need ownership; can come from the computer-based effort, does not require wet lab. • Generating letter grades - use staged problem sets to teach techniques, record progress; periodic reports with written and oral defense of conclusions. • Challenging - work always changing, requires time commitment; computer support important • Quality of the experimental work is very good! Finished sequence, publishable data, conclusions. Good student-scientist partnership.
Goals for workshop…. • Provide background experience in gene annotation; introduce computer-based training materials, problem sets; annotate a Drosophila gene • Provide a review of genome sequencing, visit the WU Genome Sequencing Center • Discuss your role as a TA • Discuss plan to facilitate data in / data out from WU • Discuss communications plan - Wiki? Help contacts? • Discuss present and future projects of the GEP