130 likes | 249 Views
Driving Biological Projects #1: Vosshall - Aedes aegypti neurotranscriptome. Dan Lawson. Driving Biological Projects@Project Wiki. VectorBase 2012. 2. Project aims.
E N D
Driving Biological Projects #1:Vosshall - Aedes aegypti neurotranscriptome Dan Lawson
Driving Biological Projects@Project Wiki VectorBase 2012 2
Project aims • Aim is to generate a complete neurotranscriptome—a catalogue of all genes expressed in the central nervous system and head sensory appendages from Ae. aegypti using RNA-seq technology. • Understanding the molecular basis of host-seeking behaviors is a priority in the search for effective vector intervention strategies. • This project will focus on decoding gene expression in the three principal head sensory organs thought to be involved in host-seeking behavior: • antenna (TGMA:0000007) • maxillary palp (TGMA:0000068) • proboscis (TGMA:0000075) • brain (TGMA: absent from ontology) VectorBase 2012 3
1. Improve Aedes gene models • Generate whole animal transcriptome data sets to use to validate and improve existing models. • Specifically this process should improve the prediction/characterization of untranslated regions (UTRs) as these are poor in the current AaegL1.2 gene set • This is a preliminary task to improve RPKM expression values for subsequent experiment • RPKM: Reads Per Kilobase of exon model per Million mapped reads VectorBase 2012 4
2. Examine how tissue-specific gene expression changes with blood-feeding state • Generate four datasets from Liverpool strain to compare neurotranscriptome of sugar-fed versus 48 hr post blood-fed mosquitoes VectorBase 2012 5
3. Examine expression changes between two strains of Aedes that differ in host preferences • Generate four datasets from both the Rabai brown (anthrophilic) and Rabai black (zoophilic) strains VectorBase 2012 6
Summary of datasets to be generated VectorBase 2012 7
Using RNA-Seq datasets to assess gene set accuracy Align using tophat perl scripts Align using gmap supercont1.123:123-456 + 68 supercont1.123:678-890 - 58 VectorBase 2012 8
Oddities in tophat BED files and junction files BED format supercont1.1485 27743 47722 JUNC00000001 1 - 27743 47722 255,0,0 2 33,43 0,19936supercont1.1483 19523 19721 JUNC00000002 13 - 19523 19721 255,0,0 2 70,67 0,131 tophat bed_to_junc output supercont1.1485 27775 47679 -supercont1.1483 19592 19654 - unnamed format Allows merging of bed files (and other alignments) First field notation for quick cut-n-paste to e! browser Frequency counts available for sorting supercont1.1485:27775-47679 - 1supercont1.1483:19592-19654 - 13 VectorBase 2012 9
Genebuild intron QC by RNA-Seq alignment sort comm -12 perl scripts VectorBase 2012 10
Statistics (intron counts) VectorBase 2012 11
Example gene structure confirmation/improvement VectorBase 2012 12