1 / 31

Alternative Splicing

Alternative Splicing. As an introduction to microarrays. Human Genome. 90,000 Human proteins, initially assumed near that number of genes (initial estimates 153,000) The 1000 cell roundworm Caenorhabditis elegans has 19,500 genes, corn has 40,000 genes

golda
Download Presentation

Alternative Splicing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Alternative Splicing As an introduction to microarrays

  2. Human Genome • 90,000 Human proteins, initially assumed near that number of genes (initial estimates 153,000) • The 1000 cell roundworm Caenorhabditis elegans has 19,500 genes, corn has 40,000 genes • Current estimates are 25,000 or fewer genes • Alternative splicing allows different tissue types to perform different function with same gene assortment

  3. Implications • 75% of human genes are subject to alternative editing • faulty gene splicing leads to cancer and congenital diseases. • gene therapy can use splicing

  4. Application • We talked before about apoptotis when the cell determines it cant be repaired • Bcl-x is a regulator of apoptotis, is alternatively spliced to produce either Bcl-x(L) that suppresses apoptosis, or Bcl-x(S) that promotes it.

  5. Spliceosome • Five snRNA molecules U1, U2, U3, U4, U5, U6 combine with as many as 150 proteins to form the spliceosome • It recognizes sites where introns begin and end • Cuts introns out of pre-mRNA • joins exons

  6. Spliceosome • The 5’ splice site is at the beginning of the intron, the 3’ site is at the end • The average human protein coding gene is 28000 nucleotides long with 8.8 exons separated by 7.8 introns • exons are 120 nucleotides long while introns are 100-100,000 nucleotides long

  7. Splicing errors • familial dysautonomia results from a single-nucleotide mutation that causes a gene to be alternatively spliced in nervous system tissue • The decrease in the IKBKAP protein leads to abnormal nervous system development (half die before 30) • > 15% of gene mutations that cause genetic diseases and cancers are caused by splicing errors.

  8. Why splicing • Each gene generates 3 alternatively spliced mRNAs • Why so much intron (1-2% of genome is exons)? • Mouse and human differences are almost all splicing • Half of the human genome is made up of transposable elements, Alus being the most abundant (1.4 million copies) • They continue to multiply and insert themselves into the genome at the rate of one insertion per 100 human births • mutations in the Alu can create a 5’ or 3’ site in an intron causing it to be an exon • This mutation doesn’t impact existing exons • It only has effect when it is alternatively spliced in

  9. Microarrays For Alt. Splicing • Use short oligonucleotides • Get a guess at the rate of expression of the oligo Exon 1 Exon 2 Exon 4 Exon 5 Exon 3

  10. Probe types Constitutive Junction Exon Unique (“Cassette”) AffymetrixMicroarrays For Alt. Splicing Exon 1 Exon 2 Exon 4 Exon 5 Exon 3 Isoform 1: Exon 1 Exon 2 Exon 4 Exon 5 Isoform 2: Exon 1 Exon 3 Exon 5

  11. Probe types Constitutive Exon Junction Unique (“Cassette”) Ideal Microarray Readings Expression a b c d e Probe Isoform 1: a c Exon 1 Exon 2 Exon 4 Exon 5 b Isoform 2: a d Exon 1 Exon 3 Exon 5 e

  12. Motivation • Why alternatively splice? • How does it affect the resulting proteins? • Look at domains: • High level summary of protein • ~80% of eukaryotic proteins are multi-domain • Domains are big relative to an exon

  13. Some Previous Work • Signatures of domain shuffling in the human genome. Kaessmann, 2002. Intron phase symmetry around domain boundaries • The Effects of Alternative Splicing On Transmembrane Proteins in the Mouse Genome. Cline, 2004. Half of TM proteins studied affected by alt-splicing.

  14. Method • Predict Alternative Splicing • Predict Protein Domains • Look for effects of Alt-Splicing on predicted domains • “Swapping” • “Knockout” • “Clipping”

  15. Microarray Design • Genes based on mRNA and EST data in mouse • Mapped to Feb. 2002 mouse genome freeze • ~500,000 probes (~66,000 sets) • ~100,000 transcripts • ~13,000 gene models

  16. Technical work Genome Space Overlap gene models Generated Data transcripts Overlap Provided data Overlap Probe to transcript mapping E@NM_021320 cc-chr10-000017.82.0 G6836022@J911445 cc-chr10-000017.91.1 G6807921@J911524_RC cc-chr10-000018.4.0 probes

  17. Predicting Alternative Splicing • Using mouse alt-splicing microarrays • Data from Manny Ares • 8 tissues • 3 replicates of each tissue

  18. Predicting Alternative Splicing • General Approach: Clustering, then Anti-Clustering 107 Clusters Detail View

  19. Gene Expression Measurement • mRNA expression represents dynamic aspects of cell • mRNA expression can be measured with latest technology • mRNA is isolated and labeled with fluorescent protein • mRNA is hybridized to the target; level of hybridization corresponds to light emission which is measured with a laser

  20. Gene Expression Microarrays The main types of gene expression microarrays: • Short oligonucleotide arrays (Affymetrix); • cDNA or spotted arrays (Brown/Botstein). • Long oligonucleotide arrays (Agilent Inkjet); • Fiber-optic arrays • ...

  21. 50um Affymetrix Microarrays Raw image 1.28cm ~107 oligonucleotides, half Perfectly Match mRNA (PM), half have one Mismatch (MM) Raw gene expression is intensity difference: PM - MM

  22. Microarray Potential Applications • Biological discovery • new and better molecular diagnostics • new molecular targets for therapy • finding and refining biological pathways • Recent examples • molecular diagnosis of leukemia, breast cancer, ... • appropriate treatment for genetic signature • potential new drug targets

  23. Microarray Data Analysis Types • Gene Selection • find genes for therapeutic targets • avoid false positives (FDA approval ?) • Classification (Supervised) • identify disease • predict outcome / select best treatment • Clustering (Unsupervised) • find new biological classes / refine existing ones • exploration • …

  24. Microarray Data Mining Challenges • too few records (samples), usually < 100 • too many columns (genes), usually > 1,000 • Too many columns likely to lead to False positives • for exploration, a large set of all relevant genes is desired • for diagnostics or identification of therapeutic targets, the smallest set of genes is needed • model needs to be explainable to biologists

More Related