1 / 95

Bioinformatics

Bioinformatics. Expression profiling and functional genomics Part I: Preprocessing Ad 29/10/2006. http://www.esat.kuleuven.ac.be/~kmarchal/ Course material: course notes + powerpoint files Exercises. Overview. MICROARRAY PREPROCESSING Gene expression Omics era Transcript profiling

mavis
Download Presentation

Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics Expression profiling and functional genomics Part I: Preprocessing Ad 29/10/2006

  2. http://www.esat.kuleuven.ac.be/~kmarchal/ • Course material: course notes + powerpoint files • Exercises

  3. Overview MICROARRAY PREPROCESSING • Gene expression • Omics era • Transcript profiling • Experiment design • Preprocessing • Exercises

  4. +1 protein Gene expression DNA transcription mRNA translation protein

  5. Bacterial cell out in FNR box cytN cytO cytQ cytP Gene expression Adaptation of cell to its environment Signal 1 Signal 2 • Adaptation of a cell: • response on environmental signals • response to e.g. hormones (cell differentiation) ? ? Cellular response determined by the genes which are switched on upon a signal

  6. Gene expression Action of genetic networks underlie the observed phenotypical behavior

  7. Overview MICROARRAY PREPROCESSING • Gene expression • Omics era • Transcript profiling • Experiment design • Preprocessing • Exercises

  8. Structural Genomics Comparative Genomics Functional genomics

  9. Omics era Traditional molecular biology • Directed toward understanding the role of a particular gene or protein in a molecular biological process • Northern analysis • Mutational analysis • Expression by reporter fusions Omics era Measurement of the expression of 1000 of genes, proteins simultaneously • The function or the expression of a gene in a global context of the cell • Holistic approaches allow better understanding of fundamental molecular biological processes Because a gene does not act on its own, it is always embedded in a larger network (systems biology)

  10. Reference sample Test sample RNA RNA Reference Test Omics era cDNA cDNA Detection transcriptomics

  11. Omics era proteomics

  12. Omics era metabolomics

  13. Omics era Consider the cell as a system SYSTEMS BIOLOGY

  14. Omics era SYSTEMS BIOLOGY Mechanistic insight in the biological system at molecular biological level High throughput data

  15. Omics era • analysis of such large scale data is no longer trivial => computational challenges • Low signal/ noise • High dimensionality • Simple spreadsheet analysis such as excel are no longer sufficient • More advanced datamining procedures become necessary • Another urgent problem is also how to store and organize all the information. Bioinformatics

  16. Overview MICROARRAY PREPROCESSING • Gene expression • Omics era • Transcript profiling • Principle of microarray • Applications • Experiment design • Preprocessing • Exercises

  17. Reference sample Test sample RNA RNA Reference Test Transcript profiling transcriptomics cDNA cDNA Detection

  18. Transcript profiling • Previously: measure expression level of one gene: Northern blot analysis • Novel techniques: measure expression level of all genes simultaneously => EXPRESSION PROFILING Principle: hybridisation mRNA: 5’ –UGACCUGACG- 3’ cDNA 3’ -ACTGGACTGC-5’ Hybridize : stick together

  19. allows to gain a general insight in the global cell behavior (holistic) Transcript profiling • Monitor molecular activities on a global level • protein levels proteomics, • enzyme activities • Metabolites • gene expression (mRNA), transcriptomics = transcript profiling Molecular biological methods • RT-PCR • SAGE • Protein arrays • Microarray analysis

  20. Transcript profiling

  21. +1 Transcript profiling cDNA array Gene (DNA) Transcript (mRNA) cDNA Spotted cDNA Glass side Upscaled Northern hybridisation

  22. Transcript profiling • Preparation of probes • Collect cDNA clones • Amplify target cDNA insert by PCR • Check yield & specificity by electrophoresis • Spot + PCR products on glass slides

  23. Reference sample Test sample RNA RNA Reference Test Transcript profiling cDNA cDNA Detection

  24. Transcript profiling Signal 1 Signal 2 2. mRNA isolation 1. Cell culture 3. labeling numerical value 4. Hybridization + washing 5. scanning 6. Image analysis

  25. Transcript profiling http://www.bio.davidson.edu/courses/genomics/chip/chip.html

  26. Transcript profiling Superimposed color image * Transform into color images * Superimpose color images from R and G channel good alignment bad alignment

  27. Transcript profiling Superimposed color image black spots : gene was neither expressed in test nor in control sample green : gene was only expressed in control sample red : gene was only expressed in test sample yellow : gene was expressed both in test and in control sample

  28. Transcript profiling Signal intensity is proportional with the amount of cDNA present in the sample signal cy3 -> numerical value signal cy5 -> numerical value Image analysis Data analysis

  29. Transcript profiling Data representation Gene profile Experiment profile

  30. Transcript profiling Spotted DNA microarray High density oligonucleotide array

  31. Overview MICROARRAY PREPROCESSING • Gene expression • Omics era • Transcript profiling • Experiment design • Preprocessing • Exercises

  32. Experiment Design • Depending on experimental design other mathematical approach • Comparison of 2 samples (black/white) • Comparison of multiple arrays • Global dynamic profiling • Static experiment: Comparison of samples (mutants, patients)

  33. Experiment Design 2 sample design Control sample Induced sample Statistical testing Retrieve statistically over or under expressed genes Type1: Comparison of 2 samples

  34. Experiment Design • black/white experiment description (array V mice genes) • Condition 1 : pygmee mouse 10 days old (test) • Condition 2 : normal mouse 10 days old (ref) detect differentially expressed genes Experiment design (Latin Square) Array 1 Per gene, per condition 4 measurements available Array 2

  35. Experiment Design • Measure expression of all genes • During time (dynamic profile) • In different conditions Multiple array design Clustering Identify coexpressed genes Motif Finding Identify mechanism of coregulation

  36. Experiment Design Multiple array design • Study of Mitotic cell cycle of Saccharomyces cerevisiae with oligonucleotide arrays (Cho et al.1999) - 15 time points (E=18) • time points 90 & 100 min deleted (Zhang et al. 1999, Tavazoie et al., 1999) Original dataset : 6178 genes • Preprocessing: • select 4634 most variable (25 % most variable) • variance normalized • adaptive quality based clustering (32 clusters) (95%)

  37. Experiment Design Reference design: e.g. Spellman dataset • Reference: unsynchronized cells • Condition: synchronized cells during cell cycle at distinct time intervals Array 1

  38. Experiment Design Loop design

  39. Overview MICROARRAY PREPROCESSING • Gene expression • Omics era • Transcript profiling • Experiment design • Preprocessing • Sources of Variation • General normalization steps • Slide by slide normalization • ANOVA normalization

  40. Preprocessing Sources of variation • Overshine effects • Dye effect • Spot effects • Array effect • Consistent errors • Consistent errors complicate direct comparison of measurements of the same gene/condition • Consistent errors need to be removed by preprocessing/normalization • Tedious • Influences downstream measurements

  41. Preprocessing Signal 1 Signal 2 Dye effect 2. mRNA isolation 1. Cell culture 3. labeling numerical value 4. Hybridization + washing 5. scanning 6. Image analysis

  42. Preprocessing Dye, condition effect:within slide variation Measurement error: • Preparation mRNA • Labeling &reverse transcription Overall signal in one channel more pronounced than in other channel Normalization Global normalization assumption

  43. Preprocessing Signal 1 Signal 2 2. mRNA isolation 1. Cell culture 3. labeling numerical value 4. Hybridization + washing 5. scanning 6. Image analysis Array effect

  44. Preprocessing Array effects: between slide variation Differences in global intensity between slides Hybridization differences Comparison between slides impossible • normalization within slide • ratio

  45. Preprocessing Array effects: Between slide variation

  46. Preprocessing Pin main effects: spot effects Measurement error: Different quantity of DNA in spot Difference in duplicate spots Absolute levels between genes incomparable Spot effect Ratio: compare differential expression between genes Gene 1: test: 4 ref:2 R/G:2 Gene 2: test: 8 ref:4 R/G:2

  47. Preprocessing Overshine effects: within slide variation Non specific signal Cy5 or Cy3 resulting from overshining = emission from neighboring spots Background intensity increases with the intensity of the neighboring spots

  48. Preprocessing • Removing sources of variation is obligatory step • To make comparisons within a slide possible • E.g. find differentially expressed genes • To allow interslide comparisons • E.g. combining the replica’s of the original experiment and the color flip

  49. Overview MICROARRAY PREPROCESSING • Gene expression • Omics era • Transcript profiling • Experiment design • Preprocessing • Sources of Variation • General normalization steps • Slide by slide normalization • ANOVA normalization ANOVA

  50. Preprocessing Array by array approach ANOVA based Background corr Background corr Log transformation Log transformation Filtering Filtering normalization Linearisation Ratio Test statistic (T-test) Bootstrapping

More Related