120 likes | 246 Views
Microarray Data Analysis Using R. Studies in Tissue Databases Mark Reimers, NCI. Outline. The GNF tissue database Exploratory analysis - clustering Positional co-regulation Insight via co-regulation Apoptotic configuration of tissues Probe level analysis. The GNF Expression Atlas.
E N D
Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI
Outline • The GNF tissue database • Exploratory analysis - clustering • Positional co-regulation • Insight via co-regulation • Apoptotic configuration of tissues • Probe level analysis
The GNF Expression Atlas • Su et al ( PNAS 2004) hybridized 150 samples from 61 tissues to Affymetrix U133A and custom arrays • Variation in gene expression (as proportion of transcriptome) • 95% show at least one 2-fold change among 61 tissues • 37% show more than 2-fold differences between lowest 10% and highest 10%
Clustering samples • All biological replicates are nearest neighbors • Dendrogram reflects discrepancy between healthy and cancerous
Co-regulation of Nearby Genes • Some groups of genes next to one another on chromosome show high correlation across tissues
Significance of Co-regulation • How often would such correlations happen ‘by chance’ - eg. by selecting genes at random? • Three random measures would have correlation greater than 0.6 with p < 10-20! • However 3 genes selected at random from atlas have probability ~ 10-3 of having all corrs > 0.6 • In 30,000 positions, we should see 30 • 156 regions of high correlation determined • Many are paralogs • Perhaps 50% false discovery rate among the rest
Prediction of Function • Zhang, et al (J. Biol, 2004, 3:21) hybridized 55 mouse tissues to spotted oligo arrays • Hypothesis: genes with similar tissue expression patterns share similar function • Able to recover prediction of GO biological process for known genes with better than 50% accuracy for many categories • Extended prediction to 1,092 uncharacterized transcripts
Investigation of Poorly Characterized Gene - Top1MT • 10-fold variation in expression (odd for a ‘housekeeping gene’) • >50 genes with expression highly correlated ( .75) with Top1MT across tissue database • Large proportion are splicing factors • Top1MT has an odd splice junction in intron 1, and may depend critically on abundant splicing factors
Apoptosis Patterns • Majority of epithelial tissues show common pattern (indisposed to apoptosis) • Blood cells show variety of patterns
Exploration of Probe Sets • Examine correlation of probe sets across 150 samples • All but one probe verified to match latest Unigene build for gene • Probes organized by position in 3’ end Red: 1; White: < 0
Quality of Arrays • Regional bias images