1 / 55

HapMap: application in the design and interpretation of association studies Mark J. Daly, PhD on behalf of The Internat

HapMap: application in the design and interpretation of association studies Mark J. Daly, PhD on behalf of The International HapMap Consortium. Goals of this segment. Briefly summarize HapMap design and current status

albert
Download Presentation

HapMap: application in the design and interpretation of association studies Mark J. Daly, PhD on behalf of The Internat

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HapMap: application in the design and interpretation of association studies Mark J. Daly, PhD on behalf of The International HapMap Consortium

  2. Goals of this segment • Briefly summarize HapMap design and current status • Discuss the application of HapMap to all aspects of association study design, analysis and interpretation

  3. HapMap Project A freely-available public resource to increase the power and efficiency of genetic association studies to medical traits High-density SNP genotyping across the genome provides information about • SNP validation, frequency, assay conditions • correlation structure of alleles in the genome All data is freely available on the web for application in study design and analyses as researchers see fit

  4. HapMap Samples • 90 Yoruba individuals (30 parent-parent-offspring trios) from Ibadan, Nigeria (YRI) • 90 individuals (30 trios) of European descent from Utah (CEU) • 45 Han Chinese individuals from Beijing (CHB) • 45 Japanese individuals from Tokyo (JPT)

  5. HapMap progress • PHASE I – completed, described in Nature paper • * 1,000,000 SNPs successfully typed in all 270 HapMap samples • * ENCODE variation reference resource available • PHASE II –data generation complete, data released this past Monday • * >3,500,000 SNPs typed in total !!!

  6. ENCODE-HAPMAP variation project • Ten “typical” 500kb regions • 48 samples sequenced • All discovered SNPs (and any others in dbSNP) typed in all 270 HapMap samples • Current data set – 1 SNP every 279 bp A much more complete variation resource by which the genome-wide map can evaluated

  7. Completeness of dbSNP Vast majority of common SNPs are contained in or highly correlated with a SNP in dbSNP

  8. Recombination hotspots are widespreadand account for LD structure 7q21

  9. Utility of LD in association study • “If I’m a causal variant, what is relevant to my detection in association studies is how well correlated I am with one of the SNPs or haplotypes examined in the study.”

  10. Coverage of Phase II HapMap(estimated from ENCODE data) Panel %r2 > 0.8 max r2 YRI 81 0.90 CEU 94 0.97 CHB+JPT 94 0.97 From Table 6 – “A Haplotype Map of the Human Genome”, Nature

  11. Coverage of Phase II HapMap(estimated from ENCODE data) Panel %r2 > 0.8 max r2 YRI 81 0.90 CEU 94 0.97 CHB+JPT 94 0.97 Percentage of deeply ascertained common variants highly correlated with a HapMap SNP From Table 6 – “A Haplotype Map of the Human Genome”, Nature

  12. Coverage of Phase II HapMap(estimated from ENCODE data) Panel %r2 > 0.8 max r2 YRI 81 0.90 CEU 94 0.97 CHB+JPT 94 0.97 Average maximum correlation between a deeply ascertained variant and a neighboring HapMap SNP From Table 6 – “A Haplotype Map of the Human Genome”, Nature

  13. Coverage of Phase II HapMap(estimated from ENCODE data) Panel %r2 > 0.8 max r2 YRI 81% 0.90 CEU 94% 0.97 CHB+JPT 94% 0.97 Vast majority of common variation (MAF > .05) captured by Phase II HapMap

  14. Applying the HapMap • Study design - tagging • Study coverage evaluation • Study analysis - improving association testing • Study interpretation • Comparison of multiple studies • Connection to genes/genomic features • Integration with expression and other functional data • Other uses of HapMap data • Admixture, LOH, selection

  15. Tagging from HapMap • Since HapMap describes the majority of common variation in the genome, choosing non-redundant sets of SNPs from HapMap offers considerable efficiency without power loss in association studies

  16. G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 G G A A G G G T T G G A C C C C C C C C C C C C A A A A T T G G G C C C high r2 high r2 high r2 Pairwise tagging Tags: SNP 1 SNP 3 SNP 6 3 in total Test for association: SNP 1 SNP 3 SNP 6 After Carlson et al. (2004) AJHG 74:106

  17. Pairwise Tagging Efficiency Tag SNPs were picked to capture common SNPs in release 16c.1 for every 7,000 SNP bin using Haploview. Tagging Phase I HapMap offers 2-5x gains in efficiency

  18. G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 A A G G G G G T T G G A A C C C C C C C C C C C C C C C A A T T A A G G G C C C Use of haplotypes can improve genotyping efficiency Tags: SNP 1 SNP 3 2 in total Test for association: SNP 1 captures 1+2 SNP 3 captures 3+5 “AG” haplotype captures SNP 4+6 Tags: SNP 1 SNP 3 SNP 6 3 in total Test for association: SNP 1 SNP 3 SNP 6 tags in multi-marker test should be conditional on significance of LD in order to avoid overfitting

  19. Efficiency and power tag SNPs ~300,000 tag SNPs needed to cover common variation in whole genome in CEU Relative power (%) random SNPs Average marker density (per kb) P.I.W. de Bakker et al. (2005) Nat Genet Advance Online Publication 23 Oct 2005

  20. How to pick tag SNPs? • What is the genetic hypothesis? Which variants do you want to test for a role in disease? • functional annotation (coding SNPs) • allele frequency (HapMap ascertainment) • previously implicated associations • Go to http://www.hapmap.org – DCC supported interactive tagging • Export HapMap data into tools such as Tagger, Haploview (www.broad.mit.edu/mpg)

  21. Will tag SNPs picked from HapMap apply to other population samples? CEU CEU CEU Utah residents with European ancestry(CEPH) Whites from Los Angeles, CA Botnia, Finland Population differences add very little inefficiency Platform presentation: Paul de Bakker (#223: Sat 9.30)

  22. Applying the HapMap • Study design - tagging • Study coverage evaluation • Study analysis - improving association testing • Study interpretation • Comparison of multiple studies • Connection to genes/genomic features • Integration with expression and other functional data • Other uses of HapMap data • Admixture, LOH, selection

  23. Genome-wide association coverage • If genome-wide products are typed on the HapMap sample panel, the SNPs on HapMap not included in the panel provide an evaluation for the coverage of the product • ENCODE (deep ascertainment) • Phase II (dense, genome-wide)

  24. G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 A A G G G G G T T G A G A C C C C C C C C C C C C C T T A A A A G C G G C C C C Association tests with fixed markers Tests of association: SNP 1 SNP 3 = SNP on whole-genome product (~1 - 5% common variation directly assayed)

  25. G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 G G A A G G G T T A G G A C C C C C C C C C C C C C T T A A A A G C G G C C C C high r2 high r2 Association tests with fixed markers Tests of association: SNP 1 SNP 3

  26. G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 A A G G G T T A G A C C C C C C C C C C C C T T A A G C G C C C high r2 high r2 Association tests with fixed markers Tests of association: SNP 1 SNP 3 SNPs actually tested: SNP 1 SNP 3 SNP 2 SNP 5

  27. Genome-wide products can capture most common variation Example: 500K data generated by Affymetrix and recently submitted to HapMap DCC

  28. More on this topic • Platform presentations tomorrow morning 8 AM sharp: • Peer • Jorgenson • Lazarus • As well as several detailed posters!

  29. Applying the HapMap • Study design - tagging • Study coverage evaluation • Study analysis - improving association testing • Study interpretation • Comparison of multiple studies • Connection to genes/genomic features • Integration with expression and other functional data • Other uses of HapMap data • Admixture, LOH, selection

  30. Can incorporating tests of haplotypes of SNPs on the genome-wide product improve this coverage?

  31. G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 G G A A G T T G A A C C C C C C C C C C C C A A T T G C G C C C Improving association power using data from HapMap Tests of association: SNP 1 SNP 3 SNPs actually tested: SNP 1 SNP 3 SNP 2 SNP 5

  32. G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 G G A A G T T G A A C C C C C C C C C C C C A A T T G C G C C C Improving association power using data from HapMap Tests of association: SNP 1 SNP 3 SNPs actually tested: SNP 1 SNP 3 SNP 2 SNP 5

  33. G/C 3 G/A 2 T/C 4 G/C 5 A/T 1 A/C 6 A A G G G T G A C C C C C C C C A A T T G G C C Improving association power using data from HapMap Tests of association: SNP 1 SNP 3 “AG haplotype” SNPs actually tested: SNP 1 SNP 3 SNP 2 SNP 5 SNP 4 SNP 6

  34. Haplotypes increase coverage

  35. Applying the HapMap • Study design - tagging • Study coverage evaluation • Study analysis - improving association testing • Study interpretation • Connection to genes/genomic features • Comparison of multiple association studies • Integration with expression and other functional data • Other uses of HapMap data • Admixture, LOH, selection

  36. Integration with genomic features • Positive association to a SNP on HapMap enables detailed interpretation: • How many other SNPs are in LD with this SNP? • What genes are in LD with this SNP? • What coding variants and putative functional variants are in LD with this SNP? Potential to improve power by modifying Bayesian priors of each association test based on this information

  37. Example: Complement Factor H - AMD • Original SNP hit in Affy 100K experiment – rs380390 • Extent and structure of LD from HapMap aids in the fine mapping phase of project Klein et al Science 2005

  38. Example: Complement Factor H - AMD rs380390

  39. Example: Complement Factor H - AMD rs380390

  40. Meta-analysis of association studies • When different marker sets are used to study association (candidate gene or genome-wide), results can be readily integrated when all markers are typed on HapMap samples

  41. Example: DTNBP1 and schizophrenia • Multiple studies have described modest association to schizophrenia • Most studies have examined small numbers of non-overlapping sets of SNPs • HapMap data can be used to determine whether these association finding Derek Morris, Mousumi Mutsuddi (WCPG meeting)

  42. Extensive LD across DTNBP1 Phase II HapMap - 186 SNPs 180 kb

  43. 2 3 4 5 7 10 AGGCCA AAGCCT AGGCCT AGGCCA AGATTA GGATCA 4 (GA), 5 (CT) 10 (AT) 7(CT) 2 (AG) 3 (GA) Phylogeny of DTNBP1 tag SNPs Ancestral haplotype 6% 33% 42% 8% 11%

  44. Tag SNPs 2 3 4 5 7 10 AGGCCA AAGCCT AGGCCT AGGCCA AGATTA GGATCA Associated alleles reported Straub 2002 Van den Oord 2003

  45. Tag SNPs 2 3 4 5 7 10 AGGCCA AAGCCT AGGCCT AGGCCA AGATTA GGATCA Associated alleles reported Straub 2002 Van den Oord 2003 Schwab 2003

  46. Tag SNPs 2 3 4 5 7 10 AGGCCA AAGCCT AGGCCT AGGCCA AGATTA GGATCA Associated alleles reported Straub 2002 Van den Oord 2003 Van den Bogaert 2003 Funke 2004 Schwab 2003

  47. Tag SNPs 2 3 4 5 7 10 AGGCCA AAGCCT AGGCCT AGGCCA AGATTA GGATCA Associated alleles reported Straub 2002 Van den Oord 2003 Williams 2004 Bray 2005 Van den Bogaert 2003 Funke 2004 Schwab 2003

  48. Tag SNPs 2 3 4 5 7 10 AGGCCA AAGCCT AGGCCT AGGCCA AGATTA GGATCA Associated alleles reported Kirov 2004 Straub 2002 Van den Oord 2003 Williams 2004 Bray 2005 Van den Bogaert 2003 Funke 2004 Schwab 2003