330 likes | 518 Views
GENE PHENOTYPE/DISEASE ENVIRONMENT. . . . GENETIC MARKER GENE PHENOTYPE/DISEASE ENVIRONMENT (controlled, fixed). . . . Linkage disequilibrium. Early history of association analysis (1921). blood type (ABO) and disease associationJA Buchanan, ET Higley (19
E N D
1. Association Studies To Locate Human Disease Genes Wentian Li, Ph.D
The Robert S Boas Center for Genomics and Human Genetics
North Shore LIJ Institute for Medical Research
2. GENE PHENOTYPE/DISEASEENVIRONMENT
3. GENETIC MARKERGENEPHENOTYPE/DISEASEENVIRONMENT (controlled, fixed)
4. Early history of association analysis (1921)
blood type (ABO) and disease association
JA Buchanan, ET Higley (1921) "The relationship of blood groups to disease", British Journal of Experimental Pathology 2:247-255.
5. Early history of association analysis (1945)
The suggestion to use ABO blood type/secretor polymorphism to detect association with diseases
EB Ford (1945), "Polymorphism", Biological Reviews, 20:73-88.
7. Early history of association analysis (1953-54) Ian Aird, HH Bentall, JA Fraser-Roberts (1953), "A relationship between cancer of stomach and the ABO blood groups", British Medical Journal, 1:799-801.
I Aird, HH Bentall, JA Mehigan, JAF Roberts (1954), "The blood groups in relation to peptic ulceratiuon and carcinoma of the colon, rectum, breast and bronchus: an association between the ABO groups and peptic ulceration", British Medical Journal, 2:315-321.
8. Early history of association analysis (1960s) Polymorphism in Human Leukocyte Antigen (HLA) system (also known as Major Histocompatibility (MHC)) and disease association
International Histocompatibility Workshop (first one in 1964)
9. Divergence between linkage and association analysis for human disease gene detection (1970s-1980s?) Both are based on the same principle that the genetic polymorphism (itself may not have function) and the disease gene (it has function) lie close to each other on the chromosome.
Only the techniques are different
Association (and linkage disequilibrium) became mainly a topic in population genetics (with the exception of HLA-disease association analysis)
10. Differences between linkage analysis and association analysis Linkage analysis is based on pedigree data
Association analysis is based on population data
Linkage analyses rely on recombination events in action
Association analyses rely on ancestral recombinations
The statistic is linkage analysis is to count the number of recombinants and non-recombinants
The statistical method for association analysis is statistical correlation
11. The domination of linkage analysis (1980s?) The easy determination for restriction fragment length polymorphism (RFLP) made linkage analysis popular again
Linkage analysis helped to locate chromosomal regions for dozens of rare Mendelian diseases (in 1983, the first disease gene, for Huntington disease, was mapped )
Even easier for typing and denser genetic marker: microsatellite markers
12. Association analysis was brought back to disease mapping (1990s). I. Family-based association The most often criticized aspect of association analysis, its inability to deal with population stratification, was thought to be solved by the family-based design
Genotype-based haplotype relative risk (Falk and Rubinstein, 1987)
Haplotype-based haplotype relative risk (Terwilliger and Ott, 1992)
McNemar test (Terwilliger and Ott, 1992), Transmission disequilibrium test (TDT) (Spielman, McGinnis, Ewen, 1993)
13. Association analysis was brought back to disease mapping (1990s). II. Weaker signal in complex diseases TDT is shown to be more powerful than the affected-sib identical-by-descent sharing method (a nonparametric linkage analysis) for complex diseases (diseases with lower genotypic relative risk)
N Risch, K Merikangas (1996), "The future of genetic studies of complex human diseases", Science, 273:1516-1517
24. Statistical significance of a correlation versus correlation strength Statistical significance is usually measured by p-value: the probability for observing the same amount of correlation or more if the true correlation is zero.
Correlation strength can be measured by many many quantities: D, D, r2
Correlation strength between a marker and the disease status is usually measured by odd-ratio (OR)
The 95% confidence interval (CI) of OR contains both information on strength and significance
When the sample size is increased, typically the p-value can become even more significant, whereas OR usually stays the same (but 95% CI of OR becomes more narrow).
26. Main Issues in Association Analysis The association is typically detected between a non-function marker and the disease, instead of the disease gene itself and the disease status. (non-direct role of the disease gene in association analysis)
When the disease (case) group and the normal (control) group both are a mixture of subpopulations with a different proportion of mixing, even markers not associated with the disease will exhibit spurious association (heterogeneity)
28. Solution to the first issue Choose the marker, haplotype,
to have a matching (allele, haplotype,
) frequency as the disease gene.
Whenever possible, typing a marker that is also functional (e.g. coding SNP, functional SNP, regulatory SNP)
30. Well-known problem when case/control groups consist of two different subpopulations with different mixing proportion Example: comparing peoples height between two places: 1. prison, and 2. nurse school
In prison, maybe 80% are men
In nursing school, maybe 80% are women
Men are on average taller than women
People in prison are taller than people in nurse school
But the cause of this difference is due to the different mixing proportions, not due to staying in prison makes people taller
31. Solution to the second issue Try to use people from the same population in both case and control group.
Use neutral marker to test whether subpopulations exist
If possible use an isolated population (the extra benefit is to reduce the heterogeneity in the case group)
Use family-based association design (the disadvantage is that it is more costly, and parents of late-onset patients are hard to find)
32. Lee et al. Gene and Immunity (2005)
34. Criswell et al. Am J Hum Genetics (2005)