240 likes | 347 Views
ANALYSIS OF SELECTIVE DNA POOLING DATA IN FOX Joanna Szyda, Magdalena Zatoń-Dobrowolska, Heliodor Wierzbicki, Anna Rząsa. MAIN OBJECTIVES: ASSES POLYMORPHISM OF MICROSATELLITES IDENTIFY MARKER-TRAIT ASSOCIATIONS METHODOLOGICAL OBJECTIVES: TOOLS FOR THE ANALYSIS OF SPARSE DATA.
E N D
ANALYSIS OF SELECTIVE DNA POOLING DATA IN FOX Joanna Szyda, Magdalena Zatoń-Dobrowolska, Heliodor Wierzbicki, Anna Rząsa
MAIN OBJECTIVES: ASSES POLYMORPHISM OF MICROSATELLITES IDENTIFY MARKER-TRAIT ASSOCIATIONS • METHODOLOGICAL OBJECTIVES: TOOLS FOR THE ANALYSIS OF SPARSE DATA
SELECTIVE (INDIVIDUAL) GENOTYPING qq QQ • MORE POWER • STANDARD (LINEAR) MODELS NOT VALID MATERIAL METHODS RESULTS CONCLUSIONS
M1 M2 M3 M4 QTL SELECTIVE DNA POOLING qq QQ m1 M1 m1 m1 m2 m2 m2 m2 m3 m3 m3 m3 M4 M4 m4 m4 M1 m1 M1 M1 M2 M2 M2 M2 M3 M3 M3 M3 m4 m4 M4 M4 MATERIAL METHODS RESULTS CONCLUSIONS
SELECTIVE DNA POOLING • CHEAP ~18%-60% more efficient (Barrat et al. 02) • MORE POWERFULL ~10%-70% less individuals • HIGH TECHNICAL ERROR DNA pool formation (DNA quantification) • DNA amplification (differential amplification, shadow bands) • POOLING POPULATIONS: no relationship information • testing for association • POOLING HALFSIBS: partial relationship information • testing for linkage MATERIAL METHODS RESULTS CONCLUSIONS
POLAR FOX (Alopex lagopus) NORWEGIAN TYPE “LARGE” FINNISH TYPE “SMALL” 77 63 ANIMALS MATERIAL METHODS RESULTS CONCLUSIONS
MARKERS MATERIAL METHODS RESULTS CONCLUSIONS
MARKER SELECTION CRITERIA: MARKERS • POLYMORPHISM • number of alleles • allele lengths • AMPLIFICATION PROPERTIES • temperature • ? MATERIAL METHODS RESULTS CONCLUSIONS
MARKER ALLELE FREQUENCY IN POOLS MATERIALMETHODS RESULTS CONCLUSIONS
MARKER ALLELE FREQUENCY IN POOLS • LOW POLYMORPHISM WITHIN EACH POOL • “POOL-SPECIFIC” ALLELES • POOR CORRESPONDENCE BETWEEN REPLICATES MATERIALMETHODS RESULTS CONCLUSIONS
BINOMIAL DISTRIBUTION • Odds Ratio, Logistic Regression BINOMIAL DISTRIBUTION MATERIAL METHODS RESULTS CONCLUSIONS
ln (OR) = ln • distribution ln (OR) ~ N (0,1) • variance ln (OR) = • confidence intervals ln (OR) ± ODDS RATIO ln (OR) = ln MATERIAL METHODS RESULTS CONCLUSIONS
ln (OR) = ln ln (OR) = ln ODDS RATIO IN SPARSE DATA SPARSE DATA PROBLEM • c = 0 standard • c = 0.5 Haldane(55) • cij= 2 (ni.n.j / n2 ) Bishop(75) • Agresti (99): • c=0.5 not valid for ln(OR)>4 • cij not valid for ln(OR)>8 MATERIAL METHODS RESULTS CONCLUSIONS
ODDS RATIO: P-values MATERIALMETHODS RESULTS CONCLUSIONS
ODDS RATIO - CI 0.01 CI FOR “DISCORDANT” POOLS 0.01 CI FOR “CONCORDANT” POOLS MATERIALMETHODS RESULTS CONCLUSIONS
ODDS RATIO - REMARKS • many 2x2 comparisons (theoretically) possible: 18 m4 – 60 m1,m6 • significance pattern often inconsistent between alleles – sparse data • difficult to summarize ORs with a single value MATERIALMETHODS RESULTS CONCLUSIONS
FURTHER WORK • use all table cells • account for sparseness in testing • multivariate logistic models MATERIALMETHODS RESULTS CONCLUSIONS
MULTINOMIAL DISTRIBUTION • Multivariate Logistic Regression MULTINOMIAL DISTRIBUTION MATERIAL METHODS RESULTS CONCLUSIONS
1 4 8 16 MODEL • GENERAL LOGISTIC MODEL • CONSIDERED MODELS FOR ALLELE FREQUENCIES MATERIAL METHODS RESULTS CONCLUSIONS
observed frequencies estimated frequencies TEST STATISTIC • MODEL SELECTION DATA MODEL • POWER DIVERGENCE FAMILY Cressie, Read (1984) Pearson’s X2 Likelihood Ratio Test MATERIAL METHODS RESULTS CONCLUSIONS
TEST STATISTIC ? SPARSE DATA ! INCREASING CELLS ASYMPTOTICS ! • NORMALISATION MATERIAL METHODS RESULTS CONCLUSIONS
TEST STATISTIC mD ? sD ? • ANALYTICAL • Osius, Rojek (1989): D(l=1) • Farrington (1996): D(l=1)+D • Copas (1989): a*D(l=1) • EMPIRICAL – Bootstrap, Jackknife • EVALUATION OF REAL DATA • NORMAL PROPERTIES - simulation MATERIAL METHODS RESULTS CONCLUSIONS
LITERATURE • Agresti, A. (1990) Categorical data analysis. New York, Chichester, Brisbane, Toronto, Singapore. John Wiley & Sons. • Agresti, A. (1999) On logit confidence intervals for the odds ratio with small samples. Biometrics 55:597-602. • Barratt, B. J., Payne, F., Rance, H. E. ,Nutland, S., Todd, J. A., Clayton, D. G. (2002) Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design. Annals of Human Genetics 66:393-405. • Bishop, Y.M.M., Fienberg, S.E., Holland, P. (1975) Discrete multivariate analysis. Cambridge, Massachusetts: MIT Press. • Copas, J.B. (1989) Unweighted Sum of Squares Test for Proportions. Applied Statistics 38:71-80. • Cressie, N.A.C., Read, T.R.C. (1984) Multinomial goodness-of-t tests, Journal of the Royal Statistical Society Ser.B 46: 440-464. • Farrington, C.P. (1996) On assessing goodness of fit of generalized linear models to sparse data. Journal of the Royal Statistical Society Ser.B 58:349-360. • Haldane, J.B.S. (1956) The estimation and significance of the logarithm of a ratio of frequencies. Annals of Human Genetics 20:309-311. • Osius, G., Rojek, D. (1989) Normal goodness-of-fit tests for parametric multinomial models with large degrees of freedom. Fahbereich Mathematik/Informatik, Universitaet Bremen. Mathematik Arbeitspapiere 36: MATERIALMETHODS RESULTS CONCLUSIONS