BIOL 582

BIOL 582 Lecture Set 7 Tests for multiple groups: Part 3 One-Way ANOVA Assumptions, One-Way ANOVA Evaluation, Multiple Comparisons Tests, Data Transformations ,Contingency Tables

BIOL 582 Assumptions in ANOVA • A One-Way (Single Factor) ANOVA is appropriate for • Samples collected from multiple populations (observational study) • A completely randomized design (Experimental design) – experimental subjects are randomly assigned to treatments • Assumptions include • Normally distributed residuals (not data) • Homoscedasticity • Independent observations (i.e., sample sizes don’t contain multiple measurements on the same subjects; different samples or treatments do not contain the same subjects) • These are the assumptions of Linear Models! • For visual assistance, the testing of assumptions with the pupfish.parasite.csv data will be presented in the next few slides. (It is assumed that data were loaded and a variable, “GROUP”, was created for all population-sex classifications.

BIOL 582 Assumptions in ANOVA: Testing normality • Histograms, Normal Probability plots (fat pencil test), Normality Tests > lm.group<-lm(GRUBS~GROUP) > residuals<-resid(lm.group) > hist(residuals,col=‘light blue’,xlab=“Residual”) > qqnorm(residuals) Conclusion: data positively skewed

BIOL 582 Assumptions in ANOVA: Testing normality • Histograms, Normal Probability plots (fat pencil test), Normality Tests > a<-qqnorm(residuals) > normal<-a$x # standard normal quantiles > observed<-(a$y-mean(a$y))/sd(a$y) # standard deviates of observed > plot(normal,observed,xlab="Theoretical Quantiles",ylab="Observed Quantiles") This is the same plot as before, but the observed values have been converted to standard normal deviates. If this is a symmetric distribution, the scale on the axis should be between -3 and 3. A “lopsided” scale is indicative of skewing Conclusion: data positively skewed

BIOL 582 Assumptions in ANOVA: Testing normality • Histograms, Normal Probability plots (fat pencil test), Normality Tests > ks.test(normal,observed) Two-sample Kolmogorov-Smirnov test data: normal and observed D = 0.165, p-value = 0.1209 alternative hypothesis: two-sided Warning message: In ks.test(normal, observed) : cannot compute correct p-values with ties > shapiro.test(observed) Shapiro-Wilk normality test data: observed W = 0.8292, p-value = 1.472e-09 One has to be cautious with normality tests. Small or large samples can affect results – high and small P-values, respectively. A single outlier can have huge impact. There are situations when one type of test will be better than another. It is probably easiest to use the Shapiro-Wilk Test (specifically for normality), not worry about the formula, and evaluate its results cautiously. If a distribution is symmetric but not normal, it is usually ok. Note! One does not have to find standard deviates! > shapiro.test(residuals) Shapiro-Wilk normality test data: residuals W = 0.8292, p-value = 1.472e-09 Conclusion: data not normal?

BIOL 582 Assumptions in ANOVA: Testing homoscedasticity • Plot standardized residuals, ANOVA on residuals > par(mfcol=c(1,2)) > plot(GROUP, observed, xlab="Population-SEX", ylab="Standardized Residuals") > plot(predict(lm.group), observed, xlab="Predicted Values", ylab="Standardized Residuals") Groups Should have mean of 0 plus symmetric and equally spread distributions. Conclusion: Unequal variances (heteroscedasticity)

BIOL 582 Assumptions in ANOVA: Testing homoscedasticity • Plot standardized residuals, ANOVA on residuals > lm.residuals<-lm(abs(residuals)~GROUP) # use absolute value of residuals > anova(lm.residuals) Analysis of Variance Table Response: abs(residuals) Df Sum Sq Mean Sq F value Pr(>F) GROUP 3 230367 76789 6.6675 0.0003799 *** Residuals 99 1140179 11517 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 # If the means of absolute values of residuals are different among groups, then it means that at least one group has values more extreme than another P-value is less than alpha = 0.05; reject null hypothesis of equal variances Conclusion: Unequal variances (heteroscedasticity)

BIOL 582 Data Transformation • Linear models and ANOVA are ignorant about the data but the biologist might know better how they behave. • The intent with data transformations is to find a new variable which is a function of the desired variable and which is suitable for linear models (i.e., meets assumptions). • Let y be a variable and z be a function of y • If z is a variable that can be used with a linear model, then it is a suitable transformation of y, if and only if, • is solvable

BIOL 582 Data Transformation • For example, if y is a variable that is not suitable for ANOVA, but • and z is a variable that is suitable, then z is a log-transformation of y. • More importantly, one can perform ANOVA on z because

BIOL 582 Data Transformation • Here is a list of common data transformations used on continuous data, plus an explanation for their use.

BIOL 582 Example log-transformation • Using the pupfish data > log.grubs<-log(GRUBS+1) # the +1 eliminates problems, in most cases, caused by 0s in the data, since log0 is undefined > lm.new<-lm(log.grubs~GROUP) > residuals<-resid(lm.new) > qqnorm(residuals) > shapiro.test(residuals) Shapiro-Wilk normality test data: residuals W = 0.9869, p-value = 0.4061

BIOL 582 Example log-transformation • Using the pupfish data > st.residuals<-(residuals-mean(residuals))/sd(residuals) > par(mfcol=c(1,2)) > plot(GROUP, st.residuals,xlab="Population-SEX",ylab="Standardized Residuals") > plot(predict(lm.new), st.residuals,xlab="Predicted Values",ylab="Standardized Residuals")

BIOL 582 Example log-transformation • Using the pupfish data > lm.residuals<-lm(abs(residuals)~GROUP) # use absolute value of residuals > anova(lm.residuals) Analysis of Variance Table Response: abs(residuals) Df Sum Sq Mean Sq F value Pr(>F) GROUP 3 16.500 5.5001 12.913 3.457e-07 *** Residuals 99 42.167 0.4259 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Conclusion: Log-transformation “fixed” normality issue, but did not fix equal variance issue (nor will it ever). However, standardized residuals were all within -3 and +3 for all four groups. Although the assumption of homoscedastcity is violated, the violation is not egregious. Probably OK to use F distributions (but one could use a randomization procedure instead)

BIOL 582 Evaluations in ANOVA • One of the most important questions when using linear models and ANOVA is, “How much of the variation is explained by group differences?” • This is easy. Since SST = SSB + SSE, the ratio of SSB to SST , called the coefficient of determination (R2), expresses how important group differences are. Or, one can calculate 1 – SSE/SST , which is a better definition for more complex models (next lecture)

BIOL 582 Evaluations in ANOVA • One of the most important questions when using linear models and ANOVA is, “How much of the variation is explained by group differences?” • Example True: Group difference explained 12.9 percent of the overall variation in the log of grubs. False: Group difference explained 12.9 percent of the overall variation in grubs. > summary(lm.group) Call: lm(formula = log.grubs ~ GROUP) Residuals: Min 1Q Median 3Q Max -2.8590 -0.8546 0.1149 0.7944 2.8303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.1649 0.2575 16.176 < 2e-16 *** GROUP1.M 0.3154 0.3641 0.866 0.38852 GROUP2.F -0.2073 0.3289 -0.630 0.52995 GROUP2.M 1.1630 0.3999 2.909 0.00448 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Residual standard error: 1.261 on 99 degrees of freedom Multiple R-squared: 0.129, Adjusted R-squared: 0.1026 F-statistic: 4.889 on 3 and 99 DF, p-value: 0.003273 > # Brute Force > lm.null<-lm(log.grubs~1) > lm.group<-lm(log.grubs~GROUP) > > SSE.null<-SSE(lm.null) > SSE.group<-SSE(lm.group) > SSB<-SSE.null-SSE.group > SST<-SSB + SSE.group > R2<-1-SSE.group/SST > > R2 [,1] [1,] 0.1290355 > Do not worry about this yet

BIOL 582 Multiple Comparisons Test • What if we want to know why SSB is significantly greater than 0? Can we do t-tests among groups? • Yes, but they must be constrained to maintain an experiment-wise (family-wise) type I error rate. • Tukey’s range test (aka Tukey’s Honest Significant Difference Test) • Modified t-distribution • Finds confidence intervals for pairwise differences • If CI contains 0, do not reject null hypothesis that means compared are different

BIOL 582 Multiple Comparisons Test • Example: Tukey’s range test > aov.group<-aov(log.grubs~GROUP) > TukeyHSD(aov.group) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = log.grubs ~ GROUP) $GROUP diff lwr upr p adj 1.M-1.F 0.3153772 -0.6361571 1.2669114 0.8222764 2.F-1.F -0.2072937 -1.0667296 0.6521422 0.9220590 2.M-1.F 1.1630006 0.1180954 2.2079059 0.0228731 2.F-1.M -0.5226708 -1.3821067 0.3367651 0.3894327 2.M-1.M 0.8476235 -0.1972817 1.8925287 0.1539578 2.M-2.F 1.3702943 0.4085045 2.3320841 0.0018319

BIOL 582 Multiple Comparisons Test • Example: Tukey’s range test > aov.group<-aov(log.grubs~GROUP) > TukeyHSD(aov.group) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = log.grubs ~ GROUP) $GROUP diff lwr upr p adj 1.M-1.F 0.3153772 -0.6361571 1.2669114 0.8222764 2.F-1.F -0.2072937 -1.0667296 0.6521422 0.9220590 2.M-1.F 1.1630006 0.1180954 2.2079059 0.0228731 2.F-1.M -0.5226708 -1.3821067 0.3367651 0.3894327 2.M-1.M 0.8476235 -0.1972817 1.8925287 0.1539578 2.M-2.F 1.3702943 0.4085045 2.3320841 0.0018319 Something important to realize. Parametric Multiple Comparison tests, like Tukey’s, are used after a parametric ANOVA is performed and significant SSB has been found. One cannot use it with a randomization test. To perform multiple comparisons with a randomization test, one simply needs to compare individual group means in every random permutation See Next slide

BIOL 582 Multiple Comparisons Test • Example: Randomization approach > lm.group<-lm(log.grubs~GROUP) > ls<-data.frame(GROUP=levels(GROUP)) # a list of group means to obtain > group.means<-predict(lm.group,ls) > names(group.means)<-levels(GROUP) > > group.means 1.F 1.M 2.F 2.M 4.164853 4.480231 3.957560 5.327854 > > # create a pair-wise mean-difference matrix (actually absolute values) > diff<-dist(group.means) > diff 1.F 1.M 2.F 1.M 0.3153772 2.F 0.2072937 0.5226708 2.M 1.1630006 0.8476235 1.3702943

BIOL 582 Multiple Comparisons Test • Example: Randomization approach > diff 1.F 1.M 2.F 1.M 0.3153772 2.F 0.2072937 0.5226708 2.M 1.1630006 0.8476235 1.3702943 > P.values > # P.values are in the same cells as mean differences 1 2 3 2 0.406 3 0.549 0.121 4 0.013 0.047 0.001 > > # Notice that P.values are different > # they are not constrained > # one might want to do a Bonferroni correction of P-values > diff<-as.matrix(diff) > P<-matrix(1,nrow(diff),ncol(diff)) > > permute<-999 > > for(i in 1:permute){ + y<-sample(log.grubs) + lm.r<-lm(y~GROUP) + m.r<-predict(lm.r,ls) + diff.r<-as.matrix(dist(m.r)) + P<-ifelse(diff.r>=diff,P+1,P+0) + } > > P.values<-P/(permute+1) > P.values<-as.dist(P.values) > diff<-as.dist(diff) > > A (sequential) Bonferroni correction is to rank P-values from smallest to highest and compare to α/rank. E.g 0.001 compare to 0.05/1 reject 0.013 compare to 0.05/2 reject 0.047 compare to 0.05/3 do not reject

BIOL 582 Going Forward • Henceforth, models will become more complex but the procedure will be the same: • Linear model design and set-up • Experimental Design and Observation Studies / Use • Assumptions • Evaluation • Multiple Comparisons • Example

BIOL 582 Categorical response data • Until now, we have considered only continuous data for testing hypotheses that involve comparison of groups • I.e., continuous response ~ factor • What about categorical response ~ factor ? • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) H0: Human hair color is independent of sex HA: Human hair color is not independent of sex I.e., are the distributions of hair colors consistent between males and females?

BIOL 582 Categorical response data • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) • Create a table of expected frequencies if the null hypothesis were true • The totals should still be the same. What values go into the cells?

BIOL 582 Categorical response data • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) • Create a table of expected frequencies if the null hypothesis were true • The totals should still be the same. What values go into the cells? • Ri*Ci/T, where R, C, and T represent row, column, and total sums, respectively

BIOL 582 Categorical response data • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) • Observed • Expected • Test statistic • For this example Find the percentile of the test statistic in a χ2 distribution with (r-1)(c-1) df to get a P-value. r = number of rows = 2 c = number of columns = 4 Thus df = 3. The χ2 test stat of 8.987 yields a P-value of 0.029 in this case, which would reject the null hypothesis at α = 0.05.

BIOL 582 Categorical response data • Contingency Table Assumptions and Rules • The data are a random sample from a population or from subjects randomly assigned to experimental treatments • Sufficient sample size (usually 5 or more in each cell). The chi-square stat is “shaky” when cell sizes are low. However, 0 might be an appropriate response. If low cell sizes are a problem, randomization tests are probably advisable. (E.G., Fisher exact Test) • Independent observations (i.e., values within cells are not repeatedly measured; values between cells are not collected from the same subjects) • Contingency Table analysis in R • See http://www.statmethods.net/stats/frequencies.html • Involves setting up the table and then analyzing accordingly. • But easy to do “by hand”

BIOL 582

BIOL 582

Presentation Transcript

BIOL 3340

BIOL 3340

Mgmt 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

BIOL 582

582

BIOL 3340