160 likes | 302 Views
Lecture 13: Tues., Feb. 24. Comparisons Among Several Groups – Introduction (Case Study 5.1.1) Comparing Any Two of the Several Means (Chapter 5.2) The One-Way Analysis of Variance F-test (Chapter 5.3) Robustness to Assumptions (5.5.1) Thursday: Multiple Comparisons (6.3-6.4).
E N D
Lecture 13: Tues., Feb. 24 • Comparisons Among Several Groups – Introduction (Case Study 5.1.1) • Comparing Any Two of the Several Means (Chapter 5.2) • The One-Way Analysis of Variance F-test (Chapter 5.3) • Robustness to Assumptions (5.5.1) • Thursday: Multiple Comparisons (6.3-6.4)
Comparing Several Groups • Chapter 5 and 6: Compare the means of I groups (I>=2). Examples: • Compare the effect of three different teaching methods on test scores. • Compare the effect of four different therapies on how long a cancer patient lives. • Compare the effect of using different amounts of fertilizer on the yield of a crop. • Compare the amount of time that ten different tire brands last. • As in Ch. 1-4, studies can either seek to compare treatments (causal inferences) or population means
Case Study 5.1.1 • Female mice randomly assigned to one of six treatment groups • NP: Mice in this group ate as much as they pleased of nonpurified, standard diet • N/N85: Fed normally both before and after weaning. After weaning, ration controlled at 85 kcal/wk • N/R50: Fed normal diet before weaning and reduced calorie diet of 50 kcal/wk after weaning • R/R50: Fed reduced calorie diet of 50 kcal/wk both before and after weaning • N/R50 lopro: Fed normal diet before weaning, a restricted diet of 50 kcal/wk after weaning and dietary protein content decreased with advancing age • N/R40: Fed normally before weaning and given severely reduced diet of 40 kcal/wk after weaning.
Questions of Interest • Specific comparisons of treatments, see Display 5.3 (section 5.2) • Are all of the treatments the same? (F-test, Section 5.3). • Multiple comparisons (Chapter 6) • Terminology for several group problem: one-way classification problem, one-way layout • Setup in JMP: One column for response (e.g., lifetime), a second column for group label.
Ideal Model for Several Samples Ideal model: • The populations 1,2,…,I have normal distributions with means • Each population has the same standard deviation • Observations within each sample are independent • Observations in any one sample are independent of observations in other samples • Sample sizes . Total sample size
Randomized Experiments • Terminology of samples from multiple populations used but methods also apply to data from randomized experiments in which response of Y1 on treatment 1 would produce response of on treatment 2 and on treatment 3, etc • Can think of as equivalent to and as equivalent to (additive treatment effect of treatment 3 compared to treatment 2) • Phrase concluding statements in terms of treatment effects or population means depending on type of study.
Comparing any two of several means • Compare mean of mice on N/R50 diet to mean of N/N85 diet, (i.e., what is the additive treatment effect of N/N85 diet?) • What’s different from two group problem? We have additional information about the variability in the populations from the additional groups. • We use this information in constructing a more accurate estimate of the population variance.
Comparing any two means • Comparison of and • Use usual t-test but estimate from weighted average of sample standard deviations in all groups, use df=n-I. • 95% CI for : (Note: Multiplier for degree of confidence equals , the multiplier if there were only two groups) • See handout for implementation in JMP
Note about CIs and hyp. tests • Suppose we form a 95% confidence interval for a parameter, e.g., • The 95% confidence interval will contain 0 if and only if the p-value of the two sided test that the parameter equals 0 (e.g., vs. ) has p-value >=0.05. • In other words the test will only give a “statistically significant” result if the confidence interval does not contain 0.
One-Way ANOVA F-test • Basic Question: Is there any difference between any of the means? • H0 : • HA: At least two of the means and are not equal • Could do t-tests of all pairs of means but this has difficulties (Chapter 6 – multiple comparisons) and is not the best test. • Test statistic: Analysis of Variance F-test.
ANOVA F-test in JMP • Convincing evidence that the means (treatment effects) are not all the same
Rationale behind the test statistic • If the null hypothesis is true, we would expect all the sample means to be close to one another (and as a result, close to the grand mean). • If the alternative hypothesis is true, at least some of the sample means would differ. • Thus, we measure variability between sample means. • Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. • Therefore, even though sample means may markedly differ from one another, variability between sample means must be judged relative to the “within samples variability”.
F Test Statistic • Notation: = jth observation in ith group, = sample mean of ith group, = grand mean (sample mean of all observations) • F-test statistic: • Test statistic is essentially (Variability of the sample means)/(Variability within samples). • Large values of F are implausible under H0. • F statistic follows F(I-1,n-I) distribution under H0. Reject H0 if F>F( ) [See Table A.4]
ANOVA F-test in JMP • F=57.1043, p-value <0.0001 • Convincing evidence that the means (treatment effects) are not all the same
Robustness to Assumptions • Robustness of t-tests and F-tests for comparing several groups are similar to robustness for two group problem. • Normality is not critical. Extremely long-tailed or skewed distributions only cause problems if sample size in a group is <30 • The assumptions of independence within and across groups are critical. • The assumption of equal standard deviations in the population is crucial. Rule of thumb: Check if largest sample standard deviation divided by smallest sample standard deviation is <2 • Tools are not resistant to severely outlying observations. Use outlier examination strategy in Display 3.6