1 / 45

ANALYSIS OF VARIANCE (ANOVA)

ANALYSIS OF VARIANCE (ANOVA). Heibatollah Baghi, and Mastee Badii. Purpose of ANOVA. Use one-way Analysis of Variance to test when the mean of a variable (Dependent variable) differs among three or more groups

lynam
Download Presentation

ANALYSIS OF VARIANCE (ANOVA)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii

  2. Purpose of ANOVA • Use one-way Analysis of Variance to test when the mean of a variable (Dependent variable) differs among three or more groups • For example, compare whether systolic blood pressure differs between a control group and two treatment groups

  3. Continued Purpose of ANOVA • One-way ANOVA compares three or more groups defined by a single factor. • For example, you might compare control, with drug treatment with drug treatment plus antagonist. Or might compare control with five different treatments. • Some experiments involve more than one factor. These data need to be analyzed by two-way ANOVA or Factorial ANOVA. • For example, you might compare the effects of three different drugs administered at two times. There are two factors in that experiment: Drug treatment and time.

  4. Why not do repeated t-tests? • Rather than using one-way ANOVA, you might be tempted to use a series of t tests, comparing two groups each time. Don’t do it. • Repeated t-test increase the chances of type I error or multiple comparison problem • If you are making comparison between 5 groups, you will need 10 comparison of means • When the null hypothesis is true the probability that at least 1 of the 10 observed significance levels is less than 0.05 is about 0.29

  5. Continued Why not do repeated t-tests? • With 10 means (45 comparisons), the probability of finding at least one significant difference is about 0.63 • In other words, when level of significance is .05, there is a 1 in 20 chance that one t-test will yield a significant result even when the null hypothesis is true. • The more t-test the more that probability will increase

  6. What Does ANOVA Do? • ANOVA involves the partitioning of variance of the dependent variable into different components: • A. Between Group Variability • B. Within Group Variability • More Specifically, The Analysis of Variance is a method for partitioning the Total Sum of Squares into two Additive and independent parts.

  7. Definition of Total Sum of Squares or Variance Grand average Summed acrossall n times p observations

  8. Grand average Average of group j .j Definition of Between Sum of Squares Sum of squared differences of group means from the grand mean is SSB

  9. Group mean Observations Definition of Within Sum of Squares Sum of squareddifference of observations from group means

  10. Total sum of squares Within groups sum of squares Between groups sum of squares Partitioning of Variance into Different Components

  11. Test Statistic in ANOVA Test statistic for ANOVA is based on between & within groups SS

  12. Continued Test Statistic in ANOVA • F = Between group variability / Within group variability • The source of Within group variability is the individual differences. • The source of Between group variability is effect of independent or grouping variables. • Within group variability is sampling error across the cases • Between group variability is effect of independent groups or variables

  13. Steps in Test of Hypothesis • Determine the appropriate test • Establish the level of significance:α • Determine whether to use a one tail or two tail test • Calculate the test statistic • Determine the degree of freedom • Compare computed test statistic against a tabled/critical value Same as Before

  14. Continued 1. Determine the Appropriate Test • Independent random samples have been taken from each population • Dependent variable population are normally distributed (ANOVA is robust with regards to this assumption) • Population variances are equal (ANOVA is robust with regards to this assumption) • Subjects in each group have been independently sampled

  15. 2. Establish Level of Significance • α is a predetermined value • The convention • α = .05 • α = .01 • α = .001

  16. 3. Use a Two Tailed Test • Ho: 1 = 2 = 3 = 4 Where • 1 = population mean for group 1 • 2 = population mean for group 2 • 3 = population mean for group 3 • 4 = population mean for group 4 • H1 = not Ho

  17. Continued 3. Use a Two Tailed Test • Ha = not Ho • The alternative hypothesis does not specify whether • 1  2 or • 2  3 or • 1  3

  18. 4. Calculating Test Statistics • F = (SSb / dfB) / (SSw / dfw) Degrees of freedom between Degrees of freedom within Sum of square between Sum of square within

  19. Continued 4. Calculating Test Statistics • By dividing the sum of the squared deviations by degrees of freedom, we are essentially computing an “average” (or mean) amount of variation • The specific name for the numerator of the F statistic is the mean square between (the average amount of between-group variation • The specific name for the denominator of the F statistic is the mean square within (the average amount of within- group variation)

  20. 5. Determine Degrees of Freedom • Degrees of freedom between • dfB = k – 1 • K = number of groups • Degrees of freedom within • dfw = N – k • N = total number of subjects in the study

  21. 6. Compare the Computed Test Statistic Against a Tabled Value • α = .05 • If Fc > Fα Reject H0 • If Fc > Fα Can not Reject H0

  22. Example • Suppose we had patients with myocardial infarction in the following groups: • Group 1: A music therapy group • Group 2: A relaxation therapy group • Group 3: A control group • 15 patients are randomly assigned to the 3 groups and then their stress levels are measured to determine if the interventions were effective in minimizing stress.

  23. Continued Example • Dependent Variable • The stress scores. The ranges are from zero (no stress) to 10 (extreme stress) • Independent Variable or Factor • Treatment Conditions(3 levels)

  24. Observations

  25. Sum of Squares for Each Group

  26. SS Within

  27. Group 1 average Group 2 average Group 3 average Number of cases Grand average SS Between

  28. Sum of Squares Total

  29. Components of Variance SSTotal = SSBetween + SSWithin 116 = 70 + 46

  30. Degrees of Freedom • Df between = 3 -1 • Df within = 15 - 3 dfB = k – 1 dfw = N – k

  31. Test Statistic MSBetween= 70 / 2 = 35 MSWithin= 46 / 12 = 3.83 Fc = MSBetween / MSWithin Fc = 35 / 3.83 = 9.13

  32. Lookup Critical Value • Fα = 3.88

  33. Conclusions • Fc = 9.13 > Fα= 3.88 • Fc > Fα Therefore Reject H0

  34. One-way ANOVA Summary

  35. Multiple Comparison Groups F test does not tell which pair are not equal Additional analysis is necessary to answer which pair are not equal

  36. Fisher’s LSD Test • These are the null and alternative hypothesis being tested • Ho1 : µ1 = µ2 Ha1 : µ1  µ2 • Ho2 : µ1 = µ3 Ha2 : µ1  µ3 • Ho3 : µ2 = µ3 Ha3 : µ2  µ3

  37. Fisher’s LSD Test • Known as the protected t-test • The least difference between means needed for significance • Df = N – K • Use the following formula:

  38. Calculation of LSD • All pairs for means differing by at least 2.70 points on the stress scale would be significantly different from on another.

  39. Application to Three Samples Mean 1 – Mean 2 = 1 Mean 3 – Mean 1 = 4 Mean 3 – Mean 2 = 5 Alternative Hypotheses: Ho1 :µ1 = µ2 Not Rejected Ho2 :µ1 = µ3 Rejected Ho3 :µ2 = µ3 Rejected

  40. Use of SPSS in ANOVA

  41. Data in SPSS Input Format

  42. N Mean Std. Deviation Std. Error 95% Confidence Interval for Mean Minimum Maximum Lower Bound Upper Bound Music Therapy 5 3.00 2.236 1.000 .22 5.78 0 6 Relaxation Therapy 5 2.00 1.581 .707 .04 3.96 0 4 Control Group 5 7.00 2.000 .894 4.52 9.48 5 10 Total 15 4.00 2.878 .743 2.41 5.59 0 10 SPSS Output for ANOVA Descriptives Stress Levels

  43. Levene Statistic df1 df2 Sig level or p-value .242 2 12 .788 Sum of Squares df Mean Square F Sig.level or p-value Between Groups 70.000 2 35.000 9.130 .004 Within Groups 46.000 12 3.833 Total 116.000 14 SPSS Output for ANOVA Test of Homogeneity of Variances Stress Levels. P > .05, therefore, th assumption of Homogeneity of Variance is met. Stress Levels ANOVA P<.05, therefore, we reject the Null Hypothesis and continue with Multiple Comparison Table

  44. (I) Groups (J) Groups Mean Difference (I-J) Std. Error Sig. Level 95% Confidence Interval Music Therapy Relaxation Therapy 1.000 1.238 .435 -1.70 3.70 Control Group -4.000(*) 1.238 .007 -6.70 -1.30 Relaxation Therapy Music Therapy -1.000 1.238 .435 -3.70 1.70 Control Group -5.000(*) 1.238 .002 -7.70 -2.30 Control Group Music Therapy 4.000(*) 1.238 .007 1.30 6.70 Relaxation Therapy 5.000(*) 1.238 .002 2.30 7.70 SPSS Output for ANOVA Multiple Comparisons Dependent Variable: Stress Levels LSD * The mean difference is significant at the .05 level.

  45. Take home lesson How to compare means of three or more samples

More Related