1 / 49

When we think only of sincerely helping all others, not ourselves,

When we think only of sincerely helping all others, not ourselves, We will find that we receive all that we wish for. Chapter 9: Multiple Comparisons. Error rate of control Pairwise comparisons Comparisons to a control Linear contrasts. Multiple Comparison Procedures.

olive
Download Presentation

When we think only of sincerely helping all others, not ourselves,

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. When we think only of sincerely helping all others, not ourselves, We will find that we receive all that we wish for. multiple comparisons

  2. Chapter 9: Multiple Comparisons Error rate of control Pairwise comparisons Comparisons to a control Linear contrasts multiple comparisons

  3. Multiple Comparison Procedures Once we reject H0: ==...t in favor of H1: NOT all ’s are equal, we don’t yet know the way in which they’re not all equal, but simply that they’re not all the same. If there are 4 columns (levels), are all 4 ’s different? Are 3 the same and one different? If so, which one? etc. multiple comparisons

  4. These “more detailed” inquiries into the process are called MULTIPLE COMPARISON PROCEDURES. Errors (Type I): We set up “” as the significance level for a hypothesis test. Suppose we test 3 independent hypotheses, each at = .05; each test has type I error (rej H0 when it’s true) of .05. However, P(at least one type I error in the 3 tests) = 1-P( accept all ) = 1 - (.95)3 .14 3, given true multiple comparisons

  5. In other words, Probability is .14 that at least one type one error is made. For 5 tests, prob = .23. Question - Should we choose = .05, and suffer (for 5 tests) a .23 Experimentwise Error rate (“a” or aE)? OR Should we choose/control the overall error rate, “a”, to be .05, and find the individual test  by 1 - (1-)5 = .05, (which gives us  = .011)? multiple comparisons

  6. The formula 1 - (1-)5 = .05 would be valid only if the tests are independent; often they’re not. [ e.g., 1=22=3, 1= 3 IF accepted & rejected, isn’t it more likely that rejected? ] 2 3 1 1 2 3 multiple comparisons

  7. Error Rates When the tests are not independent, it’s usually very difficult to arrive at the correct for an individual test so that a specified value results for the experimentwise error rate (or called family error rate). multiple comparisons

  8. There are many multiple comparison procedures. We’ll cover only a few. Pairwise Comparisons Method 1: (Fisher Test) Do a series of pairwise t-tests, each with specified  value (for individual test). This is called “Fisher’s LEAST SIGNIFICANT DIFFERENCE” (LSD). multiple comparisons

  9. Example: Broker Study A financial firm would like to determine if brokers they use to execute trades differ with respect to their ability to provide a stock purchase for the firm at a low buying price per share. To measure cost, an index, Y, is used. Y=1000(A-P)/A where P=per share price paid for the stock; A=average of high price and low price per share, for the day. “The higher Y is the better the trade is.” multiple comparisons

  10. CoL: broker 1 12 3 5 -1 12 5 6 2 7 17 13 11 7 17 12 3 8 1 7 4 3 7 5 4 21 10 15 12 20 6 14 5 24 13 14 18 14 19 17 } n=6 Five brokers were in the study and six trades were randomly assigned to each broker. multiple comparisons

  11. “MSW”  = .05, FTV = 2.76 (reject equal column MEANS) multiple comparisons

  12. For any comparison of 2 columns, Yi -Yj /2 /2 CL 0 Cu AR: 0+ ta/2 x MSW x 1+ 1 nj ni dfw (ni = nj = 6, here) Pooled Variance, the estimate for the common variance MSW : multiple comparisons

  13. In our example, with=.05 0  2.060 (21.2 x 1 + 1 ) 0 5.48 6 6 This value, 5.48 is called the Least Significant Difference (LSD). When same number of data points, n, in each column, LSD = ta/2 x 2xMSW. n multiple comparisons

  14. Col: 3 1 2 4 5 5 6 12 14 17 Underline Diagram Summarize the comparison results. (p. 443) • Now, rank order and compare: multiple comparisons

  15. 3 1 2 4 5 5 6 12 14 17 Step 2: identify difference > 5.48, and mark accordingly: 3: compare the pair of means within each subset: Comparisondifferencevs. LSD < < < < 3 vs. 1 2 vs. 4 2 vs. 5 4 vs. 5 * * * 5 * Contiguous; no need to detail multiple comparisons

  16. 3 1 2 4 5 5 6 12 14 18 Conclusion : 3, 1 2 4 5 ??? Conclusion : 3, 1 2, 4, 5 Can get “inconsistency”: Suppose col 5 were 18: Now: Comparison |difference| vs. LSD < < > < 3 vs. 1 2 vs. 4 2 vs. 5 4 vs. 5 * * * 6 multiple comparisons

  17. Broker 1 and 3 are not significantly different but they are significantly different to the other 3 brokers. Conclusion : 3, 1 2 4 5 • Broker 2 and 4 are not significantly different, and broker 4 and 5 are not significantly different, but broker 2 is different to (smaller than) broker 5 significantly. multiple comparisons

  18. multiple comparisons

  19. Minitab: Stat>>ANOVA>>One-Way Anova then click “comparisons”. Fisher's pairwise comparisons (Minitab) Family error rate = 0.268 Individual error rate = 0.0500 Critical value = 2.060  t_a/2 Intervals for (column level mean) - (row level mean) 1 2 3 4 2 -11.476 -0.524 3 -4.476 1.524 6.476 12.476 4 -13.476 -7.476 -14.476 -2.524 3.476 -3.524 5 -16.476 -10.476 -17.476 -8.476 -5.524 0.476 -6.524 2.476 Col 1 < Col 2 Cannot reject Col 2 = Col 4 multiple comparisons

  20. Pairwise comparisons Method 2: (Tukey Test) A procedure which controls the experimentwise error rate is “TUKEY’S HONESTLY SIGNIFICANT DIFFERENCE TEST ”. multiple comparisons

  21. Tukey’s method works in a similar way to Fisher’s LSD, except that the “LSD” counterpart (“HSD”) is not ta/2 x MSW x  1+ 1 ni nj ) ( or, for equal number of data points/col , = ta/2 x 2xMSW n but tukX 2xMSW , a/2 n where tuk has been computed to take into account all the inter-dependencies of the different comparisons. multiple comparisons

  22. HSD = tuka/2x2MSW n_______________________________________ A more general approach is to write HSD = qaxMSW nwhere qa = tuka/2 x2 ---q = (Ylargest - Ysmallest) / MSW n ---- probability distribution of q is called the “Studentized Range Distribution”. --- q = q(t, df), where t =number of columns, and df = df of MSW multiple comparisons

  23. With t = 5 and df = v= 25,from Table 10:q = 4.15 for a= 5% tuk = 4.15/1.414 = 2.93 Then, HSD = 4.15 21.2/6 = 7.80 also, 2.93 2x21.2/6 = 7.80 multiple comparisons

  24. In our earlier example: 3 1 2 4 5 5 6 12 14 17 Rank order: (No differences [contiguous] > 7.80) multiple comparisons

  25. Comparison |difference|>or< 7.80 < < > > < > > < < < 3 vs. 1 3 vs. 2 3 vs. 4 3 vs. 5 1 vs. 2 1 vs. 4 1 vs. 5 2 vs. 4 2 vs. 5 4 vs. 5 (contiguous) * 7 9 12 * 8 11 * 5 * 3, 1, 2 4, 5 2 is “same as 1 and 3, but also same as 4 and 5.” multiple comparisons

  26. Tukey's pairwise comparisons (Minitab)Family error rate = 0.0500Individual error rate = 0.00706Critical value = 4.15  q_aIntervals for (column level mean) - (row level mean) 1 2 3 4 2 -13.801 1.801 3 -6.801 -0.801 8.801 14.801 4 -15.801 -9.801 -16.801 -0.199 5.801 -1.199 5 -18.801 -12.801 -19.801 -10.801 -3.199 2.801 -4.199 4.801 Minitab: Stat>>ANOVA>>One-Way Anova then click “comparisons”. multiple comparisons

  27. Special Multiple Comp. Method 3: Dunnett’s test Designed specifically for (and incorporating the interdependencies of) comparing several “treatments” to a “control.” Col Example: 1 2 3 4 5 } n=6 6 12 5 14 17 CONTROL Analog of LSD (=t/2 x 2 MSW ) D = Dut/2 x 2 MSW n n From table or Minitab multiple comparisons

  28. D= Dut/2 x 2 MSW/n = 2.61 (2(21.2) ) = 6.94 CONTROL 6 1 2 3 4 5 In our example: 6 12 5 14 17 Comparison |difference|>or< 6.94 < < > > 1 vs. 2 1 vs. 3 1 vs. 4 1 vs. 5 6 1 8 11 - Cols 4 and 5 differ from the control [ 1 ]. - Cols 2 and 3 are not significantly different from control. multiple comparisons

  29. Minitab: Stat>>ANOVA>>General Linear Model then click “comparisons”. Dunnett's comparisons with a control (Minitab) Family error rate = 0.0500  controlled!! Individual error rate = 0.0152 Critical value = 2.61  Dut_a/2 Control = level (1) of broker Intervals for treatment mean minus control mean Level Lower Center Upper --+---------+---------+---------+----- 2 -0.930 6.000 12.930 (---------*--------) 3 -7.930 -1.000 5.930 (---------*--------) 4 1.070 8.000 14.930 (--------*---------) 5 4.070 11.000 17.930 (---------*---------) --+---------+---------+---------+----- -7.0 0.0 7.0 14.0 multiple comparisons

  30. What Method Should We Use? • Fisher procedure can be used only after the F-test in the Anova is significant at 5%. • Otherwise, use Tukey procedure. Note that to avoid being too conservative, the significance level of Tukey test can be set bigger (10%), especially when the number of levels is big. Or use S-N-K procedure. multiple comparisons

  31. Contrast Consider the following data, which, let’s say, are the column means of a one factor ANOVA, with the one factor being “DRUG”: 1 2 3 4 Consider 4 column means: Y.1 Y.2 Y.3 Y.4 6 4 1 -3 Grand Mean = Y.. = 2 # of rows (replicates) = R = 8

  32. Contrast Example 1 1 3 4 2 Sulfa Type S1 Sulfa Type S2 Anti-biotic Type A Placebo Suppose the questions of interest are (1) Placebo vs. Non-placebo (2) S1 vs. S2 (3) (Average) S vs. A multiple comparisons

  33. For (1), we would like to test if the mean of Placebo is equal to the mean of other levels, i.e. the mean value of {Y.1-(Y.2 +Y.3 +Y.4)/3} is equal to 0. • For (2), we would like to test if the mean of S1 is equal to the mean of S2, i.e. the mean value of (Y.2-Y.3) is equal to 0. • For (3), we would like to test if the mean of Types S1 and S2 is equal to the mean of Type A, i.e. the mean value of {(Y.2 +Y.3 )/2-Y.4} is equal to 0.

  34. In general, a question of interest can be expressed by a linear combination of column means such as with restriction that Saj = 0. Such linear combinations are called (linear) contrasts. multiple comparisons

  35. Test if a contrast has mean 0 The sum of squares for contrast Z is where n is the number of rows (replications). The test statistic Fcalc = SSZ/MSW is distributed as F with 1 and (df of error) degrees of freedom. Reject E[Z]= 0 if the observed Fcalc is too large (say, > F0.05(1,df of error) at 5% significant level). multiple comparisons

  36. Example 1 (cont.): aj’s for the 3 contrasts P S1 S2 A 1234 -3 1 1 1 P vs. P: Z1 S1 vs. S2:Z2 S vs. A: Z3 0 -1 1 0 0 -1 -1 2 multiple comparisons

  37. Calculating top row middle row bottom row       multiple comparisons

  38. 5 6 7 10 Y.1 Y.2 Y.3 Y.4 PS1 S2 A Placebo vs. drugs S1 vs. S2 Average S vs. A -3 5.33 1 1 1 0.50 0 1 -1 0 8.17 2 -1 -1 0 14.00 multiple comparisons

  39. 5.33 42.64 .50 4.00 (Y.j - Y..)2 = 14. • SSBc = 14.R; • R = # rows= 8. 8.17 65.36 SSBc ! 14.00 112.00

  40. ai1j . ai2j = 0 for all i1, i2, j i1 = i2. Orthogonal Contrasts A set of k contrasts { Zi = , i=1,2,…,k } are called orthogonal if If k = c -1 (the df of “column” term and c: # of columns), then

  41. Orthogonal Contrasts If a set of contrasts are orthogonal, their corresponding questions are called independent because the probabilities of Type I and Type II errors in the ensuing hypothesis tests are independent, and “stand alone”. That is, the variability in Y due to one contrast (question) will not be affected by that due to the other contrasts (questions).

  42. Orthogonal Breakdown Since SSBcol has (C-1) df (which corresponds with havingC levels, or C columns ), the SSBcolcan be broken up into (C-1) individual SSQ values, each with a singledegree of freedom, each addressing a different inquiry into the data’s message (one question). A set of C-1 orthogonal contrasts (questions) provides an orthogonal breakdown.

  43. Recall Data in Example 1: S1 . . . . . 6 Placebo . . . . . 5 S2 . . . . . 7 A . . . . . 10 { R=8 Y..= 7

  44. ANOVA F1-.05(3,28)=2.95

  45. An Orthogonal Breakdown Source SSQ df MSQ F Z1 Z2 Z3 42.64 4.00 65.36 8.53 .80 13.07 42.64 4.00 65.36 { { { 1 1 1 3 112 Drugs Error 140 28 5 F1-.05(1,28)=4.20

  46. Example 1 (Conti.): Conclusions • The mean response for Placebo is significantly different to that for Non-placebo. • There is no significant difference between using Types S1 and S2. • Using Type A is significantly different to using Type S on average. multiple comparisons

  47. What if contrasts of interest are not orthogonal? • Let k be the number of contrasts of interest; • c be the number of levels • If k <= c-1  Bonferroni method • If k > c-1  Bonferroni or Scheffe method *Bonferroni Method: The same F test but use a = a/k, where a is the desired family error rate (usual at 5%). *Scheffe Method: To test all linear combinations at once. Very conservative. (Section 9.8)

  48. Special Pairwise Comp.Method 4: MCB Procedure (Compare to the best) This procedure provides a subset of treatments that cannot distinguished from the best. The probability of that the “best” treatment is included in this subset is controlled at 1-a. *Assume that the larger the better. If not, change response to –y.

  49. Identify the subset of the best brokers Minitab: Stat>>ANOVA>>One-Way Anova then click “comparisons”, HSU’s MCB Hsu's MCB (Multiple Comparisons with the Best) Family error rate = 0.0500 Critical value = 2.27 Intervals for level mean minus largest of other level means Level Lower Center Upper ---+---------+---------+---------+---- 1 -17.046 -11.000 0.000 (------*-------------) 2 -11.046 -5.000 1.046 (-------*------) 3 -18.046 -12.000 0.000 (-------*--------------) 4 -9.046 -3.000 3.046 (------*-------) 5 -3.046 3.000 9.046 (-------*------) ---+---------+---------+---------+---- -16.0 -8.0 0.0 8.0 Brokers 2, 4, 5 Not included; only if the interval (excluding ends) covers 0, this level is selected.

More Related