380 likes | 707 Views
Orthogonality. One way to delve further into the impact a factor has on the yield is to break the Sum of Squares (SSQ) into “orthogonal” components.
E N D
Orthogonality One way to delve further into the impact a factor has on the yield is to break the Sum of Squares (SSQ) into “orthogonal” components. If SSBcol has (C-1) df (which corresponds with havingC levels, or C columns ), the SSBcolcan be broken up into (C-1) individual SSQ values, each with a singledegree of freedom, each addressing a different inquiry into the data’s message.
Orthogonality If each “question” asked of the data is orthogonal to all the other “questions”, two generally desirable properties result: 1. Each “question” is independent of each other one; the probabilities of Type I and Type II errors in the ensuing hypothesis tests are independent, and “stand alone”.
Orthogonality 2. The (C-1) SSQ values are guaranteed to add up exactly to the total SSBcol you started with. What? How? -Watch! 1 2 3 4 Consider 4 column means: 6 4 1 -3 Grand Mean = 2
Call these values: Y1, Y2, Y3, Y4, 4 and define 1 = a1j Yj , j=1 4 2 = aj Yj , j=1 4 and= aj Yj , j=1
4 (3.) ai1j . ai2j = 0 for all i1, i2, j=1 i1 = i2. Under what conditions will 3 4 2i = Yj - Y)2 ? i=1 j=1 one answer: 4 (1) aij = 1 for all i j=1 (i=1,2,3) 4 (2) aij = 0 for all i j=1 A linear combination of treatment means satisfying (2) is called a contrast. 2 orthogonal
Writing the aij’s as a “matrix”, one possibility among many: 1/2 1/2 -1/2 -1/2 1/2 -1/2 1/2 -1/2 1/2 -1/2 -1/2 1/2 Y1 Y2 Y3 Y4 Y= 2 6 4 1 -3 2 1/2 1/2 -1/2 -1/2 6 36 1 = 2 = 1/2 -1/2 1/2 -1/2 3 9 3= 1/2 -1/2 -1/2 1/2 -1 1 46
Yj -Y)2 = (6-2)2 + (4-2)2 + (1-2)2 + (-3-2)2 = 16 + 4 + 1 + 25 = 46 OK! How does this help us?
Consider the following data, which, let’s say, are the column means of a one factor ANOVA, with the one factor being “DRUG”: Y.1 Y.2 Y.3 Y.4 5 6 7 10 Y.. = 7 and (Y.j - Y..)2 = 14. (SSBc = 14.R, where R = # rows)
Consider the following two examples: Example 1 1 3 4 2 Sulfa Type S1 Sulfa Type S2 Anti-biotic Type A Placebo Suppose the questions of interest are (1) Placebo vs. Non-placebo (2) S1 vs. S2 (3) (Average) S vs. A
How would you combine columns to analyze the question? P S1 S2 A 1234 -3 1 1 1 P vs. P: S1 vs. S2: S vs. A: 0 -1 1 0 0 -1 -1 2 Note Conditions 2 & 3 Satisfied
divide top row by middle row by bottom row by (to satisfy condition 1)
5 6 7 10 Y.1 Y.2 Y.3 Y.4 Zi2 PS1 S2 A Placebo vs. drugs S1 vs. S2 Average S vs. A 1 12 1 12 1 12 3 12 5.33 1 2 1 2 0.50 0 0 2 6 8.17 1 6 1 6 0 14.00
Example 2: Y.1 Y.2 Y.3 Y.4 sulfa type sulfa type antibiotic type antibiotic type S1 S2 A1 A2
Exercise: • Suppose the questions of interest are: • The difference between sulfa types • The difference between antibiotic types • The difference between sulfa and antibiotic types, on average. • Write down the three corresponding contrasts. Are they orthogonal? If not, can we make them orthogonal?
Y.1 Y.2 Y.3 Y.4 Zi2 S1 S2 A1 A2 1 2 1 2 0 0.5 S1 vs. S2 A1 vs. A2 Ave. S vs. Ave. A 0 1 2 1 2 0 4.5 0 1 4 1 4 1 4 1 4 9.0 (5)(6)(7) (10) 14.00 OK! Now to the analysis:
Example: ASP1 . . . . . 6 Placebo . . . . . 5 ASP2 . . . . . 7 Buff . . . . . 10 { R=8 Y..= 7
ANOVA F.05(3,28)=2.95
Now, an orthogonal breakdown: Placebo ASP1 ASP2 Buff Z +5 +6 +7 10 Placebo vs. others ASP1 vs. ASP2 ASP vs. Buff - 3 12 1 12 1 12 8 12 1 12 1 2 1 2 -1 2 0 0 -1 6 7 6 -1 6 2 6 0 1/2 -1/2 5/2
Z 8 12 Z2 Z2 x 8 5.33 42.64 1 7 .50 4.00 8.17 65.36 5 100 25/2 14.00 112.00
ANOVA Source SSQ df MSQ F Z1 Z2 Z3 42.64 4.00 65.36 8.53 .80 13.07 42.64 4.00 65.36 { { { 1 1 1 3 112 Drugs Error 100 100 20 Z4 1 140 28 5 F1-.05/3(1,28)<7.64 F1-.05(1,28)=4.20
A significant difference between Placebo and the rest, and between ASP’s and BUFF, but not between the two different ASP’s. Another Example: The variable (coded) is mileage per gallon. Gasoline I II III IV V YIELD -4 19 21 10 18 Standard Gasoline Standard, plus additive A made by P Standard, plus additive B made by P Standard, plus additive A made by Q Standard, plus additive B made by Q
Questions actually chosen: Standard gasoline vs gasoline with an additive P vs. Q Between the two additives of P Between the two additives of Q (Z1) (Z2) (Z3) (Z4)
With appropriate orthogonal matrix and Z 2values: I IIIII IV V Z2i 1 20 1 20 1 20 1 20 + 4 Z1 Z2 Z3 Z4 352.8 20 1 4 1 4 1 4 1 4 0 36.0 2.0 32.0 1 2 1 2 0 0 0 0 1 2 1 2 0 0 422.8 By far, the largest part of the total variability in yields is associated with standard gasoline vs. gasoline with an additive.
Orthogonal Breakdowns In 2k and 2k-p designs Let n=4. Let the four observed yields be the four yields of a 22 factorial experiment: Y1 = 1 Y2 = a Y3 = b Y4 = ab
Example:Miles per Gallon by Gas Type and Auto Make 1234 16 28 16 28 22 27 25 30 16 17 16 19 10 20 16 18 18 23 19 24 8 23 16 25 15 23 18 24 Group
Suppose: 1 1 15 2 a 23 3 b 18 4 ab 24 A = Gas Type = 0, 1 B = Auto Make = 0, 1 (22 design)
1 a b ab -1 -1 1 1 -1 -1 -1 1 -1 1 1 1 Earlier we formed estimate of 2A estimate of 2B estimate of 2AB
Which for present purposes we replace by: 1 a b ab Z . . . . . . . . 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 - 1 2 1 2 1 2 - A - - B - - AB Now we can see that these coefficients of the yields are elements of the orthogonal matrix. So, A, B and AB constitute orthogonal estimates.
Source SSQ df MSQ F Col 324 3 108 5.4 Error 400 20 20 F.95 (3,20) =3.1 Standard one-way ANOVA:
1 1 1 14 Then, A = (-15+23-18+24) = B = (-15-23+18+24) = AB = (15-23-18+24) = A2 = 49, B2 = 4 , AB2 = 1 4 4 4 4 4 -2 4 4
Multiply each of these by the number of data points in each column: A26(49) = 294 B26(4) = 24 AB26(1) = 6 TOTAL : 324
ANOVA: Source SSQ df MS Fcalc Col 324 3 Error 400 20 20 14.7 1.2 .3 294 24 6 A B AB 1 1 1 294 24 6 F.95 (1,20) =4.3 { { { And:
If: 1 c 15 2 a 23 3 b 18 4 abc 24 A = Gas Type B = Auto Make C = Highway (23-1 design)
Source SSQ df { { A+BC 294 1 B+AC 24 3 1 AB+C 6 1 Error 400 20 e t c . We’d get the same breakdown of the SSQ, but being the + block of I = ABC, ANOVA:
What if contrasts of interest are not orthogonal? • Let k be the number of contrasts of interest. • If k <= c-1 Bonferroni method • If k > c-1 Bonferroni or Scheffe method *Bonferroni Method: The same F test (SSQ = RxZi^2) but using a = a/k, where a is the overall error rate. *Scheffe Method: p.108, skipped. Reference: Statistical Principles of Research Design and Analysis by Robert O. Kuehl.
If k contrasts are orthogonal, • Can k be larger than c-1? • Can k be smaller than c-1? No. Yes. For case 2, do the same F test (but the sum of SSQ will not be equal to SSB). See Slide 16.