1 / 31

Design of Statistical Investigations

Design of Statistical Investigations. 9 Unbalanced Designs. Stephen Senn. Lack of Orthogonality. So far we have been considering “balanced designs” for example every treatment appears equally frequently in every block Sometimes we do not have such balance by accident missing observations

knoton
Download Presentation

Design of Statistical Investigations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design of Statistical Investigations 9 Unbalanced Designs Stephen Senn SJS SDI_9

  2. Lack of Orthogonality • So far we have been considering “balanced designs” • for example every treatment appears equally frequently in every block • Sometimes we do not have such balance • by accident • missing observations • by design SJS SDI_9

  3. Consequences • Some loss of efficiency • compared to some theoretical optimum • CAUTION: this may not be obtainable in practice and may be why an unbalanced design has been chosen • Complications in analysis • Sums of squares may depend on what other terms have been fitted • so far only residual sum of squares has had this property SJS SDI_9

  4. Exp_11Senn 2002 Example 5.1 • Cross-over trial in asthma • Comparison of salbutamol, formoterol, placebo • Trial run in six sequences • Unequal numbers of patients per sequence SJS SDI_9

  5. Exp_11 • Sequences and Periods: • Number of Observations • I II III • FSP 5 5 5 • SPF 3 3 3 • PFS 6 6 6 • FPS 6 6 6 • SFP 5 5 5 • PSF 5 5 5 • Patients by Sequence • FSP SPF PFS FPS SFP PSF • 5 3 6 6 5 5 Note that although there are no missing data due to patients not having completed a sequence, the numbers of patients are unbalanced by sequence SJS SDI_9

  6. Exp_11Data 1 FSP 35003200 2900 10 FSP 34002800 2200 17 FSP 2300 2200 1700 21 FSP 23001300 1400 23 FSP 30002400 1800 4 SPF 2200 1100 2600 8 SPF 2800 2000 2800 16 SPF 2400 1700 3400 6 PFS 2200 25002400 9 PFS 2200 32003300 13 PFS 800 14001000 20 PFS 950 13201480 26 PFS 1700 26002400 31 PFS 1400 25002200 2 FPS 3100 1800 2400 11 FPS 2800 1600 2200 14 FPS 3100 1600 1400 19 FPS 2300 1500 2200 25 FPS 3000 1700 2600 28 FPS 3100 2100 2800 3 SFP 2100 3200 1000 12 SFP 16002300 1600 18 SFP 1600 1400 800 24 SFP 31003200 1000 27 SFP 28003100 2000 5 PSF 900 19002900 7 PSF 1500 2600 2000 15 PSF 1200 22002700 22 PSF 2400 26003800 30 PSF 1900 2700 2800 SJS SDI_9

  7. Exp_11Not Fitting Period • > fit1 <- lm(fev1 ~ patient + treat) • > summary(fit1, corr = F) • Coefficients: • Value Std. Error t value Pr(>|t|) • …... • treatS -424.6667 87.3127 -4.8637 0.0000 • treatP -1099.0000 87.3127 -12.5869 0.0000 • Residual standard error: 338.2 on 58 degrees of freedom • Multiple R-Squared: 0.8569 SJS SDI_9

  8. Exp_11 Fitting Period > fit2 <- update(fit1, . ~ . + period) summary(fit2, corr = F) Call: lm(formula = fev1 ~ patient + treat + period) Coefficients: Value Std. Error t value Pr(>|t|) …... treatS -422.6220 88.2647 -4.7881 0.0000 treatP -1103.4638 87.8208 -12.5649 0.0000 periodII -109.7228 87.8208 -1.2494 0.2167 periodIII -42.7659 88.2647 -0.4845 0.6299 Residual standard error: 339.4 on 56 degrees of freedom Multiple R-Squared: 0.8608 SJS SDI_9

  9. Exp_11ANOVA > aov.1 <- aov(fev1 ~ patient + treat) > summary(aov.1) Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733775 6.41677 1.065573e-009 treat 2 18428682 9214341 80.57832 0.000000e+000 Residuals 58 6632451 114353 > aov.2 <- aov(fev1 ~ patient + period) > summary(aov.2) Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733774.9 1.703663 0.0424500 period 2 80282 40141.1 0.093199 0.9111486 Residuals 58 24980851 430704.3 > SJS SDI_9

  10. Exp_11ANOVA > aov.3 <- aov(fev1 ~ patient + period + treat) >summary(aov.3) Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733775 6.37115 0.0000000 period 2 80282 40141 0.34853 0.7072422 treat 2 18531248 9265624 80.45067 0.0000000 Residuals 56 6449603 115171 > aov.4 <- aov(fev1 ~ patient + treat + period) > summary(aov.4) Df Sum of Sq Mean Sq F Value Pr(F) patient 29 21279472 733775 6.37115 0.0000000 treat 2 18428682 9214341 80.00540 0.0000000 period 2 182848 91424 0.79381 0.4571415 Residuals 56 6449603 115171 SJS SDI_9

  11. Exp_11ANOVA • > ssType3(aov.3) • Type III Sum of Squares • Df Sum of Sq Mean Sq F Value Pr(F) • patient 29 21279472 733775 6.37115 0.0000000 • period 2 182848 91424 0.79381 0.4571415 • treat 2 18531248 9265624 80.45067 0.0000000 • Residuals 56 6449603 115171 • > ssType3(aov.4) • Type III Sum of Squares • Df Sum of Sq Mean Sq F Value Pr(F) • patient 29 21279472 733775 6.37115 0.0000000 • treat 2 18531248 9265624 80.45067 0.0000000 • period 2 182848 91424 0.79381 0.4571415 • Residuals 56 6449603 115171 SJS SDI_9

  12. Exp_11Standard Errors Period effect not fitted Period effect fitted SJS SDI_9

  13. Incomplete Blocks • These designs arise when the number of treatments exceeds the number of units in a typical block • Not possible to have every treatment in every block • Each block receives a subset of the units • These to be chosen in a sensible manner SJS SDI_9

  14. Exp_12Senn 2002 Example 7.2 • Placebo (P) controlled cross-over design to compare two doses of formoterol • F12 : 12 mg in a single puff • F24: 24 mg in a single puff • Patients could only be treated in two periods • Incomplete blocks design • 24 Patients to be allocated in equal numbers to each of six sequences SJS SDI_9

  15. EXP_12Sequences used The basic design is said to be that of balanced incomplete blocks. In this context balance has a special meaning: each pair of possible treatments appears equally often in every block Because this is a cross-over design and we are worried about period effects the design is also balanced by period (order) but that is another matter P F12 F12 P P F24 F24 P F12 F24 F24 F12 SJS SDI_9

  16. EXP_12The sad reality • Two incorrect packs were picked up. • One was for correct sequence • One was not Numbers of Observations Period Sequence1 2 F12F24 3 3 F12P 5 5 F24F12 4 4 F24P 4 4 PF12 4 4 PF24 4 4 F12 F24 has one fewer patient F12 P has one more SJS SDI_9

  17. EXP_12The Data 6 F12F24 2.5002.450 10 F12F24 1.7501.725 15 F12F24 1.3701.120 4 F12P 3.400 2.500 11 F12P 2.250 1.925 14 F12P 1.460 1.260 21 F12P 1.480 0.880 23 F12P 2.050 2.100 2 F24F12 2.7002.250 12 F24F12 0.9000.925 13 F24F12 1.2701.010 24 F24F12 2.1502.100 • 3 F24P 1.750 1.350 • 7 F24P 2.525 2.150 • 18 F24P 1.080 0.840 • 22 F24P 3.120 2.310 • 5 PF12 2.500 3.500 • 9 PF12 1.600 2.650 • 16 PF12 1.750 2.190 • 19 PF12 0.640 0.840 • 1 PF24 2.100 3.100 • 8 PF24 2.300 2.700 • 17 PF24 1.030 1.870 • 20 PF24 0.810 0.940 SJS SDI_9

  18. SJS SDI_9

  19. Exp_12Analysis 1 > fit1 <- lm(FEV1 ~ patient + period + treat) > summary(fit1, corr = F) Call: lm(formula = FEV1 ~ patient + period + treat) ... Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 2.8164 0.1854 15.1874 0.0000 patient2 -0.3770 0.2350 -1.6042 0.1236 ... patient24 -0.7270 0.2350 -3.0933 0.0055 period 0.0310 0.0667 0.4652 0.6466 treatF24 0.0402 0.0973 0.4134 0.6835 treatP -0.5041 0.0914 -5.5148 0.0000 SJS SDI_9

  20. Exp_12Analysis 2 > aov1 <- aov(FEV1 ~ patient + period + treat) > summary(aov1) Df Sum of Sq Mean Sq F Value Pr(F) patient 23 22.46280 0.976643 18.37451 0.0000000 period 1 0.00083 0.000833 0.01568 0.9015459 treat 2 2.32792 1.163962 21.89871 0.0000073 Residuals 21 1.11619 0.053152 > ssType3(aov1) Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) patient 23 23.64324 1.027967 19.34011 0.0000000 period 1 0.01150 0.011501 0.21638 0.6466003 treat 2 2.32792 1.163962 21.89871 0.0000073 Residuals 21 1.11619 0.053152 SJS SDI_9

  21. Exp_12Analysis 3 • > aov2 <- aov(FEV1 ~ patient + treat + period) • > summary(aov2) • Df Sum of Sq Mean Sq F Value Pr(F) • patient 23 22.46280 0.976643 18.37451 0.0000000 • treat 2 2.31726 1.158628 21.79836 0.0000075 • period 1 0.01150 0.011501 0.21638 0.6466003 • Residuals 21 1.11619 0.053152 • > ssType3(aov2) • Type III Sum of Squares • Df Sum of Sq Mean Sq F Value Pr(F) • patient 23 23.64324 1.027967 19.34011 0.0000000 • treat 2 2.32792 1.163962 21.89871 0.0000073 • period 1 0.01150 0.011501 0.21638 0.6466003 • Residuals 21 1.11619 0.053152 SJS SDI_9

  22. Standard Errors • Consider the standard error of the contrast F24 versus F12 • This is given as 0.0973 • How could this be calculated? • There are two sequences in which these drugs could be compared • F12F24 with 3 patients • F24F12 with 4 patients SJS SDI_9

  23. However Thus the standard error we have from fitting the regression model is actually lower than that produced by a naïve argument. SJS SDI_9

  24. QuestionsExp_12 • Why is the SE produced by the regression analysis lower than that produced by using the pooled MSE and the direct comparison of the means? • What would the treatment estimate be if this naïve approach was used? • How does it compare to that produced? • What further information is the regression approach taking into account? SJS SDI_9

  25. Block Size and Comparisons Suppose that the block size is k (there are k units per block) and that there are b blocks in total and bk units in total Suppose that we have v treatments and r replicates. There must also be rv units in total Hence rv = bk = N . Each block permits k(k-1)/2 comparisons. There are bk(k-1)/2 in total. However, there are v(v-1)/2 possible pair-wise comparisons. SJS SDI_9

  26. Block Size and Comparisons Let l be the average number of repetitions of the pair-wise comparisons in the design. Hence Obviously unless this is an integer, it will not be possible to “balance” the blocks. If v-1 is a multiple of k-1 then it becomes particularly easy to balance the blocks SJS SDI_9

  27. Exp_13 • It was desired to compare three doses each of two formulations of formoterol to placeo • ISF 6, ISF12, ISF24 • MTA6, MTA12,MTA24 • Placebo • There are thus seven treatments • Maximum number of acceptable periods was deemed to be five SJS SDI_9

  28. Exp_13Possible solution • Since 7-1 = 6 is twice 4-1 = 3 use design in 4 periods • If seven sequences are used it will also be possible to make the treatments “uniform” on the periods • There are (7  6)/2 = 21 possible pair-wise comparisons of treatments • Each patient provides (4  3)/2 = 6 possible comparison • There are 7  6 = 42 = 2  21 such comparisons per set of seven sequences SJS SDI_9

  29. A Balanced Design Uniform on the Periods for 7 treatments in 4 periods SJS SDI_9

  30. QuestionsExp_13 Exp_13 was in fact run using five periods and 21 sequences • Check that such a design can be “balanced” An alternative considered was to use five periods and seven sequences • Show that such a design cannot be balanced • Why might it be preferable to the design in four periods and seven sequences? SJS SDI_9

More Related