Business Statistics, 6 th ed. by Ken Black

Business Statistics, 6th ed.by Ken Black Chapter 11 Analysis of Variance

Learning Objectives • Understand the differences between various experimental designs and when to use them. • Compute and interpret the results of a one-way ANOVA. • Compute and interpret the results of a random block design. • Compute and interpret the results of a two-way ANOVA. • Understand and interpret interaction. • Know when and how to use multiple comparison techniques.

Introduction to Design of Experiments Experimental Design • a plan and a structure to test hypotheses in whichthe researcher controls or manipulates one or more variables.

Introduction to Design of Experiments Independent Variable • Treatment variable - one that the experimenter controlsor modifies in the experiment. • Classification variable - a characteristic of the experimental subjects that was present prior to the experiment, and is not a result of the experimenter’s manipulations or control. • Levels or Classifications - the subcategories of the independent variable used by the researcher in the experimental design. • Independent variables are also referred to as factors.

Independent Variable • Manipulation of the independent variable depends on the concept being studied • Researcher studies the phenomenon being studied under conditions of the aspects of the variable

Introduction to Design of Experiments • Dependent Variable • the response to the different levels of the independent variables. • Analysis of Variance (ANOVA) – a group of statistical techniques used to analyze experimental designs. • ANOVA begins with notion that individual items being studied are all the same

Three Types of Experimental Designs • Completely Randomized Design – subjects are assigned randomly to treatments; single independent variable. • Randomized Block Design – includes a blocking variable; single independent variable. • Factorial Experiments – two or more independent variables are explored at the same time; every levelof each factor are studied under every level of all other factors.

Completely Randomized Design • The completely randomized design contains onlyone independent variable with two or moretreatment levels • If two treatment levels of the independent variableare present, the design is the same used to test the difference in means of two independent populations presented in chapter 10 which used the t test to analyze the data

Completely Randomized Design • A technique has been developed that analyzes all the sample means at one time and precludes the buildup of error rate: ANOVA • A completely randomized design is analyzed by one way analysis of variance

One-Way ANOVA: Procedural Overview

Analysis of Variance • The null hypothesis states that the population means for all treatment levels are equal • Even if one of the population means is different from the other, the null hypothesis is rejected • Testing the hypothesis is done by portioning the total variance of data int6o the following two variances • Variance resulting from the treatment (columns) • Error variance or that portion of the total variance unexplained by the treatment

One-Way ANOVA: Sums of Squares Definitions

Analysis of Variance • The total sum of square of variation is portioned into the sum of squares of treatment columns and the sum of squares of error • ANOVA compares the relative sizes of the treatment variation and the error variation • The error variation is unaccounted for variation and can be viewed at the point as variation due to individual differences in the groups • If a significant difference in treatment is present, the treatment variation should be large relative to the error variation

One-Way ANOVA: Computational Formulas

One-Way ANOVA: Computational Formulas • ANOVA is used to determine statistically whetherthe variance between the treatment level meansis greater than the variances within levels(error variance) • Assumptions underlie ANOVA • Normally distributed populations • Observations represent random samples fromthe population • Variances of the population are equal

One-Way ANOVA: Computational Formulas • ANOVA is computed with the three sums of squares • Total – Total Sum of Squares (SST); a measure of all variations in the dependent variable • Treatment – Sum of Squares Columns (SSC); measuresthe variations between treatments or columns since independent variable levels are present in columns • Error – Sum of Squares of Error (SSE); yields the variations within treatments (or columns)

One-Way ANOVA: Preliminary Calculations

One-Way ANOVA: Sum of Squares Calculations

One-Way ANOVA: Computational Formulas • Other items • MSC – Mean Squares of Columns • MSE - Error • MST - Total • F value – determined by dividing the treatment variance (MSC) by the error variance (MSE) • F value is a ratio of the treatment variance to the error variance

One-Way ANOVA: Mean Square and F Calculations

Analysis of Variance for Valve Openings Source of Variance df SS MS F Between 3 0.23658 0.078860 10.18 Error 200.15492 0.007746 Total 23 0.39150

F Table • F distribution table is in Table A7 • Associated with every F table are two unique df variables: degrees of freedom in the numerator,and degrees of freedom in the denominator • Stat computer software packages for computing ANOVVA usually give a probability for the F value, which allows hypothesis testing decisions for any alpha bases on the v-value method

A Portion of the F Table for  = 0.05 df1 df 2 df2

Rejection Region Non rejection Region  F = Since F = > 3 . 10 , reject H . o Critical Value c One-Way ANOVA: Procedural Summary 10.18

Excel Output for the ValveOpening Example

MINITAB Output for the Valve Opening Example

F and t Values • Analysis of variance can be used to test hypothesis about the difference in two means • Analysis of data from two samples by both a t testand an ANOVA shows that the observed F values equals the observed t value squared • F = t2 • t test of independent samples actually is specialcase of one way ANOVA when there are only two treatment levels

Multiple Comparison Tests • ANOVA techniques useful in testing hypothesisabout differences of means in multiple groups • Advantage: Probability of committing a TypeI error is controlled • Multiple Comparison techniques are used to identify which pairs of means are significantly different given that the ANOVA test reveals overall significance

Multiple Comparison Tests Multiple Comparison techniques are used to identify which pairs of means are significantly different given that the ANOVA test reveals overall significance. • ANOVA techniques useful in testing hypothesis about differences of means in multiple groups • Advantage: Probability of committing a Type I error iscontrolled

Multiple Comparison Tests • Multiple comparisons are used when an overall significant difference between groups has been determined using the F value of the analysis of variance • Tukey’s honestly significant difference (HSD) test requires equal sample sizes • Takes into consideration the number of treatment levels, value of mean square error, and sample size

Multiple Comparison Tests • Once HSD is computed, one can examine the absolute value of all differences between pairs of means from treatment levels to determine if it is a significant difference • Tukey-Kramer Procedure is used when sample sizes are unequal

Tukey’s Honestly SignificantDifference (HSD) Test

Demonstration Problem 11.1 A company has three manufacturing plants, and company officials want to determine whether there is a difference in the average age of workers at the three locations. The following data are the ages of five randomly selected workers at each plant. Perform a one-way ANOVA to determine whether there is a significant difference in the mean ages of the workers at the three plants. Use α = .01 and note that the sample sizes are equal.

Data from Demonstration Problem 11.1 PLANT (Employee Age) 1 2 3 29 32 25 27 33 24 30 31 24 27 34 25 28 30 26 Group Means 28.2 32.0 24.8 nj 5 5 5 C = 3 dfE = N - C = 12 MSE = 1.63

Tukey’s HSD test • Since sample sizes are equal, Tukey’s HSD testscan be used to compute multiple comparison tests between groups. • To compute the HSD, the values of MSE, n andq must be determined

q Values for  = .01 Number of Populations Degrees of Freedom 2 3 4 5 ... 1 90 135 164 186 2 14 19 22.3 24.7 3 8.26 10.6 12.2 13.3 4 6.51 8.12 9.17 9.96 . . 11 4.39 5.14 5.62 5.97 12 4.32 5.04 5.50 5.84

Tukey’s HSD Test for the Employee Age Data

Tukey-Kramer Procedure: The Case of Unequal Sample Sizes

Freighter Example: Means and Sample Sizes for the Four Operators A metal-manufacturing firm wants to test the tensile strength of a given metal under varying conditions of temperature. Suppose that in the design phase, the metal is processed under five different temperature conditions and that random samples of size five are taken under each temperature condition. The data follow.

Freighter Example: Means and Sample Sizes for the Four Operators Operator Sample Size Mean 1 5 6.3180 2 8 6.2775 3 7 6.4886 4 4 6.2300

Tukey-Kramer Results forthe Four Operators Critical |Actual Pair Difference Differences| 1 and 2 .1405 .0405 1 and 3 .1443 .1706* 1 and 4 .1653 .0880 2 and 3 .1275 .2111* 2 and 4 .1509 .0475 3 and 4 .1545 .2586* *denotes significant at  .05

Randomized Block Design • Randomized block design - focuses on one independent variable (treatment variable) of interest. • Includes a second variable (blocking variable) usedto control for confounding or concomitant variables. • Variables that are not being controlled by the researcher in the experiment • Can have an effect on the outcome of the treatment being studied.

Randomized Block Design • Repeated measures design - is a design in which each block level is an individual item or person, and that person or item is measured across all treatments.

Randomized Block Design • The sum of squares in a completely randomizeddesign is • SST = SSC + SSR + SSE • In a randomized block design, the sum of squares is • SST = SSC + SSE • SSR (blocking effects) comes out of the SSE • Some error in variation in randomized design aredue to the blocking effects of the randomized blockdesign, as shown in the next slide

Randomized Block Design TreatmentEffects: Procedural Overview • The observed F value for treatments computed using the randomized block design formula is tested by comparing it to a table F value • If the observed F value is greater than the table value, the null hypothesis is rejected for that alpha value • If the F value for blocks is greater than the critical F value, the null hypothesis that all block population means are equal is rejected

Randomized Block Design Treatment Effects: Procedural Overview

Randomized Block Design: Computational Formulas

Randomized Block Design: Tread-Wear Example As an example of the application of the randomized block design, consider a tire company that developed a new tire. The company conducted tread-wear tests on the tire to determine whether there is a significant difference in tread wear if the average speed with which the automobile is driven varies. The company set up an experiment in which the independent variable was speed of automobile. There were three treatment levels

Speed Supplier Slow Medium Fast Block Means ( ) 1 3.7 4.5 3.1 3.77 2 3.4 3.9 2.8 3.37 n = 5 3 3.5 4.1 3.0 3.53 4 3.2 3.5 2.6 3.10 5 3.9 4.8 3.4 4.03 Treatment Means( ) 3.54 4.16 2.98 3.56 C = 3 Randomized Block Design: Tread-Wear Example N = 15

Business Statistics, 6 th ed. by Ken Black