Chapter 12

Chapter 12 Prediction/Regression Part 2: Nov. 19, 2013

Simple Regression Review • Regression allows us to predict y using x (predictor) • We create a regression equation or line – based on 1 sample, then use to predict y scores for a 2nd sample. • Ŷ = .688 + .92(x) • Interpret ‘a’ = ‘y’ should equal .688 when ‘x’=0 • Interpret ‘b’ = If we increase ‘x’ by 1, ‘y’ should increase .92

Drawing the Regression Line • Draw and label the axes for a scatter diagram • Figure predicted value on criterion for a low value on predictor variable • You can randomly choose what value to plug in.. • Maybe x=1, so Ŷ = .688 + .92(1) = 1.61 • Repeat step 2. with a high value on predictor • Maybe x=6, so Ŷ = .688 + .92(6) = 6.21 • Draw a line passing through the two marks (1, 1.61) and (6, 6.21) • Hint: you can also use (Mx, My) to save time as one of your 2 points. Reg line always passes through the means of x and y.

Regression Error • Now that you have a regression line or equation built from 1 sample, you can find predicted y scores using a new sample of x scores (Sample 2)… • Then, assume that you later collect data on Sample 2’s actual y scores • Compare the accuracy of predicted ŷ to the actual y scores for Sample 2 • Sometimes you’ll overestimate, sometimes underestimate…this is ERROR. • Can we get a measure of error? How much is OK?

Error in regression • Actual score minus the predicted score • Proportionate Reduction in Error (PRE) • Squared error using prediction (reg) model = SSError =  (y - ŷ)2  Compare this to amount of error w/o this prediction (reg) model. • If no other model, best guess would be the mean. • Total squared error when predicting

Error and Proportionate Reduction in Error • Formula for proportionate reduction in error • compares reg model to mean baseline (predicting everyone’s y score will be at the mean) We want reg model to be much better than mean(baseline)  that would indicate fewer prediction errors So you want PRE to be large…

Reg model was ŷ = .688 + .92(x) • Use mean model to find error (y-My)2 for each person & sum up that column  SStot • Find prediction using reg model: • plug in x values into reg model to get ŷ • Find (y-ŷ)2 for each person, sum up that column  SSerror • Find PRE

If our reg model no better than mean, SSerror = SStotal, so (0/ SStot) = 0. • Using this regression model, we reduce error over the mean model by 0%….not good prediction. • If reg model has 0 error (perfect), SStot-0/SStot = 1, or 100% reduction of error. • Proportionate reduction in error = r2 • aka “Proportion of variance in y accounted for by x”, ranges between 0-100%.

Computing Error • Sum of the squared residuals = SSerror ŷ = .688 + .92(x) X Y 6 6 .688 + .92(6) = 6.2 .688 + .92(1) = 1.6 1 2 5 6 .688 + .92(5) =5.3 3 4 .688 + .92(3) = 3.45 3 2 .688 + .92(3) = 3.45 3.6 4.0 mean

SSERROR Computing SSerror • Sum of the squared residuals = SSerror X Y 6 6 6.2 0.04 -0.20 1.6 0.40 0.16 1 2 5 6 5.3 0.70 0.49 3 4 3.45 0.55 0.30 3 2 3.45 -1.45 2.10 3.6 4.0 0.00 3.09 mean

Computing SStotal = (y-My)2 X Y My y-My (y-My)2 6 6 4 2 4 1 2 4 -2 4 5 6 4 2 4 3 4 4 0 0 3 2 4 -2 4 3.6 4.0 Σ=0 Σ=16 (SStotal) mean

PRE • PRE = 16 - 3.09 16 = .807 • We have 80.7% proportionate reduction in error from using our regression model as opposed to the mean baseline model • So we’re doing much better using our regression as opposed to just predicting the mean for everyone…

Chapter 12

Chapter 12

Presentation Transcript

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

CHAPTER 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12