1 / 12

Chapter 12

Chapter 12. Prediction/Regression Part 2: Nov. 19, 2013. Simple Regression Review. Regression allows us to predict y using x (predictor) We create a regression equation or line – based on 1 sample, then use to predict y scores for a 2 nd sample. Ŷ = .688 + .92(x)

drake
Download Presentation

Chapter 12

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 12 Prediction/Regression Part 2: Nov. 19, 2013

  2. Simple Regression Review • Regression allows us to predict y using x (predictor) • We create a regression equation or line – based on 1 sample, then use to predict y scores for a 2nd sample. • Ŷ = .688 + .92(x) • Interpret ‘a’ = ‘y’ should equal .688 when ‘x’=0 • Interpret ‘b’ = If we increase ‘x’ by 1, ‘y’ should increase .92

  3. Drawing the Regression Line • Draw and label the axes for a scatter diagram • Figure predicted value on criterion for a low value on predictor variable • You can randomly choose what value to plug in.. • Maybe x=1, so Ŷ = .688 + .92(1) = 1.61 • Repeat step 2. with a high value on predictor • Maybe x=6, so Ŷ = .688 + .92(6) = 6.21 • Draw a line passing through the two marks (1, 1.61) and (6, 6.21) • Hint: you can also use (Mx, My) to save time as one of your 2 points. Reg line always passes through the means of x and y.

  4. Regression Error • Now that you have a regression line or equation built from 1 sample, you can find predicted y scores using a new sample of x scores (Sample 2)… • Then, assume that you later collect data on Sample 2’s actual y scores • Compare the accuracy of predicted ŷ to the actual y scores for Sample 2 • Sometimes you’ll overestimate, sometimes underestimate…this is ERROR. • Can we get a measure of error? How much is OK?

  5. Error in regression • Actual score minus the predicted score • Proportionate Reduction in Error (PRE) • Squared error using prediction (reg) model = SSError =  (y - ŷ)2  Compare this to amount of error w/o this prediction (reg) model. • If no other model, best guess would be the mean. • Total squared error when predicting

  6. Error and Proportionate Reduction in Error • Formula for proportionate reduction in error • compares reg model to mean baseline (predicting everyone’s y score will be at the mean) We want reg model to be much better than mean(baseline)  that would indicate fewer prediction errors So you want PRE to be large…

  7. Reg model was ŷ = .688 + .92(x) • Use mean model to find error (y-My)2 for each person & sum up that column  SStot • Find prediction using reg model: • plug in x values into reg model to get ŷ • Find (y-ŷ)2 for each person, sum up that column  SSerror • Find PRE

  8. If our reg model no better than mean, SSerror = SStotal, so (0/ SStot) = 0. • Using this regression model, we reduce error over the mean model by 0%….not good prediction. • If reg model has 0 error (perfect), SStot-0/SStot = 1, or 100% reduction of error. • Proportionate reduction in error = r2 • aka “Proportion of variance in y accounted for by x”, ranges between 0-100%.

  9. Computing Error • Sum of the squared residuals = SSerror ŷ = .688 + .92(x) X Y 6 6 .688 + .92(6) = 6.2 .688 + .92(1) = 1.6 1 2 5 6 .688 + .92(5) =5.3 3 4 .688 + .92(3) = 3.45 3 2 .688 + .92(3) = 3.45 3.6 4.0 mean

  10. SSERROR Computing SSerror • Sum of the squared residuals = SSerror X Y 6 6 6.2 0.04 -0.20 1.6 0.40 0.16 1 2 5 6 5.3 0.70 0.49 3 4 3.45 0.55 0.30 3 2 3.45 -1.45 2.10 3.6 4.0 0.00 3.09 mean

  11. Computing SStotal = (y-My)2 X Y My y-My (y-My)2 6 6 4 2 4 1 2 4 -2 4 5 6 4 2 4 3 4 4 0 0 3 2 4 -2 4 3.6 4.0 Σ=0 Σ=16 (SStotal) mean

  12. PRE • PRE = 16 - 3.09 16 = .807 • We have 80.7% proportionate reduction in error from using our regression model as opposed to the mean baseline model • So we’re doing much better using our regression as opposed to just predicting the mean for everyone…

More Related