1 / 76

Statistics for Business and Economics

Statistics for Business and Economics. Chapter 10 Simple Linear Regression. Learning Objectives. Describe the Linear Regression Model State the Regression Modeling Steps Explain Least Squares Compute Regression Coefficients Explain Correlation Predict Response Variable.

rinah-hicks
Download Presentation

Statistics for Business and Economics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics for Business and Economics Chapter 10 Simple Linear Regression

  2. Learning Objectives • Describe the Linear Regression Model • State the Regression Modeling Steps • Explain Least Squares • Compute Regression Coefficients • Explain Correlation • Predict Response Variable

  3. What is Regression? • 1. Method of modeling relationships • between a response variable Y and one or more predictors X • 2. A way of “fitting line through data”

  4. Example 1: Store Site Selection Model sales Y at existing sites as a function of demographic variables: X1 = Population in store vicinity X2 = Income in area X3 = Age of houses in area X4 = X5 = From equation, predict sales at new sites

  5. Example 2:Marketing Research Model consumer response to a product on basis of product characteristics: Y = Taste score on soft drink X1 = Sugar level X2 = Carbonation level X3 = X4 =

  6. Example 3:Operations/Quality Model product quality in plant as a function of manufacturing process characteristics: Y = Quality score of sheet metal X1 = Raw material purity score X2 = Molten aluminum temp X3 = Line speed X4 =

  7. Example 4:Real Estate Pricing Y = Selling price of houses X1 = Square feet X2 = Taxes X3 = Lot acreage X4 = X5 = X6 =

  8. One-Predictor Regression 25 Homes Sold in Essex County, New Jersey, 1996

  9. One-Predictor Regression Regression Equation: Y= -39001 + 85.0 X

  10. Regression Models • Answers ‘What is the relationship between the variables?’ • Equation used • One numerical dependent (response) variable • What is to be predicted • One or more numerical or categorical independent (explanatory) variables • Used mainly for prediction and estimation

  11. Prediction Using Regression Y= -39001 + 85.0(4000)= $300,999

  12. Regression Modeling Steps • Hypothesize deterministic component • Estimate unknown model parameters • Specify probability distribution of random error term • Estimate standard deviation of error • Evaluate model • Use model for prediction and estimation

  13. Specifying the Model • Define variables • Conceptual (e.g., Advertising, price) • Empirical (e.g., List price, regular price) • Measurement (e.g., $, Units) • Hypothesize nature of relationship • Expected effects (i.e., Coefficients’ signs) • Functional form (linear or non-linear) • Interactions

  14. Model Specification Is Based on Theory • Theory of field (e.g., Sociology) • Mathematical theory • Previous research • ‘Common sense’

  15. Thinking Challenge: Which Is More Logical? Sales Sales Advertising Advertising Sales Sales Advertising Advertising

  16. 1 Explanatory 2+ Explanatory Variable Variables Multiple Simple Non- Non- Linear Linear Linear Linear Types of Regression Models RegressionModels

  17. Linear Regression Model Relationship between variables is a linear function Population y-intercept Population Slope Random Error y     x   0 1 Dependent (Response) Variable Independent (Explanatory) Variable

  18. Line of Means y E(y) = β0 + β1x (line of means) Change in y β1 = Slope Change in x β0 =y-intercept x High school teacher

  19. Linear Regression Model • 1. Relationship between variables is a linear function Population Y-intercept Population slope Independent (explanatory) variable Y     X   i 0 1 i i Dependent (response) variable Random error

  20. Population Linear Regression Model Observedvalue i= Random error Observed value

  21. Sample LinearRegression Model Y True Unknown Line X Observed value

  22. Sample LinearRegression Model Y i= Random error ^ X Observed value

  23. Regression Modeling Steps • Hypothesize deterministic component • Estimate unknown model parameters • Specify probability distribution of random error term • Estimate standard deviation of error • Evaluate model • Use model for prediction and estimation

  24. y 60 40 20 0 x 0 20 40 60 Scattergram • Plot of all (xi, yi) pairs • Suggests how well model will fit

  25. y 60 40 20 0 x 0 20 40 60 Thinking Challenge • How would you draw a line through the points? • How do you determine which line ‘fits best’?

  26. Least Squares • ‘Best fit’ means difference between actual y values and predicted y values are a minimum • But positive differences off-set negative • Least Squares minimizes the Sum of the Squared Differences (SSE)

  27. Least Squares Graphically y ^ e ^ 4 e 2 ^ e ^ e 1 3 x

  28. Coefficient Equations Prediction Equation Slope y-intercept

  29. Computation Table 2 2 xi yi xi xiyi yi 2 2 x1 y1 x1 x1y1 y1 2 2 x2 y2 x2 y2 x2y2 : : : : : 2 2 xn yn xn yn xnyn 2 2 Σxi Σyi Σxi Σxiyi Σyi

  30. ^ • Slope (1) • Estimated y changes by 1 for each 1unit increase in x • If 1 = 2, then Sales (y) is expected to increase by 2 for each 1 unit increase in Advertising (x) ^ ^ ^ • Y-Intercept (0) • Average value of y when x = 0 • If 0 = 4, then Average Sales (y) is expected to be 4 when Advertising (x) is 0 ^ Interpretation of Coefficients

  31. Least Squares Example You’re a marketing analyst for Hasbro Toys. You gather the following data: Ad $Sales (Units)1 1 2 1 3 2 4 2 5 4 Find the least squares line relatingsales and advertising.

  32. Scattergram Sales vs. Advertising Sales 4 3 2 1 0 0 1 2 3 4 5 Advertising

  33. Parameter Estimation Solution Table 2 2 xi yi xi xiyi yi 1 1 1 1 1 2 1 4 1 2 3 2 9 4 6 4 2 16 4 8 5 4 25 16 20 15 10 55 26 37

  34. Parameter Estimation Solution

  35. Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP 1 -0.1000 0.6350 -0.157 0.8849 ADVERT 1 0.7000 0.1914 3.656 0.0354 Parameter Estimation Computer Output ^ 0 ^ 1

  36. JMP “Fit Model” Results

  37. ^ • Slope (1) • Sales Volume (y) is expected to increase by .7 units for each $1 increase in Advertising (x) ^ • Y-Intercept (0) • Average value of Sales Volume (y) is -.10 units when Advertising (x) is 0 • Difficult to explain to marketing manager • Expect some sales without advertising Coefficient Interpretation Solution

  38. Regression Modeling Steps • Hypothesize deterministic component • Estimate unknown model parameters • Specify probability distribution of random error term • Estimate standard deviation of error • Evaluate model • Use model for prediction and estimation

  39. Linear Regression Assumptions • Mean of probability distribution of error, ε, is 0 • Probability distribution of error has constant variance • Probability distribution of error, ε, is normal • Errors are independent

  40. Error Probability Distribution ^ f(  ) Y X 1 X 2 X

  41. Error Probability Distribution ^ f(  ) Y X 1 X 2 X

  42. Error Probability Distribution ^ f(  ) Y X 1 X 2 X

  43. Estimating s2 Recall that: where is our estimator of the mean. Now substitute (1) Yi for Xi , (2) , and n-2 df for n-1:

  44. Why are df = n - 2? 1. Two statistics are estimated to compute the regression line, b0 and b1 2. As a result, if I know any n-2 residuals, the other two can be computed from ei = 0 and SeiXi = 0. ^ ^ ^ ^

  45. Regression Modeling Steps • Hypothesize deterministic component • Estimate unknown model parameters • Specify probability distribution of random error term • Estimate standard deviation of error • Evaluate model • Use model for prediction and estimation

  46. Test of Slope Coefficient • Shows if there is a linear relationship between x and y • Involves population slope 1 • Hypotheses • H0: 1 = 0 (No Linear Relationship) • Ha: 1 0 (Linear Relationship) • Theoretical basis is sampling distribution of slope

  47. All Possible Sample Slopes Sample 1: 2.5 Sample 2: 1.6 Sample 3: 1.8 Sample 4: 2.1 : :Very large number of sample slopes Sampling Distribution ^ S 1 ^ b 1 1 Sampling Distribution of Sample Slopes Sample 1 Line y Sample 2 Line Population Line x

  48. Slope Coefficient Test Statistic

  49. Test of Slope Coefficient Example You’re a marketing analyst for Hasbro Toys. You find β0 = –.1,β1 = .7and s= .6055. Ad $Sales (Units)1 1 2 1 3 2 4 2 5 4 Is the relationship significantat the .05 level of significance? ^ ^

  50. Test StatisticSolution

More Related