390 likes | 619 Views
Statistics 101. Chapter 3 Section 3. Least – Squares Regression. Method for finding a line that summarizes the relationship between two variables. Regression Line . A straight line that describes how a response variable y changes as an explanatory variable x changes. Mathematical model.
E N D
Statistics 101 Chapter 3 Section 3
Least – Squares Regression • Method for finding a line that summarizes the relationship between two variables
Regression Line • A straight line that describes how a response variable y changes as an explanatory variable x changes. • Mathematical model
Calculating error • Error = observed – predicted • = 5.1 – 4.9 • = 0.2
Least – squares regression line (LSRL) • Line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible
http://hadm.sph.sc.edu/courses/J716/demos/leastsquares/leastsquaresdemo.htmlhttp://hadm.sph.sc.edu/courses/J716/demos/leastsquares/leastsquaresdemo.html
What we need • y = a + bx • b = r (sy/ sx) • a = y - bx
Statistics 101 Chapter 3 Section 3 Part 2
Facts about least-squares regression • Fact 1: the distinction between explanatory and response variables is essential • Fact 2: There is a close connection between correlation and the slope • A change of one standard deviation in x corresponds to a change of r standard deviations in y
More facts • Fact 3: The least-squares regression line always passes through the point (x,y) • Fact 4: the square of the correlation, r2, is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x.
Residuals • Is the difference between an observed value of the response variable and the value predicted by the regression line. • Residual = observed y – predicted y = y - y
Residuals • If the residual is positive it lies above the line • If the residual is negative it lies below the line • The mean of the least-squares residuals is always zero • If not then it is a roundoff error • Technology Toolbox on page 174 shows how to do a residual plot.
Residual plots • A scatterplot of the regression residuals against the explanatory variable. • To help us assess the fit of a regression line. • If the regression line captures the overall relationship between x and y, the residuals should have no systemic pattern.
Curved pattern • A curved pattern shows that the relationship is not linear.
Increasing or decreasing spread • Indicates that prediction of y will be less accurate for larger x.
Influential Observations • An observation is an influential observation for a statistical calculation if removing it would markedly change the result of the calculation.