70 likes | 226 Views
Chapter 9 Notes. In scatterplots we can have points that are outliers or influential points or both.
E N D
In scatterplots we can have points that are outliers or influential points or both. • An outlier is an observation that lies outside the overall pattern of the other observations in a scatterplot. An observation can be an outlier in the x direction, the y direction, or in both directions. • An observation is influential if removing it would markedly change the position of the regression line. Points that are outliers in the x direction are often influential. Your book calls these leverage points.
Extrapolation is the use of a regression line (or curve) for prediction outside the domain of values of the explanatory variable x. • Such predictions cannot be trusted.
A Lurking Variable is a variable that has an important effect on the relationship among the variables in a study but is not included among the variables being studied. • Lurking variables can suggest a relationship when there isn’t one or can hide a relationship that exists.
With observational data, as opposed to data from a well designed experiment, there is no way to be sure that a lurking variable is not the cause of any apparent association.
Association vs. Causation • A strong association between two variables is NOT enough to draw conclusions about cause & effect. • Strong association between two variables x and y can reflect: • A) Causation – Change in x causes change in y • B) Common response – Both x and y are Responding to some other unobserved factor • C) Confounding – the effect on y of the explanatory variable x is hopelessly mixed up with the effects on y of other variables.
Data with no apparent linear relationship can also be examined in two ways to see if a relationship still exists: • 1) Check to see if breaking the data down into subsets or groups makes a difference. • 2) If the data is curved in some way and not linear, a relationship still exists. We will explore that in the next chapter.