160 likes | 304 Views
DO NOW. EXPLAIN YOUR ANSWER!. Stats: Modeling the World. Chapter 7 Scatterplots, Association, and Correlation. Making a Picture of Bivariate Data. Relationships between variables are often what we truly want to know about our data.
E N D
DO NOW EXPLAIN YOUR ANSWER!
Stats: Modeling the World Chapter 7 Scatterplots, Association, and Correlation
Making a Picture of Bivariate Data Relationships between variables are often what we truly want to know about our data. Visually, you can show the associations between quantative variables using a scatterplot.
Looking at Scatterplots After plotting two variables on a scatterplot, we describe the relationship by examining the form, direction, and strength of the association. We look for an overall pattern Form: linear, curved, clusters, no pattern Association/Direction: positive, negative, no direction Scatterplot of the size of a diamond ring in carats and the price in dollars.
Looking at Scatterplots Strength: how closely the points fit the “form” Outliers: deviations from the pattern Scatterplot of the size of a diamond ring in carats and the price in dollars.
POD #30 10/18/2011 Which of the scatterplots show: • Little or no association? • A negative association? • A linear association? • A moderately strong association? • A very strong association?
Roles for Variables Explanatory (or predictor) variable goes on the x-axis Response (or predicted) variable goes on the y-axis **If the relationship between the variables is unclear, it does not matter which one we identify as the explanatory/response variable. Always THINK about the dataset and what you are measuring!!!!
Creating a Scatterplot By hand: - Graph on a normal x-y plane - Make sure to label and scale axes (including units if known!) - You do not have to show the origin! By TI: - Enter data - 2nd:Stat Plot – 1st type of graph
Quantifying Strength When determining the strength of a scatterplot, we would like a numerical value that indicates the strength of the relationship. This numerical value is called the correlation coefficient.
Correlation Coefficient (aka “r”) The correlation coefficient (r) gives us a numerical measurement of the strength of the linear relationship between the explanatory and response variables.
Strength and Direction • Direction: • Positive “r” indicates a positive association • Negative “r” indicates a negative association • Strength: • Values close to 0 indicate weak relations • As r gets closer to 1, the relationship is stronger • Values of exactly 1 indicate a perfect line
“r” ranges from −1 to +1 “r” quantifies the strength and direction of a linear relationship between two quantitative variables. Strength: How closely the points follow a straight line. Direction is positive when individuals with higher x values tend to have higher values of y.
When to use Correlation • Quantitative Variables – r cannot be applied to categorical data! Make sure you understand your variables • Linear data – r can always be calculated, but correlation only measures strength of linear relationships, so watch for curvature! • Outliers– Since r is calculated using z-scores (and hence the mean and st. dev), it is non-resistant to outliers!
Properties of Correlation • Sign of r gives the direction of association • Correlation is always between -1 and +1 • Flipping x and y does NOT affect r • R has NO units!! It has been standardized • Changing units on x or y does not affect r • R measures a LINEAR relationship only! • R is non-resistant to outliers
Finding Correlation Using the TI Stat: Calc: 4:LinReg If your r does not show, you will need to turn DiagnosticsOn. Go to 2nd:0 (Catalog), scroll down to DiagnosticsOn and hit Enter twice.