220 likes | 1.03k Views
Chapter 3 Describing Bivariate Data. General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental unit. Special techniques that can be used in describing these variables will help you identify possible relationships between them.
E N D
Chapter 3 Describing Bivariate Data General Objectives: Sometimes the data that are collected consist of observations for two variables on the same experimental unit. Special techniques that can be used in describing these variables will help you identify possible relationships between them. ©1998 Brooks/Cole Publishing/ITP
Specific Topics 1. Bivariate data 2. Side-by-side pie charts, comparative line charts 3. Side-by-side bar charts, stacked bar charts 4. Scatterplots for two quantitative variables 5. Covariance and the correlation coefficient 6. The best-fitting line ©1998 Brooks/Cole Publishing/ITP
3.1 and 3.2 Bivariate Data and Graphs for Qualitative Variables • When two variables are measured on a single experimental unit, the resulting data are called bivariate data. • You may want to explore the relationship between the two variables. • If you have one qualitative and one quantitative variable, you can use two side-by-side charts or a bar chart in which the bars for the two populations are placed side by side. • Another option is to use a stacked bar chart, where the bars for each category are stacked on top of each other. • See Figures 3.1, 3.2, and 3.3 for examples of techniques. ©1998 Brooks/Cole Publishing/ITP
Figure 3.1 Comparative bar charts ©1998 Brooks/Cole Publishing/ITP
Figure 3.2 Comparative pie charts ©1998 Brooks/Cole Publishing/ITP
Figure 3.3 Stacked bar chart ©1998 Brooks/Cole Publishing/ITP
3.3 Scatterplots for Two Quantitative Variables • Describing patterns shown in a scatterplot: - What type of pattern do you see? Upward or downward? Curved? None or random? - How strong is the pattern? All points follow it? Only weakly visible? - Are there any unusual observations? Outliers? Clusters? Explanation for groupings? • See Figures 3.4 and 3.5 for examples of the use of a scatterplot. ©1998 Brooks/Cole Publishing/ITP
Figure 3.4 ©1998 Brooks/Cole Publishing/ITP
Figure 3.5 ©1998 Brooks/Cole Publishing/ITP
3.4 Numerical Measures for Quantitative Bivariate Data • Sometimes two variables exhibit a linear relationship, e.g., the relationship between height and weight in humans. ©1998 Brooks/Cole Publishing/ITP
A simple measure of the relationship between two variables x and y is the correlation coefficient, r: where sx and sy are the standard deviations for the variables x and y. • The new quantity sxy is called the covariance between x and y and is defined as: • A computing formula for the covariance: where Sxiyi is the sum of the products xiyi for each of the n pairs of measurements. ©1998 Brooks/Cole Publishing/ITP
If the points in the x vs y plot tend to run from lower left to upper right, then sxy and r will be positive. • If the points tend to run from upper left to lower right, then sxy and r will be negative. • If the points are scattered high and low and left and right, then sxy and r will be close to zero. • The value of y typically depends on the value of x; y is called the dependent variable and x is called the independent variable. • Sometimes it is possible to describe the relationship relating x to y using a straight line given by the equation y = ax + b. ©1998 Brooks/Cole Publishing/ITP
The best-fitting line relating y to x, called the regression or least-squares line, is found by minimizing the sum of the squared differences between the data points and the line itself. • The formulas for computing b and a are • When r is positive, b is positive; when r is negative, b is negative; when r is zero, b is zero. ©1998 Brooks/Cole Publishing/ITP
Example An electrical engineer is interested in calibrating a volt meter to be used on a production line. For the test, 10 different flow rates are fixed and the corresponding meter readings observed. Measured VoltageActual Voltage 14 10 23 20 31 30 42 40 51 50 58 60 68 70 76 80 87 90 95 100 ©1998 Brooks/Cole Publishing/ITP
Compute the correlation coefficient between the observed and actual voltages. Solution ©1998 Brooks/Cole Publishing/ITP