330 likes | 442 Views
Statistics and Data Analysis Final Exam Review – Part 2. Linear Regression. Linear Regression II. Inference. Probability I. Probability II. 200. 200. 200. 200. 200. 400. 400. 400. 400. 400. 600. 600. 600. 600. 600. 800. 800. 800. 800. 800. 1000. 1000. 1000. 1000.
E N D
Statistics and Data Analysis Final Exam Review – Part 2
Linear Regression Linear Regression II Inference Probability I Probability II 200 200 200 200 200 400 400 400 400 400 600 600 600 600 600 800 800 800 800 800 1000 1000 1000 1000 1000
Linear Regression – 200 Here is a graph of the percent of adults in each state who were obese in 1991 and the percent who were obese in 1998. What answer choice best describes the graph? • Positive, linear, moderate • Positive, linear, strong • Positive, linear, weak • Random, linear, moderate Answer: A
Linear Regression - 400 Given the scatterplot shown below, the correlation between X and Y is probably closest to: • –1.3 • 0.55 • 0.99 • 1.3 Answer: B
Linear Regression - 600 • A newscaster read a study comparing the amount of time spent cleaning and weight of the homeowner. The study reported a correlation of –0.87. She then reported on the evening news that the more time you spend cleaning your house the skinner you will get. Which of the following is true? • She is incorrect since the correlation is negative which shows no relationship between the two variables. • She is correct since –0.87 shows a strong relationship between the variables. • The study must be incorrect because the correlation should be positive. • She is incorrect because she is confusing association with causation. Answer: D
Linear Regression - 800 Daily Double! The least-squares regression line for predicting the percent of a country's females who are illiterate from the percent of males who are illiterate is female%= 3.34 + 1.39(male%). In China, 10.1% of men are illiterate. Predict the percent of illiterate women in China. 4.7% 14% 17.4% 47.8% Answer: C
Linear Regression - 1000 • The least-squares regression line for predicting the percent of a country's females who are illiterate from the percent of males who are illiterate is female%= 3.34 + 1.39(male%). • The equation of the regression line tells us that (on average) when the male illiteracy rate goes up by 1%, the female rate goes up by • A. 4.73% B. 3.34% C. 1.95% D. 1.39% Answer: D
Linear Regression II - 200 • You are planning an experiment to study the effect of gasoline octane value on the gas mileage (miles per gallon) of sport utility vehicles. In this study • A. gas mileage is a response variable. • B. gas mileage is an explanatory variable. • C. gas mileage is a lurking variable. • D. gas mileage is a categorical variable. Answer: A
Linear Regression II - 400 • Suppose that the least squares regression line for predicting y from x is y = 100 + 1.3x. Which of the following is a possible value for the correlation between y and x? • 1.3 • 1.3 • 0 • D. 0.5 • E. 0.5 Answer: E
Linear Regression II - 600 • Which of these is not true of the correlation r between the length (in inches) and weight (in pounds) of a sample of salamanders? • r must take a value between -1 and 1. • r is measured in inches. • if longer salamanders tend to also be heavier, then r > 0. • r would not change if we measured these trout in centimeters instead of inches. Answer: B
Linear Regression II - 800 • A study found that SAT verbal scores were positively associated with first-year grade point averages for liberal arts majors. We can conclude from this that • students who scored high on the SAT verbal test tended to get lower GPAs than those who scored lower on the SAT verbal test • students who scored high on the SAT verbal test tended to get higher GPAs than those who scored lower on the SAT verbal test • we can use the SAT verbal score to accurately predict GPAs for liberal arts majors • grade point averages are higher for older students • the correlation between the SAT verbal score and GPA is higher than 0.5 Answer: B
Linear Regression II - 1000 • A scatterplot has a correlation of –0.0002. Which of the following must be true? • There is no association between the explanatory and response variables • The scatterplot has no form • The scatterplot shows a negative direction • There is no linear association between the explanatory and response variables Answer: D
Probability I – 200 • A student spun a given spinner 80 times and recorded the results of each spin in the table. Based on his findings what is the probability of landing on Red? • 0.1500 • 0.1875 • 0.8000 • 0.2000 Answer: B
Probability I - 400 • A student spun a given spinner 80 times and recorded the results of each spin in the table. What is the probability of not landing on Yellow? • 0.0625 • 0.2000 • 0.8000 • 0.9375 Answer: D
Probability I - 600 • Of American adults 24% have a genetic marker for breast cancer. If a random sample of 3 adults is taken, what is the probability that all 3 have the genetic marker? • 0.2400 • 0.7200 • 0.0138 • 0.0800 Answer: C
Probability I - 800 • A fair die is rolled 4 times and 6 appears each time. What is the probability that on the next roll the 6 will appear again? • It would be close to zeros since it is very unlikely that 6 would appear again after coming up so many times. • It would be close to one since the die appears to be on a streak of 6's • The probability remains 1/6 • The probability would be (1/6)5 = 1/7776 Answer: C
Probability I - 1000 I II IIII • Which of the following is a legitimate probability distribution? • I only • II only • III only • I and II Answer: D
Probability II - 200 Let the table above give the probabilities for having a certain number of children for a given town. What is the expected number of children per family? • 1.0 • 1.3 • 2.0 • 2.5 Answer: B
Probability II - 400 The table relates gender to smoking. What is the probability that a randomly selected person smokes or is male? 0.20 B. 0.40 C. 0.54 D. 0.60 Answer: C
Probability II - 600 The table relates gender to smoking. What is the probability that a randomly selected person smokes and is male? 0.06 B. 0.20 C. 0.40 D. 1.00 Answer: A
Probability II - 800 You read in a book about bridge that the probability that each of the four players is dealt exactly one ace is about 0.11. To simulate an outcome with probability 0.11 you could look at 2 digits in the random number table; the outcome occurs if the digits are 11. look at 2 digits in the random number table; the outcome occurs if the digits are any of 00, 01, …, 11. look at 2 digits in the random number table; the outcome occurs if the digits are any of 00, 01, …, 10. None of these would work. Answer: C
Probability II - 1000 Daily Double! • In a small community 76% of the population is over 35 years old, 72% of the population consider themselves conservatives, and 52% are over 35 and conservative. What is the probability that a randomly selected person is over 35 or conservative? HINT: Make a two-way table. • A. 52% B. 96% C. 8% D. 148% Answer: B
Significance Tests - 200 • The teacher wants to perform a test of significance to see if her students underestimate her actual age of 50. She samples the 30 students and calculates an average of 46.5. The null and alternative hypotheses are: • H0: = 46.5; Ha: 46.5 • H0: = 46.5; Ha: < 46.5 • H0: = 50; Ha: < 50 • H0: = 50; Ha: 50 Answer: C
Significance Tests - 400 A CBS News/New York Times opinion poll asked 1,190 adults whether they would prefer balancing the Federal budget over cutting taxes; 702 of those asked said "Yes." Which of these is a correct 95% confidence interval for the proportion of all adults who prefer balancing the budget over cutting taxes? A. 0.59 0.0004 B. 0.59 0.014 C. 0.59 0.018 D. 0.59 0.028 Answer: D
Significance Tests- 600 A Census Bureau report on the income of Americans says that with 95% confidence the median income of all U.S. households in 1997 was $37,005 with a margin of error of $342. This means that 95% of all households had incomes in the range $37,005 $342. we can be 95% sure that the median income for all households in the country lies in the range $37,005 $342. 95% of the households in the sample interviewed by the Census Bureau had incomes in the range $37,005 $342. the Census Bureau got the result $37,005 $342 using a method that will cover the true median income 95% of the time when used repeatedly. Answer: B
Significance Tests - 800 • If a significance test gives P-value 0.005, • A. the margin of error is 0.005. • B. the null hypothesis is very likely to be true. • C. we do not have good evidence against the null hypothesis. • D. we do have good evidence against the null hypothesis. Answer: D
Significance Tests - 1000 A high school football coach claims the average weight of his players is 230 lbs. You suspect he may be wrong, and take a sample of 18 players and get a mean weight of 228 lbs. with a standard deviation of 4 lbs. What would your conclusion be if you ran a hypothesis test with = 0.05? Since our p-value is less than 0.05, we would reject the coach’s claim. Since our p-value is less than 0.05, we would not reject the coach’s claim. Since our p-value is greater than 0.05, we would reject the coach’s claim. Since our p-value is greater than 0.05, we would not reject the coach’s claim. Answer: A
Make your wager based on this category: Linear Regression
You must write your wager and your groups answer on a piece of paper. And now, the Final Jeopardy! answer…
Linear Regression Suppose you run a linear regression for a data set and get r = 0.92. You then graph the residuals vs. the x-variable. The plot is shown on the right. What would be an appropriate interpretation of the residual plot? • Since the plot has a distinct curve to it, and since r = 0.92, the linear model is a good fit. • Since the plot has a distinct curve to it, and since r = 0.92, the linear model is not a good fit. • The shape of the residual plot doesn’t matter; since r = 0.92, the linear model is the best one. • Since the plot has a distinct curve, the linear model is not the best model for the data, even though r = 0.92.