900 likes | 1.14k Views
NEKSDC CCSSM HS Statistics and Probability. Elaine Watson, Ed.D . March 13, 2013. Arthur Benjamin TED Talk: Teach Statistics Before Calculus. Essential Questions. How do we organize data so we can describe it? What is an adequate description of data? How do we interpret and analyze data?
E N D
NEKSDC CCSSM HS Statistics and Probability Elaine Watson, Ed.D.March 13, 2013
Essential Questions • How do we organize data so we can describe it? • What is an adequate description of data? • How do we interpret and analyze data? • How is statistics different from mathematics? • Why is probability so closely linked to statistics?
Structure of Presentation • General Overview of Statistics and Probability • GAISE Report • Levels A, B, C • Components of Statistical Problem Solving • Link Between Statistics and Probability • Statistics Framework: • Population versus Sample • Descriptive Statistics and Inferential Statistics • Common Core Statistics and Probability • K – 5 Data (Categorical and Measurement) • 6 – 8 Statistics • HS Statistics • Time to Work Problems • Illustrative Mathematics • Other Text Resources
Activity: There are six tasks. • Randomly draw a task • Work through your task • Share your answer with the group • How are the tasks alike? • How are the tasks different? What CCSSM Math Practice Standards did you use? How does this activity relate to Probability and Statistics?
Permutations Tasks • What are all three-digit numbers that you can make using each of the digits 1, 2, and 3, and using each digit only once? • Angel, Barbara, and Clara run a race. Assuming there is no tie, what are all the possible outcomes of the race (first, second, third)? • You are watching Angel, Barbara, and Clara playing on a merry-go-round. As the merry-go-round spins, what are all the different ways that you see them from left to right? • You want to form two partners from among Angel, Barbara, and Clara, by the following procedure: You choose one of them, and then let that child choose her partner. What are all the possible outcomes of this process? • In a 3 x 3 grid square, color three of the (unit) squares blue, in such a way that there is at most one blue square in each row and in each column. What are all the ways of doing this? • Find all of the symmetries of an equilateral triangle. https://www.math.purdue.edu/~goldberg/Math453/eqi-slides-web.pdf
The Difference between Statistics and Mathematics “Statistics is a methodological discipline. It exist not for itself, but rather to offer to other fields of study a coherent set of ideas and tools for dealing with data. The need for such a discipline arises from the omnipresence of variability.” Moore and Cobb, 1997 Statistical problem solving and decision making depend on understanding, explaining, and quantifying the variability in the data. It is this focus on variability in data that sets apart statistics from mathematics. GAISE Report
The Nature of Variability • Measurement Variability • Measuring devices can produce unreliable results • Changes in the system being measured (blood pressure from one moment to the next) • Natural Variability • Individuals differ in size, aptitudes, abilities, opinions, emotional responses • Induced Variability • Planting seeds in two different locations with different conditions • A carefully designed experiment can help determine the effects of different factors • Sampling Variability • We use a sample of a population to make an estimate of a whole population. However, it is rare for two samples to have identical results. Proper sampling techniques and adequate sampling size help to lower sampling variation. GAISE Report
The Role of Context “The focus on variability naturally gives statistics a particular content that sets it apart from mathematics, itself, and from other mathematical sciences but there is more than just content that distinguishes statistical thinking from mathematics. Statistics requires a different kind of thinking, because data are not just numbers, they are numbers with a context. In mathematics, context obscures structure. In data analysis, context provides meaning.” Moore and Cobb, 1997 GAISE Report
What is the role of context? Mexico 2000 United States 2000 Male Female Female Male Population (in millions) Population (in millions) Statistics in Action (Watkins, Scheaffer, Cobb) Key Curriculum Press, 2004
Components of Statistical Problem Solving • Formulate Questions • Collect Data • Analyze Data • Interpret Results C Compare to CCSSM Modeling Cycle
Statistics is Modeling ★ In the HS CCSSM Standards, Modeling is considered not only a Practice Standard, but also one of the six Conceptual Categories (Numbers & Quantity, Algebra, Functions, Modeling, Geometry, Statistics & Probability) The Modeling standards are interspersed throughout the Numbers & Quantity, Algebra, Functions, and Geometry Conceptual Categories and are indicated with a ★. However, ALL standards in the the Statistics Conceptual Category are considered modeling standards. As a result, the Category is marked with a ★, but the individual standards are not.
Statistical Education: A Developmental Process GAISE Report: 3 Levels of Statistical Understanding • Level A Students develop “data sense” – an understanding that data are more than just numbers. Statistics changes numbers into information. • Level B Students see statistical reasoning as a process for solving problems through data and quantitative reasoning • Level C Students extend concepts learned in Levels A and B to cover a wider scope of investigatory issues, and develop a deeper understanding of inferential reasoning and its connection to probability. Students also should have an increased ability to explain statistical reasoning to others.
Statistical Education: A Developmental Process Although these three levels may parallel grade levels, they are based on development in statistical literacy, not age. Thus, a student who has no prior experience with statistics will need to begin with Level A concepts and activities before moving to Level B and Level C. For this reason, we will spend time looking at the 6 – 8 Standards, since many students currently in HS may be at Level A or B. The learning is more teacher driven at Level A, but becomes more student-driven at Levels B and C.
GAISE Framework Levels A,B,C in the Statistical Modeling Process As we go through the next few slides on the Statistical Modeling Process… Refer to the one-page handout from the GAISE Document that has a Table with Column Headings • Level A, Level B, Level C and Row Headings • I. Formulate Questions, II. Collect Data, III. Analyze Data, and IV. Interpret Results
Components of Statistical Problem Solving I. Formulate Questions • Clarify the problem at hand • Formulate one (or more) questions that can be answered with data • The question should anticipate variability Which of these questions anticipates variability? • How tall am I? • How tall are adult men in the USA? Discuss the horizontal progression in the Formulate Question row across Levels A, B, C
Components of Statistical Problem Solving II. Collect Data • Design a plan to collect appropriate data • Employ the plan to collect the data • Data collection designs must acknowledge variability in data, and frequently are intended to reduce variability (random sampling) • The sample size influences the effect of sampling variation (error) GAISE Report Discuss the horizontal progression in the Collect Data row across Levels A, B, C
Components of Statistical Problem Solving III. Analyze Data • Select appropriate graphical and numerical methods • Use these methods to analyze the data • The main purpose of statistical analysis is to give an accounting of the variability in the data • 42% of those polled support the candidate with a margin of error +/- 3% at the 95% confidence level • Test scores are described as “normally distributed with mean 450 and standard deviation 100 GAISE Report Discuss the horizontal progression in the Analyze Data row across Levels A, B, C
Components of Statistical Problem Solving IV. Interpret Results • Interpret the analysis • Relate the interpretation to the original question • Statistical interpretations are made in the presence of variability and must allow for it. • Looking beyond the data to make generalizations must allow for variability in the data. GAISE Report Discuss the horizontal progression in the Interpret Results row across Levels A, B, C
At the point of question formulation, the statistician anticipates the data collection, the nature of the analysis, and the possible interpretations – all of which involve possible sources of variability. 1. Formulate Question 4. Interpret Results Variability 3. Analyze Data 2. Collect Data Variability Variability
In the end, the mature practitioner reflects upon all aspects of data collection and analysis as well as the question itself when interpreting results. 1. Formulate Question 4. Interpret Results 3. Analyze Data 2. Collect Data
Likewise, he or she links data collection and analysis to each other and the other two components. 1. Formulate Question 4. Interpret Results 3. Analyze Data 2. Collect Data
Statistical Education: A Developmental Process The mature practitioner understands the role of variability in the statistical problem-solving process. Beginning students cannot be expected to make all of these linkages. They require years of experience and training. The GAISE Report, therefore, provides a framework for statistical education over three levels for K – 12. A mature practicing statistician would go beyond these three levels.
Resources for Deeper Study of Statistics and Probability Standards See handout for links to the following resources: • The GAISE Report • Progressions Documents • 6 – 8 Statistics and Probability • High School Statistics and Probability
What is Meant by Statistics and Probability? • Statistics is the study of what has already happened to find the structure. • Probability is using the structure to predict the future. Julie Conrad
Probability: An essential tool in mathematical modeling and in statistics • The use of probability as a mathematical model and the use of probability as a tool in statistics employ not only different approaches, but also different kinds of reasoning. • Two problems and the nature of the solutions will illustrate the difference… GAISE Report
Probability: An essential tool in mathematical modeling and in statistics • Problem 1: Assume a coin is “fair.” Question: If we toss the coin five times, how many heads will we get? • Problem 2: You pick up a coin. Question: Is this a fair coin? Problem 1 is a mathematical probability problem. Problem 2 is a statistics problem that can use the mathematical probability model determined in Problem 1 as a tool to seek as solution. GAISE Report
Link between Probability and Statistics Probability shows you the likelihood, or chances, for each of the various future outcomes, based on a set of assumptions about how the world works. • Allows you to handle randomness (uncertainty) in a consistent, rational manner. • Forms the foundation for statistical inference (drawing conclusions from data), sampling, linear regression, forecasting, risk management.
Link between Probability and Statistics With Statistics, you go from observed data to generalizing how the world works. The 7 hottest years on record occurred in the most recent decade. There is global warming Perhaps without justification http://pages.stern.nyu.edu/~churvich/Undergrad/Handouts2/07-Prob.pdf
Link between Probability and Statistics With Probability, you start with an assumption about how the world works, and then figure out what kind of data you are likely to see. Probability provides the justification for statistics. Probability is the only scientific basis for decision-making in the face of uncertainty. Assume no global warming. How likely would we be to get such high temperatures as we have been having? http://pages.stern.nyu.edu/~churvich/Undergrad/Handouts2/07-Prob.pdf
Looking at the World through a Statistical Lens From a statistics lens, if you are given a jar of different colored jelly beans (the world), you won’t be able to see what’s in the jar. You will use a sampling method to collect information to infer the percentage of each color of jelly beans. In the world of jelly beans in this jar Julie Conrad
Looking at the World from a Probability Lens From a probability lens, you know the percentage of each color of jelly beans. (how this world works) You predict what’s going to happen when you choose one at random. In the world of jelly beans in this jar, I know that 20% are red Julie Conrad
A Framework for Studying Statistics Statistics for K-8 Educators by Robert Rosenfeld The Practice of Statistics Descriptive Statistics Inferential Statistics
A Framework for Studying Statistics Descriptive Statistics Graphs Measures of Center Measures of Variability Measures of Relationship Placement of Individuals Normal Curve Statistics for K-8 Educators by Robert Rosenfeld
A Framework for Studying Statistics Inferential Statistics Connections between statistics and probability Correlation Confidence Intervals Statistical Significance Inference and Margin of Error Statistics for K-8 Educators by Robert Rosenfeld
Population versus Sample • A populationis the total set of individuals, groups, objects, or events that the researcher is studying. For example, if we were studying employment patterns of recent U.S. college graduates, our population would likely be defined as every college student who graduated within the past one year from any college across the United States. • A sample is a relatively small subset of people, objects, groups, or events, that is selected from the population. Instead of surveying every recent college graduate in the United States, which would cost a great deal of time and money, we could instead select a sample of recent graduates, which would then be used to generalize the findings to the larger population. • http://sociology.about.com/od/Statistics/a/Descriptive-inferential-statistics.htm
Descriptive Statistics • Descriptive statistics includes statistical procedures that we use to describe the population we are studying. The data could be collected from either a sample or a population, but the results help us organize and describe data. Descriptive statistics can only be used to describe the group that is being studying. That is, the results cannot be generalized to any larger group. • Descriptive statistics are useful and serviceable if you do not need to extend your results to any larger group. However, much of social sciences tend to include studies that give us “universal” truths about segments of the population, such as all parents, all women, all victims, etc. http://sociology.about.com/od/Statistics/a/Descriptive-inferential-statistics.htm
Inferential Statistics • Inferential statistics is concerned with making predictions or inferences about a population from observations and analyses of a sample. That is, we can take the results of an analysis using a sample and can generalize it to the larger population that the sample represents. In order to do this, however, it is imperative that the sample is representative of the group to which it is being generalized.
Compare and Contrast Different Representations of the Same Data • Activity: • Look at the four graphical representations of the same data set on page 42 • Compare and contrast the graphs • What does each communicate? • Which do you think is the best representation of the data? Justify your answer for why you chose this representation?
K – 5 Foundation for Statistics Categorical Data Measurement Data Two paths for K – 5 Data Standards Sorting Measuring Representing on Bar Graphs Representing on Line Plots Supports later work on bivariate data and two-way tables in Grade 8 Supports later work on histograms and box plots in MS
Grade 6Common Core:Statistics Begins In Grades K – 5, students have learned to represent and interpret data using line plots and bar graphs. 6.SP.2 Understand that a set of data collected to answer a statistical question has a distribution which can be described by its center, spread, and overall shape. Students learn to represent data using histograms and box plots.
Common Core Grade 6 Grade 6 Develop understanding of statistical variability. • Understand that a set of data collected to answer a statistical question has a distribution which can be described by its center, spread, and overall shape. • Recognize that a measure of center for a numerical data set summarizes all of its values with a single number (mean, median, mode), while a measure of variation describes how its values vary with a single number (mean absolute deviation or interquartile range).
Grade 6: Summarize and DescribeDistributions Describe Data by Measures of Center: Mean, Median, Mode Big Idea: You should not decide which measure or measures of center to use until you know the reason you are doing it. Pick one that helps you tell the story of your data. If you have a very small set of data, you may prefer not to do any of them, but just to show all the data Statistics for K-8 Educators by Robert Rosenfeld
Grade 6: Summarize and Describe Distributions Describe Data by Measures of Variation: What is the spread of the data? Range: 20 – 13 = 7 Interquartile Range: 18 – 16 = 2
Grade 6: Summarize and DescribeDistributions Describe Data by Measures of Variation: How do numbers tend to spread out from the center? X: 4, 5, 7, 12 Mean = 7 Mean deviations: -3, -2, 0, 5 Absolute values of deviations: 3, 2, 0, 5 Mean absolute deviation: (3 + 2 + 0 + 5)/4 = 10/4 = 2.5 Mean absolute deviation is introduce in Grade 6
Standard Deviation is not introduced until HS Another Measures of Variation: How do numbers tend to spread out from the center? Standard Deviation – also summarizes how the individual numbers in a set differ from the mean, but it is based on the squares of the deviations rather than their absolute values. X: 4, 5, 7, 12 Standard Deviation is introduced in HS in Common Core Mean deviations: -3, -2, 0, 5 Squares of the deviations: 9, 4, 0, 25 Mean of the squares: (9 + 4 + 0 + 25)/4 = 38/4 = 9.5 SD = square root (9.5) = 3.08 9.5 (the square of the SD) is called the variance and is used as a measure of variation in more advanced work
MAD or SD? Measures of Variation: Which do I use? The mean absolute deviation is gaining popularity as the best way to introduce measuring variability in grades K – 12, saving standard deviation for more advanced work. In research, the mean absolute deviation is often chosen as the measure of variability when the median is used as the measure of center, while the standard deviation is used when the mean is the measure of center. Statistics for K-8 Educators by Robert Rosenfeld
Grade 6: Summarize and DescribeDistributions Describe Data by Overall shape of data distribution: Normal Distribution Bimodal Distribution Uniform Distribution Skewed Left Skewed Right
Grade 6: Summarize and DescribeDistributions • Display numerical data in plots on a number line, including dot plots, histograms, and box plots.
Grade 6: Summarize and DescribeDistributions Summarize numerical data sets in relation to their context, such as by: • Reporting the number of observations. • Describing the nature of the attribute under investigation, including how it was measured and its units of measurement.