230 likes | 471 Views
Introduction to Statistics. Unit 1. Statistics in a Word. Economics is about ... Money Biology ... Life History ... What, where, and when? Philosophy ... Why? Accounting ... How much? Statistics ... Variation .
E N D
Introduction to Statistics Unit 1
Statistics in a Word • Economics is about ... Money • Biology ... Life • History ... What, where, and when? • Philosophy ... Why? • Accounting ... How much? • Statistics ... Variation Data vary. People are different. We can’t see everything, let alone measure it all. And even what we do measure, we measure imperfectly. So the data we wind up looking at and basing our decisions on provide, at best, an imperfect picture of the world.
You are already familiar with many practices of statistics, such as • Surveys • Business Surveys • School Surveys • Collecting data • Census - • Online – Cookies track where you go and what you buy • Describing populations • Voting districts look at ethnicities • Schools for funding
Our first goal is to understand the basic concepts and goals of statistics. • Almost every day you are exposed to statistics. For example, consider the following excerpts from recent newspapers and journals. • “A survey of traffic deaths during this past Memorial Day weekend shows a 36% decrease in fatalities compared with last year.” • Men who eat just two servings of tomatoes a week in raw, sauce or pizza form have a 34% less risk of developing prostate cancer.” • More than three fourths of all college seniors in the United States complete at least one internship by graduation and 55% participate in two or more.” • These three statements we just read are based on the collection of data.
DATA • DATA consists of information coming from observations, counts, measurements, or responses. The singular for data is datum. • Sometimes data is presented graphically. • The use of statistics dates back to census taking in ancient Babylonia, Egypt and later in the Roman Empire, when data was collected about matters concerning the state, such as births and deaths. • The word statistics is derived from the Latin word status, meaning “state”.
Statistics is all about Data • Data are numbers, but they are not “just numbers”. Data are numbers with a context. The number 10.5 for example, carries no information by itself. But if we hear a friend’s new baby weighed 10.5 pounds at birth we congratulate her on the healthy size of her child. • The context engages our background knowledge and allows us to make judgments. We know that a baby weighing 10.5 pounds is quite large, and that a human baby is unlikely to weigh 10.5 ounces or kilograms. The context makes the number informative.
Statistics • Statistics is the science of collecting, organizing, analyzing and interpreting data in order to make decisions. • Data beat anecdotes (stories) • An anecdotes is a striking story that sticks in our minds exactly because it is striking. Anecdotes humanize an issue, but they can be misleading. • Let’s look at an example.
Does living near power lines cause leukemia in children? • The national cancer institute spent 5 years and $5 million gathering data on this question. The researcher compared 638 children who had leukemia with 620 who did not. They went into the homes and measured the magnetic fields in the children’s bedrooms, in other rooms and at the front door. They recorded facts about the power lines near the family home and also near the mother’s residence when she was pregnant. Result: no connection between leukemia and exposure to magnetic fields of the kind produced by power lines. The editorial that accompanied the study report in the New England Journal of Medicine thundered, “It is time to stop wasting our research resources” on the question.
Now Compare… • What would the effectiveness of a television news report of a 5 year, $5 million investigation against a televised interview with an articulate mother whose child has leukemia and who happens to live near a power line? • In the public mind, the anecdote wins every time. Why? • A statistically literate person knows better. Data are more reliable than anecdotes because they systematically describe an over all picture rather than focus on a few incidents.
You will use two types of data sets when studying statistics • Populations – the collection of all outcomes, responses, measurements or counts that are of interest. • Samples – a subset of a population • Why are samples used more often than a population? • Unless a population is small, it is usually impractical if not impossible to obtain all the population data. So therefore most studies, information must be obtained from a sample.
Example • In a recent survey, 3002 adults in the United States were asked if they read news on the Internet at least once a week. Six hundred of the adults said yes. • Identify the population • Identify the sample • What does the data set consist of?
Whether a data set is a population or a sample usually depends on the context of the real-life situation. For instance, in our previous example, the population was the set of responses of all adults in the United States. Depending on the purpose of the survey, the population could have been the set of responses of all adults who live in California or who have telephones or who read a particular newspaper
You try! • The U.S. Department of Energy conducts weekly surveys of approximately 900 gasoline stations to determine the average price per gallon of regular gasoline. On December 29, 2011, the average price was $3.48 per gallon. • Identify the population • Identify the sample. • What does the data set consist of?
Again, whether a data set is a population or a sample usually depends on the context of the real-life situation. For instance, in Example 1, the population was the set of all responses of all adults in the United States. Depending on the purpose of the survey, the population could have been the set of responses of all adults who live in Texas or who have telephones or who read a particular newspaper. • Parameter – a numerical description of a population characteristic • Statistic – a numerical description of a sample characteristic
Distinguish between Parameter and Statistic • “A recent survey of a sample reported that the average starting salary for a MBA is less than $65,000.” • “Startling salaries for the 667 MBA graduates from the University of Chicago Graduate School of Business increased 8.5% from the previous year.” • “In a random check of a sample of retail stores, the Food and Drug Administration found that 34% of the stores were not storing fish at the proper temperature.”
Branches of Statistics • The study of statistics has two major branches: descriptive statistics and inferential statistics. • Descriptive statistics is the branch of statistics that involves the organization, summarization, and display of data. • Inferential statistics is the branch of statistics that involves using a sample to draw conclusions about a population. A basic tool in the study of inferential statistics is probability.
Example 1 • Decide which part of the study represents the descriptive branch of statistics. What conclusion might be drawn from the study using inferential statistics? • A large sample of men, aged 48 was studied for 18 years. For unmarried men, approximately 70% were alive at age 65. For married men, 90% were alive at 65.
Example 2 • Decide which part of the study represents the descriptive branch of statistics. What conclusion might be drawn from the study using inferential statistics? • In a sample of Wall Street analysts, the percentage who incorrectly forecasted high-tech earnings in a recent year was 44%.
Introduction to Statistics • Learning Objectives: • What is the definition of statistics • Distinguish between a population and sample • Distinguish between a parameter and a statistic • Distinguish between descriptive statistics and inferential Statistics • Assessment: • Complete Unit 1- Lesson one class work. Due at the end of class. • Homework • Find a newspaper or magazine article that describes a survey and bring to class on Friday.