1 / 25

Section 5.1

Section 5.1. Incomes and Other Quantities. Examples of Categorical Variables. What is your gender? Did you see Toni Morrison last night? How confident are you that you’ll be able to find a job in your major upon graduation? not confident at all somewhat confident

aimon
Download Presentation

Section 5.1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Section 5.1 Incomes and Other Quantities

  2. Examples of Categorical Variables • What is your gender? • Did you see Toni Morrison last night? • How confident are you that you’ll be able to find a job in your major upon graduation? not confident at all somewhat confident confident very confident

  3. Numerical Summaries of Categorical Data • What is your gender? • There were 10 females and 20 males in the sample. • How can we numerically summarize this data besides reporting raw counts?

  4. Summarizing Categorical Data • So the proportion of females is 33.33% and the proportion of males is 67.77%. • Are there other appropriate numeric measurements? Does it makes sense to describe gender by an average? NO.

  5. Job Confidence Results • 10 say not confident • 15 say somewhat confident • 30 say confident • 20 say very confident • How can we numerically summarize this data? Using proportions.

  6. Graphical Displays of Categorical Data • Bar Graphs • Pie Charts

  7. Numeric Variables • Numeric data consists of numbers representing measurements. • The text calls “numeric data”, “number line data”. • Examples: • Weights of football players • Prices of college textbooks • Age of US Presidents at inauguration

  8. Looking Ahead • Chapters 5-7 examine many of the same ideas that we studied in Chapters 1-4, except from the point of view of numeric variables. • Similar to before, we’ll look at numerical and graphical summaries of data, sampling distributions of statistics, confidence intervals, and hypothesis tests.

  9. Overview of Chapter 5 (in part) • Numerical Summaries of Numeric Variables • Measures of center: What is the center value? • Measures of spread: Is the data set close to the center or spread out?

  10. Numerical Summaries for Numeric Data • Prices of college textbooks • $82.50, $75.50, $27.50, $88.25, $79.00, $120.50, $90.25, $68.50, $85.50, $90.25 • How should we summarize this numeric data? • Does computing proportions make sense?

  11. Measurements of Center for Numeric Data Three common measures of “center” are: • Mean – arithmetic average • Median – “middle” value • Mode – most frequent

  12. CO2 Pollution of the 8 Largest Nations • The Pew Center on Global Climate Change reports that possible global warming is due in large part to human activity that produces carbon dioxide emissions and other greenhouse gases. The CO2 emissions from fossil fuel combustion are the result of the generation of electricity, heating, and gas consumption in cars.

  13. Which countries are most populated? • http://www.aneki.com/populated.html

  14. Per capita CO2 emissions for the 8 largest countries in population size (metric tons/person) • China 2.3 • India 1.1 • USA 19.7 • Indonesia 1.2 • Russia 9.8 • Brazil 1.8 • Pakistan 0.7 • Bangladesh 0.2

  15. Dotplot of the CO2 emissions data

  16. Mean • Defn: The sum of the data values divided by the number of data values. • Ex: Find the mean the 8 countries: Ans:

  17. Median • Defn: Center value of ordered data. • Ex: Find the median of the data set. Begin by ordering the data. 0.2, 0.7, 1.1, 1.2, 1.8, 2.3, 9.8, 19.7 Since there are an even number of data points, the median is the mean of the middle two values, 1.2 and 1.8. So the median is 1.5.

  18. Why two measures of center? • The mean and median are usually different so journalists have an opportunity to mislead you by which one is reported. • Ex: In 2004 the median household income was $44, 389 and the mean household income was $60,528.

  19. Mean vs. Median • Median is below about half of its observations. • It’s possible for the mean to be below most of the observations. • Ex: http://bcs.whfreeman.com/ips5e/default.asp?s=&n=&i=&v=&o=&ns=0&uid=0&rau=0

  20. Describing the Shape of a Histogram • Mean is the balance point. • If a histogram is symmetrical, its balance point is the middle observation. In this case, mean=median. • Distributions that are not symmetrical are skewed – either to the right (tail extends out further to the right than the left) or to the left (tail extends out further to the left than to the right.)

  21. Skewed RightHow much cash do you have on you?Median = $15 Mean = $35.82

  22. Skewed left

  23. Number of States VisitedMedian = 15 Mean = 16.43

  24. Mean follows skewness • If a distribution of data is skewed, the mean will be farther towards the tail than the median.

  25. Exercises • The workers and management of a company are having a labor dispute. Explain why workers might use the median income of all employees to justify a raise but management might use the mean to argue that a raise is not needed. • The mean age of four people in a room is 30 years. A new person whose age is 55 years enters the room. What is the mean age of the five people in the room?

More Related