1 / 19

Descriptive Statistics-III (Measures of Central Tendency)

Descriptive Statistics-III (Measures of Central Tendency). QSCI 381 – Lecture 5 (Larson and Farber, Sects 2.3 and 2.5). Introduction. is a value that represents the typical, or central, entry in a data set.

brier
Download Presentation

Descriptive Statistics-III (Measures of Central Tendency)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Descriptive Statistics-III(Measures of Central Tendency) QSCI 381 – Lecture 5 (Larson and Farber, Sects 2.3 and 2.5)

  2. Introduction • is a value that represents the typical, or central, entry in a data set. • There are three commonly used measures of central tendency: • The Mean • The Median • The Mode. A Measure of Central Tendency

  3. The Mean-I • The sample mean: • The population mean:

  4. The Mean-II • Consider the data set consisting of a sample of the diameters of 6 trees in a stand: 29cm, 31cm, 43cm, 31cm, 12cm, 33cm • Calculate the mean:

  5. The Mean-III • Why we like the mean • Unique. • Based on every data point in the data set. • Well suited to statistical treatment. • Why we dislike the mean • Can be sensitive to “outlying” observations.

  6. The Median • Sort the data and average the central values. • Six values: • Five values: 32 31

  7. The Mode • Find the frequency of each data entry and identify the data entry with the greatest frequency. • Unlike the median and mean, the mode is not always uniquely defined. If a data set has two modes, it is referred to as being bimodal.

  8. Which Measure is Best? • There is no clear answer to this question. • The mean can be influenced by outliers while the mode may not be particularly “typical”. • Statistical inference based on the median and the mode is somewhat difficult. Median Mode Mean Outlier?

  9. Computing the Mean of a Group of Data Points • Suppose the data are in the form of frequencies, i.e., for each i, we have xi and fi, where fi is number of data entries for which x equals xi, then: In Excel use: “sumproduct(a1:a10,b1:b10)/sum(b1:b10)” where the xi’s are stored in column A and the fi’s are stored in column B.

  10. Shapes of Distributions-I • A frequency distribution is when a vertical line can be drawn through the middle of a graph of the distribution and the resulting halves are mirror images. Symmetric Mean, Median, Mode

  11. Shapes of Distributions-II • A frequency distribution is (or rectangular) when the number of entries in each class is equal (a uniform distribution is symmetric). Uniform Mean, Median, Mode

  12. Shapes of Distributions-III • A frequency distribution is (or positively skewed) if its tail extends to the right (mode < median < mean). Skewed right Mode Median Tail Mean

  13. Shapes of Distributions-IV • A frequency distribution is (or negatively skewed) if its tail extends to the left (mode > median > mean). Skewed left

  14. Fractiles Range • The is the difference between the maximum and minimum data entries. • The : Q1, Q2, and Q3, divide a (ordered) data set into four equal parts. • The : P1, P2, ….P99 divide a (ordered) data set into 100 equal parts. • Collectively, Quartiles, Percentiles (and Deciles) are referred to as Fractiles. Quartiles Percentiles

  15. More on Quartiles • The quartiles divide a data set at the 25th percentile, the 50th percentile, and the 75th percentile. • The 50th percentile is the median. • The difference between the 75th and 25th percentiles is referred to as the . Interquartile range

  16. More on Percentiles 80% 15.2m Interpretation: 80% of the bowheads caught are smaller than 15.2m

  17. Box and Whisker Plots-I • The information on the range and the quartiles can be represented using a box and whisker plot.

  18. Box and Whisker Plots-II • Find the five number summary of the data (range, Q1,Q2,Q3). • Construct a horizontal line that spans the data. • Plot the five numbers above the horizontal scale. • Draw a box above the horizontal scale from Q1 to Q3 and draw a vertical line in the box at Q2. Q1 Q2 Median Q3 Maximum Minimum whisker 5 10 15 Length (m)

  19. Review of Symbols in this Lecture

More Related