1 / 26

5 things you need to know about statistics and data

Laine G.M. Ruus University of Toronto. Data Library Service laine.ruus@utoronto.ca 2010-03-05. 5 things you need to know about statistics and data. …and the 5 things are:. Statistics are produced from data

juliet-wise
Download Presentation

5 things you need to know about statistics and data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Laine G.M. Ruus University of Toronto. Data Library Service laine.ruus@utoronto.ca 2010-03-05 5 things you need to know about statistics and data

  2. …and the 5 things are: • Statistics are produced from data • You need to know where the data came from and who collected them to determine their quality • Information, as well as statistics and data, are commodities, and therefore have a price • Most of the statistics and data available at UT are not available free on the web • The two most commonly used sources of Canadian statistics are CANSIM, and the Census of population.

  3. 1. Statistics are derived from data • Data + analysis=statistics • To have data about a characteristic, it must be definable in a consistent manner, and measurable • Without data, there can be no statistics, just guesses (aka ‘estimates’)

  4. Source: Physicians for a Smoke-Free Canada http://www.smoke-free.ca/health/pscissues_health.htm

  5. Source: Statistics Canada. Health indicators. 2009, no. 1 (STC 82-221-X)

  6. Bottom line: always read the documentation (aka metadata)

  7. Data versus statistics • Data are the characteristics of entities (units of observation), collected: • In the course of normal processes (deaths database, hourly weather observations, etc) • By surveys and censuses • Statistics summarize data: • Eg the largest, and the smallest • The average, and/or median • How like, or different from, the average are most of the cases (standard deviation) • Quantiles, percentiles, rates

  8. 2. You need to know the provenance of the data/statistics to determine their quality Statistics + interpretation = information

  9. Source: Labour force historical review cd-rom, 2009 ed.

  10. Source: Labour force historical review cd-rom, 2009 ed.

  11. …. Source: http://www.islamfortoday.com/america01.htm

  12. Source: CIA world factbook https://www.cia.gov/library/publications/the-world-factbook/ US Census Bureau’s US population estimate for 2007: 301,714,354 (0.6 % of 301,714,354) is approximately 1,810,290

  13. Source: NationMaster http://www.nationmaster.com

  14. Source: NationMaster http://www.nationmaster.com

  15. You need to know: • Who has collected the data, and how? • Who created the descriptive statistics, and how? If it’s a rate, or percentage, what is in the numerator? What is in the denominator? • For what purpose were the statistics created? What was the objective of the interpretation? Did the collector/creator have a social or political agenda? • Is causality part of the interpretation? Does it make sense?

  16. 3. Information, statistics, and data are commodities, and therefore have a price • In an information-based economy, information (and the data and statistics from which it derives) has a price, just as resources do in a resource-based economy • Most of the priced statistics and data available to University of Toronto faculty, students and staff are on restricted servers, and have licences that limit what you may legally do with the statistics/data

  17. Source: http://www.statcan.gc.ca/bsolc/olc-cel/olc-cel?lang=eng&catno=92-153-X

  18. Source:http://dc1.chass.utoronto.ca/census/pccf.html

  19. Most of the data and statistics available to you are not available free on the www. • The Library purchases large amounts of data and statistics for academic use at University of Toronto. • Do not pay for data/statistics (eg from Statistics Canada) without checking with the Data Library Service whether they are already available at UofT

  20. 4. Most of the statistics and data available at UT are not listed in the Library catalogue • Collections can be searched via the Data Library Service homepage: http://www.chass.utoronto.ca/datalib/ • Collections can also be found using Google

  21. Two major links for finding statistics: • Finding Canadian statistics: http://www.chass.utoronto.ca/datalib/other/findcans.htm • Finding U.S. and international statistics http://www.chass.utoronto.ca/datalib/other/findints.htm

  22. Two major links for finding data: • Socio-economic and financial time-series databases http://www.chass.utoronto.ca/datalib/major/ecofin.htm • Microdata analysis and extraction http://www.chass.utoronto.ca/datalib/major/sda.htm

  23. If all else fails: • Search the DLS web site: http://www.chass.utoronto.ca/datalib/ • Or e-mail: http://www.chass.utoronto.ca/datalib/sendquestion.html

  24. Whento refer:

  25. 5.Major data resources on Canada: • CANSIM <http://www.chass.utoronto.ca/datalib/codebooks/cstdli/cansim.htm> • Census of population, eg 2006 <http://www.chass.utoronto.ca/datalib/cc06/cc06.htm>

  26. …and one final note • Please, do not every tell a user that a statistic or data does not exist! • It may exist, but you cannot find it, eg because of terminology or definitional changes or problems • It may indeed not exist, but we may be able to create it from raw data, to the user’s specifications

More Related