1 / 45

Cross Tabulation

Cross Tabulation. Statistical Analysis of Categorical Variables. To date…. We have examined statistical tests for differences of means, proportions, regression coefficients and correlation coefficients. These statistics are all measured at the interval level. New Test….

Download Presentation

Cross Tabulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cross Tabulation Statistical Analysis of Categorical Variables

  2. To date…. • We have examined statistical tests for differences of means, proportions, regression coefficients and correlation coefficients. • These statistics are all measured at the interval level.

  3. New Test… • Now we wish to examine statistical tests for questions involving nominal and ordinal variables. To do so we introduce the Chi Square Test.

  4. Cross Tabulation • We are interested in the counting the number of cases for the categories of one variable in terms of the categories of a second variable, and…. • Implicitly, we are asking if there are differences in the patterns of the counts….

  5. Cross Tabulation and Chi Square Test A cross tabulation cross classifies one variable by another variable. Below is a cross classification of occupational groups and wards for the Simon data for 1905.

  6. Cross Tabulation and Chi Square Test We count the number of cases in each occupational category for each ward. At the edges of the table we total the rows and columns.

  7. Graphic Illustration of the Counts of Occupational Groups by Ward

  8. The Simplest Example….a 2 by 2 Table • Do the opinions of men and women differ on the War in Iraq? • Do the opinions of men and women differ on the importance of capturing Osama Bin Laden? • Data: September 2006 ABC News Poll on the War on Terror. A Sample of about 1000 respondents.

  9. Basic Frequencies

  10. Basic Frequencies Broken Down by Gender

  11. Another Graphical Illustration: 4 Bins of Counts

  12. Tabular Data

  13. Some Terms and Assumptions • Cell frequency: number in the body of the table • Marginal total: total of the row or the column • Row percent: the proportion of cases in the cell for the particular row. • Column percent: the proportion of cases in the cell for the particular column • Expected frequency: the number of cases expected based upon the marginal proportions • Deviation: the difference between the expected frequency and the actual frequency

  14. Tabular Data Cell Frequency Q12 War worth fighting NET * Q921 GENDER Crosstabulation Count Q921 GENDER Male Female Total Q12 War worth Worth fighting NET 213 217 430 fighting NET Not worth fighting NET 245 307 552 Total 458 524 982 Marginals

  15. 213 245 307 217 Table Counts and Graph

  16. Row Percents

  17. Column Percents

  18. Frequencies, Row and Column Percents

  19. New Concept: Expected Frequencies • What would the counts in the cells be if there was no impact of gender on attitudes towards the Iraq War? • The marginal proportions would define the cell counts.

  20. Row Total * Column Total/ Grand Total Or… Row Proportion * Column Total Or… Column Proportion * Row Total Expected Frequencies

  21. Another Example: The Importance of Capturing Osama Bin Laden

  22. Frequencies by Gender

  23. Frequencies By Gender

  24. Row Percents

  25. Column Percents

  26. Expected Frequencies

  27. Actual Frequencies, Expected Frequencies, and Deviations (Residual)

  28. Chi Square • Chi Square = Sum of [ (Expected – Observed)2 / Expected Frequency ] • Chi Square Table: http://www.uwm.edu/~renlex/chisquare.html

  29. Examples of Chi Square Distribution

  30. Degrees of Freedom for Chi Square • Degrees of Freedom = (r-1)* (c-1) • So, 2 by 2 table has 1 degree of freedom • 3 by 2 table has (3-1)(2-1)= 2 degrees of freedom

  31. Calculations: Catching Osama bin Laden by Gender • 640.09/198.3 = 3.23 • 640.09/220.7 = 2.90 • 640.09/251.7 = 2.54 • 640.09/280.3 = 2.29 • Chi Square (SUM) = 10.96

  32. Attitudes toward Iraq War by Gender

  33. Calculations: Attitudes toward Iraq War by Gender • 156.25/200.5 = .78 • 156.25/229.5 = .68 • 156.25/257.5 = .61 • 156.25/294.5 = .53 • Chi Square (SUM) = 2.60 • (not statistically signfication at .05 level)

  34. Chi Square Test • For a larger table, calculation is the same, but the number of terms increases. The number of terms is equal to the number of cells.

  35. Concentration of Occupational Groups by Ward

  36. Cross Tabulation • Are the occupational patterns different in the four wards? • Or….are the patterns a result of chance? (null hypothesis) • How would we decide?

  37. Illustration: Frequencies and Marginals

  38. Row and Column Percents

  39. Expected and Actual Frequencies Frequencies OCC$ (rows) by WARD (columns) 14 18 20 22 Total +-----------------------------------------+ profcler | 9 55 30 57 | 151 prop | 27 33 54 45 | 159 skilled | 90 16 149 114 | 369 skillpart | 13 12 40 26 | 91 unskilled | 175 12 71 46 | 304 +-----------------------------------------+ Total 314 128 344 288 1074 Expected values OCC$ (rows) by WARD (columns) 14 18 20 22 +-----------------------------------------+ profcler | 44.15 18.00 48.36 40.49 | prop | 46.49 18.95 50.93 42.64 | skilled | 107.88 43.98 118.19 98.95 | skillpart | 26.61 10.85 29.15 24.40 | unskilled | 88.88 36.23 97.37 81.52 | +-----------------------------------------+

  40. Deviates Deviates: (Observed-Expected) OCC$ (rows) by WARD (columns) 14 18 20 22 +-----------------------------------------+ profcler | -35.147 37.004 -18.365 16.508 | prop | -19.486 14.050 3.073 2.363 | skilled | -17.883 -27.978 30.810 15.050 | skillpart | -13.605 1.155 10.853 1.598 | unskilled | 86.121 -24.231 -26.371 -35.520 | +-----------------------------------------+

  41. Calculations Case number OCC$ WARD FREQUENCY EXPECTED RESIDUAL CHITERM 1 profcler 14.000 9.000 44.147 -35.147 27.982 2 profcler 18.000 55.000 17.996 37.004 76.087 3 profcler 20.000 30.000 48.365 -18.365 6.973 4 profcler 22.000 57.000 40.492 16.508 6.730 5 prop 14.000 27.000 46.486 -19.486 8.168 6 prop 18.000 33.000 18.950 14.050 10.418 7 prop 20.000 54.000 50.927 3.073 0.185 8 prop 22.000 45.000 42.637 2.363 0.131 9 skilled 14.000 90.000 107.883 -17.883 2.964 10 skilled 18.000 16.000 43.978 -27.978 17.799 11 skilled 20.000 149.000 118.190 30.810 8.032 12 skilled 22.000 114.000 98.950 15.050 2.289 13 skillpart 14.000 13.000 26.605 -13.605 6.957 14 skillpart 18.000 12.000 10.845 1.155 0.123 15 skillpart 20.000 40.000 29.147 10.853 4.041 16 skillpart 22.000 26.000 24.402 1.598 0.105 17 unskilled 14.000 175.000 88.879 86.121 83.449 18 unskilled 18.000 12.000 36.231 -24.231 16.205 19 unskilled 20.000 71.000 97.371 -26.371 7.142 20 unskilled 22.000 46.000 81.520 -35.520 15.477

  42. Review: Terms and Assumptions • Cell frequency: number in the body of the table • Marginal total: total of the row or the column • Row percent: the proportion of cases in the cell for the particular row. • Column percent: the proportion of cases in the cell for the particular column • Expected frequency: the number of cases expected based upon the marginal proportions • Deviation: the difference between the expected frequency and the actual frequency

  43. Strength of Relationships • Phi: Square root of (Chi Square/N) • Cramer’s V: Square root of (Chi Square/n*min(r-1, c-1)) • Contingency Coefficient: Square root of (Chi Square/(Chi Square+n))

More Related