1 / 49

Chi-Square Distributions: Explanation and Testing Independence

Understand Chi-Square distribution in statistics with Brase and Brase's guide. Learn how to test independence of variables using contingency tables and Chi-Square statistic. Interpret results using critical values and degrees of freedom.

efournier
Download Presentation

Chi-Square Distributions: Explanation and Testing Independence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Understandable StatisticsSeventh EditionBy Brase and BrasePrepared by: Lynn SmithGloucester County College Chapter Eleven Part 1 (Section 11.1) Chi-Square and F Distributions

  2. The Chi-Square Distribution The 2 Distribution is not symmetrical and depends on the number of degrees of freedom.

  3.  is the Greek Letter Chi.

  4. The 2 Distribution for d.f. = 3 1 2 3 4 5 6 7 8 9 10 n

  5. The 2 Distribution for d.f. = 5 d.f. = 3 1 2 3 4 5 6 7 8 9 10 n

  6. The 2 Distribution for d.f. = 10 d.f. = 3 d.f. = 5 1 2 3 4 5 6 7 8 9 10 n

  7. The mode or high point occurs over n – 2 for n  3. d.f. = 3 d.f. = 5 d.f. = 10 1 2 3 4 5 6 7 8 9 10 n

  8. d.f. = 3 d.f. = 5 d.f. = 10 1 2 3 4 5 6 7 8 9 10 n As the degrees of freedom increase, the graphs looks more bell-like and symmetric.

  9. Use Table 7 in Appendix II to find Critical Values of 2 Distributions

  10. Area in the Right Tail of the Distribution =   2

  11. Use Table 7 (with d.f. = 8) to find the area to the right of 2 = 2.73.  =

  12. Chi Square: Tests of Independence To test the independence of two factors, use a contingency table.

  13. Contingency Table

  14. Shaded boxes (called “cells”) will contain frequencies.

  15. Horizontal lines of cells are called rows.

  16. Vertical lines of cells are called columns.

  17. The size of a table is given as row X column.

  18. This is a 3 X 3 contingency table.

  19. This is a 3 X 2 contingency table.

  20. This is a 2 X 3 contingency table.

  21. When giving the size of a contingency table, Always give the number of rows first.

  22. Suppose we wish to determine (at 5% level of significance) if the time it takes to complete a given task is independent of gender.

  23. Number and gender of individuals who completed a task in the times indicated.

  24. To test the null hypothesis that gender and the time it takes to complete the task are independent: H0: Variables are independent. H1: Variables are not independent. Use the null hypothesis to determine the expected frequency of each cell.

  25. Expected Frequency

  26. Finding the Expected FrequenciesE = (Row total)(Column total)Sample size Sample size

  27. Finding the Expected FrequenciesE = (Row total)(Column total)sample size Sample size

  28. Finding the Expected FrequenciesE = (Row total)(Column total)sample size

  29. Finding the Expected FrequenciesE = (Row total)(Column total)sample size

  30. Finding the Expected FrequenciesE = (Row total)(Column total)sample size

  31. Finding the Expected FrequenciesE = (Row total)(Column total)sample size

  32. Finding the Expected FrequenciesE = (Row total)(Column total)sample size

  33. The actual frequency which occurred is called the observed frequency, O.

  34. The Sample Statistic 2 Chi square is a measure of the sum of the differences between observed frequency O and expected frequency E in each cell.

  35. Difference Between Observed and Expected Frequencies

  36. The Sum of the (O – E) Column Will Equal Zero.

  37. To calculate Chi Square, we use the values (O – E)2/E • To reflect the magnitude of the differences between the observed and expected frequencies. • To reflect the fact that the small difference between the observed and expected frequencies is more important when the expected frequency is small.

  38. Computing 2

  39. Degrees of Freedom d.f. = (R – 1)(C – 1) R = number of cell rows C = number of cell columns

  40. For our example: R = 2, C = 3 d.f. = (2 – 1)(3 – 1) = 2

  41. Using d.f. = 2 and  = 0.05, find the critical value of 2 from Table 7.

  42. If the sample statistic is larger than the critical value, reject the null hypothesis of independence. In our example, the sample statistic 2 = 10.36 . The critical value = 5.99.

  43. Conclusion Reject the null hypothesis of independence. We conclude that the time it takes to complete the task is not independent of gender.

  44. P Value Approach • In our example, the sample statistic 2 = 10.36 . • For d.f. = 2, the sample statistic 2 = 10.36 falls between 9.21 and 10.60 (the critical values for  = .010 and .005 respectively). • We conclude that 0.005 < P < 0.010. • We would reject H0 for any   P. • We, therefore reject H0 for  = 0.05.

  45. In order to safely use the critical values of 2 from Table 7, we must assure that all expected frequencies are greater than or equal to five. If this condition is not met, the sample size should be increased.

  46. Using Chi-Square Distribution to Test the Independence of Two Variables • Set up the hypotheses H0: The variables are independent. H1: The variables are not independent. • Compute the expected frequency for each cell in the contingency table.

  47. Using Chi-Square Distribution to Test the Independence of Two Variables • Compute the statistic 2 for the sample.

  48. Using Chi-Square Distribution to Test the Independence of Two Variables • Find the critical value 2 in Table 7. Use the level of significance  and degrees of freedom: d.f. = (R – 1)(C – 1) where R and C are the numbers of rows and columns of cells. • The critical region = all values of 2 to the right of the critical value 2 .

  49. Using Chi-Square Distribution to Test the Independence of Two Variables • Compare the sample statistic 2 with the critical value 2 . • If the sample statistic is larger than the critical value, reject the null hypothesis of independence. • Otherwise, do not reject the null hypothesis.

More Related