1 / 41

Psy 425 Tests & Measurements

Psy 425 Tests & Measurements. Furr & Bacharach Chapter 5 Conceptual Basis for Reliability. True Scores?. Do scores on a test accurately reflect real psychological differences? Assessment of reliability Detecting the ability of a test to accurately reflect real differences.

Download Presentation

Psy 425 Tests & Measurements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Psy 425 Tests & Measurements Furr & Bacharach Chapter 5 Conceptual Basis for Reliability

  2. True Scores? • Do scores on a test accurately reflect real psychological differences? • Assessment of reliability • Detecting the ability of a test to accurately reflect real differences

  3. Classical Test Theory (CTT) • Conceptual basis of reliability • Outlines procedures for estimating the reliability of psychological measures

  4. CTT • True differences vs. measurement error • A test’s reliability reflects the extent to which the differences in respondents’ test scores are a function of their true psychological differences, as opposed to measurement error…

  5. Reliability • Not all or none • Is on a continuum • A test may be more or less reliable

  6. Theoretical • Reliability is a theoretical notion • Not directly observable • Can only estimate the reliability

  7. Derivation of Reliability Estimate • Estimate is derived based on three factors: • Observed scores • True scores • Measurement error

  8. Observed Scores • Values obtained from measurement of some characteristic of an individual

  9. True Scores • Real, true amounts of that characteristic

  10. Reliability • Extent to which observed scores are consistent with true scores as opposed to other often unknown test and test administration characteristics

  11. Measurement Error • “Other” characteristics that contribute to differences in observed scores • These characteristics create inconsistencies between observed scores and true scores

  12. Sources of Measurement Error? Can all sources be accounted for?

  13. Accurate Measurement? • Factors can obscure observed scores… • Measurement of physical properties… • Measurement of psychological attributes… Height & Weight? Post-partum Depression?

  14. What sources of error might contribute to scores on a test of depression (i.e., inflate or deflate true scores)? • Interpretation of written items • Incorrect recording of answers • Secondary gain? • Defensive or avoidant? • Psychological mindedness? • Cultural factors?

  15. Test reliability depends on… • Extent to which differences in test scores can be attributed to real inter- or intra- individual differences • AND • Extent to which such differences are a function of measurement error

  16. CTT • Person’s observed score on a test is a function of that person’s true score, plus error:

  17. Fundamental Theoretical Assumption of CTT • Observed scores on a psychological measure are determined by respondents’ true scores and by measurement error

  18. CTT assumption about measurement error… RANDOM

  19. Random Error • Inflation and deflation caused by error is independent of the individuals’ true levels of the psychological attribute being measured… • Interpretation of written items • Incorrect recording of answers • Secondary gain? • Defensive or avoidant? • Psychological mindedness? • Cultural factors?

  20. Important consequences of assumption of random error: • Error cancels itself out across respondents • Error scores are uncorrelated with true scores

  21. Error cancels itself out…

  22. Correlation between true scores and error scores = 0.0

  23. Four ways to think of reliability

  24. Four ways to think of reliability

  25. Values:

  26. Worksheet

  27. Size of reliability coefficient • Test’s reliability • Varies between 0 and 1 • Larger values = greater psychometric quality • As value increases, a greater proportion of the differences among observed scores can be attributed to differences among true scores

  28. Good vs. poor test reliability • No clear cutoff • In social science research, .70 to .80 is satisfactory • Less than that, marginal to poor • What about test reliability = 0; is the test at all useful? What about .43?

  29. Improving reliability… Improved Test Rxx = .48 Rxx = .74

  30. Error variance • Small degree = respondents’ scores are only being slightly affected by measurement error

  31. Index of reliability • “index of reliability” = unsquared correlation between observed and true scores • USUALLY – referring to coefficient of reliability or R2

  32. Reliability and Standard Error of Measurement • Standard deviation of error scores • Represents average size of error scores • The greater average difference between observed scores and true scores, the less reliable the test • Closely link to reliability - large sempoor Rxx • If Rxx = 1, then sem = 0

More Related