1 / 62

Assessment Accommodations: What Have We Learned From Research?

Assessment Accommodations: What Have We Learned From Research?. Stephen G. Sireci Center for Educational Assessment University of Massachusetts Amherst Mary J. Pitoniak Educational Testing Service. In this presentation we will. Discuss validity issues in test accommodations

gunda
Download Presentation

Assessment Accommodations: What Have We Learned From Research?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assessment Accommodations: What Have We Learned From Research? Stephen G. Sireci Center for Educational Assessment University of Massachusetts Amherst Mary J. Pitoniak Educational Testing Service

  2. In this presentation we will • Discuss validity issues in test accommodations • List the most common test accommodations used to promote valid score interpretation • Discuss research conducted on test accommodations • Suggest areas for future research on test accommodations

  3. Defining “Accommodation” • The Standards for Educational and Psychological Testing • use the terms “modification” and “accommodation” almost interchangeably, • use accommodation “as the general term for any action taken in response to a determination that an individual’s disability requires a departure from standard testing protocol” (p. 101).

  4. Current State Testing Programs • “Accommodation” is used to refer to test or test administration changes that are not considered to alter the construct measured. • “Modification” is used to refer to changes that are thought to alter the construct.

  5. Validity Issues in Accommodations • To support valid test score interpretations for students with disabilities, it is important to remove construct-irrelevant barriers to these students’ test performance, but it is also important to maintain “construct representation.” • In situations where individuals who take accommodated versions of tests are compared with those who take the standard version, an additional validity issue is the comparability of scores across the different test formats.

  6. The Psychometric Oxymoron • Accommodated Standardized Test • Promotes fairness in testing?Or • Provides an unfair advantage to some examinees? What do the Standards for Educational and Psychological Testing say on this issue?

  7. Standards for Educational and Psychological Testing • Standard 10.1: “In testing individuals with disabilities, test developers, test administrators, and test users should take steps to ensure that the test score inferencesaccurately reflect the intended construct rather than any disabilities and their associated characteristics extraneous to the intent of the measurement” (AERA, et al., p. 106).

  8. Standards for Educational and Psychological Testing • Standard 10.4: If modifications are made or recommended by test developers. . . (unless)evidence of validity for a given inference has been established for individuals with the specific disabilities, test developers should issue cautionary statements in manuals or supplementary materials regarding confidence in interpretations based on such test scores” (AERA et al., p. 106).

  9. “Cautionary statements” • Flagging of test scores: Controversial—most research in this area focused on postsecondary and postgraduate admissions tests (Sireci, 2005). • How do states handle score reporting issues for accommodated and alternate assessments?

  10. Accommodated Tests and Accommodated Test Administrations have the Potential to Undermine Validity in at Least 2 Ways: • Construct underrepresentation • Construct-irrelevant variance • As stated by Messick (1989): • “Tests are imperfect measures of constructs because they either leave out something that should be included…or else include something that should be left out, or both” (p. 34)

  11. When standardized tests are NOT accommodated for SWD • Construct-irrelevant variance can interfere with test performance • e.g. ability to see, hear, focus, interferes with measurement of math or reading proficiency • When standardized tests ARE accommodated • Construct underrepresentation may occur • e.g., read-aloud for a reading assessment

  12. What methods do states use to minimize construct-irrelevant variance, while maintaining construct representation?

  13. Categories of Accommodations • Presentation • Timing • Response • Setting Thompson, Blount, and Thurlow (2002)

  14. Presentation Accommodations • Oral (read-aloud, audiocassette) • Paraphrasing • Technological • Braille/large print • Sign language interpreter • Encouragement (redirecting) • Cueing • Spelling assistance • Use of manipulatives

  15. Timing Accommodations • Extended time • Multiple days/sessions • Separate sessions Timing accommodations are not so much an issue on state standards-based assessments because most have generous time limits.

  16. Response Accommodations • Scribe • Booklet versus answer sheet • Marking booklet to maintain place • Transcription Setting Accommodations • Individual administration • Administration in a separate room

  17. Other Accommodations • Alternate assessment • Others?

  18. Psychometric Research on Test Accommodations Has Focused On • Has the accommodation changed the construct measured? • Speed • Different skill • Do accommodations help only those who need them? • Interaction hypothesis • Do test scores from accommodated and non-accommodated administrations have the same meaning?

  19. Research on test accommodations for individuals with disabilities: • Little empirical study • Some literature reviews • Willingham et al. (1988) ─Chiu & Pearson (1999) • Tindal & Fuchs (2000) ─Pitoniak & Royer (2001) • Thompson et al. (2002) ─Bolt & Thurlow (2004) • Sireci, Scarpati, & Li (2005) • Psychometric issues (Geisinger, 1994) • Legal issues (Phillips, 1994) • Also: Keeping Score for All(Koenig & Bachman, 2004)

  20. Sireci, Scarpati, & Li (2005)Research Questions • Do test accommodations improve the scores of students with disabilities (SWD)? • If so, do such score gains reflect increased validity or unfair advantage? • Interaction hypothesis • What specific types of accommodations are best for specific types of students?

  21. Interaction Hypothesis

  22. Macarthur & Cavalier (2004) “Differential impact on students with and without disabilities provides evidence that the accommodation removes a barrier based on disability” (p. 55).

  23. Fletcher et al. (2006) “Because the source of variance is fundamentally irrelevant to the measurement of the construct, a valid accommodation will improve performance only for students with a disability” (p. 138).

  24. Are there any general conclusions regarding effects? • Extended time seems to help and it helps SWD more than non-SWD. • Oral accommodations show promise (math), but less uniformity across studies. Effects are considered unclear.

  25. Review Process • ERIC and PsychInfo searches • E-mails to researchers in this area

  26. Structure of review • Dimension 1: SWD or ELL • Dimension 2: Type of accommodation • Dimension 3: Experimental or non-experimental study Note that the review was primarily conducted in 2003 and so the results are somewhat dated. We have, however, reviewed additional research since then.

  27. Research Design Study Focused On Total SWD ELL Experimental 13 8 21 Quasi-experimental 2 4 6 Non-experimental 10 1 11 Total 25 13 38 Characteristics of Studies Studies pertaining exclusively to ELL will not be discussed in this presentation.

  28. Type(s) of Accommodation # of Studies Presentation: Oral* 23 Paraphrase 2 Technological 2 Braille/Large Print 1 Sign Language 1 Encouragement 1 Cueing 1 Spelling assistances 1 Manipulatives 1 Types of Accommodations *Includes read aloud, audiotape, or videotape, and screen-reading software. Note: Literature reviews and issues papers are not included in this table.

  29. Type(s) of Accommodation # of Studies Timing: Extended time 14 Multi day/sessions 1 Separate sessions 1 Response: Scribes 2 In booklet vs. answer sheet 1 Mark task book to maintain place 1 Transcription 1 Setting (separate room) 1 Types of Accommodations Note: Literature reviews and issues papers are not included in this table.

  30. Characteristics of Studies • Most of the studies focused on elementary school (2/3 between grades 3 and 8). • Only 41% were published in peer-reviewed journals.

  31. Results: Extended Time • Most common findings were gains for both SWD and and non-SWD. • Contrast Camara et al. (1998) with Bridgeman et al. (in press) • Most studies of extended time (6 of 8) looked at students with learning disabilities (SWLD)

  32. Summary of Studies on Extended Time (1)

  33. Summary of Studies on Extended Time (2)

  34. Results: Oral • Results depend on subject • Gains for SWD only in Math • No differential gain in other subject areas • Tends to support oral accommodation for math tests

  35. Study Subject Design Results H1? Weston (2002) Math Experimental (b/w and w/in groups) Greater gains for SWD Yes Tindal, Heath, et al. (1998) Math Experimental (b/w and w/in groups) Sig. gain for SWD only Yes Calhoon, Fuchs, & Hamlett (2000) Math Experimental (w/in group) Sig. gains for oral accom., no differences b/w teacher & computer Yes Johnson (2000) Math Experimental (b/w group) Greater gains for SWD Yes Huynh, Meyer, & Gallant (2004) Math Ex post facto Accommodated SWD > matched non-accom. SWD Yes Helwig, & Tindal (2003) Math Quasi-experimental Teachers not accurate in predicting benefit; no gains for either group. No Meloy, Deville, & Frisbie (2000) Science, Math, Reading Experimental (b/w and w/in groups) Similar gains for SWD and non-SWD No

  36. Study Subject Design Results H1? Brown & Augustine (2001) Science, Social Studies Experimental (b/w and w/in groups) No gain No Kosciolek & Ysseldyke (2000) Reading Quasi-experimental SWD had greater gains, but not statistically significant No McKevitt & Elliot (2003) Reading Experimental (b/w and w/in groups) No sig. effect size differences b/w accom. & standard. conditions for either group. No Oral (continued)

  37. More Recent Research • Extended time • Cohen, Gregg, & Deng (2005) • Wainer, Bridgeman, Najarian, & Trapani (2004) • Oral • Fletcher, Francis, Boudousquie, Copeland, Young, Kalinowski, & Vaughn (2006) • Dictation software • MacArthur & Cavalier (2004)

  38. Cohen, Gregg, & Deng (2005) • Looked at groups of students with and without accommodations and their performance on specific types of math items using differential item functioning methods • Accommodation status “only marginally related to the pattern of accommodation-related DIF” • Different types of students benefited from the extra time • DIF not due to accommodations, but to differences in students’ performance across different types of math items

  39. Cohen, Gregg, & Deng (2005) “Accommodations are more appropriately viewed as leveling the playing field; they do not supply the knowledge necessary to pass tests” (p. 231).

  40. Wainer et al. (2004) • Reanalysis of Bridgeman, Trapani, & Curley (2004) data • Evaluated extended time by shortening experimental sections of SAT • Little difference for verbal (about 5-point gain) • Big difference for quantitative • about 10-30 points, with larger gain associated with larger time extension • Largest gains for highest-scoring students

  41. Wainer et al. (2004) • Looked at correlations b/w scores from standard and extended time with students’ HS math grades • Claimed no relationship, but results (correlations and sample sizes) were not reported! • Important idea to look at external validity criterion

  42. Wainer et al. (2004) • Claim that results support not flagging verbal, but should flag quantitative • Don’t acknowledge presence of undesired speededness • SWD not included in study • Hard to agree with conclusions • Supports increasing time limit on SAT-Q

  43. Fletcher et al. (2006) • Experimental study involving Grade 3 students with (n=91) and without (n=91) decoding difficulties associated with dyslexia • Oral vs. standard accommodation reading test (Texas)

  44. Fletcher et al. (2006) • Accommodation targeted for specific disability • Oral reading of proper nouns, comprehension stems, & answer choices • Designed to reduce the impact of word recognition difficulties

  45. Fletcher et al. (2006) • Results • Significant group/accommodation interaction • Only SWD benefited from the accommodation • Seven times greater likelihood of passing the test with the accommodation

  46. Macarthur & Cavalier (2004) • Looked at accommodations for writing assessments • Experimental study: SWD (n=21), students w/o documented disability (n=10) • Three accommodation conditions: • hand-written • dictation to scribe • dictation to speech recognition software • 48 states allow dictation accommodation (17 exclude scores)

  47. Macarthur & Cavalier (2004) • Results: • Dictation improved writing scores for SWD, with Scribe > speech recognition software > hand-written • Dictation did not improve scores for students w/o disability • No difference between student groups with respect to preference (hand vs. dictation)

  48. Macarthur & Cavalier (2004) • Caveat • Small n (21, 10) • Construct issue • Dictation okay if construct = “composing” • Not okay if construct=“writing”

  49. Research on Equivalence of Test Structure • One aspect of “construct equivalence” • Rock, Bennett, Kaplan, & Jirele (1988) • Tippets & Michaels (1997) • Huynh, Meyer, & Gallant (2004) • Huynh & Barton (2006) • Cook, Eignor, Sawaki, Steinberg, & Cline (2006)

  50. Research on Equivalence of Test Structure Results tend to support similarity of test structure across accommodated and standard test administrations (oral, extended time, various).

More Related