1 / 26

Assessment Training

Assessment Training. Nebo School District. Assessment Literacy. Test Acronyms. CRT - Criterion Referenced Test 1-11 IOWA –Iowa Test of Basic Skills and Iowa Test of Educational Development 3,5,8,&11 UBSCT - Utah Basic Skills Competency Test 10-12 DWA - Direct Writing Assessment 6&9

dore
Download Presentation

Assessment Training

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assessment Training Nebo School District

  2. Assessment Literacy

  3. Test Acronyms • CRT -Criterion Referenced Test1-11 • IOWA –Iowa Test of Basic Skills and Iowa Test of Educational Development3,5,8,&11 • UBSCT -Utah Basic Skills Competency Test10-12 • DWA -Direct Writing Assessment 6&9 • UAA –Utah Alternate Assessment1-12 with severe cognitive disabilities • UALPA -The Utah Academic Language Proficiency Assessment 1-12 ELL

  4. Norm-Referenced Tests • Standardized Tests • Scores interpreted in terms of comparison to a specific group • Percentile scores are most common measurement of achievement • Percentile scores range from 1st to 99th with the 50th percentile being used to represent the national average • ITBS and ITED (IOWA) tests are the state adopted Norm-Referenced Assessments

  5. Criterion-Referenced Tests • Standardized Tests • Every question/item is aligned to an explicitly stated educational objective • Used to identify which standards and objectives have been mastered by the examinee • CRT or End-of-Level tests in Language Arts, Math, and Science

  6. Summative Assessment • Used to determine the students’ final understanding of material • State CRT tests are an example

  7. Formative Assessment • Used to identify the students’ understanding of material, to provide feedback for teachers and learning experiences for students • Benchmarks, UTIPS, Running Records, and Student Interviews are all included in this category

  8. Raw score • The number of correct responses on a test • A student answered 48 questions correctly

  9. Percent Correct Score • The number of correct responses divided by the total number if items • 49 out of 70 = 70%

  10. Percentile Score • The percent of students who performed worse on a test • 75th percentile – 75% of examinees scored lower on the test than this examinee

  11. Scaled Score • The students performance is based on an arbitrary numerical scale (can be alphabetical) • A scaled score correctly provides comparable information on student performance for different years on different tests

  12. ACT • What is 36? • What is 28? • What is 12? • These numbers represent the value we place on numbers in a scale • Often we have the help of others such as colleges in setting value • Utah State University and University of Utah say you must have at least a score of 18

  13. Scaled Scores • Act Scores range from 10-3618-28 is considered proficient depending on school • Advanced Placement tests range from 1-5 3 is proficient • UBSCT and CRT range from 100-200 160 is proficient

  14. Scaled Scores • Scaled scores offer the advantage of simplifying the reporting of results • There can be common score reporting for each level and for each test • No more specific percentages for cut scores for each subject • Far greater comparability between tests and years

  15. Scaled Scores • CRTs and UBSCT use a cut score of 160 • Each proficiency level has its own cut score • Proficiency levels range from 1-4 in NCLB and 1a-4 in UPASS (We will discuss this in the next session)

  16. Example • If john has a raw score of 65 in 2004, and a raw score of 58 in 2005,does this show a decrease in performance? • If john has a scaled score of 165 in 2004, and a scaled score of 155 in 2005, does this show a decrease in performance?

  17. Why Not Raw Scores • Most states do not release raw scores • Looking at raw scores can lead to an incorrect assumption • It is incorrect to compare raw scores from one year to those of the next • It is incorrect to compare the raw scores of one test to those of another

  18. EQUATING Career Home Runs

  19. Who Is The Greatest? • Individually Ability Strength Skill Technique Knowledge • Difficulty of the game Tightly Wound Baseballs Improved Bats Higher Pitchers Mound Changes in Season Length Steroids .

  20. Comparisons • Impossible to compare Barry Bonds with Babe Ruth • Impossible to compare a game in 1914 to a game in 2006

  21. Comparisons • Possible to compare johns ability on the 2005 language arts CRT with johns ability on the 2006 language arts CRT (Scaling) • Possible to compare the difficulty of the 2005 language arts CRT to the 2006 CRT(Equating)

  22. Equating • Statistical process that takes different tests and makes them equal in difficulty • Disentangles differences between test difficulty and student ability

  23. Equating • Common (anchor) items between test forms • Statistical comparison of common items for equivalent difficulty level • This statistical process ensures that results from test to test are accurately comparable and not subject to fluctuations due to unintentional changes in item difficulty

  24. Equating Form X Form Y Anchor Items Anchor Items

  25. Anchor Items • It is the performance of the two sets of anchor items across years that allow us to make interpretations about the relative difficulty of the non-anchor items • If student performance on the anchor items is the same, we conclude that the student achievement is the same • If student performance on the anchor items increases we can interoperate that student achievement increased • If student performance on the anchor items decreases we interoperate that student achievement decreased • We use this information to judge the difficulty of the non-anchor items

  26. Why Equate • One test is more difficult than another • One group of examinees may be more intelligent than another • Both

More Related