1 / 33

Workforce Readiness: Assessing Written English Skills for Business Communication

ECOLT, Washington DC, 29 October 2010. Workforce Readiness: Assessing Written English Skills for Business Communication Alistair Van Moere Masanori Suzuki Mallory Klungtvedt Pearson Knowledge Technologies. Development of a Workplace Writing Test. Background Test & task design

ata
Download Presentation

Workforce Readiness: Assessing Written English Skills for Business Communication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECOLT, Washington DC, 29 October 2010 Workforce Readiness: Assessing Written English Skills for Business Communication Alistair Van Moere Masanori Suzuki Mallory Klungtvedt Pearson Knowledge Technologies

  2. Development of a Workplace Writing Test • Background • Test & task design • Validity questions • Results • Conclusions

  3. Widely-used Assessments of Written Skills

  4. Widely-used Assessments of Written Skills Needs Gap • Few authentic measures of writing efficiency • Lack of task variety • 2-3 weeks to receive scores • Only an overall score reported • Inflexible structure: only BULATS offers a writing only test

  5. Needs Analysis • Interviewed 10 companies from 5 countries • Online questionnaire, 157 respondents • Multi-national companies • Business Process Outsourcing (BPOs) companies • HR managers • Recruitment managers • Training managers

  6. Needs Analysis Results

  7. Needs Analysis Results SPOKEN MODULE WRITTEN MODULE

  8. Testing goals Flexible testing: target desired skills Speed and convenience Quick score turnaround (5 mins) Computer delivered; Automatically scored Workplace-relevant tasks Efficiency and appropriateness of written skills

  9. “Versant Pro” - Written Module Our return ( ) is detailed in the attached document for future reference. 45 mins, 5 tasks, 39 items

  10. Overall Score (20-80) • Grammar • Vocabulary • Organization • Voice & Tone • Reading Comprehension • Additional Information • Typing Speed • Typing Accuracy

  11. Development of a Workplace Writing Test • Background • Test & Task Design • Item specifications • Item development • Field testing • Rating scale design • Validity Questions • Results • Discussion

  12. Item Specifications Email Writing task with 3 themes: • Cognitively relevant • No specific business/domain knowledge required • Free of cultural/geographic bias • Elicits opportunities to demonstrate tone, voice, organization • Control for creativity • Constrain topic of responses for prompt-specific automated scoring models

  13. Item Development • Texts modeled on actual workplace emails • Situations inspired from workplace communication Source material • General English: Switchboard Corpus • ~8,000 most frequent words • Business English: 4 corpus-based business word lists • ~3,500 most frequent words Word list Expert review • Internal reviews by test developers • External reviews by subject matter experts

  14. Rating Scales Passage Reconstruction Email Writing

  15. Field Testing Other countries include: France, Spain, Italy, Costa Rica, Russia, Iraq, Taiwan, Czech, Columbia, Yemen, Iran, Malaysia, Vietnam, Thailand, Venezuela, Nepal, etc….. 51 countries 58 L1s Period: August 2009 – November 2009

  16. Validity Questions • Do the tasks elicit performances which can be scored reliably? • Rater reliability • Generalizability? • Does the rating scale operate effectively? • Do the traits tap distinct abilities? • Are the bands separable? 3. What is the performance of machine scoring? • Reliability • Correlation with human judgments

  17. Validity Questions • Do the tasks elicit performances which can be scored reliably? • Rater reliability • Generalizability? • Does the rating scale operate effectively? • Do the traits tap distinct abilities? • Are the bands separable? • What is the performance of machine scoring? • Reliability • Correlation with human judgments

  18. Rater Reliability Email Writing Passage Reconstruction (21,200 ratings, 9 raters)

  19. Generalizability Coefficients (n=2,118 * 4 prompts * 2 ratings)

  20. Validity Questions • Do the tasks elicit performances which can be scored reliably? • Rater reliability • Generalizability? • Does the rating scale operate effectively? • Do the traits tap distinct abilities? • Are the bands separable? • What is the performance of machine scoring? • Reliability • Correlation with human judgments

  21. rater1 rater3 rater4 rater5 rater2 Email Writing

  22. Inter-correlation matrix

  23. Validity Questions • Do the tasks elicit performances which can be scored reliably? • Rater reliability • Generalizability? • Does the rating scale operate effectively? • Do the traits tap distinct abilities? • Are the bands separable? • What is the performance of machine scoring? • Reliability • Correlation with human judgments

  24. Subscore reliability

  25. Email items - Machine score vs Human Score Email Human Rating Email Machine Score

  26. Versant Pro - Machine score vs Human Score Overall Human Score Overall Machine Score

  27. Machine score vs CEFR judgments Human CEFR Estimate 6 panelists IRR = 0.96 Versant Pro Machine Score

  28. Limitations/Further work • Predictive Validity • Concurrent validity • Score use in specific contexts • Dimensionality (factor analysis, SEM) • Constructs not assessed, under-represented • Explanation/limitations of machine scoring

  29. Conclusion • Automatically-scored test of workplace written skills: • Modular, flexible • Short (45-mins) • 5-min score turnaround • Job relevant • Task variety • Common shortfall in task design – written & spoken – is planning time and execution time • Shorter, more numerous, real-time tasks are • construct-relevant, efficient and reliable.

  30. Thank you alistair.vanmoere@pearson.com

More Related