Moving beyond the psychometric discourse: A model for programmatic assessment

Moving beyond the psychometric discourse: A model for programmatic assessment Researching Medical Education Association for the Study of Medical education London, 23 November 2010 Cees van der Vleuten Maastricht University The Netherlands Powerpoint: www.fdg.unimaas.nl/educ/cees/asme

The first step is to measure whatever can be easily measured. This is ok as far as it goes. The second step is to disregard that which can't be easily measured or to give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can't be measured easily really isn't important. This is blindness. The fourth step is to say that what can't be easily measured really doesn't exist. This is suicide. —Charles Handy, The Empty Raincoat, page 219.

Mc Namara’s Fallacy • The Mc Namara fallacy refers to Robert McNamara, The Secretary of defence from 1961-1968 and his belief that the body count was a good way of measuring how the war was going. • As long as more Vietcong were being killed than US forces, the war was being won

Programmatic assessment • Planned arrangement of individual assessments in a learning program • Quality compromises are made for individual methods/data points but not for the program as a whole • Programmatic assessment is fit for purpose: • Assessment for learning • Robust decision making over learner’s performance • Improving the learned curriculum

Programmatic assessment so far • Proposed in 2005 (Van der Vleuten & Schuwirth, 2005) • Multitude of quality criteria and self-assessment instrument of program quality (Baartman et al, 2006, 2007) • Framework of design (Dijkstra et al, 2009) and design guidelines (Dijkstra et al, in prep) • Need for theoretical model of programmatic assessment for program in action

Pillars of the proposed model 11 • Any point measurement or single data point of assessment is flawed • Standardized methods of assessment can have ´built-in´ validity • Quality control around test construction and administration • Assessment ‘technology’ is available • Validity in unstandardized methods lies in the users of the instruments, more than in the instruments 1Theoretical/empirical account: Van der Vleuten, C. P., Schuwirth, L. W., Scheele, F., Driessen, E. W., & Hodges, B. (2010). The assessment of professional competence: building blocks for theory development. Best Pract Res Clin Obstet Gynaecol, 24, 703-719.

Pillars of the proposed model 21 • Assessment drives learning • Requires richness or meaningfulness of data • Qualitative, narrative information carries a lot of weight • Theoretical understanding emerging (Cilliers et al., 2010, under editorial review) • Stakes (formative and summative assessment) is a continuum • N of datapoints need to be proportional to stakes • Expert judgment is imperative for assessing complex competencies and when diverse information is to be combined • Sampling strategies can reduce random error • Procedural strategies can reduce bias 1Van der Vleuten et al., 2010

Artifacts of learning • Outcome artifacts: Products of learning tasks • Process artifacts: Learning or working activities Training Activities v v Assessment Activities Supporting Activities Time • Learning task • PBL case • Patient encounter • Operation • Project • Lecture • Self-study

Certification of mastery-oriented • learning tasks • Rescuscitation • Normal delivery of infant Training Activities v v Assessment Activities Supporting Activities Time • Individual data points of assessment • Fit for purpose • Multiple/all levels of Miller • Learning oriented, Information rich documentation, meaningful (quantitative, qualitative) • Low stake

(P)Reflective activity by learner • Interpretation of feedback • Planning new learning objectives and tasks Training Activities v v Assessment Activities Supporting Activities Time • Supportive social interaction • Coaching/mentoring/supervision • Peer interaction (intervision)

Firewall dilemma • Dilemma between access to rich information and compromising relationship supporting person(s) and learner Training Activities v v Assessment Activities Supporting Activities Time • Intermediate evaluation • Aggregate information held against performance standard • Committee of examiners • Decision making: diagnostic, therapeutic, prognostic • Remediation oriented, not repetition oriented • Informative • Longitudinal • Intermediate stake

Training Activities v v v v Assessment Activities Supporting Activities Time

Training Activities v v v v v v Assessment Activities Supporting Activities Time • Final evaluation • Aggregate information held against performance standard • Committee of examiners • Pass/fail(/distinction) high stake decision • Based on many data points and rich information • Decision trustworthiness optimized though procedural measures, inspired qualitative methodology strategies • High stake

Paradigms of research Quantitative Qualitative Criterion approach approach Truth value Internal validity Credibility Applicability External validity Transferability Consistency Reliability Dependability Neutrality Objectivity Confirmability

Training Activities v v v v v v Assessment Activities Supporting Activities Time

Training Activities v v v v v v Assessment Activities Supporting Activities Time Training Activities v v v v v v Assessment Activities Supporting Activities Time

Risks • Resources/cost (do fewer things well rather than doing more but poorly) • Bureaucracy, Trivialization, Reductionism • Legal restrictions • Novelty/unknown

Opportunities • Assessment for learning combined with rigorous decision making • Post-psychometric era of individual instruments • Theory driven assessment design • Infinite research opportunities • Multiple formalized models of assessment (e.g. psychometrics, Bayesian approaches to information gathering, new conceptions of validity……) • Judgment (bias, expertise, learning….) • How and why is learning facilitated (theory of assessment driving learning) • ………..

Moving beyond the psychometric discourse: A model for programmatic assessment

Moving beyond the psychometric discourse: A model for programmatic assessment

Presentation Transcript

Discourse Annotation: Discourse Connectives and Discourse Relations

Ch. 29-Lifting and Moving Victims

What is … Discourse Analysis?

Programmatic CDM – How is it different?

Theories of Discourse and Dialogue

Testing 09

Modeling Discourse

Modeling Discourse

Writing Programmatic Outcomes

Discourse Structure and Discourse Coherence

Moving from Concept to Meaningful Assessment :

Model Assessment and Selection

Answering WHY questions in Closed Domain from a Discourse Model

Using Discourse Analysis on News Media Content

Discourse and Pragmatics

Applying the Wiener diffusion process as a psychometric measurement model

Programmatic Assessment of Carve-In and Carve-Out Arrangements in Medicaid Managed Care

California K-12 Assessment Update

Module 3: Assessment:

Psychometric Examination

‘Assessment of Student Learning’: A Programmatic Perspective

Implementing an Integrated System for Programmatic Assessment of the MBBS Program