420 likes | 638 Views
AUTHENTICITY. AND THE ASSESSMENT. OF MODERN LANGUAGE LEARNING. Research into an Exemplary Programme. The International Baccalaureate Diploma Programme Group 2 Languages. Dr. John ISRAEL. Starting point for research. Based in an IB evaluation philosophy of :
E N D
AUTHENTICITY AND THE ASSESSMENT OF MODERN LANGUAGE LEARNING Research into an Exemplary Programme The International Baccalaureate Diploma Programme Group 2 Languages Dr. John ISRAEL
Starting point for research Based in an IB evaluation philosophy of : commonality of programme levels, tasks and criteria for assessment and evaluations for any and all languages differentiation by programme, level, task and criterion descriptors (A2, B, abinitio, higher, standard), with no entry criteria authenticity as the goal in language reception and production minimal pedagogical intervention in editing authentically-sourced materials assessment and evaluation by descriptive criterion referencing as“can do” performance, not “what is unknown or incorrect?”
Performance assessment and evaluation: some key concepts • Assessmentunderstood as qualitative : matching samples of performance to descriptions, by level and discrete criterion • Evaluation understood as quantitative : assigning numerical value to assessments, as indicators of overall quality
Assessment criteria for Production, Languages B These are categorised by : Language Cultural Interaction Message With each criterion valued equally, at a maximum of 33,33% (oral and written)
LANGUAGEWritten Production Command of the language is good and effective. • A range of grammar and vocabulary is used accurately despite some errors in more complex constructions. • Some complex sentence structures are used clearly and effectively. • Spelling/calligraphy is generally correct and clear. • At least the prescribed minimum number of words has been written. Command of the language is very good and may show evidence of sophistication. • A wide range of grammar and vocabulary is used accurately with few errors. • Complex sentence structures are used effectively and skillfully. • Spelling/calligraphy is almost always correct and clear. • At least the prescribed minimum number of words has been written. B Standard Level : General Criteria • To what extent does the candidate write the language fluently and accurately? • How varied and accurate are the grammar and vocabulary used by the candidate? • How clear are the sentence structures? • To what extent is the candidate able to use complex structures? • How accurate is the spelling or calligraphy? • Has the candidate written the prescribed minimum number of words? B Higher Level : Skip
LANGUAGEOral Production General Criteria Command of the spoken language is good. • The production of language is fluent. • The use of grammar and vocabulary is generally correct, varied and idiomatic. • The intonation contributes effectively to communication. Command of the spoken language is very good. • The production of language is fluent and with a touch of authenticity. • The use of grammar and vocabulary is varied and idiomatic, almost error free. • The intonation contributes effectively and expressively to communication. B Standard Level : To what extent does the candidate speak the language fluently and accurately? • How fluent is the language spoken by the candidate? (Fluency refers to ease of speaking.) • How correct and idiomatic are the grammar and vocabulary used by the candidate? • To what extent does intonation contribute to communication? (Intonation refers to the sounds and rhythms of the language that are essential for effective communication. It does not refer to accent. The candidate is not expected to sound like a native speaker of the language.) B Higher Level : Skip
MESSAGEWritten Production The message has been communicated well. • The ideas are relevant. • The development of ideas is methodical and thorough. • Supporting details are appropriate. • The organization of ideas is clear. The message has been communicated very well. • The ideas are relevant and stimulating. • The development of ideas is thorough and imaginative. • Supporting details are appropriate and convincing. • The organization of ideas is clear and flows well. B Standard Level : General Criteria • To what extent does the candidate communicate the message • in a developed and organized manner? • How relevant are the ideas presented by the candidate? • How developed are the ideas? • How appropriate are the supporting details? • To what extent are the ideas organized into an overall plan? B Higher Level : Skip
MESSAGEOral Production The candidate handles complex ideas well. • Both simple and complex ideas and opinions are generally presented clearly, coherently and effectively. • Responses are generally relevant and show some imagination. • The conversation flows coherently. The candidate handles complex ideas very well. • Both simple and complex ideas and opinions are presented clearly, coherently and vividly. • Responses are relevant and show insight and imagination. • A coherent conversation is maintained throughout. B Standard Level : General Criteria • To what extent is the candidate able to communicate ideas (or message) • and maintain a coherent conversation? • To what extent is the candidate able to convey complex ideas and opinions? • How clearly, coherently and effectively are the ideas and opinions presented? • How relevant and complete are the candidate’s responses? • How coherent is the conversation? B Higher Level : Skip
CULTURAL INTERACTIONWritten Production The text is clear and convincing. • The choice of register and style is generally effective and appropriate to the task. • Rhetorical devices appropriate to the type of text are generally effective and varied. • Structural elements contribute to the clarity of the text. The text is convincing and expressive, with some imagination. • The choice of register and style is consistently effective and appropriate to the task. • Rhetorical devices appropriate to the type of text are effective, varied and imaginative. • Structural elements contribute fully to the clarity of the text. B Standard Level : General Criteria • To what extentdoes the candidate select language • appropriate to the audience and type of text? • How convincing and expressive is the text? • Howeffective and appropriate is the choice of register and style to the task? • Where appropriate, how varied and effective are the rhetorical devices? • (Rhetorical devices include all techniques used to present the message • more vividly, such as metaphor, exaggeration and repetition.) • To what extent do structural elements contribute to the clarity of the text? • (Structural elements include cohesive devices.) B Higher Level : Skip
CULTURAL INTERACTIONWritten Production The text is clear and convincing. • The choice of register and style is generally effective and appropriate to the task. • Rhetorical devices appropriate to the type of text are generally effective and varied. • Structural elements contribute to the clarity of the text. The text is convincing and expressive, with some imagination. • The choice of register and style is consistently effective and appropriate to the task. • Rhetorical devices appropriate to the type of text are effective, varied and imaginative. • Structural elements contribute fully to the clarity of the text. B Standard Level : General Criteria • To what extentdoes the candidate select language • appropriate to the audience and type of text? • How convincing and expressive is the text? • Howeffective and appropriate is the choice of register and style to the task? • Where appropriate, how varied and effective are the rhetorical devices? • (Rhetorical devices include all techniques used to present the message • more vividly, such as metaphor, exaggeration and repetition.) • To what extent do structural elements contribute to the clarity of the text? • (Structural elements include cohesive devices.) B Higher Level : Skip
CULTURAL INTERACTIONOral Production General Criteria Interaction in conversation is successful. • Responses in fairly complex exchanges show some sensitivity to subtlety, nuances and prompts. • Contributions to the conversation are active and spontaneous. • The language is generally appropriate to the subject and context. Interaction in conversation is very successful. • Responses in complex exchanges show sensitivity and subtlety to nuances and prompts. • Contributions to the conversation are active, spontaneous and sensitive to others. • The language is consistently appropriate to the subject and context. B Standard Level : • To what extent does the candidate interact appropriately • and successfully in the conversation? • How sensitive and subtle is the candidate’s response to nuances and prompts? • (Teachers should ensure that their participation in the exchange is sufficient, • subtle and complex enough to enable the assessment of the candidate’s • listening skills .) • How actively and sensitively does the candidate contribute • to the conversation? • Towhat extent does the candidate speak spontaneously, • or has the candidate rehearsed the conversation? • How appropriate is the language to the subject and context? • (Language refers to tone and register.) B Higher Level : Skip
Assessment method for samples of production By multiple, repeated moderation for reliability, as: • ‘Subjective’ interpretation, through matching to criteria by experienced assessors, themselves moderated for consistency. • Consensal, through multiple moderations by different assessors. With statistically–determined moderation factors applied to the results of each assessor. Open to appeal and repeats of the assessment process.
Research Hypotheses As an investigation of IB programmes, assessment and evaluation practices, with particular reference to authenticity, understood in philosophical terms as well as purely linguistic, and as categorised for example, by Van Lier (1996), with focussing on Language B programmes for: • Task Design and Assessment Criterion Validity • Assessment and Evaluation Reliability • Credibility of Final Evaluations
Key research questions: • In what ways and to what extent do the IB’s language programmes promote “authentic” language production in target languages? • Can authenticity in communicative and interactive language productions be a yardstick for measurements, with acceptable reliability? • Can validity, reliability and credibility in IB assessment and evaluation practices be enhanced, without compromise to a programme philosophy of authenticity? • If so, with what implications for the teaching, learning and measurement processes?
How is “Authenticity” to be understood? As essentially subjective consciousness in communicative interaction with other as object (in ‘bad faith’), or alternate subjectivity (in ‘good faith’), and/or self (through reflection), in (linguistically), culturally and socially definable contexts for specific purposes, including self-definition. The processes involved are dialectical. One example very briefly summarised from J-P. Sartre (1946)
Van Lier’s categorisations As a ‘triad’ of: Awareness (The Cartesian consciousness of Sartre) Autonomy (The Sartrean conceptualisation of ‘self’ and ‘other’) Authenticity (The unity of the above)
from the perspective of language use, A triadic representation of interrelations, between awareness, autonomy and authenticity A. Exposure to language (including quality of language and the receptivity of the individual). 1. AWARENESS B. Perception of social and linguistic interaction (i.e. the relation between the individual and exposure). 2. AUTONOMY C. Processing of language (i.e. the social and cognitive transformations that lead to conscious activities of interpretation and purposeful linguistic interaction). (See Van Lier (1996), op. cit.) 3. AUTHENTICITY
Van Lier’s further categorisations This ‘triad’ of: Awareness Autonomy Authenticity may be linked to a further ‘triad’ of: Curricular authenticity Pragmatic authenticity Personal authenticity
Research criteria developed from Van Lier For measuring communicative value in integrated performances, interactively combining comprehension and production, as: Evidence for ‘Curricular Authenticity’ This is evidence for the quality of an individual’s creation and use of language, after exposure to the target environment. This major criterion is further categorised as: • Creator authenticity, concerning ‘self’ • Creator authenticity, concerning the recognition of ‘other’ • Finder authenticity • User authenticity
“Curricular authentication” Evidence for ‘Curricular Authenticity’, measuring levels of (linguistic) awareness and autonomy • Creator authenticity for assessing linguistic realisations of ‘self’: focussing attention on evidence for the personal and unique ‘voice’ of each producer of language. • Creator authenticity, assessing perceptions of ‘other’ as interlocutor, audience or reader: focussing attention on evidence for attempts to motivate participation in communicative interchange through personal strategies or discrete tactics for retaining listeners’ or readers’ attention. • Finder authenticity, or the resourcefulness of communicators in finding material for communication: focussing attention on evidence for recognisable agency in the selection and manipulation of specific objects of awareness, sourced from outside ‘self’, or meta-cognitively within ‘self’. • User authenticity, or recognitions of ‘other’ as listener or reader: focussing attention on evidence for linguistic interaction through respect for commonly-acquired social traditions and communicative convention, facilitating initiations and continuations of communication: there is purposive response to set stimuli and to prompts sourced in the initiatives of ‘other’.
Research criteria developed from Van Lier For measuring communicative value in integrated performances, interactively combining comprehension and production, as: Evidence for ‘Pragmatic Authenticity’ This is evidence for appropriate, individual purposes in public language production, creating links with the physical, temporal and socio-cultural contexts within which linguistic interactions take place. This major criterion is further categorised as: • Authenticity of purpose • Authenticity of context • Authenticity of interaction
“Pragmatic authentication” Evidence for ‘Pragmatic Authenticity’, measuring levels of (socio-cultural) awareness and autonomy • Authenticity of context: OR the willingness by partners in communication to share culturally-situated perspectives, respecting the conventions and traditions of a collaboratively-modifiable culture. There should be evidence of agreements, explicit or implicit, interactively to share communication and so construct extensive and extendable social relations through language. with no suggestion of ‘self-determined’, one-sided closure of communication. • Authenticity of purpose: OR self-awareness and transparency in choices of expressive form and the content to be communicated. There should be evidence of intentional facilitations of changes in perspective and / or knowledge amongst interlocutors, audiences, or readers of the text created, and reflexively in ‘self’, where relevant. • Authenticity, of interaction between partners in communication: OR recognitions of power in questions of validity, balance and ‘convincingness’, determining communicative quality in social relationships between speakers and listeners, writers and readers. There should be evidence for accommodations of ‘self’ to ‘other’ in processes of continuous change, and recognitions within ‘self’ and in ‘self as other’, of ability to guide this development.
Research criteria developed from Van Lier For measuring communicative value in integrated performances, interactively combining comprehension and production, as: Evidence for ‘Personal Authenticity’ This subsequently emerges from processing language, creating ontological, or existential status for individuals who participate in linguistic interchanges, through intrinsically- motivated, purposeful choices extending over time. This major criterion is further categorised as: • Existential authenticity • Intrinsic authenticity • Autotelic authenticity (after Csikszentmihalyi, 1990)
“Personal authentication” Evidence for ‘Personal Authenticity’, measuring levels of (cognitive) awareness and autonomy • Existential authenticity : ORsocial constructions and expressions of ‘self’ through (communicative) actions. Focussing attention on evidence of an awareness of the uniqueness of personal ‘voice’, or negatively, on avoiding overt plagiarism, through its absence as evidence in any given production. • Intrinsic authenticity : OR recognitions of self-determination in continuous operations of choice. Focussing attention on evidence for metacognitively-aware, active and responsible selections of style and content in communicative performances. • Autotelic authenticity : OR experiencing and expressing ‘flow’ as ‘optimal experience’, relating linguistic coherence and psychological balance to the inner mental worlds of subjects. Focussing attention on evidence for concentrated awareness of, and commitment to communication as present activity, with intentions to satisfy personally-chosen goals, and without intrusive distractions or irrelevance. (after Csikszentmihalyi, 1990) Reminder Continue
Starting point for research Based in an IB evaluation philosophy of : commonality of programme levels, tasks and criteria for assessment and evaluations for any and all languages differentiation by programme, level, task and criterion descriptors (A2, B, abinitio, higher, standard), with no entry criteria authenticity as the goal in language reception and production minimal pedagogical intervention in editing authentically-sourced materials assessment and evaluation by descriptive criterion referencing as“can do” performance, not “what is unknown or incorrect?”
Application to samples Method • Van Lier’sconcepts grouped into sets of descriptors, as with the IB referent model. • No leeway within each level for further, subjective ‘adjustments’, as with the IB scheme. • Three simple, discrete descriptions and quantifications for the provision of evidence: “little” (1 point), “adequate” (2 points),“significant” (3 points) • With Van Lier’s ten categories, an unweighted, maximum aggregation gives thirty points. • Two complementary levels later added for extremes: * one negative, for the complete absence of evidence (0 points) * one positive, for interpretatively-incontestable displays of competence (4 points). • This provides clear indicators of ‘inauthenticity’. An excess at extremes should reveal inappropriate programme and level selection by candidates, with tasks either too ‘easy’ or too ‘difficult’. • Compensations permit ‘adjustments’, more precisely discriminating individual performances, but increasing the maximum aggregated total to forty points.
Some general analysis of results NOTE: This graph was derived from samples of responses to a single, common task and rubric of individual productions for external IB moderation. It allows for comparisons across raters and rating systems, across the individual assessment of each production. • Assessment by criteria derived from categorisations of authentic language production in communicative interactions can produce meaningful evaluations. • Internal and external moderator scores are closely correlated. • Assessment by researcher using the Van Lier derived model also correlates closely, with finer distinctions between individual productions. • Assessment using the Van Lier model with ‘plussages’ produces a more finely-distinguished gradient closer approaching the ideal for ‘perfect’ discrimination across an infinite number of productions from zero to the maximum. NOT
Some more specific analysis • NOTE: This graph was derived from samples • of written responses to a single, common task and rubric • in one examination for external IB assessment. • It allows for comparisons across raters and rating systems, • AND across the individual assessment of each production. • Similar general trends are apparent, as previously, despite the far smaller sample • and application to reading and writing, rather than listening and speaking. • Extreme cases are clearly identified, with the ‘plussages’ in most cases exaggerating the effect of aberrance. Develop Continue
Some refinements to interpretations, after experiment • Creator authenticity as expressions of ‘self’: assessment requires evidence of particularisations of identity. An individual should be revealed, for example in the recounting of autobiographical incident, personal attitudes, emotions, dilemmas, expectations for the future, amongst others, thus allowing for originality and avoiding recitation of pre-learned models, or plagiarism. This is the ‘self’ identified by listeners and readers as a ‘personal voice’. • Creator authenticity in perceptions of ‘other’, as interlocutor, audience or reader: assessment rates appropriate participation in communicative interchanges. There should be for example, evidence for retaining and developing listeners’ or readers’ attention through the use of appropriate content, appropriately adapted by form, maintaining appropriate levels of participation, the text being relevance to its audience. Negative evidence pointing to a lack of proficiency in this respect, would be the demonstrable need for interlocutors and readers to put in special effort to maintain appropriate levels of interaction. • Finder authenticity:where evidence in the content of responses illustrates purposeful, critical selection and manipulation of sources of knowledge appropriate to the chosen task and its genre, in intellectual, cultural, or emotional terms, and so forth, as sourced outside ‘self’. It demonstrates the ‘resourcefulness’ of the producer of language. • User authenticity:where evidence is assessed for the purposeful adaptation of content to fit task and context. That is, the content ‘found’ from resources is appropriately applied to the task in hand.
Some refinements to interpretations, after experiment • Authenticity of context:or evident construction of extensive and extendable communicative interactions. Text-production indicates appropriate negotiation and agreement with initiatives from ‘other’, as a representative of the target-language culture, given tasks and situations chosen. There should be demonstrable awareness of effects on response-receivers, as markers of interaction. There will be no evidence of sustained irrelevance, evasiveness (unless appropriate), cultural ignorance or insensitiveness, or indeed attempts to close down channels of communication. A clear respect for ‘other’ is indicated in a situation of “give and take”. • Authenticity of purpose:or organisations of content with evidence for promotions and facilitations of changes in perspective and knowledge amongst audiences of the text created, and on occasion reflexively in ‘self’, thus granting the production a quality of ‘convincingness’. • The receiver knows easily, why communication takes place, both in choice of genre as its framework and in the message that it contains. • Authenticity of interaction:or the accommodations of ‘self’ to ‘other’ in processes of continuous change, marked through the unfolding of the text. There is assessable evidence of responsive, perhaps spontaneous recognitions within ‘self’ and in ‘other’, of ability and will to guide this development.
Some refinements to interpretations, after experiment • NOTE: The remaining criteria under Evidence for Personal Authenticity are more psychological in focus, and therefore more purely dependent upon the experientially-based, subjective and personal interpretations of overall ‘effect’ in communication, developed by listeners or readers. • Nevertheless: • Existential Authenticityprecludes overt plagiarism, whose presence would constitute • a negative factor, indicating a weak or absent level of authentic language use in this respect. • Intrinsic Authenticitydenotes the operations of selection and choice, with evidence to maintain linguistic interactions between speaker and listener, writer and reader, in a manner that involves their deepening and / or widening. • Autotelic Authenticityreveals ‘flow’ and psychological ‘balance’ throughout communicative productions, experienced as levels of engagement with, and concentration on the task in hand.
Conclusions in relation to the IB programmes, assessments and evaluations • Need better to theorise categorisations and relationships for converting descriptive assessments to quantitative evaluations. • Need for an explicit rationale in ascribing 33,33% as a maximal value for each discrete, major criterion, as conceptualised in the current scheme. • Need for the assessment of the desirability and feasibility of measuring interactive, communicative value as authentic, across a range of languages and proficiency levels.
Some considerations put forward • The reduction of assessment categories by descriptor could improve assessor reliability, with assessment choices more constrained and therefore, less variability in interpretation possible. • An increase in the overall number of discrete, assessment categories could isolate more significant features of effective, authentic communication, better focussing the attention of teachers and students alike on what is of value. • The conceptualisation of authenticity in linguistic interaction suggests that balance across all criteria is significant in effective communication and should be taken into assessment consideration. • The inclusion of extreme categories could aid in identifying the (authentic) appropriateness of candidate placement by school administrations.
Some key questions remaining • How can we ascribe justifiable quantitative values to qualitative assessment categories and level descriptors? • In the interests of transparency and improvements in understanding, how can the values of weighting and aggregation of discrete quantitative assessments be made explicitly justifiable? • How can final quantitative evaluations be justifiably described, (most particularly to interested third parties)? • Can the differences between discrete assessment categories be made more distinct without loss of validity, in order to facilitate use by assessors, trained, untrained, experienced or inexperienced? Develop Continue
More research needed on: the reliability of the experimental model over a range of: • triangulations of data across different languages at different levels, from beginner to native speaker, and all in between. • assessors, from expertly trained and highly experienced to trained novice, untrained teacher, students and perhaps non-specialist native speaker, as targeted receiver of language input.
More research needed on: The relationship of criterion categories to quantitative value, as perceived by a range of interested parties, such as: • Professional assessors • Teachers • Students • Parents • Institutes of Higher Education and their admissions sections
More research needed on: Relationship of values determined by any new model of assessment and evaluation to other systems, such as: • Previous IB programmes • Council of Europe values • Other systems, such as: • National for non-national learners (e.g. DELF/DALF, ZDaF, ZMP, etc) • National for national learners of foreign language
References • CSIKSZENTMIHALYI, M., (1990), Flow: the psychology of optimal experience, New York,Harper & Row. • INTERNATIONAL BACCALAUREATE ORGANISATION, (1997), Guide to the Diploma Programme (in English and French versions), Geneva, International Baccalaureate Organisation. • INTERNATIONAL BACCALAUREATE ORGANISATION, (2001), Paper Specific Instructions, Language B, Higher and Standard Levels, Cardiff, International Baccalaureate Curriculum and Assessment Centre. • INTERNATIONAL BACCALAUREATE ORGANISATION, (2002), The Diploma Programme: Group 2 Languages (in English and French versions), Geneva, International Baccalaureate Organisation. • INTERNATIONAL BACCALAUREATE ORGANISATION, (unpublished draft), Standardiser’s Guidelines and Checklist, Cardiff, International Baccalaureate Curriculum and Assessment Centre. • ISRAEL, J., (unpublished doctoral thesis, 2003), Authenticity and the Assessment of Modern Language Learning, Open University, Milton Keynes. • ISRAEL, J., (2007),Authenticity and the Assessment of Modern Language Learning for International BaccalaureateDiploma, Group 2 Languages, in The Journal of Research in International Education, London, Sage Publishers. • SARTRE, J-P., (1946a), L’Être et le Néant, Paris, Gallimard. • SARTRE, J-P., (1946b), L’Existentialisme est un Humanisme, Paris, Nagel. • VAN LIER, L., (1996), Interaction in the Language Curriculum: Awareness, Autonomy and Authenticity, London and New York, Longman.