Papers by Jayanti Banerjee
Language assessment accommodations: Issues and challenges for the future
Language Testing
Accommodations in language testing and assessment: Safeguarding equity, access, and inclusion
Language Testing

Handbook of Second Language Assessment
Handbook of Second Language Assessment, 2016
Second language assessment is ubiquitous. It has found its way from education into questions abou... more Second language assessment is ubiquitous. It has found its way from education into questions about access to professions and migration. This volume focuses on the main debates and research advances in second language assessment in the last fifty years or so, showing the influence of linguistics, politics, philosophy, psychology, sociology, and psychometrics. There are four parts which, when taken together, address the principles and practices of second language assessment while considering its impact on society. Read separately, each part addresses a different aspect of the field. Part I deals with the conceptual foundations of second language assessment with chapters on the purposes of assessment, and standards and frameworks, as well as matters of scoring, quality assurance, and test validation. Part II addresses the theory and practice of assessing different second language skills including aspects like intercultural competence and fluency. Part III examines the challenges and opportunities of second language assessment in a range of contexts. In addition to chapters on second language assessment on a national scale, there are chapters on learning-oriented assessment, as well as the uses of second language assessment in the workplace and for migration. Part IV examines a selection of important issues in the field that deserve attention. These include the alignment of language examinations to external frameworks, the increasing use of technology to both deliver and score second language tests, the responsibilities associated with assessing test takers with special needs, the concept of 'voice' in second language assessment, and assessment literacy for teachers and other test and score users.
Translation and interpretation skills
De Gruyter Mouton, 2016

Welcome to issue 48 of Research Notes, our quarterly publication reporting on matters relating to... more Welcome to issue 48 of Research Notes, our quarterly publication reporting on matters relating to research, test development and validation within University of Cambridge ESOL Examinations. This issue presents research undertaken within the 2011 English Australia/Cambridge ESOL Action Research in ELICOS Programme, which supports teachers working in the English language intensive courses for overseas students (ELICOS) sector in Australia. In the first article Katherine Brandon provides the background to the 2011 Action Research (AR) Programme which sought projects to explore knowledge, skills, attitudes or practices in teaching English for specific or general purposes; monitoring student progress; and student motivation. This is followed by a summary of a recent study into the impact of the Programme for the ELICOS sector by Anne Burns who focuses on the impact on participating teachers, their institutions and more widely. Next, six funded projects are presented by the teacher-researchers who participated in the 2011 Programme within five different institutions and several regions within Australia. The first pair of articles explore specific skills in the classroom. Sara Kablaoui and Amal Khabbaz explore the development of reading skills of Arabic English as a Second Language (ESL) learners through four specific reading strategies which helped to improve the participants' reading skills. Next, John Gardiner reports on his study in which he investigated the grammar teaching beliefs of English for Academic Purposes (EAP) learners in order to improve classroom instruction and student motivation. The second pair of articles focus on aspects of learner autonomy and include the winner of the 2011 Action Research in ELICOS Award, Brendan Brown. Brendan explores ways of improving the pronunciation of higher level students, based on the students' identification of key aspects of their own pronunciation and independent practice. Adi Rotem's project sought to enable greater learner autonomy amongst EAP students, using teaching and learning strategies to observe and document learner progress along an existing independent learning continuum with students encouraged to form learner-directed study groups outside of class. The final two articles explore assessment. Brigette Fyfe and Christine Vella report on their study into using assessment rubrics as a teaching tool in order to improve students' academic writing skills through an increase in understanding of academic conventions and building upon intrinsic features of academic texts. Finally, Megan Baker describes how she created a blog for a mixed-level class of students in order to see whether this increased their fluency and creativity in writing and whether this could be used for self-assessment. The third round of research funded by this programme is underway and we look forward to reporting on these studies in a future issue. We finish this issue with a picture of the presentation of the 2011 Action Research in ELICOS Award. Contents edi torial notes 1 The english australia/Cambridge esOL action Research in eLiCOs Programme: Background and rationale: Kather ine Brandon 2 Teacher research in a national programme: impact and implications: Anne Burns 3 Developing reading skills of learners from arabic-speaking backgrounds: 7
Achievement Tests. Aptitude tests. Diagnostic tests. DIALANG. Dictation. Direct/Indirect testing. Discrete point tests. Integrated tests. Integrative tests. Placement tests. Proficiency tests. Progress tests. Reliability. Validity

Assessing Writing, 2015
In performance-based writing assessment, regular monitoring and modification of the rating scale ... more In performance-based writing assessment, regular monitoring and modification of the rating scale is essential to ensure reliable test scores and valid score inferences. However, the development and modification of rating scales (particularly writing scales) is rarely discussed in language assessment literature. The few studies documenting the scale development process have derived the rating scale from analyzing one or two data sources: expert intuition, rater discussion, and/or real performance. This study reports on the review and revision of a rating scale for the writing section of a large-scale, advanced-level English language proficiency examination. Specifically, this study first identified from literature, the features of written text that tend to reliably distinguish between essays across levels of proficiency. Next, using corpus-based tools, 796 essays were analyzed for text features that predict writing proficiency levels. Lastly, rater discussions were analyzed to identify components of the existing scale that raters found helpful for assigning scores. Based on these findings, a new rating scale has been prepared. The results of this work demonstrate the benefits of triangulating information from writing research, rater discussions, and real performances in rating scale design.
State-of-the-Art Review Language testing and assessment (Part I)
This is the third in a series of State-of-the-Art review articles in language testing in this jou... more This is the third in a series of State-of-the-Art review articles in language testing in this journal, the first having been written by Alan Davies in 1978 and the second by Peter Skehan in 1988/1989. Skehan
Choosing Test Formats and Task Types
ILTA language testing bibliography 1990-1999
Keeping up with the times: Triangulating multiple data sources to inform revisions to a writing rubric

This study has taken its lead from discussions about the benefit of collaboration between researc... more This study has taken its lead from discussions about the benefit of collaboration between researchers in language testing and second language acquisition (eg Bachman and Cohen, 1998; Ellis, 2001; and Laufer, 2001). It addresses the question of how competence levels, as operationalised in a rating scale, might be related to what is known about L2 developmental stages. Looking specifically at the writing performances generated by Tasks 1 and 2 of the IELTS Academic Writing module, the study explores the defining characteristics of written language performance at IELTS bands 3-8 with regards to: cohesive devices used; vocabulary richness; syntactic complexity; and grammatical accuracy. It also considers the effects of L1 and writing task type on the measures of proficiency explored. The writing performances of 275 test-takers from two L1 groups (Chinese and Spanish) were transcribed and then subjected to manual annotation for each of the measures selected. Where automatic or semi-autom...

State of the art review : language testing and assessment (part two)
In Part 1 of this two-part review article (Alderson & Banerjee, 2001), we first addressed issues ... more In Part 1 of this two-part review article (Alderson & Banerjee, 2001), we first addressed issues of washback, ethics, politics and standards. After a discussion of trends in testing on a national level and in testing for specific purposes, we surveyed developments in computer-based testing and then finally examined self-assessment, alternative assessment and the assessment of young learners. In this second part, we begin by discussing recent theories of construct validity and the theories of language use that help define the constructs that we wish to measure through language tests. The main sections of the second part concentrate on summarising recent research into the constructs themselves, in turn addressing reading, listening, grammatical and lexical abilities, speaking and writing. Finally we discuss a number of outstanding issues in the field.
Language Assessment in the Educational Context
European survey of language testing and assessment needs : part 2 (regional and national results)

Response Time Data as Validity Evidence: Has It Lived Up To Its Promise and, If Not, What Would It Take to Do So
Understanding and Investigating Response Processes in Validation Research
As a convenient data source from computerized tests, response time could also be very informative... more As a convenient data source from computerized tests, response time could also be very informative evidence for the validity of test scores, offering an opportunity for insights into parts of the test that test takers linger over and other parts of the test where they glide through the material. Given these hopes for, and expectations of, response time data, we should critically evaluate how such data are studied and more importantly, to what extent this type of data lives up to its promise. We begin this chapter by defining response time and briefly discussing its use as validity evidence. We then describe the typical uses of response time data and review the different approaches to studying response time data in various research. The chapter closes with an evaluation of response time data as validity evidence and suggests the conditions that would facilitate better use of response data for validation purposes.

Using Corpus Analyses to Help Address the DIF Interpretation: Gender Differences in Standardized Writing Assessment
Frontiers in Psychology
Addressing differential item functioning (DIF) provides validity evidence to support the interpre... more Addressing differential item functioning (DIF) provides validity evidence to support the interpretation of test scores across groups. Conventional DIF methods flag DIF items statistically, but often fail to consolidate a substantive interpretation. The lack of interpretability of DIF results is particularly pronounced in writing assessment where the matching of test takers’ proficiency levels often relies on external variables and the reported DIF effect is frequently small in magnitude. Using responses to a prompt that showed small gender DIF favoring female test takers, we demonstrate a corpus-based approach that helps address DIF interpretation. To provide linguistic insights into the possible sources of the small DIF effect, this study compared a gender-balanced corpus of 826 writing samples matched by test takers’ performance on the reading and listening components of the test. Four groups of linguistic features that correspond to the rating dimensions, and thus partially represent the writing construct were analyzed. They include (1) sentiment and social cognition, (2) cohesion, (3) syntactic features, and (4) lexical features. After initial screening, 123 linguistic features, all of which were correlated with the writing scores, were retained for gender comparison. Among these selected features, female test takers’ writing samples scored higher on six of them with small effect sizes in the categories of cohesion and syntactic features. Three of the six features were positively correlated with higher writing scores, while the other three were negative. These results are largely consistent with previous findings of gender differences in language use. Additionally, the small differences in the language features of the writing samples (in terms of the small number of features that differ between genders and the small effect size of the observed differences) are consistent with the previous DIF results, both suggesting that the effect of gender differences on the writing scores is likely to be very small. In sum, the corpus-based findings provide linguistic insights into the gender-related language differences and their potential consequences in a testing context. These findings are meaningful for furthering our understanding of the small gender DIF effect identified through statistical analysis, which lends support to the validity of writing scores.

Language Testing
The papers in this special issue provide support for continued scrutiny of interactional competen... more The papers in this special issue provide support for continued scrutiny of interactional competence (IC) as an important component of the speaking construct. The contributions underscore the complex nature of IC and remind us of the multiple factors that affect any construct definition. At the same time, each study offers insights into those factors through their explorations of IC. In this final paper, we first briefly review key findings from the papers that confirm what is already known about IC and that provide new information to our understanding of the construct of IC. After summarizing points of convergence and of divergence, we turn to a discussion of areas that require additional targeted attention and offer four generalizations as starting points for research. In the final section, we take a critical look at the challenges associated with including IC in the speaking construct and the implications of the studies in this special issue for the relationship between IC and pro...
Uploads
Papers by Jayanti Banerjee