Rater Reliability in Evaluation of Essay and Oral Examinations

Harald Engvik; Steinar Kvale; Odd Erik Havik

doi:10.1080/0031383700140111

Outline

Title

Abstract

Introduction

Conclusions and Summary

References

Rater Reliability in Evaluation of Essay and Oral Examinations

Odd Havik

1970, Pedagogisk Forskning

https://doi.org/10.1080/0031383700140111

visibility

…

description

27 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract
AI

This study analyzes the reliability of examination scoring methods at the Institute of Psychology in Oslo, focusing on both written and oral assessments across different examination periods. By evaluating grading consistency among multiple examiners, the research highlights discrepancies and potential biases in grading practices, particularly noting differences in how men and women candidates are assessed during oral examinations.

Ravinder N ANISETTI

University News, 2018

Examination is a measure of student's progress. It is used as a means to organize and integrate knowledge that incorporates both scholastic and non-scholastic aspects of education. During the process of examination, maintenance of fairness, confidentiality, security and timely execution has become a serious challenge, especially in traditional affiliating universities. It became a bigger challenge to keep abreast with the changing times. Therefore, there is a need to bring in reforms in the traditional examination and evaluation systems in arriving uniformity, reliability and validity. The contemporary concerns of higher education are mainly the processes and procedures associated with the delivery of key services and activities of the assessment reforms, which includes coexisting practices of both examination and evaluation. Opinions expressed in the articles are those of the contributors and do not necessarily reflect the views and policies of the Association.

downloadDownload free PDF View PDFchevron_right

The English Exam in the University Entrance Examination: an overview of studies

María-Belen Diez-Bedmar

Revista Canaria de Estudios Ingleses, 2011

The importance of the University Entrance Examination in the students' academic future has fostered research on the characteristics of the exams which compose it. Among them, the English exam has been analysed concerning crucial issues such as its validity and reliability, the students' written production in the foreign language, and the type of improvements which may be implemented in the exam. The results obtained in the studies conducted so far on the English exam are overviewed in this paper, so that they may be considered in forthcoming studies or when implementing changes to the exam.

downloadDownload free PDF View PDFchevron_right

THE ENTRY TEST TO THE DEGREE COURSE IN SCIENCE OF EDUCATION

Antonio Marzano

The entry test to University are a hybrid of level tests, survey tests, diagnostic tests and selective ones. In fact some of them (those relating to planned access degree courses) determines a ranking which will allow, within the limits of available places, registration for the course. The other, even if compulsory, determine scores not affecting the inscription but to highlight the deficiencies to be recovered or, in some cases, give negative opinion about inscription. The “number of pre-registrations” problem, which might be irrelevant for a planned access degree, poses serious problems for an unscheduled access degree. In fact, while the test of degree courses in Primary Education and for Childhood and Preadolescence Training can ensure the objectivity and selectivity of the test, for the degree course in Science of Education there is a vital need for the construction of test, in addition to the required reliability and validity, it can find any deficiencies which, if not remedied, could invalidate future studies (Marlow, 2000; Cheung, Bucat, 2002). In other words, high numbers of registrations to unscheduled access course degree affect the quality of students learning path (Notti, 2010). If entry tests do not find and use any instrument relating to the recovery of any debt, the productivity parameter of the CdL and the faculty will suffer heavily. It is clear that it is necessary and useful to check the instruments built, without any presumptions of infallibility, especially since a bad test provides unreliable results (Steven et al., 1990; 1991). For this reason every effort should be made to build valid and reliable tests. The objectives of the research have been, in summary, the following: check the validity and the reliability of entry tests of the degree course in Science of Education; show any kind of problem emerged from statistical analysis; suggest, if possible, solutions to be adopted to make tests congruent with the purposes for which they were built. After obtaining the necessary authorizations, the documentation in electronic form concerning tests given to the students and the corresponding tabs of results were acquired. Then, the following statistical processes were made considering the objectives of the research to check if entry tests are able to select students considering their preparation level; to check the ability of the test to measure the skills for which it was constructed and, consequently, its internal coherence. The test is quite selective, not particularly difficult and with many unreliable items. The study of results examination reveals that the 1133 participant students had more difficulties especially in the two test areas called “Linguistics and literature” and “Geography”. In them there is a good level of selectivity. A strong criticism is emerged from the distractors quality. The results we received, show a sufficient quality of test and a capacity to place in a reliable ranking of student results.

downloadDownload free PDF View PDFchevron_right

A SOCIAL VALIDATION OF COLLEGE EXAMINATIONS

Becky Thurston, Mark Runco

Social validation procedures were used to compare views held by professors and students concerning college examinations. In the first phase ofthis project, four groups ofparticipants (21 professors and 126 students representing psychology and biology) listed characteristics ofgood and poor tests. The characteristics given by the groups were somewhat similar, although very few ideas were given about the results of examinations (e.g., range of scores). A socially valid questionnaire was constructed from the most commonly cited characteristics, and four additional groups of professors (ø : 25) and students (n : 102) were asked to rate the importance of each. MANOVAs indicated that the professors and students gave significantly different ratings, but there were only slight differences between the two disciplines, and no differences between those who reported having taken measurement or testing classes and those who had not. The differences between professors and students were especially clear in ratings of Instructions and Question characteristics of tests. In contrast, items from the questionnaire concerning fhe Coverage and Content of an examination were given similar ratings by the four groups. The results suggest a number of ways that professors can construct examinations which students will respect. Preliminary findings from this investigation were presented at the meeting of the Western

downloadDownload free PDF View PDFchevron_right

ANALYSIS OF A TEST OR ASSESSMENT PROCEDURE

Dr. Inayatullah Kakepoto

2013

This paper is aimed at analysing the assessment procedure for speaking component of the high-stake test of standard repute, the International English Language Testing System (IELTS). An attempt is made to report the test tasks/test construct, test procedure, its rater(s) and rating criteria, reckon its strengths and in last determine its weaknesses if any.

downloadDownload free PDF View PDFchevron_right

The problem posed by exam choice on the comparability of results in the Finnish matriculation examination

Sirkku Kupiainen

Journal for educational research online, 2016

Der Artikel von Kupiainen, Marjanen und Hautamaki konzentriert sich auf die zentrale Abschlussprufung der Sekundarstufe II in Finnland als eine Schulabschluss- und Hochschulzugangsprufung. Die dargestellte Studie geht der Frage nach, ob die gestiegenen Auswahlmoglichkeiten der fachspezifischen Prufungen die Vergleichbarkeit der Prufungsergebnisse und die Wahl der Schulerinnen und Schuler nicht nur in der Prufung, sondern bereits wahrend der Schulzeit beeinflussen kann. Es wird Bezug auf Finnlands mehr als 160 Jahre lange Tradition zentraler Abschlussprufungen am Ubergang zwischen Sekundarstufe II und Hochschulzugang genommen. Die Autorengruppe erlautert das finnische System hinsichtlich der Einfuhrung eines kursbasierten (vs. klassen- oder jahrgangsstufenbasierten) Curriculums fur die dreijahrige Sekundarstufe II und bezuglich der anschliesenden Reformen der zentralen Abschlussprufung, durch welche die Auswahlmoglichkeiten von Schulerinnen und Schuler fur die fachspezifischen Prufun...

downloadDownload free PDF View PDFchevron_right

Teoría clásica de la evaluación y teoría de respuesta al ítem: dos comprensiones de un examen avanzado de proficiencia

Valerie Meier

2014

Language testing professionals and teacher educators have articulated the need for a broad variety stakeholders––including classroom teachers–– to develop assessment literacy. In this paper, we argue that when teachers are involved in local assessment development projects, they can expand their assessment knowledge and skills beyond what is necessary for conducting principled classroom assessments. We further claim that a particular analytic approach, Rasch analysis, should be considered as one possible element of this expanded assessment literacy. To this end, we use placement exam data from one Colombian university to illustrate how analyses from item response theory perspectives (Rasch analysis) differ from, and can usefully complement classical test theory.Evaluadores de lengua y formadores de maestros argumentan que los involucrados en el campo de la educación, incluyendo los maestros de aula, deben desarrollar un conocimiento profundo en el tema de la evaluación. Planteamos qu...

downloadDownload free PDF View PDFchevron_right

Lamprianou, I.(2008). High Stakes Tests with Self-Selected Essay Questions: Addressing Issues of Fairness. International Journal of Testing, 18 (1), pp NA.

Iasonas Lamprianou

relabs.org

High-stakes tests with self-selected essay questions 2 Abstract This study investigates the effect of reporting the unadjusted raw scores in a high-stakes language exam when raters differ significantly in severity and self-selected questions differ significantly in difficulty. More sophisticated models, introducing meaningful facets and parameters, are successively used to investigate the characteristics of the dataset. The application of the Rasch models to the data showed that examinees could benefit significantly from being marked by lenient raters and by responding to less demanding essay questions. It was also shown that the third rater failed to adjust the raw scores in a way similar to the statistical adjustment by the Rasch models. The study discusses the consequences of reporting unadjusted raw scores with particular emphasis on issues of fairness. High-stakes tests with self-selected essay questions 3 HIGH STAKES EXAMINATIONS WITH SELF-SELECTED ESSAY QUESTIONS:

downloadDownload free PDF View PDFchevron_right

Computer-based Language Test in a University Language Centre

Cristina Perez Guillot

downloadDownload free PDF View PDFchevron_right

Assessment Methods and Practices in Higher Education in Denmark

Hanne Leth Andersen

2014

This paper is aimed at foreign teaching staff in Denmark who are interested in gaining a better understanding of assessment methods and practices in Danish higher education. It addresses assessment practices and grading at Danish universities, with special attention to the use, preparation, conduct, and assessment of oral exams. It also examines the formal role of examiners and co-examiners, exploring possible differences between how international and Danish staff might approach the tasks of examining, co-examining, and grading. Finally, it considers some important issues raised by the increasing use of English as the examination language. Assessment methods Summative assessment is a core activity of any education system. Exams are important in different ways for students, for teachers, and for future employers. To ensure that students fulfil their degree requirements, universities must test them not only at the end of their degree programme but also during the course of it to guarantee progression and partial competences. Exams inevitably structure and shape the work of students, who naturally want to pass their exams and succeed in their studies, and thus to reach high levels of competence and eventually find interesting and well-paying jobs. Exams are also high stakes testing activities with important consequences for the test takers: passing has important advantages, and failing has important disadvantages. In addition, the actual exam results are important to the students, since all final semester grades, which typically only reflect the grades received on exams, appear on the diploma. Teachers may stimulate students to do their best on exams by informing them of the final exam requirements in a positive and constructive way, and sometimes by reminding them of the negative consequences if they do not take their exams seriously. Teachers also sometimes measure their own performance according to the successful results of their students. Finally, future employers need exams to know the levels of knowledge, skills and competence of the job candidates. BA, MA or PhD diplomas inspire confidence, but the actual grades can also be of importance in understanding a candidate's profile. Exams are the only summative form of assessment in Danish university education since no official grading occurs during the semester. With grading and testing not usually being part of the daily culture and power relations in Danish Appeals Danish students have extensive opportunities to appeal against examination results if they do not think that these are fair. The university receives and evaluates appeals, usually at the departmental level, and has a system of appeal bodies that handle such matters. If the student disagrees with the university's decision, the student can appeal it by contacting the Danish Ministry of Science, Innovation and Higher Education. Appealing can result in not only the same or a better grade but also a lower grade.

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (8)

GUILFORD, J. P. (1954). Psychometric Methods. McGraw-Hill, New York.
GUILFORD, J. P. (1965). Fundamental Statistics in Psychology and Education. McGraw- Hill, New York.
HANDAL, G. & OSNES, J. (1970). I hvilken grad er essay-og kortsvarprover repre- sentative og palitelige maleinstrumenter i universitetsundervisning? In HANDAL, G. (ed.), Universitetsstudier under debatt. Universitetsforlaget, Oslo.
KVALE, S. (1970). En'eksaminasjon av universitetseksamener. Universitetsforlaget, Oslo.
KVALE, S. & AKSELSEN, T. (1968). Karaktergivning ved psykologieksamen varen 1968. Manuscript, Institute of Psychology, Oslo.
MCNEMAR, O_. (1969). Psychological Statistics. Wiley, New York.
MARTON, F. (1967). Prov och evaluering irtom den akademiska utbildningen. Universitets- pedagogiska utredningen, Universitetskanslerambetet, Stockholm.
TEIGEN, K. H., KVALE, S. & TSCHUDI, F. (1970). Bedommer-reliabilitet i eksamens- kommisjoner: en eksperimentell etterproving. Manuscript, Institute of Psychology, Bergen.

Robert van Krieken

Studies in Educational Evaluation, 1987

downloadDownload free PDF View PDFchevron_right

GOETHE-ZERTIFIKAT A1: FIT IN DEUTSCH 1 Durchführungsbestimmungen Terms and Conditions for Exam Administration 2 / 15

sumit balayar

downloadDownload free PDF View PDFchevron_right

Can the New Secondary School Leaving Exam ('Nowa Matura') replace written entrance exams to English Philology Departments at Polish Universities? A report on a study.

Jacek Rysiewicz

downloadDownload free PDF View PDFchevron_right

University examination: The examiners perspective

Amruta Dashputra

Journal of Education Technology in Health Sciences, 2018

Education is a joint process, of teachers managing and arranging curriculum resources and of students acquiring the necessary understanding, sensibilities and skills. Evaluation is an integral part of this system. Examiners approach may affect student's performance during summative exam. For the benefit of students it is important to know the examiners views towards students. Objective: To analyze examiners views while evaluating student's performance during theory and practical examination. Material and Methods: After ethics approval a validated questionnaire based cross-sectional study was conducted at NKPSIMS & RC. Questionnaire was categorized into-(A) General questions, (B) Questions related to theory paper correction, (C) Questions regarding practical examination and viva. Questionnaire was given to teachers (n= 62) who were examiner for university examination in their respective subject. Statistical Analysis: Mean, Standard Deviation (SD) and percentage was derived for each item. Results: Teachers gave much importance to theory paper correction (Mean 7.40, SD 2.33) and practical examination & viva (Mean 8.61, SD1.51). They gave maximum importance to content of answer sheet (Mean 8.47 SD 1.89) and least importance to number of pages filled by student (Mean 5.13 SD 2.87). Most of teachers gave importance to knowledge of student (Mean 9.35, SD 1.08). Conclusion-In present study teachers had given much importance to content of answer sheet in theory paper correction and knowledge & competency during practical examination.

downloadDownload free PDF View PDFchevron_right

Taking decisions: Assessment for university entry

Sibylle Plassmann

Language Learning in Higher Education, 2014

Language testing means taking decisions: about the test taker's results, but also about the test construct and the measures taken in order to ensure quality. This article takes the German test telc Deutsch C1 Hochschule as an example to illustrate this decision-making process in an academic context. The test is used for university entry; therefore, the main stakeholders are academic and administrative staff as well as international students about to take up their studies in Germany. The test provider had to take the stakeholders' expectations and teaching traditions into account and at the same time identify disagreements and conflicting expectations. In the end, the almost unlimited possible ways of testing in this context had to be narrowed down. While not all decisions which lead to a valid form of assessment can be presented here, we provide an illustration of the key phases: the definition of the test construct, the construction of the test format and the actual examini...

downloadDownload free PDF View PDFchevron_right

Computer-based Language Tests in a University Language Centre

Asuncion Jaime Pastor

downloadDownload free PDF View PDFchevron_right

Les Etudiants Comme Evaluateurs De Leurs Competences Une Etude De Cas

Georgios Stamelos

2018

The article is devoted to the study of differences in assessment methods of face-to face and online learning of a foreign language in higher education, in particular, the issue of the effectiveness of the assessment techniques used in different formats. Numerous questions that accompanied foreign language online learning in 2019-2020 and 2020-2021 academic years, as well as contradictory reviews about the assessment methods of students' knowledge were the prerequisites of the study. The experience demonstrated that online learning differs significantly from face-to-face learning in a number of requirements for teachers and students. The aim of the study is to modify traditional methods of students' knowledge evaluation and assessment in the form of credits or examinations towards greater independence and objectivity and achieve the autonomy of the assessment process. We carried a survey of students' opinions on the effectiveness of the forms of assessment adopted at the higher education. Based on the survey results, recommendations are made for improving methods of students' knowledge assessment system with respect to the educational format. The conclusions made on the basis of data analysis provide a number of changes in the methods of online assessment both in the educational process at higher education and in staff training programs.

downloadDownload free PDF View PDFchevron_right

Sauzedde, B. (2016). Do academic examination requirements match students expectations, 立命館高等教育研究15

Bertrand Sauzedde

Teachers often have a rather clear idea on the kind of examinations that are most effective to evaluate their students. Nevertheless, from a purely didactical point of view, it is sometimes useful to reverse the roles and to focus on those who are actually involved, the students. Looking into the way they consider exams and what they expect from them may allow us to rethink the role of examinations. We will first introduce the results of a survey conducted on several classes of second and third-year French language students. We will then focus on the students opinions about the various exams they took and discuss whether it is possible to realign the evaluation process required by academia with French learners expectations. Foreign language teachers at the University are often faced with a dilemma. They are dedicated to making their students progress by using a teaching method that promotes learning strategies. Yet, at the same time, they also have to comply with academic rules and university policies. Learning strategies are a set of operations implemented by learners to acquire, integrate and reuse the target language (Cyr). It has been proven that the multiplication of learning strategies maximizes success in learning a foreign language among students. Assessment, among other class activities, plays a part in the teaching method. It allows teachers to regulate learning by correcting, redirecting and improving learning conditions (Groupil & Lusignan). However, this definition of evaluation is somewhat utopian in a place – the university – where assessment often amounts to the validation of the minimum level required to obtain a credit. Thus, formative assessment is often opposed to summative assessment. The former assesses the

downloadDownload free PDF View PDFchevron_right

Needs of Higher Education students as regards language examinations

Cristina Perez Guillot, Julia Zabala Delgado

Culturas, Identidades e Litero-Línguas Estrangeiras; atas do I Colóquio Internacional de Línguas Estrangeiras, 2018

Establishing the goals and needs of potential candidates is of the utmost importance when designing a Proficiency exam for language certification purposes. Having a clearly defined testing goal will help us better define our construct and thus allow for a solid validity argument. In our case, as a Language Centre within a higher education institution, our goal when designing such a test is to meet the demands of the university and its members in the European Higher Education Area. Keywords: higher education, students’ needs, language examination. Our study starts by analysing our target population, for which we needed to define the profile of our candidates and determine the sample that was going to be used for research purposes. We decided to use students from our language centre as research population since it was large enough to give us a representative sample and it enabled us to use computer tools that would allow for the automatic processing of the data. This paper presents the results of our research which have allowed us to define the construct of our test based on the specific needs of our environment.

downloadDownload free PDF View PDFchevron_right

Assessing Language or Content? A comparative study of the assessment practices in three Swedish upper secondary CLIL schools

Helena Reierstam

2015

The present study investigates teachers' assessment practices in a Swedish Content and Language Integrated Learning (CLIL) context at three upper secondary schools. The aim is to explore if, and, if so, how and on what grounds the assessment practices differ in the two subject content courses biology and history due to the use of English as the language of instruction. A second aim concerns if, and, if so, how, the course content and the assessment tools in the English language (EFL) courses are affected due to the use of English in other courses. The focus is on teachers' perceptions and practices. A total of 12 teachers participated in the study: 6 subject content teachers, 4 CLIL and 2 non-CLIL, and 6 EFL teachers. The data consists of teacher interviews, a questionnaire and assessment samples. The teacher responses and assessment samples were analyzed in relation to national course goals and written assessment features. A third objective of the study is to examine if there are common cross-disciplinary features as regards language, content and form in the tests. Students' ability to show content knowledge in a foreign language has been identified as a problematic area in CLIL assessment. So, test items were analyzed in relation to cognitive and linguistic demands, triggered by question formulations. The results indicate that CLIL does not have an effect on teachers' assessment practices. Differences found rather seem to relate to individual preferences or teachers' perceptions of the discipline. The impact of CLIL on the EFL courses is insignificant. Some cross-disciplinary common features were identified in assessment of written production. In conclusion, the analysis suggests the development of CLIL-specific cross-disciplinary assessment guidelines, taking both language and content into account in relation to written disciplinary genres.

downloadDownload free PDF View PDFchevron_right

Rater Reliability in Evaluation of Essay and Oral Examinations

Sign up for access to the world's latest research

AbstractAI

Related papers

References (8)

Related papers

Abstract
AI