A study of inter-rater reliability among novice raters of the BEL120 writing course in a university in Malaysia / Nurrul Che Eme Embong
In the field of writing assessment, various factors have been identified to influence the validity of examinees' scores. One of the most prominent factors believed to threaten scoring validity in an assessment is the raters. Raters played an important role in any type of assessment particularly...
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://ir.uitm.edu.my/id/eprint/14622/ http://ir.uitm.edu.my/id/eprint/14622/1/TM_NURRUL%20CHE%20EME%20EMBONG%20ED%2013_5.pdf |
Summary: | In the field of writing assessment, various factors have been identified to influence the validity of examinees' scores. One of the most prominent factors believed to threaten scoring validity in an assessment is the raters. Raters played an important role in any type of assessment particularly ones that involve writing. The raters need to be reliable in their rating ability and awarding marks which will be used to determine the examinees ability. Previous research has shown that the most concerned group of raters would be the novice raters. This study seeks to investigate the inter-rater reliability of novice raters in rating using holistic and analytic scoring rubrics for writing assessment, particularly in rating expository essays for BEL 120 course in UiTM Dungun, Terengganu. The three novice raters chosen for this study were selected based on the same characteristics. The raters were asked to rate 30 expository essays using Test of Written English (TWE) holistic scoring rubric first and after an interval of two days, they were asked to rate the same essays using BEL 120 analytic scoring rubric used by the faculty to mark BEL 120 final exam. The marks of the essays were computed and analyzed using SPSS Version 18.0 for Windows to generate the results. Intraclass Correlation Coefficient was used to determine interrater reliability since it involved more than two raters. The results showed that the novice raters have low inter-rater reliability level for both scoring rubrics used. The scores discrepancies among the raters also varied greatly thus the raters have low raters agreement when awarding marks using both scoring rubrics. Although the raters were familiar with analytic scoring rubric, the results showed that the inter-rater reliability level for analytic scoring is lower than holistic scoring. The findings provide an insight on the actual level of novice raters' inter-rater reliability and appropriate action is hoped to be taken by Academy of Language Studies in the institution to increase the raters' reliability in rating writing assessment as to increase scoring validity of the examinees. |
---|