Detecting Rater Biases in Sparse Rater-Mediated Assessment Networks

Practical constraints in rater-mediated assessments limit the availability of complete data. Instead, most scoring procedures include one or two ratings for each performance, with overlapping performances across raters or linking sets of multiple-choice items to facilitate model estimation. These in...

Full description

Saved in:
Bibliographic Details
Published inEducational and psychological measurement Vol. 81; no. 5; pp. 996 - 1022
Main Authors Wind, Stefanie A., Ge, Yuan
Format Journal Article
LanguageEnglish
Published Los Angeles, CA SAGE Publications 01.10.2021
SAGE PUBLICATIONS, INC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Practical constraints in rater-mediated assessments limit the availability of complete data. Instead, most scoring procedures include one or two ratings for each performance, with overlapping performances across raters or linking sets of multiple-choice items to facilitate model estimation. These incomplete scoring designs present challenges for detecting rater biases, or differential rater functioning (DRF). The purpose of this study is to illustrate and explore the sensitivity of DRF indices in realistic sparse rating designs that have been documented in the literature that include different types and levels of connectivity among raters and students. The results indicated that it is possible to detect DRF in sparse rating designs, but the sensitivity of DRF indices varies across designs. We consider the implications of our findings for practice related to monitoring raters in performance assessments.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0013-1644
1552-3888
DOI:10.1177/0013164420988108