Good reasons for high variability (low inter-rater reliability) in performance assessment: Toward a fuzzy logic model

Regular performance assessment is an integral part of (high-) risk industries. Past research shows, however, that in many fields, inter-rater reliabilities tend to be moderate to low. This study was designed to investigate the variability of performance assessment in a naturalistic setting in aviati...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of industrial ergonomics Vol. 44; no. 5; pp. 685 - 696
Main Authors Roth, Wolff-Michael, Mavin, Timothy J., Munro, Ian
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 01.09.2014
Elsevier BV
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Regular performance assessment is an integral part of (high-) risk industries. Past research shows, however, that in many fields, inter-rater reliabilities tend to be moderate to low. This study was designed to investigate the variability of performance assessment in a naturalistic setting in aviation. A modified think-aloud protocol was used as research design to investigate the reasoning pairs of pilots use to assess the performance of an airline captain in a high-risk situation. Standard protocol analysis and interaction analysis methods were employed in the analysis of transcribed verbal protocols. The analyses confirm high variability in performance assessment and reveal the good, albeit fuzzy, justifications that assessor pairs use to ground their assessments. A fuzzy logic model exhibits a good approximation between predicted and actual ratings. Implications for the practice of performance assessment are provided. Many industries aim at achieving consistency in identifying true performance levels. However, if the variability in performance assessment is a real phenomenon, as reported here, then practitioners and researchers might have to test whether it can be used positively, e.g., as opportunity for improving the resilience of crews. •High-risk industries are subject to variance in staff performance assessment.•Nine pairs of pilots assessed a captain making a risk-related decision.•Considerable variance but good reasons characterize assessment.•Results are discussed in terms of interrater reliability.•The studies imply variability as an affordance in pilot training.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:0169-8141
1872-8219
DOI:10.1016/j.ergon.2014.07.004