The impact of repeated item development training on the prediction of medical faculty members' item difficulty index

Item difficulty plays a crucial role in assessing students' understanding of the concept being tested. The difficulty of each item needs to be carefully adjusted to ensure the achievement of the evaluation's objectives. Therefore, this study aimed to investigate whether repeated item devel...

Full description

Saved in:
Bibliographic Details
Published inBMC medical education Vol. 24; no. 1; pp. 599 - 9
Main Authors Lee, Hye Yoon, Yune, So Jung, Lee, Sang Yeoup, Im, Sunju, Kam, Bee Sung
Format Journal Article
LanguageEnglish
Published England BioMed Central Ltd 30.05.2024
BioMed Central
BMC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Item difficulty plays a crucial role in assessing students' understanding of the concept being tested. The difficulty of each item needs to be carefully adjusted to ensure the achievement of the evaluation's objectives. Therefore, this study aimed to investigate whether repeated item development training for medical school faculty improves the accuracy of predicting item difficulty in multiple-choice questions. A faculty development program was implemented to enhance the prediction of each item's difficulty index, ensure the absence of item defects, and maintain the general principles of item development. The interrater reliability between the predicted, actual, and corrected item difficulty was assessed before and after the training, using either the kappa index or the correlation coefficient, depending on the characteristics of the data. A total of 62 faculty members participated in the training. Their predictions of item difficulty were compared with the analysis results of 260 items taken by 119 fourth-year medical students in 2016 and 316 items taken by 125 fourth-year medical students in 2018. Before the training, significant agreement between the predicted and actual item difficulty indices was observed for only one medical subject, Cardiology (K = 0.106, P = 0.021). However, after the training, significant agreement was noted for four subjects: Internal Medicine (K = 0.092, P = 0.015), Cardiology (K = 0.318, P = 0.021), Neurology (K = 0.400, P = 0.043), and Preventive Medicine (r = 0.577, P = 0.039). Furthermore, a significant agreement was observed between the predicted and actual difficulty indices across all subjects when analyzing the average difficulty of all items (r = 0.144, P = 0.043). Regarding the actual difficulty index by subject, neurology exceeded the desired difficulty range of 0.45-0.75 in 2016. By 2018, however, all subjects fell within this range. Repeated item development training, which includes predicting each item's difficulty index, can enhance faculty members' ability to predict and adjust item difficulty accurately. To ensure that the difficulty of the examination aligns with its intended purpose, item development training can be beneficial. Further studies on faculty development are necessary to explore these benefits more comprehensively.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1472-6920
1472-6920
DOI:10.1186/s12909-024-05577-x