The Scaling of Mixed-Item-Format Tests With the One-Parameter and Two-Parameter Partial Credit Models

Item response theory scalings were conducted for six tests with mixed item formats. These tests differed in their proportions of constructed response (c.r.) and multiple choice (m.c.) items and in overall difficulty. The scalings included those based on scores for the c.r. items that had maintained...

Full description

Saved in:
Bibliographic Details
Published inJournal of educational measurement Vol. 37; no. 3; pp. 221 - 244
Main Authors Sykes, Robert C., Yen, Wendy M.
Format Journal Article
LanguageEnglish
Published Oxford, UK Blackwell Publishing Ltd 01.09.2000
National Council on Measurement in Education
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Item response theory scalings were conducted for six tests with mixed item formats. These tests differed in their proportions of constructed response (c.r.) and multiple choice (m.c.) items and in overall difficulty. The scalings included those based on scores for the c.r. items that had maintained the number of levels as the item rubrics, either produced from single ratings or multiple ratings that were averaged and rounded to the nearest integer, as well as scalings for a single form of c.r. items obtained by summing multiple ratings. A one-parameter (1PPC) or two-parameter (2PPC) partial credit model was used for the c.r. items and the one-parameter logistic (1PL) or three-parameter logistic (3PL) model for the m.c. items. Item fit was substantially worse with the combination 1PL/1PPC model than the 3PL/2PPC model due to the former's restrictive assumptions that there would be no guessing on the m.c. items and equal item discrimination across items and item types. The presence of varying item discriminations resulted in the 1PL/1PPC model producing estimates of item information that could be spuriously inflated for c.r. items that had three or more score levels. Information for some items with summed ratings were usually overestimated by 300% or more for the 1PL/1PPC model. These inflated information values resulted in under-estimated standard errors of ability estimates. The constraints posed by the restricted model suggests limitations on the testing contexts in which the 1PL/1PPC model can be accurately applied.
Bibliography:ark:/67375/WNG-JF4HLPVT-2
istex:4EC64274C441D663AC39F0E2EDAE4BAA7A5356C7
ArticleID:JEDM221
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0022-0655
1745-3984
DOI:10.1111/j.1745-3984.2000.tb01084.x