Does Instruction Affect the Underlying Dimensionality of a Kinesiology Test?

Does effective instruction, which changes students' knowledge and possibly alters their cognitive functions, also affect the dimensionality of an achievement test? This question was examined by the parameterization of kinesiology test items (n = 42) with a Rasch dichotomous model, followed by a...

Full description

Saved in:
Bibliographic Details
Published inJournal of applied measurement Vol. 17; no. 4; p. 393
Main Authors Bezruczko, Nikolaus, Frank, Eva, Perkins, Kyle
Format Journal Article
LanguageEnglish
Published United States 2016
Subjects
Online AccessGet more information

Cover

Loading…
More Information
Summary:Does effective instruction, which changes students' knowledge and possibly alters their cognitive functions, also affect the dimensionality of an achievement test? This question was examined by the parameterization of kinesiology test items (n = 42) with a Rasch dichotomous model, followed by an investigation of dimensionality in a pre- and post-test quasi-experimental study design. College students (n = 108) provided responses to kinesiology achievement test items. Then the stability of item difficulties, gender differences, and the interaction of item content categories with dimensionality were examined. In addition, a PCA/t-test protocol was implemented to examine dimensionality threats from the item residuals. Internal construct validity was investigated by regressing item content components on calibrated item difficulties. Measurement model item residuals were also investigated with statistical decomposition methods. In general, the results showed significant student achievement between pre and post testing, and dimensionality disturbances were relatively minor. The amount of unexpected item "shift" in an un-equated measurement dimension between pre and post testing was less than ten percent of the total items and largely concentrated among several unrelated items. An unexpected finding was a residual cluster consisting of several items testing related technical content. Complicating interpretation, these items tended to appear near the end of the test, which implicates test position as a threat to measurement equivalence. In general, the results across several methods did not tend to identify common threats and instead pointed to multiple sources of threats with varying degree of prominence. These results suggest conventional approaches to measurement equivalence that emphasize expedient overall procedures such as DIF, IRT, and factor analysis are probably capturing isolated sources of variability. Their implementation probably improves measurement equivalence but with substantial residual sources undetected.
ISSN:1529-7713