Assessing Scientific Practices Using Machine-Learning Methods: How Closely Do They Match Clinical Interview Performance?

The landscape of science education is being transformed by the new Framework for Science Education (National Research Council, A framework for K-12 science education: practices, crosscutting concepts, and core ideas. The National Academies Press, Washington, DC, 2012), which emphasizes the centralit...

Full description

Saved in:

Bibliographic Details
Published in	Journal of science education and technology Vol. 23; no. 1; pp. 160 - 182
Main Authors	Beggrow, Elizabeth P., Ha, Minsu, Nehm, Ross H., Pearl, Dennis, Boone, William J.
Format	Journal Article
Language	English
Published	Dordrecht Springer Science+Business Media 01.02.2014 Springer Netherlands Springer Springer Nature B.V
Subjects	Artificial Intelligence Biological evolution Biology Cactus Computers Education Educational Assessment Educational research Educational Technology Elementary Secondary Education Evolution Freeware Goodness of Fit Interviews Item Response Theory Learning algorithms Machine learning Methods Modeling Multiple choice Multiple Choice Tests Natural selection Open Source Technology Rasch model Reasoning Science Education Science Instruction Science Tests Scoring Snails Source code Student Evaluation Students Teaching Methods Undergraduate Students Verbal Tests Evaluation methodologies Teaching/learning strategies Applications in subject areas Pedagogical issues Improving classroom teaching
Online Access	Get full text
ISSN	1059-0145 1573-1839
DOI	10.1007/s10956-013-9461-9

Cover

Loading…

More Information
Summary:	The landscape of science education is being transformed by the new Framework for Science Education (National Research Council, A framework for K-12 science education: practices, crosscutting concepts, and core ideas. The National Academies Press, Washington, DC, 2012), which emphasizes the centrality of scientific practices—such as explanation, argumentation, and communication—in science teaching, learning, and assessment. A major challenge facing the field of science education is developing assessment tools that are capable of validly and efficiently evaluating these practices. Our study examined the efficacy of a free, open-source machine-learning tool for evaluating the quality of students' written explanations of the causes of evolutionary change relative to three other approaches: (1) human-scored written explanations, (2) a multiple-choice test, and (3) clinical oral interviews. A large sample of undergraduates (n = 104) exposed to varying amounts of evolution content completed all three assessments: a clinical oral interview, a written open-response assessment, and a multiple-choice test. Rasch analysis was used to compute linear person measures and linear item measures on a single logit scale. We found that the multiple-choice test displayed poor person and item fit (mean square outfit >1.3), while both oral interview measures and computer-generated written response measures exhibited acceptable fit (average mean square outfit for interview: person 0.97, item 0.97; computer: person 1.03, item 1.06). Multiple-choice test measures were more weakly associated with interview measures (r = 0.35) than the computer-scored explanation measures (r = 0.63). Overall, Rasch analysis indicated that computer-scored written explanation measures (1) have the strongest correspondence to oral interview measures; (2) are capable of capturing students' normative scientific and naive ideas as accurately as human-scored explanations, and (3) more validly detect understanding than the multiple-choice assessment. These findings demonstrate the great potential of machine-learning tools for assessing key scientific practices highlighted in the new Framework for Science Education.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1059-0145 1573-1839
DOI:	10.1007/s10956-013-9461-9