The Role of Robust Software in Automated Scoring

Automated scoring systems are software applications that rely heavily on machine learning (ML) and natural language processing (NLP). Existing literature on automated scoring focuses on functionality and evaluation metrics. This chapter discusses the role of software robustness as another important...

Full description

Saved in:

Bibliographic Details
Published in	Advancing Natural Language Processing in Educational Assessment pp. 3 - 14
Main Authors	Madnani, Nitin, Cahill, Aoife, Loukina, Anastassia
Format	Book Chapter
Language	English
Published	Routledge 2023
Edition	1
Subjects	Automated Scoring High Stakes Applications Scoring Models Open Source Software NLP Code Review Process Scoring Engines ASR Engine Updates Machine Learning Models Automated Scoring System Pull Request Automated Scoring Engines NBME Score Users NCME Version Control Code Review Automatic Item Generation NLP Algorithm Violate Prediction Pipelines Scoring Applications Merge Request Version Control Software
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Automated scoring systems are software applications that rely heavily on machine learning (ML) and natural language processing (NLP). Existing literature on automated scoring focuses on functionality and evaluation metrics. This chapter discusses the role of software robustness as another important dimension of automated scoring. Our intended audience is measurement scientists and psychometricians who are generally not exposed to the technical aspects of software development. This chapter describes the motivation for writing robust software for automated scoring and provides a brief introduction to the processes by which such software may be developed and deployed. This chapter describes the motivation for writing robust software for automated scoring and provides a brief introduction to the processes by which such software may be developed and deployed. The advantage of continuously updating the engine once proposed changes have been reviewed is that it is easy to respond to user feedback. A more conservative approach to engine updates is to update them rarely and on a fixed schedule agreed well in advance by all stakeholders. The chapter presents four important elements of such best practices: comprehensive testing, version control, reproducibility, and code review. Making the code and the models available for inspection to all stakeholders, including test-takers, is the ultimate way to ensure transparency and fairness. The comprehensive documentation should include not only detailed comments in the codebase but also stand-alone documentation describing the architecture and the detailed working of the application.
ISBN:	9781032244525 9781032203904 1032244526 1032203900
DOI:	10.4324/9781003278658-2