Towards the generation of machine learning defect reports

Effective locating and fixing defects requires detailed defect reports. Unlike traditional software systems, machine learning applications are subject defects caused from changes in the input data streams (concept drift) and assumptions encoded into models. Without appropriate training, developers f...

Full description

Saved in:
Bibliographic Details
Published inIEEE/ACM International Conference on Automated Software Engineering : [proceedings] pp. 1038 - 1042
Main Author Lai, Tuan Dung
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Effective locating and fixing defects requires detailed defect reports. Unlike traditional software systems, machine learning applications are subject defects caused from changes in the input data streams (concept drift) and assumptions encoded into models. Without appropriate training, developers face difficulties understanding and interpreting faults in machine learning (ML). However, little research is done on how to prepare developers to detect and investigate machine learning system defects. Software engineers often do not have sufficient knowledge to fix the issues themselves without the help of data scientists or domain experts. To investigate this issue, we analyse issue templates and check how developers report machine learning related issues in open-source applied AI projects. The overall goal is to develop a tool for automatically repairing ML defects or generating defect reports if a fix cannot be made. Previous research has identified classes of faults specific to machine learning systems, such as performance degradation arising from concept drift where the machine learning model is no longer aligned with the real-world environment. However, the current issue templates that developers use do not seem to capture the information needed. This research seeks to systematically develop a two-way human-machine information exchange protocol to support domain experts, software engineers, and data scientists to collaboratively detect, report, and respond to these new classes of faults.
ISSN:2643-1572
DOI:10.1109/ASE51524.2021.9678592