Ensuring Dataset Quality for Machine Learning Certification
The 10th IEEE International Workshop on Software Certification (WoSoCer 2020) In this paper, we address the problem of dataset quality in the context of Machine Learning (ML)-based critical systems. We briefly analyse the applicability of some existing standards dealing with data and show that the s...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
03.11.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The 10th IEEE International Workshop on Software Certification
(WoSoCer 2020) In this paper, we address the problem of dataset quality in the context of
Machine Learning (ML)-based critical systems. We briefly analyse the
applicability of some existing standards dealing with data and show that the
specificities of the ML context are neither properly captured nor taken into
ac-count. As a first answer to this concerning situation, we propose a dataset
specification and verification process, and apply it on a signal recognition
system from the railway domain. In addi-tion, we also give a list of
recommendations for the collection and management of datasets. This work is one
step towards the dataset engineering process that will be required for ML to be
used on safety critical systems. |
---|---|
DOI: | 10.48550/arxiv.2011.01799 |