Label noise and self-learning label correction in cardiac abnormalities classification

Abstract Objective . Learning to classify cardiac abnormalities requires large and high-quality labeled datasets, which is a challenge in medical applications. Small datasets from various sources are often aggregated to meet this requirement, resulting in a final dataset prone to label noise due to...

Full description

Saved in:
Bibliographic Details
Published inPhysiological measurement Vol. 43; no. 9; pp. 94001 - 94012
Main Authors Gallego Vázquez, Cristina, Breuss, Alexander, Gnarra, Oriella, Portmann, Julian, Madaffari, Antonio, Da Poian, Giulia
Format Journal Article
LanguageEnglish
Published IOP Publishing 30.09.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Objective . Learning to classify cardiac abnormalities requires large and high-quality labeled datasets, which is a challenge in medical applications. Small datasets from various sources are often aggregated to meet this requirement, resulting in a final dataset prone to label noise due to inter- and intra-observer variability and different expertise. It is well known that label noise can affect the performance and generalizability of the trained models. In this work, we explore the impact of label noise and self-learning label correction on the classification of cardiac abnormalities on large heterogeneous datasets of electrocardiogram (ECG) signals. Approach. A state-of-the-art self-learning multi-class label correction method for image classification is adapted to learn a multi-label classifier for electrocardiogram signals. We evaluated our performance using 5-fold cross-validation on the publicly available PhysioNet/Computing in Cardiology (CinC) 2021 Challenge data, with full and reduced sets of leads. Due to the unknown label noise in the testing set, we tested our approach on the MNIST dataset. We investigated the performance under different levels of structured label noise for both datasets. Main results. Under high levels of noise, the cross-validation results of self-learning label correction show an improvement of approximately 3% in the challenge score for the PhysioNet/CinC 2021 Challenge dataset and an improvement in accuracy of 5% and reduction of the expected calibration error of 0.03 for the MNIST dataset. We demonstrate that self-learning label correction can be used to effectively deal with the presence of unknown label noise, also when using a reduced number of ECG leads.
Bibliography:PMEA-104558.R5
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0967-3334
1361-6579
DOI:10.1088/1361-6579/ac89cb