Mitigating the Impact of Labeling Errors on Training via Rockafellian Relaxation

Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particul...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Chen, Louis L, Chern, Bobbie, Eckstrand, Eric, Mahapatra, Amogh, Royset, Johannes O
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 30.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Labeling errors in datasets are common, if not systematic, in practice. They naturally arise in a variety of contexts-human labeling, noisy labeling, and weak labeling (i.e., image classification), for example. This presents a persistent and pervasive stress on machine learning practice. In particular, neural network (NN) architectures can withstand minor amounts of dataset imperfection with traditional countermeasures such as regularization, data augmentation, and batch normalization. However, major dataset imperfections often prove insurmountable. We propose and study the implementation of Rockafellian Relaxation (RR), a new loss reweighting, architecture-independent methodology, for neural network training. Experiments indicate RR can enhance standard neural network methods to achieve robust performance across classification tasks in computer vision and natural language processing (sentiment analysis). We find that RR can mitigate the effects of dataset corruption due to both (heavy) labeling error and/or adversarial perturbation, demonstrating effectiveness across a variety of data domains and machine learning tasks.
ISSN:2331-8422