Categorising the world into local climate zones: towards quantifying labelling uncertainty for machine learning models

Abstract Image classification is often prone to labelling uncertainty. To generate suitable training data, images are labelled according to evaluations of human experts. This can result in ambiguities, which will affect subsequent models. In this work, we aim to model the labelling uncertainty in th...

Full description

Saved in:

Bibliographic Details
Published in	Journal of the Royal Statistical Society Series C: Applied Statistics Vol. 73; no. 1; pp. 143 - 161
Main Authors	Hechinger, Katharina, Zhu, Xiao Xiang, Kauermann, Göran
Format	Journal Article
Language	English
Published	US Oxford University Press 11.01.2024
Subjects	stochastic expectation maximisation mixture models multiple labellers expert evaluations labelling uncertainty
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract Image classification is often prone to labelling uncertainty. To generate suitable training data, images are labelled according to evaluations of human experts. This can result in ambiguities, which will affect subsequent models. In this work, we aim to model the labelling uncertainty in the context of remote sensing and the classification of satellite images. We construct a multinomial mixture model given the evaluations of multiple experts. This is based on the assumption that there is no ambiguity of the image class, but apparently in the experts’ opinion about it. The model parameters can be estimated by a stochastic expectation maximisation algorithm. Analysing the estimates gives insights into sources of label uncertainty. Here, we focus on the general class ambiguity, the heterogeneity of experts, and the origin city of the images. The results are relevant for all machine learning applications where image classification is pursued and labelling is subject to humans.
ISSN:	0035-9254 1467-9876
DOI:	10.1093/jrsssc/qlad089