Are fast labeling methods reliable? A case study of computer-aided expert annotations on microscopy slides
Deep-learning-based pipelines have shown the potential to revolutionalize microscopy image diagnostics by providing visual augmentations to a trained pathology expert. However, to match human performance, the methods rely on the availability of vast amounts of high-quality labeled data, which poses...
Saved in:
Main Authors | , , , , , , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
13.04.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Deep-learning-based pipelines have shown the potential to revolutionalize
microscopy image diagnostics by providing visual augmentations to a trained
pathology expert. However, to match human performance, the methods rely on the
availability of vast amounts of high-quality labeled data, which poses a
significant challenge. To circumvent this, augmented labeling methods, also
known as expert-algorithm-collaboration, have recently become popular. However,
potential biases introduced by this operation mode and their effects for
training neuronal networks are not entirely understood. This work aims to shed
light on some of the effects by providing a case study for three pathologically
relevant diagnostic settings. Ten trained pathology experts performed a
labeling tasks first without and later with computer-generated augmentation. To
investigate different biasing effects, we intentionally introduced errors to
the augmentation. Furthermore, we developed a novel loss function which
incorporates the experts' annotation consensus in the training of a deep
learning classifier. In total, the pathology experts annotated 26,015 cells on
1,200 images in this novel annotation study. Backed by this extensive data set,
we found that the consensus of multiple experts and the deep learning
classifier accuracy, was significantly increased in the computer-aided setting,
versus the unaided annotation. However, a significant percentage of the
deliberately introduced false labels was not identified by the experts.
Additionally, we showed that our loss function profited from multiple experts
and outperformed conventional loss functions. At the same time, systematic
errors did not lead to a deterioration of the trained classifier accuracy.
Furthermore, a classifier trained with annotations from a single expert with
computer-aided support can outperform the combined annotations from up to nine
experts. |
---|---|
DOI: | 10.48550/arxiv.2004.05838 |