Active Learning With Realistic Data - A Case Study

Machine learning systems learn from data. It is not rare that those data are grouped into categories (also called classes). Thus, one of the goals of a learning algorithm is to find out and understand how to assign new, previously unknown data to the correct class. But, in many cases data is availab...

Full description

Saved in:
Bibliographic Details
Published in2018 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8
Main Authors Calma, Adrian, Stolz, Moritz, Kottke, Daniel, Tomforde, Sven, Sick, Bernhard
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Machine learning systems learn from data. It is not rare that those data are grouped into categories (also called classes). Thus, one of the goals of a learning algorithm is to find out and understand how to assign new, previously unknown data to the correct class. But, in many cases data is available or can be gathered at low costs, whereas acquiring the corresponding category (or class) involves high costs. This is precisely where active learning (AL) comes into its own: In order to reduce the annotation costs, it allows the learning system to deliberately select the samples which should be annotated. Subsequently, the selected samples are presented to an entity (e.g., humans, simulation systems, etc.), generally addressed under the term oracle, that provides the corresponding classes. Afterwards, the knowledge base of the learner is updated and, depending on a stopping criterion, new labels are queried or not. Such a system is self-aware of its own imperfection, thus, it uses a selection strategy to determine the next most informative sample. Hitherto, it has been shown that AL is a powerful paradigm that evinces the desired results, provided that the oracles are omniscient. But, human oracles are prone to error, so for that reason we can ask ourself: Does AL still work with error prone and uncertain human annotators? In this article we present the results of an active learning case study conducted on 30000 images labeled by two humans and propose a new type of labeling for better solving AL problems with error-prone annotators.
ISSN:2161-4407
DOI:10.1109/IJCNN.2018.8489394