Active Learning With Realistic Data - A Case Study
Machine learning systems learn from data. It is not rare that those data are grouped into categories (also called classes). Thus, one of the goals of a learning algorithm is to find out and understand how to assign new, previously unknown data to the correct class. But, in many cases data is availab...
Saved in:
Published in | 2018 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.07.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Machine learning systems learn from data. It is not rare that those data are grouped into categories (also called classes). Thus, one of the goals of a learning algorithm is to find out and understand how to assign new, previously unknown data to the correct class. But, in many cases data is available or can be gathered at low costs, whereas acquiring the corresponding category (or class) involves high costs. This is precisely where active learning (AL) comes into its own: In order to reduce the annotation costs, it allows the learning system to deliberately select the samples which should be annotated. Subsequently, the selected samples are presented to an entity (e.g., humans, simulation systems, etc.), generally addressed under the term oracle, that provides the corresponding classes. Afterwards, the knowledge base of the learner is updated and, depending on a stopping criterion, new labels are queried or not. Such a system is self-aware of its own imperfection, thus, it uses a selection strategy to determine the next most informative sample. Hitherto, it has been shown that AL is a powerful paradigm that evinces the desired results, provided that the oracles are omniscient. But, human oracles are prone to error, so for that reason we can ask ourself: Does AL still work with error prone and uncertain human annotators? In this article we present the results of an active learning case study conducted on 30000 images labeled by two humans and propose a new type of labeling for better solving AL problems with error-prone annotators. |
---|---|
ISSN: | 2161-4407 |
DOI: | 10.1109/IJCNN.2018.8489394 |