DLTA: A Framework for Dynamic Crowdsourcing Classification Tasks

The increasing popularity of crowdsourcing markets enables the application of crowdsourcing classification tasks. How to conduct quality control in such an application to achieve accurate classification results from noisy workers is an important and challenging task, and has drawn broad research int...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on knowledge and data engineering Vol. 31; no. 5; pp. 867 - 879
Main Authors	Zheng, Libin, Chen, Lei
Format	Journal Article
Language	English
Published	New York IEEE 01.05.2019 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptation models Budgeting Budgets Classification Classification crowdsourcing Crowdsourcing Inference label acquisition label inference Labeling Plugs Quality control Reliability Resource management Task analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The increasing popularity of crowdsourcing markets enables the application of crowdsourcing classification tasks. How to conduct quality control in such an application to achieve accurate classification results from noisy workers is an important and challenging task, and has drawn broad research interests. However, most existing works do not exploit the label acquisition phase, which results in their disability of making a proper budget allocation. Moreover, some works impractically make the assumption of managing workers, which is not supported by common crowdsourcing platforms such as AMT or CrowdFlower. To overcome these drawbacks, in this paper, we devise a Dynamic Label Acquisition and Answer Aggregation (DLTA) framework for crowdsourcing classification tasks. The framework proceeds in a sequence of rounds, adaptively conducting label inference and label acquisition. In each round, it analyzes the collected answers of previous rounds to perform proper budget allocation, and then issues the resultant query to the crowd. To support DLTA, we propose a generative model for the collection of labels, and correspondingly strategies for label inference and budget allocation. Experimental results show that compared with existing methods, DLTA obtains competitive accuracy in the binary case. Besides, its extended version, which plugs in the state-of-the-art inference technique, achieves the highest accuracy.
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2018.2849385