Learning From Incomplete and Inaccurate Supervision

In plenty of real-life tasks, strongly supervised information is hard to obtain, and thus weakly supervised learning has drawn considerable attention recently. This paper investigates the problem of learning from incomplete and inaccurate supervision, where only a limited subset of training data is...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on knowledge and data engineering Vol. 34; no. 12; pp. 5854 - 5868
Main Authors	Zhang, Zhen-Yu, Zhao, Peng, Jiang, Yuan, Zhou, Zhi-Hua
Format	Journal Article
Language	English
Published	New York IEEE 01.12.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Cognitive tasks Computer bugs Defects Flaw detection Noise measurement noisy label learning semi-supervised learning Semisupervised learning Software Supervised learning Supervision Task analysis Training data Weakly supervised learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In plenty of real-life tasks, strongly supervised information is hard to obtain, and thus weakly supervised learning has drawn considerable attention recently. This paper investigates the problem of learning from incomplete and inaccurate supervision, where only a limited subset of training data is labeled but potentially with noise. This setting is challenging and of great importance but rarely studied in the literature. We notice that in many applications, the limited labeled data are with certain structures, which paves us a way to design effective methods. Specifically, we observe that labeled data are usually with one-sided noise such as the bug detection task, where the identified buggy codes are indeed with defects, while codes checked many times or newly fixed may still have other flaws. Furthermore, when there occurs two-sided noise in the labeled data, we exploit the class-prior information of unlabeled data, which is typically available in practical tasks. We propose novel approaches for the incomplete and inaccurate supervision learning tasks and effectively alleviate the negative influence of label noise with the help of a vast number of unlabeled data. Both theoretical analysis and extensive experiments justify and validate the effectiveness of the proposed approaches.
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2021.3061215