Addressing Imbalance in Weakly Supervised Multi-Label Learning

Multi-label learning has been widely used in many fields to solve the problem of assigning multiple related categories to an instance. Nevertheless, the label for each training example is assumed complete in most of the current multi-label learning methods. As a matter of fact, it is often hard to o...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 7; pp. 37463 - 37472
Main Authors Luo, Fang-Fang, Guo, Wen-zhong, Chen, Guo-Long
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Multi-label learning has been widely used in many fields to solve the problem of assigning multiple related categories to an instance. Nevertheless, the label for each training example is assumed complete in most of the current multi-label learning methods. As a matter of fact, it is often hard to obtain training samples with complete labels, thus weakly supervised multi-label learning is demanded, which has become a hot topic in recent years. Moreover, the missing labels would further aggravate the class imbalance in multi-label learning. In this paper, the asymmetric stage-wise loss function is introduced to make positive class samples farther away from the classification boundary than negative class samples by adjusting ramp as well as the margin parameters. In addition, the general aggregate loss function is replaced with the average top-<inline-formula> <tex-math notation="LaTeX">k </tex-math></inline-formula> aggregate loss so as to protect the non-typical distributed samples from being sacrificed in the aggregation process of the loss function, and therefore improve the identification accuracy of minority labels. The experiments on both standard and large-scale multi-label data sets demonstrate that the proposed algorithm can solve the problem of class imbalance by changing the sample distribution of training set, thus obtaining competitive performance.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2019.2906409