Unsupervised soft-label feature selection

Unsupervised feature selection is an important task in various research fields. It is difficult to select the discriminative features under unsupervised scenario due to the absence of label guidance. Recent works employ the pseudo labels to guide feature selection. However, they generate pseudo labe...

Full description

Saved in:
Bibliographic Details
Published inKnowledge-based systems Vol. 219; p. 106847
Main Authors Wang, Fei, Zhu, Lei, Li, Jingjing, Chen, Haibao, Zhang, Huaxiang
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 11.05.2021
Elsevier Science Ltd
Subjects
Online AccessGet full text
ISSN0950-7051
1872-7409
DOI10.1016/j.knosys.2021.106847

Cover

More Information
Summary:Unsupervised feature selection is an important task in various research fields. It is difficult to select the discriminative features under unsupervised scenario due to the absence of label guidance. Recent works employ the pseudo labels to guide feature selection. However, they generate pseudo labels from the original feature space, where noises, redundancies and outliers may degrade the quality of pseudo labels. Besides, they ignore data fuzziness and use hard-labels as the semantic supervision of feature selection, thus the selected features suffer from significant information loss and semantic shortage. To tackle these problems, we propose an effective Unsupervised Soft-label Feature Selection (USFS) model, which performs soft-label learning and simultaneously guides the unsupervised feature selection process with the learned soft-labels. Specifically, we transform the data to low-dimensional subspace where the affinity matrix with sparse constraint is learned based on the local distances. The affinity matrix is determined as the soft-label matrix and further employed to guide the ultimate feature selection process. A simple yet efficient optimization method is derived to iteratively solve the formulated problem. Promising experimental results on widely tested benchmarks demonstrate the superiority of the proposed method compared with state-of-the-art approaches. For the purpose of reproducibility, we provide the code and testing datasets at https://github.com/wang-feifei/USFS-code.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2021.106847