Unsupervised soft-label feature selection

Unsupervised feature selection is an important task in various research fields. It is difficult to select the discriminative features under unsupervised scenario due to the absence of label guidance. Recent works employ the pseudo labels to guide feature selection. However, they generate pseudo labe...

Full description

Saved in:

Bibliographic Details
Published in	Knowledge-based systems Vol. 219; p. 106847
Main Authors	Wang, Fei, Zhu, Lei, Li, Jingjing, Chen, Haibao, Zhang, Huaxiang
Format	Journal Article
Language	English
Published	Amsterdam Elsevier B.V 11.05.2021 Elsevier Science Ltd
Subjects	Affinity Dimension reduction Feature selection Fuzziness Labels Optimization Outliers (statistics) Semantics Soft-label Unsupervised feature selection Dimension reduction Soft-label Unsupervised feature selection Fuzziness
Online Access	Get full text
ISSN	0950-7051 1872-7409
DOI	10.1016/j.knosys.2021.106847

Cover

More Information
Summary:	Unsupervised feature selection is an important task in various research fields. It is difficult to select the discriminative features under unsupervised scenario due to the absence of label guidance. Recent works employ the pseudo labels to guide feature selection. However, they generate pseudo labels from the original feature space, where noises, redundancies and outliers may degrade the quality of pseudo labels. Besides, they ignore data fuzziness and use hard-labels as the semantic supervision of feature selection, thus the selected features suffer from significant information loss and semantic shortage. To tackle these problems, we propose an effective Unsupervised Soft-label Feature Selection (USFS) model, which performs soft-label learning and simultaneously guides the unsupervised feature selection process with the learned soft-labels. Specifically, we transform the data to low-dimensional subspace where the affinity matrix with sparse constraint is learned based on the local distances. The affinity matrix is determined as the soft-label matrix and further employed to guide the ultimate feature selection process. A simple yet efficient optimization method is derived to iteratively solve the formulated problem. Promising experimental results on widely tested benchmarks demonstrate the superiority of the proposed method compared with state-of-the-art approaches. For the purpose of reproducibility, we provide the code and testing datasets at https://github.com/wang-feifei/USFS-code.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2021.106847