Feature Selection Based on Structured Sparsity: A Comprehensive Study

Feature selection (FS) is an important component of many pattern recognition tasks. In these tasks, one is often confronted with very high-dimensional data. FS algorithms are designed to identify the relevant feature subset from the original features, which can facilitate subsequent analysis, such a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transaction on neural networks and learning systems Vol. 28; no. 7; pp. 1490 - 1507
Main Authors	Gui, Jie, Sun, Zhenan, Ji, Shuiwang, Tao, Dacheng, Tan, Tieniu
Format	Journal Article
Language	English
Published	United States IEEE 01.07.2017 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithm design and analysis Algorithms Categories Classification Clustering Clustering algorithms Computational modeling Dimensionality reduction Feature recognition feature selection Formulations Machine learning Machine learning algorithms Mathematical analysis Matrices (mathematics) Matrix methods Pattern recognition Product acceptance Production planning Regression analysis Regularization Representations Robustness sparse Sparsity structured sparsity Sun Taxonomy
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Feature selection (FS) is an important component of many pattern recognition tasks. In these tasks, one is often confronted with very high-dimensional data. FS algorithms are designed to identify the relevant feature subset from the original features, which can facilitate subsequent analysis, such as clustering and classification. Structured sparsity-inducing feature selection (SSFS) methods have been widely studied in the last few years, and a number of algorithms have been proposed. However, there is no comprehensive study concerning the connections between different SSFS methods, and how they have evolved. In this paper, we attempt to provide a survey on various SSFS methods, including their motivations and mathematical representations. We then explore the relationship among different formulations and propose a taxonomy to elucidate their evolution. We group the existing SSFS methods into two categories, i.e., vector-based feature selection (feature selection based on lasso) and matrix-based feature selection (feature selection based on l r,p -norm). Furthermore, FS has been combined with other machine learning algorithms for specific applications, such as multitask learning, multilabel learning, multiview learning, classification, and clustering. This paper not only compares the differences and commonalities of these methods based on regression and regularization strategies, but also provides useful guidelines to practitioners working in related fields to guide them how to do feature selection.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2162-237X 2162-2388 2162-2388
DOI:	10.1109/TNNLS.2016.2551724