Dealing with high-dimensional class-imbalanced datasets: Embedded feature selection for SVM classification
[Display omitted] •Novel embedded feature selection approach for SVM for imbalanced data sets.•Optimization is performed via Quasi-Newton and Armijo Search.•Best classification performance is achieved in experiments on benchmark datasets. In this work, we propose a novel feature selection approach d...
Saved in:
Published in | Applied soft computing Vol. 67; pp. 94 - 105 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.06.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | [Display omitted]
•Novel embedded feature selection approach for SVM for imbalanced data sets.•Optimization is performed via Quasi-Newton and Armijo Search.•Best classification performance is achieved in experiments on benchmark datasets.
In this work, we propose a novel feature selection approach designed to deal with two major issues in machine learning, namely class-imbalance and high dimensionality. The proposed embedded strategy penalizes the cardinality of the feature set via the scaling factors technique, and is used with two support vector machine (SVM) formulations designed to deal with the class-imbalanced problem, namely Cost Sensitive SVM, and Support Vector Data Description. The proposed concave formulations are solved via a Quasi-Newton update and Armijo line search. We performed experiments on 12 highly imbalanced microarray datasets using linear and Gaussian kernel, achieving the highest average predictive performance with our approach compared with the most well-known feature selection strategies. |
---|---|
ISSN: | 1568-4946 1872-9681 |
DOI: | 10.1016/j.asoc.2018.02.051 |