A Mixed Integer Linear Programming Support Vector Machine for Cost-Effective Group Feature Selection: Branch-Cut-and-Price Approach

•A cost-effective 1-norm SVM model with group feature selection and its robust model are proposed.•A BCP algorithm is developed to efficiently solve the proposed feature selection models.•The proposed feature selection model can improve economic and predictive performances.•The robust model provides...

Full description

Saved in:
Bibliographic Details
Published inEuropean journal of operational research Vol. 299; no. 3; pp. 1055 - 1068
Main Authors Lee, In Gyu, Yoon, Sang Won, Won, Daehan
Format Journal Article
LanguageEnglish
Published Elsevier B.V 16.06.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•A cost-effective 1-norm SVM model with group feature selection and its robust model are proposed.•A BCP algorithm is developed to efficiently solve the proposed feature selection models.•The proposed feature selection model can improve economic and predictive performances.•The robust model provides a feasible solution by identifying a solution immune to cost uncertainty.•The BCP algorithm can rapidly find optimal solutions for large-scale problems. Recently, cost-based feature selection has received significant attention due to its great ability to achieve promising prediction accuracy at a minimum feature acquisition cost. To further improve its predictive and economic performances, this research proposes a cost-effective 1-norm support vector machine with group feature selection as GFS-CESVM1. Its robust counterpart model, GFS-RCESVM1, is also introduced to address the cost uncertainty of features and feature groups because cost variation commonly exists in real-world problems. The proposed models are formulated as Mixed Integer Linear Programming (MILP). To efficiently solve the proposed SVM MILP models, we develop a Branch-Cut-and-Price (BCP) algorithm that considers only a limited number of variables and/or constraints, which thereby leads to rapid convergence to an optimal solution. Various experimental results on benchmark and synthetic datasets demonstrate that GFS-CESVM1 can achieve competitive outcomes by considering not only individual feature evaluation but also group structural information among features. The GFS-RCESVM1 can identify the subset of features that is immune to cost uncertainty and therefore provide feasible and optimal solutions. Furthermore, our BCP algorithm can dominantly outperform the ordinary BB algorithm for finding better objective value and integrality gap within a short period of time.
ISSN:0377-2217
1872-6860
DOI:10.1016/j.ejor.2021.12.030