Classification model selection via bilevel programming

Support vector machines and related classification models require the solution of convex optimization problems that have one or more regularization hyper-parameters. Typically, the hyper-parameters are selected to minimize the cross-validated estimates of the out-of-sample classification error of th...

Full description

Saved in:
Bibliographic Details
Published inOptimization methods & software Vol. 23; no. 4; pp. 475 - 489
Main Authors Kunapuli, G., Bennett, K.P., Hu, Jing, Pang, Jong-Shi
Format Journal Article
LanguageEnglish
Published Taylor & Francis 01.08.2008
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Support vector machines and related classification models require the solution of convex optimization problems that have one or more regularization hyper-parameters. Typically, the hyper-parameters are selected to minimize the cross-validated estimates of the out-of-sample classification error of the model. This cross-validation optimization problem can be formulated as a bilevel program in which the outer-level objective minimizes the average number of misclassified points across the cross-validation folds, subject to inner-level constraints such that the classification functions for each fold are (exactly or nearly) optimal for the selected hyper-parameters. Feature selection is included in the bilevel program in the form of bound constraints in the weights. The resulting bilevel problem is converted to a mathematical program with linear equilibrium constraints, which is solved using state-of-the-art optimization methods. This approach is significantly more versatile than commonly used grid search procedures, enabling, in particular, the use of models with many hyper-parameters. Numerical results demonstrate the practicality of this approach for model selection in machine learning.
ISSN:1055-6788
1029-4937
DOI:10.1080/10556780802102586