Classification model selection via bilevel programming
Support vector machines and related classification models require the solution of convex optimization problems that have one or more regularization hyper-parameters. Typically, the hyper-parameters are selected to minimize the cross-validated estimates of the out-of-sample classification error of th...
Saved in:
Published in | Optimization methods & software Vol. 23; no. 4; pp. 475 - 489 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Taylor & Francis
01.08.2008
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Support vector machines and related classification models require the solution of convex optimization problems that have one or more regularization hyper-parameters. Typically, the hyper-parameters are selected to minimize the cross-validated estimates of the out-of-sample classification error of the model. This cross-validation optimization problem can be formulated as a bilevel program in which the outer-level objective minimizes the average number of misclassified points across the cross-validation folds, subject to inner-level constraints such that the classification functions for each fold are (exactly or nearly) optimal for the selected hyper-parameters. Feature selection is included in the bilevel program in the form of bound constraints in the weights. The resulting bilevel problem is converted to a mathematical program with linear equilibrium constraints, which is solved using state-of-the-art optimization methods. This approach is significantly more versatile than commonly used grid search procedures, enabling, in particular, the use of models with many hyper-parameters. Numerical results demonstrate the practicality of this approach for model selection in machine learning. |
---|---|
ISSN: | 1055-6788 1029-4937 |
DOI: | 10.1080/10556780802102586 |