Classification model selection via bilevel programming
Support vector machines and related classification models require the solution of convex optimization problems that have one or more regularization hyper-parameters. Typically, the hyper-parameters are selected to minimize the cross-validated estimates of the out-of-sample classification error of th...
Saved in:
Published in | Optimization methods & software Vol. 23; no. 4; pp. 475 - 489 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Taylor & Francis
01.08.2008
|
Subjects | |
Online Access | Get full text |
ISSN | 1055-6788 1029-4937 |
DOI | 10.1080/10556780802102586 |
Cover
Loading…
Abstract | Support vector machines and related classification models require the solution of convex optimization problems that have one or more regularization hyper-parameters. Typically, the hyper-parameters are selected to minimize the cross-validated estimates of the out-of-sample classification error of the model. This cross-validation optimization problem can be formulated as a bilevel program in which the outer-level objective minimizes the average number of misclassified points across the cross-validation folds, subject to inner-level constraints such that the classification functions for each fold are (exactly or nearly) optimal for the selected hyper-parameters. Feature selection is included in the bilevel program in the form of bound constraints in the weights. The resulting bilevel problem is converted to a mathematical program with linear equilibrium constraints, which is solved using state-of-the-art optimization methods. This approach is significantly more versatile than commonly used grid search procedures, enabling, in particular, the use of models with many hyper-parameters. Numerical results demonstrate the practicality of this approach for model selection in machine learning. |
---|---|
AbstractList | Support vector machines and related classification models require the solution of convex optimization problems that have one or more regularization hyper-parameters. Typically, the hyper-parameters are selected to minimize the cross-validated estimates of the out-of-sample classification error of the model. This cross-validation optimization problem can be formulated as a bilevel program in which the outer-level objective minimizes the average number of misclassified points across the cross-validation folds, subject to inner-level constraints such that the classification functions for each fold are (exactly or nearly) optimal for the selected hyper-parameters. Feature selection is included in the bilevel program in the form of bound constraints in the weights. The resulting bilevel problem is converted to a mathematical program with linear equilibrium constraints, which is solved using state-of-the-art optimization methods. This approach is significantly more versatile than commonly used grid search procedures, enabling, in particular, the use of models with many hyper-parameters. Numerical results demonstrate the practicality of this approach for model selection in machine learning. |
Author | Bennett, K.P. Hu, Jing Kunapuli, G. Pang, Jong-Shi |
Author_xml | – sequence: 1 givenname: G. surname: Kunapuli fullname: Kunapuli, G. email: kunapg@rpi.edu organization: Department of Mathematical Sciences , Rensselaer Polytechnic Institute – sequence: 2 givenname: K.P. surname: Bennett fullname: Bennett, K.P. organization: Department of Mathematical Sciences , Rensselaer Polytechnic Institute – sequence: 3 givenname: Jing surname: Hu fullname: Hu, Jing organization: Department of Mathematical Sciences , Rensselaer Polytechnic Institute – sequence: 4 givenname: Jong-Shi surname: Pang fullname: Pang, Jong-Shi organization: Department of Industrial and Enterprise Systems Engineering , University of Illinois at Urbana-Champaign |
BookMark | eNqNj01LAzEQhoNUsK3-AG_9A6v5mmwKXmTxCwpe9Lxks7Mlkt2UJFT7791WTxbR07wzzDPDMyOTIQxIyCWjV4xqes0ogCr1GDmjHLQ6IdMxLAu5FOVknwGKcUGfkVlKb5RSyaSaElV5k5LrnDXZhWHRhxb9IqFHe-i3ziwa53E7TjcxrKPpezesz8lpZ3zCi-86J6_3dy_VY7F6fniqbleFFVLlAktEIVuUANwa2oFGxlFoxqmWFCg0pYElYCPBspYp5KiaprUCG24FgJgT9nXXxpBSxK7eRNebuKsZrffi9ZH4yJQ_GOvywS5H4_x_SDd0IfbmPUTf1tnsfIhdNIN16Ziq80ceyZs_SfH7408l1oNI |
CitedBy_id | crossref_primary_10_1016_j_cor_2018_05_005 crossref_primary_10_1287_ijoc_2023_0108 crossref_primary_10_1007_s00186_022_00798_6 crossref_primary_10_1080_10556788_2015_1025133 crossref_primary_10_1007_s11063_013_9301_1 crossref_primary_10_1007_s10107_010_0395_1 crossref_primary_10_1109_ACCESS_2022_3212387 crossref_primary_10_1016_j_ejor_2020_12_009 crossref_primary_10_1007_s10589_023_00487_y crossref_primary_10_12677_AAM_2023_124183 crossref_primary_10_1080_10556788_2012_749876 crossref_primary_10_1109_TSMCB_2009_2015672 crossref_primary_10_1007_s11590_024_02101_4 crossref_primary_10_1007_s13675_015_0041_z crossref_primary_10_1137_20M1352375 crossref_primary_10_1007_s10898_014_0228_5 crossref_primary_10_3390_math11163518 crossref_primary_10_1002_minf_201100092 crossref_primary_10_1080_02331934_2024_2394612 crossref_primary_10_1016_j_eswa_2021_115017 crossref_primary_10_1287_moor_2021_1164 crossref_primary_10_1007_s10107_021_01766_4 crossref_primary_10_1007_s10898_010_9644_3 crossref_primary_10_1137_20M1371403 crossref_primary_10_1007_s10489_020_02151_y crossref_primary_10_1137_11082868X crossref_primary_10_1007_s11750_020_00538_1 crossref_primary_10_1109_TPAMI_2021_3132674 crossref_primary_10_1007_s10107_022_01888_3 crossref_primary_10_1080_02331934_2025_2468412 |
ContentType | Journal Article |
Copyright | Copyright Taylor & Francis Group, LLC 2008 |
Copyright_xml | – notice: Copyright Taylor & Francis Group, LLC 2008 |
DBID | AAYXX CITATION |
DOI | 10.1080/10556780802102586 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1029-4937 |
EndPage | 489 |
ExternalDocumentID | 10_1080_10556780802102586 310424 |
GroupedDBID | .4S .7F .DC .QJ 0BK 0R~ 123 29N 30N 4.4 5VS AAENE AAJMT AALDU AAMIU AAPUL AAQRR ABCCY ABDBF ABFIM ABHAV ABJNI ABLIJ ABPAQ ABPEM ABTAI ABXUL ABXYU ACGEJ ACGFS ACIWK ACTIO ACUHS ADCVX ADGTB ADXPE AEISY AENEX AEOZL AEPSL AEYOC AFKVX AGCQS AGDLA AGMYJ AHDZW AIJEM AJWEG AKBVH AKOOK ALMA_UNASSIGNED_HOLDINGS ALQZU AQRUH ARCSS AVBZW AWYRJ BLEHA CAG CCCUG CE4 COF CS3 DGEBU DKSSO DU5 EAP EBS EDO EMK EPL EST ESX E~A E~B F5P GTTXZ H13 HZ~ H~P I-F IPNFZ J.P KYCEM M4Z NA5 NY~ O9- P2P PQQKQ RIG RNANH ROSJB RTWRZ S-T SNACF TBQAZ TDBHL TEJ TFL TFT TFW TTHFI TUROJ TUS TWF UT5 UU3 ZGOLN ~S~ 07G 1TA AAGDL AAHIA AAIKQ AAKBW AAYXX ACAGQ ACGEE ADYSH AEUMN AFRVT AGLEN AGROQ AHMOU AIYEW ALCKM AMEWO AMPGV AMVHM AMXXU BCCOT BPLKW C06 CITATION CRFIH DMQIW DWIFK EJD HF~ IVXBP LJTGL NUSFT QCRFL TAQ TFMCV TOXWX UB9 UU8 V3K V4Q |
ID | FETCH-LOGICAL-c346t-e7ee34de4552ca0f58e12e38120840505b7a595eb45c1d16e2e6bbdc3eb2c3553 |
ISSN | 1055-6788 |
IngestDate | Tue Jul 01 01:19:32 EDT 2025 Thu Apr 24 22:55:53 EDT 2025 Wed Dec 25 09:04:46 EST 2024 Mon May 13 12:09:11 EDT 2019 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c346t-e7ee34de4552ca0f58e12e38120840505b7a595eb45c1d16e2e6bbdc3eb2c3553 |
PageCount | 15 |
ParticipantIDs | crossref_primary_10_1080_10556780802102586 informaworld_taylorfrancis_310_1080_10556780802102586 crossref_citationtrail_10_1080_10556780802102586 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2008-08-00 |
PublicationDateYYYYMMDD | 2008-08-01 |
PublicationDate_xml | – month: 08 year: 2008 text: 2008-08-00 |
PublicationDecade | 2000 |
PublicationTitle | Optimization methods & software |
PublicationYear | 2008 |
Publisher | Taylor & Francis |
Publisher_xml | – name: Taylor & Francis |
SSID | ssj0004146 |
Score | 2.1312997 |
Snippet | Support vector machines and related classification models require the solution of convex optimization problems that have one or more regularization... |
SourceID | crossref informaworld |
SourceType | Enrichment Source Index Database Publisher |
StartPage | 475 |
SubjectTerms | bilevel programming cross-validation feature selection model selection support vector classification |
Title | Classification model selection via bilevel programming |
URI | https://www.tandfonline.com/doi/abs/10.1080/10556780802102586 |
Volume | 23 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwELbocqEHxKvqUkA5cCpK6nXsZDkiXiuQaKWCQL2s_JgAEgTUDQ_x65nYzmNZQKWXaNeK8_B8Hs_EM_MRsi4ozUzKdIhPiQ5KzEUoKWQhB4kGL-ieUTZA9igZnPCDM3HW7Oja7JJCRfrp1byS_5EqtqFcyyzZD0i2vig24G-ULx5Rwnj8JxlbRssy1sdJ0bLabIwss035__5Sbiic9ffY6uOwrquVytujP1FjXPtUTM8mPbJoGKF6fpBNZOzhXS5vfS71flQ78Tarx-rxw-hX1GDEgqO6l92hqiJ_8_Pw98Xl2MeGfh3qVulHKspcBcfEF4FvYyVRnavdUilVl0TswcNbGpI7ohS_2HLHHzShx13goyXvTMvKl6VjKl6rmf1iLasjDHu-9OnEJT6RaYYeBeuQ6a3Bzp_TJonWp6JVb1htgffpj4mLjBkxYyVuW8bJ8RyZ9V5FsOUgMk-mIF8gn1u1JhdJMg6WwIIlqMESIFgCD5agBZYlcrK3e7w9CD1pRqhjnhQhpAAxN8CFYFrSTPShxwDtMkbRl0d7V6VSbApQXOBE7CXAIFHK6BgU02h8xl9IJ7_J4SsJDN-kWWI0dpIc3x6Vv-AmM4zpRNLUdAmtRmGofUX5ktjkavjm6HfJ97rLrSun8t7JtD20w8J-w8oc4czk6cPisegS8U6X-M1bLX_kub6RmWZurJBO8fcOVtE0LdSaB9Uzd06Ejg |
linkProvider | Library Specific Holdings |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED5BGYCBN6I8MzAhpTiJ7bQjQlQFSqdW6hbFj0gIKIimFeLXc3YcaCl06JjoLvEr9p3z-fsAzhkhmYpD6WMpMUGJKPNTojOf6hQDXi0DJSxAtsNbPXrXZ3234TZ0sEqTQ2cFUYSdq83HbTajS0jcpRV1jA0joklYWJ0vwwpr8NgoGESk83Mu0p0uQnMf7evlX82_HjG1Lk2xlk6sN81NSMqSFjCTp9ooFzX5-YvEcfGqbMGGC0W9q2LsbMOSHuzA-gRB4S5wq5lp0ES2Az2rm-MNrXaOuR4_pp7AeWWMdx3S6wUd96DXvOlet3yntODLiPLc17HWEVWaMhbKlGSsroNQ42IeEkwAMUgSccoaTAvKsPcCrkPNhVAywrxcYsQS7UNl8DrQB-Ap2iAZVxKdUor1whmDUZWpMJQ8JbGqAinbOZGOhtyoYTwngWMrnWmSKlx8u7wVHBzzjMlk5yW53fjICpWSWfMk_8irwOa4RP--6nBBvzNYbXUf2kn7tnN_BGsFAsVACo-hkr-P9AmGObk4tWP5Cx8a728 |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwEB1BkRAc2BFlzYETUoqb2E57REBVFlUcqNRbFG8SAkpF0wrx9YwdF1oKPfSYaCaJPY49kzy_B3DKCDEqiWSIT4kFSkxZmBFtQqozTHi1rCrhALIt3mzT2w7reGxO38MqbQ1tCqIIN1fbl7unzAgRd-40HRNLiGjrFVbji7DELXG43cFBWj_bIv3mIjQP0b42-qn51yUmlqUJ0tKx5aaxXmiq9h1LoUWZPFcGuajIz18cjnO3ZAPWfCIaXBQjZxMWdHcLVsfoCbeBO8VMiyVy4Qucak7Qd8o59nj4lAUCZ5UhnvU4r1d03IF24_rxshl6nYVQxpTnoU60jqnSlLFIZsSwmq5GGpfyiGD5hymSSDJWZ1pQhrGrch1pLoSSMVblEvOVeBdK3beu3oNA0ToxXEl0yii2C-cLRpVRUSR5RhJVBjLq5lR6EnKrhfGSVj1X6VSXlOHs26VXMHDMMibjsUtz99nDFBol0-Zp_pGXgc1wif-91f6cfiew_HDVSO9vWncHsFLATyye8BBK-ftAH2GOk4tjN5K_AC7U7hM |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Classification+model+selection+via+bilevel+programming&rft.jtitle=Optimization+methods+%26+software&rft.au=Kunapuli%2C+G.&rft.au=Bennett%2C+K.P.&rft.au=Hu%2C+Jing&rft.au=Pang%2C+Jong-Shi&rft.date=2008-08-01&rft.issn=1055-6788&rft.eissn=1029-4937&rft.volume=23&rft.issue=4&rft.spage=475&rft.epage=489&rft_id=info:doi/10.1080%2F10556780802102586&rft.externalDBID=n%2Fa&rft.externalDocID=10_1080_10556780802102586 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1055-6788&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1055-6788&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1055-6788&client=summon |