The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization
We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer o...
Saved in:
Published in | IEEE transactions on information theory Vol. 63; no. 5; pp. 3053 - 3075 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.05.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a mixed integer linear optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed by Bertsimas et al. and the continuous nonconvex quadratic optimization framework of Liu et al.. We propose new discrete first-order methods, which when paired with the state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular ℓ 1 -based approaches. We illustrate that our approach can handle problem instances with p = 10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling. |
---|---|
AbstractList | We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a mixed integer linear optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed by Bertsimas et al. and the continuous nonconvex quadratic optimization framework of Liu et al.. We propose new discrete first-order methods, which when paired with the state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular ℓ 1 -based approaches. We illustrate that our approach can handle problem instances with p = 10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling. We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a mixed integer linear optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed by Bertsimas et al. and the continuous nonconvex quadratic optimization framework of Liu et al.. We propose new discrete first-order methods, which when paired with the state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular ℓ1-based approaches. We illustrate that our approach can handle problem instances with p = 10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling. |
Author | Radchenko, Peter Mazumder, Rahul |
Author_xml | – sequence: 1 givenname: Rahul surname: Mazumder fullname: Mazumder, Rahul email: rahulmaz@mit.edu organization: Sloan Sch. of Manage., Oper. Res. Center & Center of Stat., MIT, Cambridge, MA, USA – sequence: 2 givenname: Peter surname: Radchenko fullname: Radchenko, Peter email: radchenk@marshall.usc.edu organization: Bus. Sch., Univ. of Sydney, Sydney, NSW, Australia |
BookMark | eNp9kMFPwyAUh4mZiXN6N_HSxHMnFArUm5lTl2zZwXpuoH1MltlOYEb318vc9ODBE_B43_vlfaeo13YtIHRB8JAQXFyXk3KYYSKGGc8lzugR6pM8F2nBc9ZDfYyJTAvG5Ak69X4ZnywnWR815Qskd9bXDkK8qDZs7SJ5ghXUoXM3ydgH-6qCbWNxrZyHZGpbUC6ZdQ2sfPJuVTKzH9AkkzbAAtzP_3wdQbuNaNeeoWOjVh7OD-cAPd-Py9FjOp0_TEa307SmlIbUCK2V0LhhYGrNOc2ZNnVmaK2J5AyIZJg2HAsBCmtpeMGVUY2mHGsiqKEDdLWfu3bd2wZ8qJbdxrUxsiKyiJpEQVnswvuu2nXeOzDV2sUd3WdFcLVzWUWX1c5ldXAZEf4HqW34Xi04ZVf_gZd70ALAb46QlDBJ6BfcQYQC |
CODEN | IETTAW |
CitedBy_id | crossref_primary_10_1214_24_AOAS1929 crossref_primary_10_1214_25_BA1517 crossref_primary_10_1109_TCE_2024_3371440 crossref_primary_10_1214_21_AOS2155 crossref_primary_10_1109_TSP_2022_3215651 |
Cites_doi | 10.1007/3-540-06583-0_43 10.1137/060675320 10.1137/100808071 10.1093/biomet/asp013 10.1201/b18401 10.1214/15-AOS1388 10.1007/978-1-4614-1927-3_13 10.1007/978-3-642-20192-9 10.1214/009053606000001523 10.1007/978-1-4419-8853-9 10.2307/2337118 10.1214/07-AOS520 10.1137/130915303 10.1007/s11228-010-0147-7 10.1214/08-AOS620 10.1017/CBO9780511804441 10.1214/009053604000000067 10.1214/14-AOS1223 10.1007/s10107-012-0629-5 10.1198/jasa.2011.tm09738 10.4171/dms/6/16 10.1198/016214501753382273 10.1111/j.2517-6161.1996.tb02080.x 10.1214/15-AOS1380 10.1137/080716542 10.1111/j.1467-9868.2008.00668.x 10.1214/07-AOAS131 10.1287/opre.2015.1436 10.1007/s12532-011-0029-5 10.1007/s00041-008-9045-x 10.1007/978-3-540-68279-0_15 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TIT.2017.2658023 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1557-9654 |
EndPage | 3075 |
ExternalDocumentID | 10_1109_TIT_2017_2658023 7831481 |
Genre | orig-research |
GrantInformation_xml | – fundername: NSF grantid: DMS-1209057 funderid: 10.13039/100000001 – fundername: Moore Sloan Foundation – fundername: ONR grantid: N000141512342 |
GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACGFS ACGOD ACIWK AENEX AETEA AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD F5P HZ~ H~9 IAAWW IBMZZ ICLAB IDIHD IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS RXW TAE TN5 VH1 VJK AAYOK AAYXX CITATION RIG 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c333t-f7bba7b0d4efcb66354bfc2f3cb1864e18403d6077ea0b8f696afadb360b173f3 |
IEDL.DBID | RIE |
ISSN | 0018-9448 |
IngestDate | Mon Jun 30 02:24:50 EDT 2025 Tue Jul 01 02:16:08 EDT 2025 Thu Apr 24 22:52:06 EDT 2025 Tue Aug 26 16:43:24 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 5 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c333t-f7bba7b0d4efcb66354bfc2f3cb1864e18403d6077ea0b8f696afadb360b173f3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0003-1384-9743 |
OpenAccessLink | http://hdl.handle.net/1721.1/120796 |
PQID | 1891107934 |
PQPubID | 36024 |
PageCount | 23 |
ParticipantIDs | crossref_primary_10_1109_TIT_2017_2658023 crossref_citationtrail_10_1109_TIT_2017_2658023 proquest_journals_1891107934 ieee_primary_7831481 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2017-05-01 |
PublicationDateYYYYMMDD | 2017-05-01 |
PublicationDate_xml | – month: 05 year: 2017 text: 2017-05-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on information theory |
PublicationTitleAbbrev | TIT |
PublicationYear | 2017 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref35 ref13 rockafellar (ref43) 1996 boyd (ref34) 2011; 3 jünger (ref24) 2009 ref36 ref14 ref31 bertsimas (ref23) 2005 parikh (ref32) 2013; 1 ref10 efron (ref20) 2004; 32 nesterov (ref30) 2004 ref2 ref1 ref39 ref17 tibshirani (ref4) 1996; 58 ref38 ref16 ref19 ref18 bixby (ref26) 2012 ref42 ref41 hastie (ref15) 2009 ref22 ref21 bühlmann (ref11) 2011 ref29 ref8 ref7 (ref37) 2015 ref9 ref6 ref5 bertsekas (ref33) 1999 freund (ref12) 2017 linderoth (ref25) 2010 ref40 (ref27) 2013 williams (ref28) 2013 hastie (ref3) 2015 |
References_xml | – ident: ref38 doi: 10.1007/3-540-06583-0_43 – ident: ref41 doi: 10.1137/060675320 – ident: ref39 doi: 10.1137/100808071 – ident: ref13 doi: 10.1093/biomet/asp013 – year: 2015 ident: ref3 publication-title: Statistical Learning with Sparsity The Lasso and Generalizations doi: 10.1201/b18401 – year: 1996 ident: ref43 publication-title: Convex Analysis – ident: ref1 doi: 10.1214/15-AOS1388 – year: 2009 ident: ref24 publication-title: 50 Years of Integer Programming 1958-2008 From the Early Years to the State-of-the-Art – start-page: 3239 year: 2010 ident: ref25 article-title: MILP software publication-title: Wiley Encyclopedia of Operations Research and Management Science – ident: ref21 doi: 10.1007/978-1-4614-1927-3_13 – year: 2011 ident: ref11 publication-title: Statistics for High-Dimensional Data doi: 10.1007/978-3-642-20192-9 – ident: ref5 doi: 10.1214/009053606000001523 – year: 2004 ident: ref30 publication-title: Introductory Lectures on Convex Optimization A Basic Course doi: 10.1007/978-1-4419-8853-9 – ident: ref35 doi: 10.2307/2337118 – year: 2013 ident: ref28 publication-title: Model Building in Mathematical Programming – ident: ref16 doi: 10.1214/07-AOS520 – ident: ref17 doi: 10.1137/130915303 – ident: ref42 doi: 10.1007/s11228-010-0147-7 – year: 1999 ident: ref33 publication-title: Nonlinear Programming – ident: ref10 doi: 10.1214/08-AOS620 – year: 2013 ident: ref27 publication-title: Top500 Supercomputer Sites Directory page for Top500 Lists Result for Each List Since June 1993 – year: 2015 ident: ref37 publication-title: Gurobi Optimizer Reference Manual – ident: ref6 doi: 10.1017/CBO9780511804441 – volume: 32 start-page: 407 year: 2004 ident: ref20 article-title: Least angle regression publication-title: Ann Statist doi: 10.1214/009053604000000067 – volume: 3 year: 2011 ident: ref34 publication-title: Foundations and Trends in Machine Learning – ident: ref19 doi: 10.1214/14-AOS1223 – year: 2017 ident: ref12 article-title: A new perspective on boosting in linear regression via subgradient optimization and relatives publication-title: Ann Statist – ident: ref31 doi: 10.1007/s10107-012-0629-5 – ident: ref14 doi: 10.1198/jasa.2011.tm09738 – start-page: 107 year: 2012 ident: ref26 article-title: A brief history of linear and mixed-integer programming computation publication-title: Documenta Math Extra Optim Stories doi: 10.4171/dms/6/16 – ident: ref29 doi: 10.1198/016214501753382273 – volume: 58 start-page: 267 year: 1996 ident: ref4 article-title: Regression shrinkage and selection via the lasso publication-title: J Roy Statist Soc Series B (Methodol ) doi: 10.1111/j.2517-6161.1996.tb02080.x – ident: ref2 doi: 10.1214/15-AOS1380 – year: 2009 ident: ref15 publication-title: The Elements of Statistical Learning Data Mining Inference and Prediction – year: 2005 ident: ref23 publication-title: Optimization over Integers – ident: ref40 doi: 10.1137/080716542 – ident: ref8 doi: 10.1111/j.1467-9868.2008.00668.x – ident: ref9 doi: 10.1214/07-AOAS131 – ident: ref22 doi: 10.1287/opre.2015.1436 – ident: ref7 doi: 10.1007/s12532-011-0029-5 – ident: ref36 doi: 10.1007/s00041-008-9045-x – volume: 1 start-page: 123 year: 2013 ident: ref32 article-title: Proximal algorithms publication-title: Found Trends Optim – ident: ref18 doi: 10.1007/978-3-540-68279-0_15 |
SSID | ssj0014512 |
Score | 2.4094126 |
Snippet | We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 3053 |
SubjectTerms | Combinatorial analysis Correlation Dantzig selector Estimation Information theory Input variables Mathematical model Mathematical models mathematical optimization Mixed integer mixed integer linear optimization nonconvex optimization Optimization Proposals Regression coefficients Solvers Sparse linear regression Tuning ℓ₀-minimization |
Title | The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization |
URI | https://ieeexplore.ieee.org/document/7831481 https://www.proquest.com/docview/1891107934 |
Volume | 63 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB5RTu2hvFp1ecmHXiqRXSd2HIcb4iFAgh66SNwi2xmjVemCIIsQv55x4qwooKq3KHESS9-8Pns8A_AdJRdZ7U3iuXGJxDRNrMlVYl3uPDmoDLtsi3N1fCFPL_PLBdiZn4VBxDb5DIfhst3Lr2_cLCyVjQotKHonrvOBiFt3Vmu-YyDztKsMnpICE-fotyR5ORqfjEMOVzHMyN3yTPzlgtqeKm8McetdjpbgrJ9Xl1Tyezhr7NA9vSrZ-L8TX4bPMcxke51crMACTldhqW_hwKJGr8KnF_UI16AmoWEHE7IkFEozEojmaXLFfmG3tL_LDskehAh3SjdviREjIypLqsJCR7Xre_YwMexs8og1CwuNV_Sj-PwnGaY_8cTnF7g4OhzvHyexDUPihBBN4gtrTWF5LdE7GyIUab3LvHA21YrgJY4oasWLAg232qtSGW9qKxS3aSG8-AqL05spfgOmtc9MLZxxTsnSaC2FVDmiQGl0ys0ARj0ylYs1ykOrjOuq5Sq8rAjLKmBZRSwH8GP-xm1Xn-MfY9cCNPNxEZUBbPbgV1GB76tUl4EZl0Kuv__WBnwM3-5yHzdhsbmb4RbFJ43dbgXzGWzb4sM |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PTxQxFH4heEAPIqBhEbUHLybMbmfa6XS8GYUsysKBJeE2aTuvZCMuRGaJ4a_3daazUSHE22SmTSf53q-vfX0P4D1KLrLam8Rz4xKJaZpYk6vEutx5clAZdtkWx2p8Jr-e5-crsLe8C4OIbfIZDsNje5ZfX7lF2CobFVpQ9E5c5wn5_Tzrbmstzwxknna1wVNSYWId_aEkL0fTw2nI4iqGGTlcnom_nFDbVeWeKW79y8E6TPo_69JKvg8XjR26u3-KNv7vr7-A5zHQZJ86ydiAFZxvwnrfxIFFnd6EZ39UJNyCmsSGfZmRLaFgmpFINHezC3aK3eb-R7ZPFiHEuHN6eU2cGBmRWVIWFnqqXd6w25lhk9kvrFnYarygheL3EzJNP-Kdz5dwdrA__TxOYiOGxAkhmsQX1prC8lqidzbEKNJ6l3nhbKoVAUwsUdSKFwUabrVXpTLe1FYobtNCePEKVudXc9wGprXPTC2ccU7J0mgthVQ5okBpdMrNAEY9MpWLVcpDs4zLqmUrvKwIyypgWUUsB_BhOeO6q9DxyNitAM1yXERlALs9-FVU4Zsq1WXgxqWQOw_Pegdr4-nkqDo6PP72Gp6GdbpMyF1YbX4u8A1FK4192wrpb2pc5g0 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Discrete+Dantzig+Selector%3A+Estimating+Sparse+Linear+Models+via+Mixed+Integer+Linear+Optimization&rft.jtitle=IEEE+transactions+on+information+theory&rft.au=Mazumder%2C+Rahul&rft.au=Radchenko%2C+Peter&rft.date=2017-05-01&rft.issn=0018-9448&rft.eissn=1557-9654&rft.spage=1&rft.epage=1&rft_id=info:doi/10.1109%2FTIT.2017.2658023&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TIT_2017_2658023 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9448&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9448&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9448&client=summon |