The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization

We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer o...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on information theory Vol. 63; no. 5; pp. 3053 - 3075
Main Authors Mazumder, Rahul, Radchenko, Peter
Format Journal Article
LanguageEnglish
Published New York IEEE 01.05.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a mixed integer linear optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed by Bertsimas et al. and the continuous nonconvex quadratic optimization framework of Liu et al.. We propose new discrete first-order methods, which when paired with the state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular ℓ 1 -based approaches. We illustrate that our approach can handle problem instances with p = 10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling.
AbstractList We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a mixed integer linear optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed by Bertsimas et al. and the continuous nonconvex quadratic optimization framework of Liu et al.. We propose new discrete first-order methods, which when paired with the state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular ℓ 1 -based approaches. We illustrate that our approach can handle problem instances with p = 10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling.
We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a mixed integer linear optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed by Bertsimas et al. and the continuous nonconvex quadratic optimization framework of Liu et al.. We propose new discrete first-order methods, which when paired with the state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular ℓ1-based approaches. We illustrate that our approach can handle problem instances with p = 10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling.
Author Radchenko, Peter
Mazumder, Rahul
Author_xml – sequence: 1
  givenname: Rahul
  surname: Mazumder
  fullname: Mazumder, Rahul
  email: rahulmaz@mit.edu
  organization: Sloan Sch. of Manage., Oper. Res. Center & Center of Stat., MIT, Cambridge, MA, USA
– sequence: 2
  givenname: Peter
  surname: Radchenko
  fullname: Radchenko, Peter
  email: radchenk@marshall.usc.edu
  organization: Bus. Sch., Univ. of Sydney, Sydney, NSW, Australia
BookMark eNp9kMFPwyAUh4mZiXN6N_HSxHMnFArUm5lTl2zZwXpuoH1MltlOYEb318vc9ODBE_B43_vlfaeo13YtIHRB8JAQXFyXk3KYYSKGGc8lzugR6pM8F2nBc9ZDfYyJTAvG5Ak69X4ZnywnWR815Qskd9bXDkK8qDZs7SJ5ghXUoXM3ydgH-6qCbWNxrZyHZGpbUC6ZdQ2sfPJuVTKzH9AkkzbAAtzP_3wdQbuNaNeeoWOjVh7OD-cAPd-Py9FjOp0_TEa307SmlIbUCK2V0LhhYGrNOc2ZNnVmaK2J5AyIZJg2HAsBCmtpeMGVUY2mHGsiqKEDdLWfu3bd2wZ8qJbdxrUxsiKyiJpEQVnswvuu2nXeOzDV2sUd3WdFcLVzWUWX1c5ldXAZEf4HqW34Xi04ZVf_gZd70ALAb46QlDBJ6BfcQYQC
CODEN IETTAW
CitedBy_id crossref_primary_10_1214_24_AOAS1929
crossref_primary_10_1214_25_BA1517
crossref_primary_10_1109_TCE_2024_3371440
crossref_primary_10_1214_21_AOS2155
crossref_primary_10_1109_TSP_2022_3215651
Cites_doi 10.1007/3-540-06583-0_43
10.1137/060675320
10.1137/100808071
10.1093/biomet/asp013
10.1201/b18401
10.1214/15-AOS1388
10.1007/978-1-4614-1927-3_13
10.1007/978-3-642-20192-9
10.1214/009053606000001523
10.1007/978-1-4419-8853-9
10.2307/2337118
10.1214/07-AOS520
10.1137/130915303
10.1007/s11228-010-0147-7
10.1214/08-AOS620
10.1017/CBO9780511804441
10.1214/009053604000000067
10.1214/14-AOS1223
10.1007/s10107-012-0629-5
10.1198/jasa.2011.tm09738
10.4171/dms/6/16
10.1198/016214501753382273
10.1111/j.2517-6161.1996.tb02080.x
10.1214/15-AOS1380
10.1137/080716542
10.1111/j.1467-9868.2008.00668.x
10.1214/07-AOAS131
10.1287/opre.2015.1436
10.1007/s12532-011-0029-5
10.1007/s00041-008-9045-x
10.1007/978-3-540-68279-0_15
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2017
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TIT.2017.2658023
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1557-9654
EndPage 3075
ExternalDocumentID 10_1109_TIT_2017_2658023
7831481
Genre orig-research
GrantInformation_xml – fundername: NSF
  grantid: DMS-1209057
  funderid: 10.13039/100000001
– fundername: Moore Sloan Foundation
– fundername: ONR
  grantid: N000141512342
GroupedDBID -~X
.DC
0R~
29I
3EH
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACGFS
ACGOD
ACIWK
AENEX
AETEA
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
F5P
HZ~
H~9
IAAWW
IBMZZ
ICLAB
IDIHD
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
RXW
TAE
TN5
VH1
VJK
AAYOK
AAYXX
CITATION
RIG
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c333t-f7bba7b0d4efcb66354bfc2f3cb1864e18403d6077ea0b8f696afadb360b173f3
IEDL.DBID RIE
ISSN 0018-9448
IngestDate Mon Jun 30 02:24:50 EDT 2025
Tue Jul 01 02:16:08 EDT 2025
Thu Apr 24 22:52:06 EDT 2025
Tue Aug 26 16:43:24 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 5
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c333t-f7bba7b0d4efcb66354bfc2f3cb1864e18403d6077ea0b8f696afadb360b173f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-1384-9743
OpenAccessLink http://hdl.handle.net/1721.1/120796
PQID 1891107934
PQPubID 36024
PageCount 23
ParticipantIDs crossref_primary_10_1109_TIT_2017_2658023
crossref_citationtrail_10_1109_TIT_2017_2658023
proquest_journals_1891107934
ieee_primary_7831481
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2017-05-01
PublicationDateYYYYMMDD 2017-05-01
PublicationDate_xml – month: 05
  year: 2017
  text: 2017-05-01
  day: 01
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on information theory
PublicationTitleAbbrev TIT
PublicationYear 2017
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref35
ref13
rockafellar (ref43) 1996
boyd (ref34) 2011; 3
jünger (ref24) 2009
ref36
ref14
ref31
bertsimas (ref23) 2005
parikh (ref32) 2013; 1
ref10
efron (ref20) 2004; 32
nesterov (ref30) 2004
ref2
ref1
ref39
ref17
tibshirani (ref4) 1996; 58
ref38
ref16
ref19
ref18
bixby (ref26) 2012
ref42
ref41
hastie (ref15) 2009
ref22
ref21
bühlmann (ref11) 2011
ref29
ref8
ref7
(ref37) 2015
ref9
ref6
ref5
bertsekas (ref33) 1999
freund (ref12) 2017
linderoth (ref25) 2010
ref40
(ref27) 2013
williams (ref28) 2013
hastie (ref3) 2015
References_xml – ident: ref38
  doi: 10.1007/3-540-06583-0_43
– ident: ref41
  doi: 10.1137/060675320
– ident: ref39
  doi: 10.1137/100808071
– ident: ref13
  doi: 10.1093/biomet/asp013
– year: 2015
  ident: ref3
  publication-title: Statistical Learning with Sparsity The Lasso and Generalizations
  doi: 10.1201/b18401
– year: 1996
  ident: ref43
  publication-title: Convex Analysis
– ident: ref1
  doi: 10.1214/15-AOS1388
– year: 2009
  ident: ref24
  publication-title: 50 Years of Integer Programming 1958-2008 From the Early Years to the State-of-the-Art
– start-page: 3239
  year: 2010
  ident: ref25
  article-title: MILP software
  publication-title: Wiley Encyclopedia of Operations Research and Management Science
– ident: ref21
  doi: 10.1007/978-1-4614-1927-3_13
– year: 2011
  ident: ref11
  publication-title: Statistics for High-Dimensional Data
  doi: 10.1007/978-3-642-20192-9
– ident: ref5
  doi: 10.1214/009053606000001523
– year: 2004
  ident: ref30
  publication-title: Introductory Lectures on Convex Optimization A Basic Course
  doi: 10.1007/978-1-4419-8853-9
– ident: ref35
  doi: 10.2307/2337118
– year: 2013
  ident: ref28
  publication-title: Model Building in Mathematical Programming
– ident: ref16
  doi: 10.1214/07-AOS520
– ident: ref17
  doi: 10.1137/130915303
– ident: ref42
  doi: 10.1007/s11228-010-0147-7
– year: 1999
  ident: ref33
  publication-title: Nonlinear Programming
– ident: ref10
  doi: 10.1214/08-AOS620
– year: 2013
  ident: ref27
  publication-title: Top500 Supercomputer Sites Directory page for Top500 Lists Result for Each List Since June 1993
– year: 2015
  ident: ref37
  publication-title: Gurobi Optimizer Reference Manual
– ident: ref6
  doi: 10.1017/CBO9780511804441
– volume: 32
  start-page: 407
  year: 2004
  ident: ref20
  article-title: Least angle regression
  publication-title: Ann Statist
  doi: 10.1214/009053604000000067
– volume: 3
  year: 2011
  ident: ref34
  publication-title: Foundations and Trends in Machine Learning
– ident: ref19
  doi: 10.1214/14-AOS1223
– year: 2017
  ident: ref12
  article-title: A new perspective on boosting in linear regression via subgradient optimization and relatives
  publication-title: Ann Statist
– ident: ref31
  doi: 10.1007/s10107-012-0629-5
– ident: ref14
  doi: 10.1198/jasa.2011.tm09738
– start-page: 107
  year: 2012
  ident: ref26
  article-title: A brief history of linear and mixed-integer programming computation
  publication-title: Documenta Math Extra Optim Stories
  doi: 10.4171/dms/6/16
– ident: ref29
  doi: 10.1198/016214501753382273
– volume: 58
  start-page: 267
  year: 1996
  ident: ref4
  article-title: Regression shrinkage and selection via the lasso
  publication-title: J Roy Statist Soc Series B (Methodol )
  doi: 10.1111/j.2517-6161.1996.tb02080.x
– ident: ref2
  doi: 10.1214/15-AOS1380
– year: 2009
  ident: ref15
  publication-title: The Elements of Statistical Learning Data Mining Inference and Prediction
– year: 2005
  ident: ref23
  publication-title: Optimization over Integers
– ident: ref40
  doi: 10.1137/080716542
– ident: ref8
  doi: 10.1111/j.1467-9868.2008.00668.x
– ident: ref9
  doi: 10.1214/07-AOAS131
– ident: ref22
  doi: 10.1287/opre.2015.1436
– ident: ref7
  doi: 10.1007/s12532-011-0029-5
– ident: ref36
  doi: 10.1007/s00041-008-9045-x
– volume: 1
  start-page: 123
  year: 2013
  ident: ref32
  article-title: Proximal algorithms
  publication-title: Found Trends Optim
– ident: ref18
  doi: 10.1007/978-3-540-68279-0_15
SSID ssj0014512
Score 2.4094126
Snippet We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 3053
SubjectTerms Combinatorial analysis
Correlation
Dantzig selector
Estimation
Information theory
Input variables
Mathematical model
Mathematical models
mathematical optimization
Mixed integer
mixed integer linear optimization
nonconvex optimization
Optimization
Proposals
Regression coefficients
Solvers
Sparse linear regression
Tuning
ℓ₀-minimization
Title The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization
URI https://ieeexplore.ieee.org/document/7831481
https://www.proquest.com/docview/1891107934
Volume 63
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT9wwEB5RTu2hvFp1ecmHXiqRXSd2HIcb4iFAgh66SNwi2xmjVemCIIsQv55x4qwooKq3KHESS9-8Pns8A_AdJRdZ7U3iuXGJxDRNrMlVYl3uPDmoDLtsi3N1fCFPL_PLBdiZn4VBxDb5DIfhst3Lr2_cLCyVjQotKHonrvOBiFt3Vmu-YyDztKsMnpICE-fotyR5ORqfjEMOVzHMyN3yTPzlgtqeKm8McetdjpbgrJ9Xl1Tyezhr7NA9vSrZ-L8TX4bPMcxke51crMACTldhqW_hwKJGr8KnF_UI16AmoWEHE7IkFEozEojmaXLFfmG3tL_LDskehAh3SjdviREjIypLqsJCR7Xre_YwMexs8og1CwuNV_Sj-PwnGaY_8cTnF7g4OhzvHyexDUPihBBN4gtrTWF5LdE7GyIUab3LvHA21YrgJY4oasWLAg232qtSGW9qKxS3aSG8-AqL05spfgOmtc9MLZxxTsnSaC2FVDmiQGl0ys0ARj0ylYs1ykOrjOuq5Sq8rAjLKmBZRSwH8GP-xm1Xn-MfY9cCNPNxEZUBbPbgV1GB76tUl4EZl0Kuv__WBnwM3-5yHzdhsbmb4RbFJ43dbgXzGWzb4sM
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PTxQxFH4heEAPIqBhEbUHLybMbmfa6XS8GYUsysKBJeE2aTuvZCMuRGaJ4a_3daazUSHE22SmTSf53q-vfX0P4D1KLrLam8Rz4xKJaZpYk6vEutx5clAZdtkWx2p8Jr-e5-crsLe8C4OIbfIZDsNje5ZfX7lF2CobFVpQ9E5c5wn5_Tzrbmstzwxknna1wVNSYWId_aEkL0fTw2nI4iqGGTlcnom_nFDbVeWeKW79y8E6TPo_69JKvg8XjR26u3-KNv7vr7-A5zHQZJ86ydiAFZxvwnrfxIFFnd6EZ39UJNyCmsSGfZmRLaFgmpFINHezC3aK3eb-R7ZPFiHEuHN6eU2cGBmRWVIWFnqqXd6w25lhk9kvrFnYarygheL3EzJNP-Kdz5dwdrA__TxOYiOGxAkhmsQX1prC8lqidzbEKNJ6l3nhbKoVAUwsUdSKFwUabrVXpTLe1FYobtNCePEKVudXc9wGprXPTC2ccU7J0mgthVQ5okBpdMrNAEY9MpWLVcpDs4zLqmUrvKwIyypgWUUsB_BhOeO6q9DxyNitAM1yXERlALs9-FVU4Zsq1WXgxqWQOw_Pegdr4-nkqDo6PP72Gp6GdbpMyF1YbX4u8A1FK4192wrpb2pc5g0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+Discrete+Dantzig+Selector%3A+Estimating+Sparse+Linear+Models+via+Mixed+Integer+Linear+Optimization&rft.jtitle=IEEE+transactions+on+information+theory&rft.au=Mazumder%2C+Rahul&rft.au=Radchenko%2C+Peter&rft.date=2017-05-01&rft.issn=0018-9448&rft.eissn=1557-9654&rft.spage=1&rft.epage=1&rft_id=info:doi/10.1109%2FTIT.2017.2658023&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TIT_2017_2658023
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9448&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9448&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9448&client=summon