Robust Wasserstein profile inference and applications to machine learning

We show that several machine learning estimators, including square-root least absolute shrinkage and selection and regularized logistic regression, can be represented as solutions to distributionally robust optimization problems. The associated uncertainty regions are based on suitably defined Wasse...

Full description

Saved in:

Bibliographic Details
Published in	Journal of applied probability Vol. 56; no. 3; pp. 830 - 857
Main Authors	Blanchet, Jose, Kang, Yang, Murthy, Karthyek
Format	Journal Article
Language	English
Published	Cambridge, UK Cambridge University Press 01.09.2019 Applied Probability Trust
Subjects	Beta Estimators Inference Machine learning Optimization Parameter estimation Regression analysis Regularization Research Papers Robustness (mathematics) Shrinkage Uncertainty United States > US Singapore Primary 60F05 support vector machine empirical likelihood Distributionally robust optimization regularization Wasserstein distance Secondary 62J05 logistic regression limit characterization of optimal Wasserstein ball radius and regularization parameter square-root LASSO 62J12
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We show that several machine learning estimators, including square-root least absolute shrinkage and selection and regularized logistic regression, can be represented as solutions to distributionally robust optimization problems. The associated uncertainty regions are based on suitably defined Wasserstein distances. Hence, our representations allow us to view regularization as a result of introducing an artificial adversary that perturbs the empirical distribution to account for out-of-sample effects in loss estimation. In addition, we introduce RWPI (robust Wasserstein profile inference), a novel inference methodology which extends the use of methods inspired by empirical likelihood to the setting of optimal transport costs (of which Wasserstein distances are a particular case). We use RWPI to show how to optimally select the size of uncertainty regions, and as a consequence we are able to choose regularization parameters for these machine learning estimators without the use of cross validation. Numerical experiments are also given to validate our theoretical findings.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0021-9002 1475-6072
DOI:	10.1017/jpr.2019.49