The Adaptive \tau-Lasso: Robustness and Oracle Properties

This paper introduces a new regularized version of the robust <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-regression estimator for analyzing high-dimensional datasets subject to gross contamination in the response variables and covari...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on signal processing Vol. 73; pp. 2464 - 2479
Main Authors Mozafari-Majd, Emadaldin, Koivunen, Visa
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper introduces a new regularized version of the robust <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-regression estimator for analyzing high-dimensional datasets subject to gross contamination in the response variables and covariates (explanatory variables). The resulting estimator, termed adaptive <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-Lasso, is robust to outliers and high-leverage points. It also incorporates an adaptive <inline-formula><tex-math notation="LaTeX">\ell_{1}</tex-math></inline-formula>-norm penalty term, which enables the selection of relevant variables and reduces the bias associated with large true regression coefficients. More specifically, this adaptive <inline-formula><tex-math notation="LaTeX">\ell_{1}</tex-math></inline-formula>-norm penalty term assigns a weight to each regression coefficient. For a fixed number of predictors <inline-formula><tex-math notation="LaTeX"> p </tex-math></inline-formula>, we show that the adaptive <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-Lasso has the oracle property, ensuring both variable-selection consistency and asymptotic normality under fairly mild conditions. Asymptotic normality applies only to the entries of the regression vector corresponding to the true support, assuming knowledge of the true regression vector support. We characterize its robustness by establishing the finite-sample breakdown point and the influence function. We carry out extensive simulations and observe that the class of <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-Lasso estimators exhibits robustness and reliable performance in both contaminated and uncontaminated data settings. We also validate our theoretical findings on robustness properties through simulations. In the face of outliers and high-leverage points, the adaptive <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-Lasso and <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-Lasso estimators achieve the best performance or match the best performances of competing regularized estimators, with minimal or no loss in terms of prediction and variable selection accuracy for almost all scenarios considered in this study. Therefore, the adaptive <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-Lasso and <inline-formula><tex-math notation="LaTeX">\tau</tex-math></inline-formula>-Lasso estimators provide attractive tools for a variety of sparse linear regression problems, particularly in high-dimensional settings and when the data is contaminated by outliers and high-leverage points. However, it is worth noting that no particular estimator uniformly dominates others in all considered scenarios.
ISSN:1053-587X
1941-0476
DOI:10.1109/TSP.2025.3563225