Robust nonparametric regression based on deep ReLU neural networks
In this paper, we consider robust nonparametric regression using deep neural networks with ReLU activation function. While several existing theoretically justified methods are geared towards robustness against identical heavy-tailed noise distributions, the rise of adversarial attacks has emphasized...
Saved in:
Published in | Journal of statistical planning and inference Vol. 233; p. 106182 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.12.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 0378-3758 1873-1171 |
DOI | 10.1016/j.jspi.2024.106182 |
Cover
Loading…
Summary: | In this paper, we consider robust nonparametric regression using deep neural networks with ReLU activation function. While several existing theoretically justified methods are geared towards robustness against identical heavy-tailed noise distributions, the rise of adversarial attacks has emphasized the importance of safeguarding estimation procedures against systematic contamination. We approach this statistical issue by shifting our focus towards estimating conditional distributions. To address it robustly, we introduce a novel estimation procedure based on ℓ-estimation. Under a mild model assumption, we establish general non-asymptotic risk bounds for the resulting estimators, showcasing their robustness against contamination, outliers, and model misspecification. We then delve into the application of our approach using deep ReLU neural networks. When the model is well-specified and the regression function belongs to an α-Hölder class, employing ℓ-type estimation on suitable networks enables the resulting estimators to achieve the minimax optimal rate of convergence. Additionally, we demonstrate that deep ℓ-type estimators can circumvent the curse of dimensionality by assuming the regression function closely resembles the composition of several Hölder functions. To attain this, new deep fully-connected ReLU neural networks have been designed to approximate this composition class. This approximation result can be of independent interest.
•We introduce a novel estimation procedure to robustly address nonparametric regression.•When the model is well-specified, applying this method to deep networks yields optimal estimators.•We establish new approximation results using fully connected neural networks for compositional Hölder functions.•Under suitable conditions, the resulting estimator can overcome the curse of dimensionality. |
---|---|
ISSN: | 0378-3758 1873-1171 |
DOI: | 10.1016/j.jspi.2024.106182 |