Training Robust Deep Neural Networks via Adversarial Noise Propagation
In practice, deep neural networks have been found to be vulnerable to various types of noise, such as adversarial examples and corruption. Various adversarial defense methods have accordingly been developed to improve adversarial robustness for deep models. However, simply training on data mixed wit...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
19.09.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In practice, deep neural networks have been found to be vulnerable to various
types of noise, such as adversarial examples and corruption. Various
adversarial defense methods have accordingly been developed to improve
adversarial robustness for deep models. However, simply training on data mixed
with adversarial examples, most of these models still fail to defend against
the generalized types of noise. Motivated by the fact that hidden layers play a
highly important role in maintaining a robust model, this paper proposes a
simple yet powerful training algorithm, named \emph{Adversarial Noise
Propagation} (ANP), which injects noise into the hidden layers in a layer-wise
manner. ANP can be implemented efficiently by exploiting the nature of the
backward-forward training style. Through thorough investigations, we determine
that different hidden layers make different contributions to model robustness
and clean accuracy, while shallow layers are comparatively more critical than
deep layers. Moreover, our framework can be easily combined with other
adversarial training methods to further improve model robustness by exploiting
the potential of hidden layers. Extensive experiments on MNIST, CIFAR-10,
CIFAR-10-C, CIFAR-10-P, and ImageNet demonstrate that ANP enables the strong
robustness for deep models against both adversarial and corrupted ones, and
also significantly outperforms various adversarial defense methods. |
---|---|
DOI: | 10.48550/arxiv.1909.09034 |