Utilizing Adversarial Targeted Attacks to Boost Adversarial Robustness
Adversarial attacks have been shown to be highly effective at degrading the performance of deep neural networks (DNNs). The most prominent defense is adversarial training, a method for learning a robust model. Nevertheless, adversarial training does not make DNNs immune to adversarial perturbations....
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
04.09.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Adversarial attacks have been shown to be highly effective at degrading the
performance of deep neural networks (DNNs). The most prominent defense is
adversarial training, a method for learning a robust model. Nevertheless,
adversarial training does not make DNNs immune to adversarial perturbations. We
propose a novel solution by adopting the recently suggested Predictive
Normalized Maximum Likelihood. Specifically, our defense performs adversarial
targeted attacks according to different hypotheses, where each hypothesis
assumes a specific label for the test sample. Then, by comparing the hypothesis
probabilities, we predict the label. Our refinement process corresponds to
recent findings of the adversarial subspace properties. We extensively evaluate
our approach on 16 adversarial attack benchmarks using ResNet-50,
WideResNet-28, and a2-layer ConvNet trained with ImageNet, CIFAR10, and MNIST,
showing a significant improvement of up to 5.7%, 3.7%, and 0.6% respectively. |
---|---|
DOI: | 10.48550/arxiv.2109.01945 |