Data-Free Adversarial Perturbations for Practical Black-Box Attack
Neural networks are vulnerable to adversarial examples, which are malicious inputs crafted to fool pre-trained models. Adversarial examples often exhibit black-box attacking transferability, which allows that adversarial examples crafted for one model can fool another model. However, existing black-...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
02.03.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Neural networks are vulnerable to adversarial examples, which are malicious
inputs crafted to fool pre-trained models. Adversarial examples often exhibit
black-box attacking transferability, which allows that adversarial examples
crafted for one model can fool another model. However, existing black-box
attack methods require samples from the training data distribution to improve
the transferability of adversarial examples across different models. Because of
the data dependence, the fooling ability of adversarial perturbations is only
applicable when training data are accessible. In this paper, we present a
data-free method for crafting adversarial perturbations that can fool a target
model without any knowledge about the training data distribution. In the
practical setting of a black-box attack scenario where attackers do not have
access to target models and training data, our method achieves high fooling
rates on target models and outperforms other universal adversarial perturbation
methods. Our method empirically shows that current deep learning models are
still at risk even when the attackers do not have access to training data. |
---|---|
DOI: | 10.48550/arxiv.2003.01295 |