Revisiting the transferability of adversarial examples via source-agnostic adversarial feature inducing method

Though deep neural networks (DNNs) have revealed their extraordinary performance in the fields of computer vision, it is evident that the vulnerability of DNNs to adversarial attacks with crafted human-imperceptible perturbations. Most existing adversarial attacks draw their attention to invading ta...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 144; p. 109828
Main Authors	Xiao, Yatie, Zhou, Jizhe, Chen, Kongyang, Liu, Zhenbang
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.12.2023
Subjects	Adversarial attack Diversity Feature inducing Transferability Transferability Adversarial attack Feature inducing Diversity
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Though deep neural networks (DNNs) have revealed their extraordinary performance in the fields of computer vision, it is evident that the vulnerability of DNNs to adversarial attacks with crafted human-imperceptible perturbations. Most existing adversarial attacks draw their attention to invading target deep task models by enhancing input-diagnostic features via image rotation, warp, or transformation to improve adversarial transferability. Such manners pay close concentration to operation on original inputs regardless of the properties from different source information. Research has inspired us to consider utilizing source-agnostic information and integrating generated features with raw inputs to enrich adversarial properties. For such needs, we propose a simple and flexible adversarial attack method with source-agnostic Feature Inducing Method (FIM) for improving the transferability of adversarial examples (AEs). FIM first focuses on generating perturbed features by imitating diverse patterns from multi-domain sources. Instead of exploiting the original inputs’ diversity, such proposed work gains the various properties by random feature imitation referring to different source distributions. By optimizing the generated features with norm bounds, FIM then integrates original inputs with imitative features. Such manner can diverse row positive class-general features, which reduce the capability of class-specific patterns on cross-model transferability. Based on the crafted property, FIM employs the adaptive gradient-based strategy on such information to generate perturbations, which helps to decrease probability dropping into local optimal when searching for the decision boundary of source and target models. We conduct detailed experiments to evaluate the performance of our proposed approach with existing baselines on three public datasets. The experimental results reveal the better performance of the proposed works on fooling source and target task models leading to a considerable margin in most adversarial scenarios. We further investigate adversarial attacks on adversarial defense models (with adversarial training and trades). Such a proposed attack strategy achieves better attack quality by a margin over 3.00% on CIFAR10 and reduces the robust accuracy of adversarially trained models by a large margin near 9.00% on MNIST. Furthermore, we exploit the performance of the proposed attack strategy applied to feature-level adversarial domains and conduct evaluations to demonstrate its adversarial feasibility in integrating with various attack mechanisms, which gains better adversarial effectiveness over 20.00% than the base attacks on studied deep task models. •We propose a FIM to generate transferable adversarial examples using multi-features.•This FIM improves the adversarial capability of AEs across deep task models.•On multi-classification tasks, FIM-based attacks are superior in attack success rate.•On defense task, FIM works retain lower robust accuracy than existing baselines.
ISSN:	0031-3203 1873-5142
DOI:	10.1016/j.patcog.2023.109828