Toward Intrinsic Adversarial Robustness Through Probabilistic Training

Modern deep neural networks have made numerous breakthroughs in real-world applications, yet they remain vulnerable to some imperceptible adversarial perturbations. These tailored perturbations can severely disrupt the inference of current deep learning-based methods and may induce potential securit...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on image processing Vol. 32; pp. 3862 - 3872
Main Authors	Dong, Junhao, Yang, Lingxiao, Wang, Yuan, Xie, Xiaohua, Lai, Jianhuang
Format	Journal Article
Language	English
Published	United States IEEE 2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	adversarial perturbations adversarial training Alignment Artificial Intelligence Artificial neural networks Computational modeling Deep neural networks Feature extraction Machine learning Neural Networks, Computer Optimization Perturbation Probabilistic logic probabilistic model Probability theory Robustness Statistical analysis Training Uncertainty
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Modern deep neural networks have made numerous breakthroughs in real-world applications, yet they remain vulnerable to some imperceptible adversarial perturbations. These tailored perturbations can severely disrupt the inference of current deep learning-based methods and may induce potential security hazards to artificial intelligence applications. So far, adversarial training methods have achieved excellent robustness against various adversarial attacks by involving adversarial examples during the training stage. However, existing methods primarily rely on optimizing injective adversarial examples correspondingly generated from natural examples, ignoring potential adversaries in the adversarial domain. This optimization bias can induce the overfitting of the suboptimal decision boundary, which heavily jeopardizes adversarial robustness. To address this issue, we propose Adversarial Probabilistic Training (APT) to bridge the distribution gap between the natural and adversarial examples via modeling the latent adversarial distribution. Instead of tedious and costly adversary sampling to form the probabilistic domain, we estimate the adversarial distribution parameters in the feature level for efficiency. Moreover, we decouple the distribution alignment based on the adversarial probability model and the original adversarial example. We then devise a novel reweighting mechanism for the distribution alignment by considering the adversarial strength and the domain uncertainty. Extensive experiments demonstrate the superiority of our adversarial probabilistic training method against various types of adversarial attacks in different datasets and scenarios.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2023.3290532