Restricted Black-box Adversarial Attack Against DeepFake Face Swapping

DeepFake face swapping presents a significant threat to online security and social media, which can replace the source face in an arbitrary photo/video with the target face of an entirely different person. In order to prevent this fraud, some researchers have begun to study the adversarial methods a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on information forensics and security Vol. 18; p. 1
Main Authors	Dong, Junhao, Wang, Yuan, Lai, Jianhuang, Xie, Xiaohua
Format	Journal Article
Language	English
Published	New York IEEE 01.01.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	adversarial attack Algorithms Black boxes black-box Closed box Deception DeepFake Deepfakes Faces Fraud Generative adversarial networks Glass box Image quality Image reconstruction Perturbation Perturbation methods Queries Regularization substitute model Substitutes Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	DeepFake face swapping presents a significant threat to online security and social media, which can replace the source face in an arbitrary photo/video with the target face of an entirely different person. In order to prevent this fraud, some researchers have begun to study the adversarial methods against DeepFake or face manipulation. However, existing works mainly focus on the white-box setting or the black-box setting driven by abundant queries, which severely limits the practical application of these methods. To tackle this problem, we introduce a practical adversarial attack that does not require any queries to the facial image forgery model. Our method is built on a substitute model based on face reconstruction and then transfers adversarial examples from the substitute model directly to inaccessible black-box DeepFake models. Specially, we propose the Transferable Cycle Adversary Generative Adversarial Network (TCA-GAN) to construct the adversarial perturbation for disrupting unknown DeepFake systems. We also present a novel post-regularization module for enhancing the transferability of generated adversarial examples. To comprehensively measure the effectiveness of our approaches, we construct a challenging baseline of DeepFake adversarial attacks for future development. Extensive experiments impressively show that the proposed adversarial attack method makes the visual quality of DeepFake face images plummet so that they are easier to be detected by humans and algorithms. Moreover, we demonstrate that the proposed algorithm can be generalized to offer face image protection against various face translation methods.
ISSN:	1556-6013 1556-6021
DOI:	10.1109/TIFS.2023.3266702