Spatial-frequency feature fusion based deepfake detection through knowledge distillation

While the misuse of Deepfake technology is drawing growing concern in the literature of information security, related forgery detection has become a significant challenge in practical applications. Most state-of-the-art detection methods achieve satisfactory results on raw images, but their performa...

Full description

Saved in:

Bibliographic Details
Published in	Engineering applications of artificial intelligence Vol. 133; p. 108341
Main Authors	Wang, Bo, Wu, Xiaohan, Wang, Fei, Zhang, Yushu, Wei, Fei, Song, Zengren
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.07.2024
Subjects	Deepfake detection Feature fusion Frequency domain Knowledge distillation Frequency domain Deepfake detection Feature fusion Knowledge distillation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	While the misuse of Deepfake technology is drawing growing concern in the literature of information security, related forgery detection has become a significant challenge in practical applications. Most state-of-the-art detection methods achieve satisfactory results on raw images, but their performance drops significantly on processed images (e.g. compression). In this work, we propose a novel Deepfake detection method that integrates spatial and frequency domain information within a knowledge distillation framework for efficient forgery detection. Our method consists of two steps: (1) spatial-frequency fusion, and (2) multi-knowledge distillation. We first extract frequency-domain and spatial-domain features, then fuse them and utilize them in attention-based guidance to improve the classification results. Note that the spatial-frequency fusion serves as the basis for both the teacher and student models with spatial-frequency features and logits transferred as knowledge. We conducted comprehensive experiments on several benchmark datasets which successfully demonstrate the excellent generalization performance of our method on compressed images while outperforming state-of-the-art techniques. •Utilize two-stream fusion for comprehensive spatial-frequency domain exploitation.•Attention-guided fusion improves feature discriminability for robust classification.•Teacher model features and logits guide student model for key forgery detection cues.•Experiments show strong generalization of our method on compressed images.
ISSN:	0952-1976 1873-6769
DOI:	10.1016/j.engappai.2024.108341