Spatial-frequency feature fusion based deepfake detection through knowledge distillation

While the misuse of Deepfake technology is drawing growing concern in the literature of information security, related forgery detection has become a significant challenge in practical applications. Most state-of-the-art detection methods achieve satisfactory results on raw images, but their performa...

Full description

Saved in:
Bibliographic Details
Published inEngineering applications of artificial intelligence Vol. 133; p. 108341
Main Authors Wang, Bo, Wu, Xiaohan, Wang, Fei, Zhang, Yushu, Wei, Fei, Song, Zengren
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.07.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:While the misuse of Deepfake technology is drawing growing concern in the literature of information security, related forgery detection has become a significant challenge in practical applications. Most state-of-the-art detection methods achieve satisfactory results on raw images, but their performance drops significantly on processed images (e.g. compression). In this work, we propose a novel Deepfake detection method that integrates spatial and frequency domain information within a knowledge distillation framework for efficient forgery detection. Our method consists of two steps: (1) spatial-frequency fusion, and (2) multi-knowledge distillation. We first extract frequency-domain and spatial-domain features, then fuse them and utilize them in attention-based guidance to improve the classification results. Note that the spatial-frequency fusion serves as the basis for both the teacher and student models with spatial-frequency features and logits transferred as knowledge. We conducted comprehensive experiments on several benchmark datasets which successfully demonstrate the excellent generalization performance of our method on compressed images while outperforming state-of-the-art techniques. •Utilize two-stream fusion for comprehensive spatial-frequency domain exploitation.•Attention-guided fusion improves feature discriminability for robust classification.•Teacher model features and logits guide student model for key forgery detection cues.•Experiments show strong generalization of our method on compressed images.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2024.108341