Transformer-based cascade networks with spatial and channel reconstruction convolution for deepfake detection

The threat posed by forged video technology has gradually grown to include individuals, society, and the nation. The technology behind fake videos is getting more advanced and modern. Fake videos are appearing everywhere on the internet. Consequently, addressing the challenge posed by frequent updat...

Full description

Saved in:

Bibliographic Details
Published in	Mathematical biosciences and engineering : MBE Vol. 21; no. 3; pp. 4142 - 4164
Main Authors	Li, Xue, Zhou, Huibo, Zhao, Ming
Format	Journal Article
Language	English
Published	United States AIMS Press 01.01.2024
Subjects	deepfake detection redundant scconv transformer visualization transformer visualization SCConv deepfake detection redundant
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The threat posed by forged video technology has gradually grown to include individuals, society, and the nation. The technology behind fake videos is getting more advanced and modern. Fake videos are appearing everywhere on the internet. Consequently, addressing the challenge posed by frequent updates in various deepfake detection models is imperative. The substantial volume of data essential for their training adds to this urgency. For the deepfake detection problem, we suggest a cascade network based on spatial and channel reconstruction convolution (SCConv) and vision transformer. Our network model's front portion, which uses SCConv and regular convolution to detect fake videos in conjunction with vision transformer, comprises these two types of convolution. We enhance the feed-forward layer of the vision transformer, which can increase detection accuracy while lowering the model's computing burden. We processed the dataset by splitting frames and extracting faces to obtain many images of real and fake faces. Examinations conducted on the DFDC, FaceForensics++, and Celeb-DF datasets resulted in accuracies of 87.92, 99.23 and 99.98%, respectively. Finally, the video was tested for authenticity and good results were obtained, including excellent visualization results. Numerous studies also confirm the efficacy of the model presented in this study.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1551-0018 1551-0018
DOI:	10.3934/mbe.2024183