DECEPTIcON: Bridging Gaps in In-the-Wild Deception Research

We present DECEPTIcON , a new large-scale dataset for automatic deception detection. It contains video clips from 100 public figures, mostly politicians, along with manually aligned text transcripts and extracted audio-visual features. Each video is labeled with one of six truth levels from the Poli...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on affective computing pp. 1 - 13
Main Authors	Bicer, Berat, Durmaz, Bahadlr, Aras, Serhat, Dibeklioglu, Hamdi
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Accuracy Affective computing Annotations classification metrics cross-validation fact-checking Fake news Feature extraction multimodal analysis Organizations political deception Standards organizations transformer architectures Transformers Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We present DECEPTIcON , a new large-scale dataset for automatic deception detection. It contains video clips from 100 public figures, mostly politicians, along with manually aligned text transcripts and extracted audio-visual features. Each video is labeled with one of six truth levels from the PolitiFact fact-checking platform, allowing both fine-grained and binary classification tasks. Unlike earlier datasets, DECEPTIcON is designed to study deception in the wild , meaning it includes real-life, unscripted speech from a wide range of people and topics. We test and compare several baseline models for text, audio, and visual input separately, using state of the art pretrained architectures such as MPNet (text), Wav2Vec2 (audio), and VideoMAE (vision). Each model is trained for deception classification using 5-fold subject-independent cross-validation. We report CCR, F1-score, and MAE to evaluate performance. Our results show that text performs best overall, while fusion of multiple inputs leads to small but meaningful improvements. We also analyze the effect of different truth-level grouping strategies and show how attention-based interpretability tools help explain which parts of the input influence model predictions. DECEPTIcON aims to support fair, generalizable, and reproducible research in multimodal deception detection, and the dataset will be made available for research purposes.
ISSN:	1949-3045 1949-3045
DOI:	10.1109/TAFFC.2025.3591205