DECEPTIcON: Bridging Gaps in In-the-Wild Deception Research

We present DECEPTIcON , a new large-scale dataset for automatic deception detection. It contains video clips from 100 public figures, mostly politicians, along with manually aligned text transcripts and extracted audio-visual features. Each video is labeled with one of six truth levels from the Poli...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on affective computing pp. 1 - 13
Main Authors Bicer, Berat, Durmaz, Bahadlr, Aras, Serhat, Dibeklioglu, Hamdi
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We present DECEPTIcON , a new large-scale dataset for automatic deception detection. It contains video clips from 100 public figures, mostly politicians, along with manually aligned text transcripts and extracted audio-visual features. Each video is labeled with one of six truth levels from the PolitiFact fact-checking platform, allowing both fine-grained and binary classification tasks. Unlike earlier datasets, DECEPTIcON is designed to study deception in the wild , meaning it includes real-life, unscripted speech from a wide range of people and topics. We test and compare several baseline models for text, audio, and visual input separately, using state of the art pretrained architectures such as MPNet (text), Wav2Vec2 (audio), and VideoMAE (vision). Each model is trained for deception classification using 5-fold subject-independent cross-validation. We report CCR, F1-score, and MAE to evaluate performance. Our results show that text performs best overall, while fusion of multiple inputs leads to small but meaningful improvements. We also analyze the effect of different truth-level grouping strategies and show how attention-based interpretability tools help explain which parts of the input influence model predictions. DECEPTIcON aims to support fair, generalizable, and reproducible research in multimodal deception detection, and the dataset will be made available for research purposes.
ISSN:1949-3045
1949-3045
DOI:10.1109/TAFFC.2025.3591205