Constructing Robust Emotional State-based Feature with a Novel Voting Scheme for Multi-modal Deception Detection in Videos
Deception detection is an important task that has been a hot research topic due to its potential applications. It can be applied in many areas, from national security (e.g., airport security, jurisprudence, and law enforcement) to real-life applications (e.g., business and computer vision). However,...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
16.04.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Deception detection is an important task that has been a hot research topic
due to its potential applications. It can be applied in many areas, from
national security (e.g., airport security, jurisprudence, and law enforcement)
to real-life applications (e.g., business and computer vision). However, some
critical problems still exist and are worth more investigation. One of the
significant challenges in the deception detection tasks is the data scarcity
problem. Until now, only one multi-modal benchmark open dataset for human
deception detection has been released, which contains 121 video clips for
deception detection (i.e., 61 for deceptive class and 60 for truthful class).
Such an amount of data is hard to drive deep neural network-based methods.
Hence, those existing models often suffer from overfitting problems and low
generalization ability. Moreover, the ground truth data contains some unusable
frames for many factors. However, most of the literature did not pay attention
to these problems. Therefore, in this paper, we design a series of data
preprocessing methods to deal with the aforementioned problem first. Then, we
propose a multi-modal deception detection framework to construct our novel
emotional state-based feature and use the open toolkit openSMILE to extract the
features from the audio modality. We also design a voting scheme to combine the
emotional states information obtained from visual and audio modalities.
Finally, we can determine the novel emotion state transformation feature with
our self-designed algorithms. In the experiment, we conduct the critical
analysis and comparison of the proposed methods with the state-of-the-art
multi-modal deception detection methods. The experimental results show that the
overall performance of multi-modal deception detection has a significant
improvement in the accuracy from 87.77% to 92.78% and the ROC-AUC from 0.9221
to 0.9265. |
---|---|
DOI: | 10.48550/arxiv.2104.08373 |