Exploring Fusion Strategies in Deep Multimodal Affect Prediction

In this work, we explore the effectiveness of multimodal models for estimating the emotional state expressed continuously in the Valence/Arousal space. We consider four modalities typically adopted for the emotion recognition, namely audio (voice), video (face expression), electrocardiogram (ECG), a...

Full description

Saved in:

Bibliographic Details
Published in	Image Analysis and Processing - ICIAP 2022 Vol. 13232; pp. 730 - 741
Main Authors	Patania, Sabrina, D’Amelio, Alessandro, Lanzarotti, Raffaella
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2022 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Deep learning Multimodal emotion recognition Multimodal fusion
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this work, we explore the effectiveness of multimodal models for estimating the emotional state expressed continuously in the Valence/Arousal space. We consider four modalities typically adopted for the emotion recognition, namely audio (voice), video (face expression), electrocardiogram (ECG), and electrodermal activity (EDA), investigating different mixtures of them. To this aim, a CNN-based feature extraction module is adopted for each of the considered modalities, and an RNN-based module for modelling the dynamics of the affective behaviour. The fusion is performed in three different ways: at feature-level (after the CNN feature extraction), at model-level (combining the RNN layer’s outputs) and at prediction-level (late fusion). Results obtained on the publicly available RECOLA dataset, demonstrate that the use of multiple modalities improves the prediction performance. The best results are achieved exploiting the contribution of all the considered modalities, and employing the late fusion, but even mixtures of two modalities (especially audio and video) bring significant benefits.
ISBN:	9783031064296 3031064291
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-031-06430-2_61