Multi-Corpus Affect Recognition with Emotion Embeddings and Self-Supervised Representations of Speech

Speech emotion recognition systems use data-driven machine learning techniques that rely on annotated corpora. To achieve a usable performance in real-life, we need to exploit multiple different datasets since each one can shed the light on some specific expression of affect. However, different corp...

Full description

Saved in:
Bibliographic Details
Published inInternational Conference on Affective Computing and Intelligent Interaction and workshops pp. 1 - 8
Main Authors Alisamir, Sina, Ringeval, Fabien, Portet, Francois
Format Conference Proceeding
LanguageEnglish
Published IEEE 18.10.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Speech emotion recognition systems use data-driven machine learning techniques that rely on annotated corpora. To achieve a usable performance in real-life, we need to exploit multiple different datasets since each one can shed the light on some specific expression of affect. However, different corpora use subjectively defined annotation schemes, which poses a challenge to train a model that can sense similar emotions across different corpora. Here, we propose a method that can relate similar emotions across corpora without being explicitly trained for it. Our method relies on self-supervised representations, which can provide us with highly contextualised speech representations, and multi-task learning paradigms. This allows to train on different corpora without changing their labelling schemes. The results show that by fine-tuning self-supervised representations on each corpus separately, we can significantly improve the state of the art within-corpus performance. We further demonstrate that by using multiple corpora during the training of the same model, we can improve the cross-corpus performance, and show that our emotion embeddings can effectively recognise the same emotions across different corpora.
ISSN:2156-8111
DOI:10.1109/ACII55700.2022.9953840