Multimodal Emotion Recognition Using Transfer Learning on Audio and Text Data

Emotion recognition has been extensively studied in a single modality in the last decade. However, humans express their emotions usually through multiple modalities like voice, facial expressions, or text. In this paper, we propose a new method to find a unified emotion representation for multimodal...

Full description

Saved in:

Bibliographic Details
Published in	Computational Science and Its Applications – ICCSA 2021 pp. 552 - 563
Main Authors	Deng, James J., Leung, Clement H. C., Li, Yuanxi
Format	Book Chapter
Language	English
Published	Cham Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Multimodal emotion recognition Multimodal fusion Transformer network
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Emotion recognition has been extensively studied in a single modality in the last decade. However, humans express their emotions usually through multiple modalities like voice, facial expressions, or text. In this paper, we propose a new method to find a unified emotion representation for multimodal emotion recognition through speech audio, and text. Emotion-based feature representation from speech audio is learned by an unsupervised triplet-loss objective, and a text-to-text transformer network is constructed to extract latent emotional meaning. As deep neural network models trained by huge datasets exhaust a lot of unaffordable resources, transfer learning provides a powerful and reusable technique to help fine-tune emotion recognition models trained on mega audio and text datasets respectively. Automatic multimodal fusion of emotion-based features from speech audio and text is conducted by a new transformer. Both the accuracy and robustness of proposed method are evaluated, and we show that our method for multimodal fusion using transfer learning in emotion recognition achieves good results.
ISBN:	9783030869694 3030869695
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-030-86970-0_39