DialogueINAB: an interaction neural network based on attitudes and behaviors of interlocutors for dialogue emotion recognition

Machines can be equipped with the capability of identifying human emotions through conversation, thus enabling them to empathize with natural persons when they speak to them. The emergence of chatbots and intelligent assistants has led to a heightened focus on emotion recognition tasks. Most existin...

Full description

Saved in:
Bibliographic Details
Published inThe Journal of supercomputing Vol. 79; no. 18; pp. 20481 - 20514
Main Authors Ding, Junyuan, Chen, Xiaoliang, Lu, Peng, Yang, Zaiyan, Li, Xianyong, Du, Yajun
Format Journal Article
LanguageEnglish
Published New York Springer US 01.12.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Machines can be equipped with the capability of identifying human emotions through conversation, thus enabling them to empathize with natural persons when they speak to them. The emergence of chatbots and intelligent assistants has led to a heightened focus on emotion recognition tasks. Most existing methodologies primarily focus on the isolated analysis of the speaker’s attitudes and behaviors, thereby disregarding the essential interplay between the potential attitudes of interlocutors engaged in a specific conversation and their subsequent immediate dialogue behavior. As a result, the comprehension of the fundamental reasons behind the speaker’s emotional fluctuations presents a noteworthy challenge throughout the course of the dialogue. This paper utilizes the attitude behavior theory from social psychology to develop a neural network model, namely the interaction neural network behind attitudes and behaviors of interlocutors for dialogue emotion recognition (DialogueINAB), that emulates the interactive process of the interlocutor’s attitude and speech behavior in conversation. Our model presents new insights for recognizing emotions in conversations from a social psychology standpoint. DialogueINAB includes three modules: perception, information interaction, and emotion classifier. Initially, DialogueINAB identifies the features of potential attitudes and speech behaviors of the conversation interlocutors from the dialogue text. Second, by utilizing a crossmodal transformer architecture, the model can simulate the interaction between interlocutors’ potential attitude and speech behavior and produce emotional features. Finally, the emotion classifier is supplied with the generated emotion features from the model for emotion recognition in conversation. We demonstrate the superiority of the proposed method through extensive experiments on three standard datasets (IEMOCAP, MELD, and AVEC). Compared with the six public baseline methods, our model improves the Weight- F 1 metric by 3.56% and 1.04% on the IEMOCAP and MELD datasets and reduces the MAE metric by an average of 3.3% on the AVEC dataset.
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-023-05439-1