DialogueINAB: an interaction neural network based on attitudes and behaviors of interlocutors for dialogue emotion recognition
Machines can be equipped with the capability of identifying human emotions through conversation, thus enabling them to empathize with natural persons when they speak to them. The emergence of chatbots and intelligent assistants has led to a heightened focus on emotion recognition tasks. Most existin...
Saved in:
Published in | The Journal of supercomputing Vol. 79; no. 18; pp. 20481 - 20514 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.12.2023
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Machines can be equipped with the capability of identifying human emotions through conversation, thus enabling them to empathize with natural persons when they speak to them. The emergence of chatbots and intelligent assistants has led to a heightened focus on emotion recognition tasks. Most existing methodologies primarily focus on the isolated analysis of the speaker’s attitudes and behaviors, thereby disregarding the essential interplay between the potential attitudes of interlocutors engaged in a specific conversation and their subsequent immediate dialogue behavior. As a result, the comprehension of the fundamental reasons behind the speaker’s emotional fluctuations presents a noteworthy challenge throughout the course of the dialogue. This paper utilizes the attitude behavior theory from social psychology to develop a neural network model, namely the interaction neural network behind attitudes and behaviors of interlocutors for dialogue emotion recognition (DialogueINAB), that emulates the interactive process of the interlocutor’s attitude and speech behavior in conversation. Our model presents new insights for recognizing emotions in conversations from a social psychology standpoint. DialogueINAB includes three modules: perception, information interaction, and emotion classifier. Initially, DialogueINAB identifies the features of potential attitudes and speech behaviors of the conversation interlocutors from the dialogue text. Second, by utilizing a crossmodal transformer architecture, the model can simulate the interaction between interlocutors’ potential attitude and speech behavior and produce emotional features. Finally, the emotion classifier is supplied with the generated emotion features from the model for emotion recognition in conversation. We demonstrate the superiority of the proposed method through extensive experiments on three standard datasets (IEMOCAP, MELD, and AVEC). Compared with the six public baseline methods, our model improves the Weight-
F
1 metric by 3.56% and 1.04% on the IEMOCAP and MELD datasets and reduces the MAE metric by an average of 3.3% on the AVEC dataset. |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-023-05439-1 |