Integration of Annotator-wise Estimations for Emotion Recognition by Using Group Softmax
In emotion recognition, a major modeling difficulty arises from the different perceptions of emotion from annotator to annotator. Generally, it is common to use a one-hot (dominant) emotion label, which is obtained using majority voting by annotator-wise (minor) emotion labels. Previous studies show...
Saved in:
Published in | 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) pp. 694 - 699 |
---|---|
Main Author | |
Format | Conference Proceeding |
Language | English |
Published |
APSIPA
14.12.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In emotion recognition, a major modeling difficulty arises from the different perceptions of emotion from annotator to annotator. Generally, it is common to use a one-hot (dominant) emotion label, which is obtained using majority voting by annotator-wise (minor) emotion labels. Previous studies show that the introduction of soft-target labels, which consider the frequency of annotator-wise labels, improves emotion recognition performance. However, these studies did not use minor emotion labels directly. Another study used multi-task learning to handle dominant and minor emotions independently, but this independent modeling is inappropriate because the two are closely related. We propose a sequential model composed of multiple annotator-wise classifiers and their majority voting to estimate dominant emotion. When using multiple classifiers, classifier imbalance, where the difficulty of classification is different from classifier to classifier, causes performance degradation. To address this classifier imbalance problem, we assign a group softmax to multiple annotator-wise classifiers. Experiments show that majority voting by estimated annotator-wise emotions improves the estimation performance for dominant emotions when compared with conventional methods that estimate dominant emotion directly. In addition, the proposed method is effective not only for speech emotion recognition but also for speech and text emotion recognition. |
---|---|
ISSN: | 2640-0103 |