An optimal model with a lower bound of recall for imbalanced speech emotion recognition
In an early complain warning system, we encounter a common problem - the lack of angry emotions for training classification models. Moreover, the recognition of angry emotion is more important than that of no-anger emotion. Based on this, the main purpose of this paper is to train an optimal model w...
Saved in:
Published in | Multimedia tools and applications Vol. 79; no. 33-34; pp. 24281 - 24301 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.09.2020
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In an early complain warning system, we encounter a common problem - the lack of angry emotions for training classification models. Moreover, the recognition of angry emotion is more important than that of no-anger emotion. Based on this, the main purpose of this paper is to train an optimal model which achieves a high recall above a lower bound and a maximum of
F
1
score. It is divided into three aspects: 1) A variant of
F
1
score (
T
F
1
score) takes recall above a lower bound and
F
1
score into consideration; 2) A Single Emotion Deep Neural Network (SEDNN) and its training process are designed to find an optimal model with a maximum of
T
F
1
score. 3) A performance comparison of different methods is conducted on IEMOCAP and Emo-DB database. Extensive experiments show that when a BCE loss function or a focal loss function is used, the training process can find a model with a recall above a high threshold and a maximum of
F
1
score. Especially, SEDNN with the focal loss function performs better than SEDNN with the BCE loss function. |
---|---|
ISSN: | 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-020-09155-3 |