A hierarchical depression detection model based on vocal and emotional cues

Effective and efficient automatic depression diagnosis is a challenging subject in the field of affective computing. Since speech signals provide useful information for diagnosing depression, in this paper, we propose to extract deep speaker recognition (SR) and speech emotion recognition (SER) feat...

Full description

Saved in:

Bibliographic Details
Published in	Neurocomputing (Amsterdam) Vol. 441; pp. 279 - 290
Main Authors	Dong, Yizhuo, Yang, Xinyu
Format	Journal Article
Language	English
Published	Elsevier B.V 21.06.2021
Subjects	Depression detection Feature variation coordination Hierarchical model Pretrained model Depression detection Hierarchical model Pretrained model Feature variation coordination
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Effective and efficient automatic depression diagnosis is a challenging subject in the field of affective computing. Since speech signals provide useful information for diagnosing depression, in this paper, we propose to extract deep speaker recognition (SR) and speech emotion recognition (SER) features using pretrained models, and combine the two deep speech features to take advantage of the complementary information between the vocal and emotional differences of speakers. In addition, due to the small amount of data for depression recognition and the cost sensitivity of the diagnosis results, we propose a hierarchical depression detection model, in which multiple classifiers are set up prior to a regressor to guide the prediction of depression severity. We test our method on the AVEC 2013 and AVEC 2014 benchmark databases. The results demonstrate that the fusion of deep SR and SER features can improve the prediction performance of the model. The proposed method, using only audio features, can avoid the overfitting problem and achieves better performance than the previous audio-based methods on both databases. It also provides results comparable to those of video-based and multimodal-based methods for depression detection.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2021.02.019