Emotional stimulated speech-based assisted early diagnosis of depressive disorders using personality-enhanced deep learning

Early diagnosis of depression is crucial, and speech-based early diagnosis of depression is promising, but insufficient data and lack of theoretical support make it difficult to be applied. Therefore, it is valuable to combine psychiatric theories, collect speech recognition data for depression, and...

Full description

Saved in:

Bibliographic Details
Published in	Journal of affective disorders Vol. 376; pp. 177 - 188
Main Authors	Ding, Zhong, Chen, Jing, Zhong, Bao-Liang, Liu, Chen-Ling, Liu, Zhen-Tao
Format	Journal Article
Language	English
Published	Netherlands Elsevier B.V 01.05.2025
Subjects	Adolescent Adult Algorithms Deep Learning Depressive Disorder, Major - diagnosis Depressive Disorder, Major - psychology Early Diagnosis Emotional stimulus Emotions Female Humans Major depressive disorder Male Middle Aged Multi-task learning Personality Personality recognition Psychiatric/Mental Health Speech Speech depression recognition Young Adult Deep learning Early diagnosis Emotional stimulus Personality recognition Speech depression recognition Multi-task learning Major depressive disorder
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Early diagnosis of depression is crucial, and speech-based early diagnosis of depression is promising, but insufficient data and lack of theoretical support make it difficult to be applied. Therefore, it is valuable to combine psychiatric theories, collect speech recognition data for depression, and develop a practicable recognition method for depression. In this study, 24 patients with major depressive disorders (MDDs) and 36 healthy controls (HCs) were recruited to participate in a multi-task speech experiment. Descriptive statistics and tests of variance were used to analyze subjects' personality and speech changes. Subsequently, the speech task with the most depressive cues was explored using the Bidirectional Long - Short Term Memory (Bi-LSTM) algorithm, on which a personality-assisted multitasking deep model, i.e., multi-task attentional temporal convolutional network model (TCN-MTA). Statistical analyses of speech duration showed that the fable reading, neutral stimulus, and negative stimulus tasks had significant differences on subjects' speech duration, and the negative stimulus task had significant differences between the depressed and control groups (p < 0.001, 0.03, 0.04). Notably, the Big Five personality emotional stability scores were significantly different between the depressed and control groups (0.03). Depression was best identified using Bi-LSTM in negative (Youden index = 0.44) and positive stimulus speech (Youden index = 0.42). Further, the specificity of 0.72 and sensitivity of 0.87 for recognizing depression in negative stimulus speech using our proposed TCN-MTA outperforms existing methods. The sample size enrolled in this study is higher than the minimum sample size calculated through G-Power 3.1, but the sample size in this study is still small. The proposed deep learning-based personality-assisted multitasking method could accurately recognize major depression, which demonstrated the potential of the method based on the fusion of specialized theories and artificial intelligence. [Display omitted] •This study built a personality depression dataset with speech on structured, semi-structured and spontaneous tasks.•HCs was faster than DDs on the read-aloud task, but took longer than DDs on the other tasks.•Depression recognition was best based on spontaneous speech and worst based on structured speech.•The personality-assisted early recognition model of depression (TCN-MTA) was effective (specificity = 0.72, sensitivity = 0.87).
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0165-0327 1573-2517 1573-2517
DOI:	10.1016/j.jad.2025.01.136