Decoupled Multi-perspective Fusion for Speech Depression Detection
S peech D epression D etection (SDD) has garnered attention from researchers due to its low cost and convenience. However, current algorithms lack methods for extracting interpretable acoustic features based on clinical manifestations. In addition, effectively fusing these features to overcome indiv...
Saved in:
Published in | IEEE transactions on affective computing pp. 1 - 15 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
04.02.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | S peech D epression D etection (SDD) has garnered attention from researchers due to its low cost and convenience. However, current algorithms lack methods for extracting interpretable acoustic features based on clinical manifestations. In addition, effectively fusing these features to overcome individual heterogeneity remains a challenge. This study proposes a decoupled multi-perspective fusion (DMPF) model. The model extracts five key features of voiceprint, emotion, pause, energy, and tremor based on the multi-perspective clinical manifestations. These features are then decoupled into common and private features, which fused through graph attention network to obtain the comprehensive depression representation. Notably, this study has collected a depression speech dataset, which includes standardized and comprehensive tasks along with diagnostic labels provided by psychologists. Extensive subject-independent experiments were conducted on the DAIC-WOZ, MODMA and MPSC datasets. The voiceprint features can automatically cluster the depressed and non-depressed populations. Furthermore, DMPF can effectively fuse common and private features from different perspectives, achieving AUC of 84.20%, 85.34%, 86.13% on three datasets. The results illustrate the interpretability of multi-perspective features and demonstrate that the combination of speech manifestations can enhance the detection ability, which can provide a multi-perspective observational tool for physicians and clinical practice. Code is available at https://github.com/zmh56/SDD-for-DMPF-MPSC . |
---|---|
ISSN: | 1949-3045 1949-3045 |
DOI: | 10.1109/TAFFC.2025.3538519 |