DCNN and DNN based multi-modal depression recognition
In this paper, we propose an audio visual multimodal depression recognition framework composed of deep convolutional neural network (DCNN) and deep neural network (DNN) models. For each modality, corresponding feature descriptors are input into a DCNN to learn high-level global features with compact...
Saved in:
Published in | International Conference on Affective Computing and Intelligent Interaction and workshops pp. 484 - 489 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2017
|
Subjects | |
Online Access | Get full text |
ISSN | 2156-8111 |
DOI | 10.1109/ACII.2017.8273643 |
Cover
Loading…
Summary: | In this paper, we propose an audio visual multimodal depression recognition framework composed of deep convolutional neural network (DCNN) and deep neural network (DNN) models. For each modality, corresponding feature descriptors are input into a DCNN to learn high-level global features with compact dynamic information, which are then fed into a DNN to predict the PHQ-8 score. For multi-modal depression recognition, the predicted PHQ-8 scores from each modality are integrated in a DNN for the final prediction. In addition, we propose the Histogram of Displacement Range as a novel global visual descriptor to quantify the range and speed of the facial landmarks' displacements. Experiments have been carried out on the Distress Analysis Interview Corpus-Wizard of Oz (DAIC-WOZ) dataset for the Depression Sub-challenge of the Audio-Visual Emotion Challenge (AVEC 2016), results show that the proposed multi-modal depression recognition framework obtains very promising results on both the development set and test set, which outperforms the state-of-the-art results. |
---|---|
ISSN: | 2156-8111 |
DOI: | 10.1109/ACII.2017.8273643 |