Multi-modal depression detection based on emotional audio and evaluation text

•We propose and prove a text reading experiment to make subjects emotions change rapidly.•Features analysis (Low-level audio features, DeepSpectrum features and word vector features).•Our research propose a multimodal fusion method based on DeepSpectrum features and word vector features to detect de...

Full description

Saved in:
Bibliographic Details
Published inJournal of affective disorders Vol. 295; pp. 904 - 913
Main Authors Ye, Jiayu, Yu, Yanhong, Wang, Qingxiang, Li, Wentao, Liang, Hu, Zheng, Yunshao, Fu, Gang
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.12.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract •We propose and prove a text reading experiment to make subjects emotions change rapidly.•Features analysis (Low-level audio features, DeepSpectrum features and word vector features).•Our research propose a multimodal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning. Early detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the research of depression identification technology is a complex problem with application value. Our research propose a new experimental method for depression detection based on audio and text. 160 Chinese subjects are investigated in this study. It is worth noting that we propose a text reading experiment to make subjects emotions change rapidly. It will be called Segmental Emotional Speech Experiment (SESE) below. We extract 384-dimensional Low-level audio features to find the differences of different emotional change in SESE. At the same time, our research propose a multi-modal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning. Our experiment proved that SESE can improve the recognition accuracy of depression and found differences in Low-level audio features. Case group and Control group, gender and age are grouped for verification. It is also satisfactory that the multi-modal fusion model achieves accuracy of 0.912 and F1 score of 0.906. Our contribution is twofold. First, we propose and verify SESE, which can provide a new experimental idea for the follow-up researchers. Secondly, a new efficient multi-modal depression recognition model is proposed.
AbstractList •We propose and prove a text reading experiment to make subjects emotions change rapidly.•Features analysis (Low-level audio features, DeepSpectrum features and word vector features).•Our research propose a multimodal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning. Early detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the research of depression identification technology is a complex problem with application value. Our research propose a new experimental method for depression detection based on audio and text. 160 Chinese subjects are investigated in this study. It is worth noting that we propose a text reading experiment to make subjects emotions change rapidly. It will be called Segmental Emotional Speech Experiment (SESE) below. We extract 384-dimensional Low-level audio features to find the differences of different emotional change in SESE. At the same time, our research propose a multi-modal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning. Our experiment proved that SESE can improve the recognition accuracy of depression and found differences in Low-level audio features. Case group and Control group, gender and age are grouped for verification. It is also satisfactory that the multi-modal fusion model achieves accuracy of 0.912 and F1 score of 0.906. Our contribution is twofold. First, we propose and verify SESE, which can provide a new experimental idea for the follow-up researchers. Secondly, a new efficient multi-modal depression recognition model is proposed.
Early detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the research of depression identification technology is a complex problem with application value.BACKGROUNDEarly detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the research of depression identification technology is a complex problem with application value.Our research propose a new experimental method for depression detection based on audio and text. 160 Chinese subjects are investigated in this study. It is worth noting that we propose a text reading experiment to make subjects emotions change rapidly. It will be called Segmental Emotional Speech Experiment (SESE) below. We extract 384-dimensional Low-level audio features to find the differences of different emotional change in SESE. At the same time, our research propose a multi-modal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning.METHODSOur research propose a new experimental method for depression detection based on audio and text. 160 Chinese subjects are investigated in this study. It is worth noting that we propose a text reading experiment to make subjects emotions change rapidly. It will be called Segmental Emotional Speech Experiment (SESE) below. We extract 384-dimensional Low-level audio features to find the differences of different emotional change in SESE. At the same time, our research propose a multi-modal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning.Our experiment proved that SESE can improve the recognition accuracy of depression and found differences in Low-level audio features. Case group and Control group, gender and age are grouped for verification. It is also satisfactory that the multi-modal fusion model achieves accuracy of 0.912 and F1 score of 0.906.RESULTSOur experiment proved that SESE can improve the recognition accuracy of depression and found differences in Low-level audio features. Case group and Control group, gender and age are grouped for verification. It is also satisfactory that the multi-modal fusion model achieves accuracy of 0.912 and F1 score of 0.906.Our contribution is twofold. First, we propose and verify SESE, which can provide a new experimental idea for the follow-up researchers. Secondly, a new efficient multi-modal depression recognition model is proposed.CONCLUSIONSOur contribution is twofold. First, we propose and verify SESE, which can provide a new experimental idea for the follow-up researchers. Secondly, a new efficient multi-modal depression recognition model is proposed.
Highlights•We propose and prove a text reading experiment to make subjects emotions change rapidly. •Features analysis (Low-level audio features, DeepSpectrum features and word vector features). •Our research propose a multimodal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning.
Author Li, Wentao
Liang, Hu
Fu, Gang
Zheng, Yunshao
Ye, Jiayu
Wang, Qingxiang
Yu, Yanhong
Author_xml – sequence: 1
  givenname: Jiayu
  orcidid: 0000-0003-0368-9651
  surname: Ye
  fullname: Ye, Jiayu
  organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
– sequence: 2
  givenname: Yanhong
  surname: Yu
  fullname: Yu, Yanhong
  organization: College of Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Jinan 250355, China
– sequence: 3
  givenname: Qingxiang
  surname: Wang
  fullname: Wang, Qingxiang
  email: wangqx@qlu.edu.cn
  organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
– sequence: 4
  givenname: Wentao
  surname: Li
  fullname: Li, Wentao
  organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
– sequence: 5
  givenname: Hu
  surname: Liang
  fullname: Liang, Hu
  organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
– sequence: 6
  givenname: Yunshao
  surname: Zheng
  fullname: Zheng, Yunshao
  organization: Shandong Mental Health Center, Jinan 250014, China
– sequence: 7
  givenname: Gang
  surname: Fu
  fullname: Fu, Gang
  organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
BookMark eNqFkUFr3DAQhUVJoJukP6C3PfZid0aOLJlCoYSmCST00PYsZGkMcrXWVrJD8-8r7-YUSHrSQ7zvafTmjJ1McSLG3iPUCNh-HOvRuJoDxxpUDR28YRsUsqm4QHnCNsUjKmi4fMvOch4BoO0kbNj9_RJmX-2iM2HraJ8oZx-nImey86p6k8lti6BdXC-KzyzOx62Z3JYeTFjMwTfT3_mCnQ4mZHr3dJ6zX9dff17dVHffv91efbmrbKNaWbkWL3uFHRmuBDSWjMN-aAayjbCEwhHyMh83A4leiR5J2AalNNySENg15-zDMXef4p-F8qx3PlsKwUwUl6y5UFK2l-3BKo9Wm2LOiQZt_XyYeE7GB42g1wL1qEuBei1Qg9KlwELiM3Kf_M6kx1eZT0eGyu8fPCWdrafJkvOp9Kld9K_Sn5_RNvjJWxN-0yPlMS6p1J816sw16B_rUtedcgRQnVAloHs54D-P_wNSMrM7
CitedBy_id crossref_primary_10_1080_01969722_2022_2122009
crossref_primary_10_1109_ACCESS_2024_3454292
crossref_primary_10_1142_S0129065722500459
crossref_primary_10_1145_3673906
crossref_primary_10_3389_fneur_2022_905917
crossref_primary_10_1016_j_cmpb_2023_107923
crossref_primary_10_1007_s00521_024_10868_x
crossref_primary_10_1142_S021800142450023X
crossref_primary_10_11834_jig_240017
crossref_primary_10_2139_ssrn_4102839
crossref_primary_10_1111_jocn_17694
crossref_primary_10_1016_j_compbiomed_2023_107805
crossref_primary_10_1109_TAFFC_2024_3370103
crossref_primary_10_3390_s22041545
crossref_primary_10_1080_00051144_2023_2296793
crossref_primary_10_1007_s42979_024_02730_7
crossref_primary_10_3390_s24020348
crossref_primary_10_3390_diagnostics15020210
crossref_primary_10_1109_TCSVT_2024_3491098
crossref_primary_10_1145_3665247
crossref_primary_10_1109_OJCS_2024_3462812
crossref_primary_10_1109_TCSS_2024_3405949
crossref_primary_10_21015_vtse_v11i2_1501
crossref_primary_10_32604_cmc_2024_056666
crossref_primary_10_35377_saucis___1381522
crossref_primary_10_1016_j_bspc_2022_104561
crossref_primary_10_1155_2023_4604885
crossref_primary_10_1016_j_compbiomed_2023_106741
crossref_primary_10_1016_j_compbiomed_2023_107555
crossref_primary_10_1016_j_inffus_2023_102017
crossref_primary_10_2139_ssrn_4180783
crossref_primary_10_3389_fneur_2024_1394210
crossref_primary_10_1016_j_specom_2022_09_001
crossref_primary_10_2139_ssrn_4172609
Cites_doi 10.3115/v1/W14-4012
10.1109/TBME.2007.900562
10.1016/j.im.2020.103349
10.1080/02699930903407948
10.1109/JBHI.2020.2983035
10.1136/jnnp.2004.036079
10.1001/jama.2017.3826
10.1016/j.csl.2018.08.001
10.1207/s15516709cog1402_1
10.1155/2018/6508319
10.1016/j.jad.2010.06.039
10.1136/jnnp.23.1.56
10.1016/j.jad.2020.11.040
ContentType Journal Article
Copyright 2021
Copyright © 2021. Published by Elsevier B.V.
Copyright_xml – notice: 2021
– notice: Copyright © 2021. Published by Elsevier B.V.
DBID AAYXX
CITATION
7X8
DOI 10.1016/j.jad.2021.08.090
DatabaseName CrossRef
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE - Academic
DatabaseTitleList
MEDLINE - Academic


DeliveryMethod fulltext_linktorsrc
EISSN 1573-2517
EndPage 913
ExternalDocumentID 10_1016_j_jad_2021_08_090
S0165032721008958
1_s2_0_S0165032721008958
GroupedDBID ---
--K
--M
.1-
.FO
.~1
0R~
1B1
1P~
1RT
1~.
1~5
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JM
AABNK
AAEDW
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AATTM
AAWTL
AAXKI
AAXUO
ABBQC
ABFNM
ABIVO
ABJNI
ABLJU
ABMAC
ABMZM
ACDAQ
ACGFS
ACHQT
ACIEU
ACIUM
ACRLP
ACVFH
ADBBV
ADCNI
ADEZE
AEBSH
AEIPS
AEKER
AENEX
AEUPX
AEVXI
AFPUW
AFRHN
AFTJW
AFXIZ
AGCQF
AGUBO
AGYEJ
AHHHB
AIEXJ
AIGII
AIIUN
AIKHN
AITUG
AJRQY
AJUYK
AKBMS
AKRWK
AKYEP
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
ANKPU
ANZVX
APXCP
AXJTR
BKOJK
BLXMC
BNPGV
CS3
DU5
EBS
EFJIC
EFKBS
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
HMQ
HMW
IHE
J1W
KOM
M29
M2V
M39
M3V
M41
MO0
N9A
O-L
O9-
OAUVE
OH0
OU-
OZT
P-8
P-9
P2P
PC.
Q38
ROL
RPZ
SAE
SCC
SDF
SDG
SDP
SEL
SES
SPCBC
SSH
SSZ
T5K
UV1
Z5R
~G-
0SF
29J
53G
AACTN
AAEDT
AAGKA
AAQXK
ABWVN
ABXDB
ACRPL
ADMUD
ADNMO
ADVLN
AFCTW
AFJKZ
AFKWA
AGHFR
AJOXV
AMFUW
ASPBG
AVWKF
AZFZN
EJD
FEDTE
FGOYB
G-2
HEG
HMK
HMO
HVGLF
HZ~
NCXOZ
R2-
RIG
SEW
SNS
SPS
WUQ
ZGI
AAIAV
ABLVK
ABYKQ
EFLBG
LCYCR
ZA5
AAYWO
AAYXX
AGQPQ
AGRNS
CITATION
7X8
ID FETCH-LOGICAL-c3867-d614b819ea28503cead1bf3fec35ce15de120692afe5b85b1e5c3177a2ce55193
IEDL.DBID .~1
ISSN 0165-0327
1573-2517
IngestDate Fri Jul 11 08:05:59 EDT 2025
Tue Jul 01 03:46:19 EDT 2025
Thu Apr 24 23:11:45 EDT 2025
Fri Feb 23 02:41:11 EST 2024
Tue Feb 25 19:59:50 EST 2025
Tue Aug 26 20:09:24 EDT 2025
IsPeerReviewed true
IsScholarly true
Keywords Deep learning
Depression
Artificial intelligence
Multi-modality
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c3867-d614b819ea28503cead1bf3fec35ce15de120692afe5b85b1e5c3177a2ce55193
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0003-0368-9651
PQID 2587764619
PQPubID 23479
PageCount 10
ParticipantIDs proquest_miscellaneous_2587764619
crossref_citationtrail_10_1016_j_jad_2021_08_090
crossref_primary_10_1016_j_jad_2021_08_090
elsevier_sciencedirect_doi_10_1016_j_jad_2021_08_090
elsevier_clinicalkeyesjournals_1_s2_0_S0165032721008958
elsevier_clinicalkey_doi_10_1016_j_jad_2021_08_090
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-12-01
PublicationDateYYYYMMDD 2021-12-01
PublicationDate_xml – month: 12
  year: 2021
  text: 2021-12-01
  day: 01
PublicationDecade 2020
PublicationTitle Journal of affective disorders
PublicationYear 2021
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Elman (bib0008) 1990; 14
Kim (bib0016) 2014
Lam, Dongyan, Lin (bib0017) 2019
Mikolov, Chen, Corrado, Dean (bib0019) 2013
Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y., 2015. On the properties of neural machine translation: encoder–decoder approaches, 103–111. 10.3115/v1/w14-4012.
Ma, Yang, Chen, Huang, Wang (bib0018) 2016
Yu, Jiang, Ren, Xu, Zhang, Hu (bib0031) 2021; 280
Joormann, Gotlib (bib0014) 2010; 24
Yang, Dai, Yang, Carbonell, Salakhutdinov, Le (bib0029) 2019
Jiang, Hu, Liu, Wang, Zhang, Li, Kang (bib0013) 2018
Aytar, Vondrick, Torralba (bib0003) 2016
Shen, Jia, Nie, Feng, Zhang, Hu, Chua, Zhu (bib0025) 2017
Tlachac, Rundensteiner (bib0027) 2020; 24
Yin, Liang, Ding, Wang (bib0030) 2019
Stasak, Epps, Goecke (bib0026) 2019; 53
Friedrich (bib0010) 2017; 317
Devlin, Chang, Lee, Toutanova (bib0007) 2019
Hamilton (bib0012) 1960; 23
Zhou, Shi, Tian, Qi, Li, Hao, Xu (bib0032) 2016
Moore, Clements, Peifer, Weisser (bib0021) 2008; 55
Alhanai, Ghassemi, Glass (bib0001) 2018
Naranjo, Kornreich, Campanella, Noël, Vandriette, Gillain, de Longueville, Delatte, Verbanck, Constant (bib0022) 2011; 128
.
Guohou, Lina, Dongsong (bib0011) 2020
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J., 2016. Pruning convolutional neural networks for resource efficient transfer learning. arXiv1611.06440[cs, stat].
Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (bib0028) 2017
Bai, S., Kolter, J.Z., Koltun, V., 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling.
Kan, Mimura, Kamijimo, Kawamura (bib0015) 2004; 75
Amiriparian, Gerczuk, Ottl, Cummins, Freitag, Pugachevskiy, Baird, Schuller (bib0002) 2017
Pascanu, Mikolov, Bengio (bib0023) 2013
Schuller, Steidl, Batliner (bib0024) 2009
Chorowski, Bahdanau, Serdyuk, Cho, Bengio (bib0006) 2015
Fan, He, Xing, Cai, Lu (bib0009) 2019
Yin (10.1016/j.jad.2021.08.090_bib0030) 2019
Devlin (10.1016/j.jad.2021.08.090_bib0007) 2019
Fan (10.1016/j.jad.2021.08.090_bib0009) 2019
Lam (10.1016/j.jad.2021.08.090_bib0017) 2019
Alhanai (10.1016/j.jad.2021.08.090_bib0001) 2018
Yu (10.1016/j.jad.2021.08.090_bib0031) 2021; 280
Friedrich (10.1016/j.jad.2021.08.090_bib0010) 2017; 317
Moore (10.1016/j.jad.2021.08.090_bib0021) 2008; 55
Joormann (10.1016/j.jad.2021.08.090_bib0014) 2010; 24
Tlachac (10.1016/j.jad.2021.08.090_bib0027) 2020; 24
Stasak (10.1016/j.jad.2021.08.090_bib0026) 2019; 53
Shen (10.1016/j.jad.2021.08.090_bib0025) 2017
Naranjo (10.1016/j.jad.2021.08.090_bib0022) 2011; 128
Jiang (10.1016/j.jad.2021.08.090_bib0013) 2018
Kim (10.1016/j.jad.2021.08.090_bib0016) 2014
Hamilton (10.1016/j.jad.2021.08.090_bib0012) 1960; 23
Kan (10.1016/j.jad.2021.08.090_bib0015) 2004; 75
Ma (10.1016/j.jad.2021.08.090_bib0018) 2016
10.1016/j.jad.2021.08.090_bib0020
Yang (10.1016/j.jad.2021.08.090_bib0029) 2019
Aytar (10.1016/j.jad.2021.08.090_bib0003) 2016
Pascanu (10.1016/j.jad.2021.08.090_bib0023) 2013
Chorowski (10.1016/j.jad.2021.08.090_bib0006) 2015
Guohou (10.1016/j.jad.2021.08.090_bib0011) 2020
10.1016/j.jad.2021.08.090_bib0005
10.1016/j.jad.2021.08.090_bib0004
Zhou (10.1016/j.jad.2021.08.090_bib0032) 2016
Amiriparian (10.1016/j.jad.2021.08.090_bib0002) 2017
Vaswani (10.1016/j.jad.2021.08.090_bib0028) 2017
Elman (10.1016/j.jad.2021.08.090_bib0008) 1990; 14
Schuller (10.1016/j.jad.2021.08.090_bib0024) 2009
Mikolov (10.1016/j.jad.2021.08.090_bib0019) 2013
References_xml – reference: Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J., 2016. Pruning convolutional neural networks for resource efficient transfer learning. arXiv1611.06440[cs, stat].
– volume: 24
  year: 2020
  ident: bib0027
  article-title: Screening for depression with retrospectively harvested private versus public text
  publication-title: IEEE J. Biomed. Health Inform.
– year: 2017
  ident: bib0028
  article-title: Attention is all you need
  publication-title: Advances in Neural Information Processing Systems
– volume: 128
  start-page: 243
  year: 2011
  end-page: 251
  ident: bib0022
  article-title: Major depression is associated with impaired processing of emotion in music as well as in facial and vocal stimuli
  publication-title: J Affect Disord
– volume: 317
  year: 2017
  ident: bib0010
  article-title: Depression Is the leading cause of disability around the world
  publication-title: JAMA
– start-page: 73
  year: 2019
  end-page: 80
  ident: bib0009
  article-title: Multi-modality depression detection via multi-scale temporal dilated CNNs
  publication-title: AVEC 2019 - Proceedings of the 9th International Audio/Visual Emotion Challenge and Workshop
– start-page: 1746
  year: 2014
  end-page: 1751
  ident: bib0016
  article-title: Convolutional neural networks for sentence classification
  publication-title: EMNLP 2014–2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
– start-page: 577
  year: 2015
  end-page: 585
  ident: bib0006
  article-title: Attention-based models for speech recognition
  publication-title: Advances in Neural Information Processing Systems
– volume: 55
  start-page: 96
  year: 2008
  end-page: 97
  ident: bib0021
  article-title: Critical analysis of the impact of glottal features in the classification of clinical depression in speech
  publication-title: IEEE Trans. Biomed. Eng.
– start-page: 207
  year: 2016
  end-page: 212
  ident: bib0032
  article-title: Attention-based bidirectional long short-term memory networks for relation classification
  publication-title: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics
– year: 2013
  ident: bib0019
  article-title: Efficient estimation of word representations in vector space
  publication-title: 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings
– start-page: 3946
  year: 2019
  end-page: 3950
  ident: bib0017
  article-title: Context-aware deep learning for multi-modal depression detection
  publication-title: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
– volume: 280
  start-page: 354
  year: 2021
  end-page: 363
  ident: bib0031
  article-title: Detecting changes in attitudes toward depression on Chinese social media: a text analysis
  publication-title: J. Affect. Disord.
– start-page: 1
  year: 2018
  end-page: 9
  ident: bib0013
  article-title: Detecting depression using an ensemble logistic regression model based on multiple speech features
  publication-title: Comput. Math Methods Med.
– year: 2018
  ident: bib0001
  article-title: Detecting depression with audio/text sequence modeling of interviews
  publication-title: Proceedings of the Annual Conference of the International Speech Communication Association
– volume: 24
  year: 2010
  ident: bib0014
  article-title: Emotion regulation in depression: relation to cognitive inhibition
  publication-title: Cognit. Emot.
– start-page: 65
  year: 2019
  end-page: 71
  ident: bib0030
  article-title: A multi-modal hierarchical recurrent neural network for depression detection
  publication-title: AVEC 2019 - Proceedings of the 9th International Audio/Visual Emotion Challenge and Workshop, Co-Located with MM 2019
– year: 2017
  ident: bib0002
  article-title: Snore sound classification using image-based deep spectrum features
  publication-title: Proceedings of the Annual Conference of the International Speech Communication Association
– start-page: 3838
  year: 2017
  end-page: 3844
  ident: bib0025
  article-title: Depression detection via harvesting social media: a multimodal dictionary learning solution
  publication-title: IJCAI International Joint Conference on Artificial Intelligence
– start-page: 35
  year: 2016
  end-page: 42
  ident: bib0018
  article-title: DepAudioNet: an efficient deep model for audio based depression classification
  publication-title: AVEC 2016 - Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, Co-Located with ACM Multimedia 2016
– volume: 14
  start-page: 179
  year: 1990
  end-page: 211
  ident: bib0008
  article-title: Finding structure in time
  publication-title: Cogn. Sci.
– year: 2009
  ident: bib0024
  article-title: The INTERSPEECH 2009 emotion challenge
  publication-title: Proceedings of the Annual Conference of the International Speech Communication Association
– reference: .
– reference: Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y., 2015. On the properties of neural machine translation: encoder–decoder approaches, 103–111. 10.3115/v1/w14-4012.
– volume: 23
  start-page: 56
  year: 1960
  end-page: 62
  ident: bib0012
  article-title: A rating scale for depression
  publication-title: J. Neurol. Neurosurg. Psychiatr.
– start-page: 1310
  year: 2013
  end-page: 1318
  ident: bib0023
  article-title: On the difficulty of training recurrent neural networks
  publication-title: 30th International Conference on Machine Learning, ICML 2013
– year: 2016
  ident: bib0003
  article-title: SoundNet: learning sound representations from unlabeled video
  publication-title: Proceedings of the 30th International Conference on Neural Information Processing Systems
– start-page: 4171
  year: 2019
  end-page: 4186
  ident: bib0007
  article-title: BERT: pre-training of deep bidirectional transformers for language understanding
  publication-title: NAACL HLT 2019–2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
– volume: 75
  start-page: 1667
  year: 2004
  end-page: 1671
  ident: bib0015
  article-title: Recognition of emotion from moving facial and prosodic stimuli in depressed patients
  publication-title: J. Neurol. Neurosurg. Psychiatry
– reference: Bai, S., Kolter, J.Z., Koltun, V., 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling.
– year: 2019
  ident: bib0029
  article-title: XLNet: generalized autoregressive pretraining for language understanding
  publication-title: Advances in Neural Information Processing Systems
– year: 2020
  ident: bib0011
  article-title: What reveals about depression level? The role of multimodal features at the level of interview questions
  publication-title: Inf. Manag.
– volume: 53
  start-page: 140
  year: 2019
  end-page: 155
  ident: bib0026
  article-title: An investigation of linguistic stress and articulatory vowel characteristics for automatic depression classification
  publication-title: Computer Speech Lang.
– ident: 10.1016/j.jad.2021.08.090_bib0020
– start-page: 35
  year: 2016
  ident: 10.1016/j.jad.2021.08.090_bib0018
  article-title: DepAudioNet: an efficient deep model for audio based depression classification
– year: 2018
  ident: 10.1016/j.jad.2021.08.090_bib0001
  article-title: Detecting depression with audio/text sequence modeling of interviews
– start-page: 3946
  year: 2019
  ident: 10.1016/j.jad.2021.08.090_bib0017
  article-title: Context-aware deep learning for multi-modal depression detection
– year: 2013
  ident: 10.1016/j.jad.2021.08.090_bib0019
  article-title: Efficient estimation of word representations in vector space
– ident: 10.1016/j.jad.2021.08.090_bib0005
  doi: 10.3115/v1/W14-4012
– start-page: 73
  year: 2019
  ident: 10.1016/j.jad.2021.08.090_bib0009
  article-title: Multi-modality depression detection via multi-scale temporal dilated CNNs
– volume: 55
  start-page: 96
  year: 2008
  ident: 10.1016/j.jad.2021.08.090_bib0021
  article-title: Critical analysis of the impact of glottal features in the classification of clinical depression in speech
  publication-title: IEEE Trans. Biomed. Eng.
  doi: 10.1109/TBME.2007.900562
– year: 2009
  ident: 10.1016/j.jad.2021.08.090_bib0024
  article-title: The INTERSPEECH 2009 emotion challenge
– year: 2020
  ident: 10.1016/j.jad.2021.08.090_bib0011
  article-title: What reveals about depression level? The role of multimodal features at the level of interview questions
  publication-title: Inf. Manag.
  doi: 10.1016/j.im.2020.103349
– start-page: 1310
  year: 2013
  ident: 10.1016/j.jad.2021.08.090_bib0023
  article-title: On the difficulty of training recurrent neural networks
– volume: 24
  year: 2010
  ident: 10.1016/j.jad.2021.08.090_bib0014
  article-title: Emotion regulation in depression: relation to cognitive inhibition
  publication-title: Cognit. Emot.
  doi: 10.1080/02699930903407948
– volume: 24
  year: 2020
  ident: 10.1016/j.jad.2021.08.090_bib0027
  article-title: Screening for depression with retrospectively harvested private versus public text
  publication-title: IEEE J. Biomed. Health Inform.
  doi: 10.1109/JBHI.2020.2983035
– volume: 75
  start-page: 1667
  year: 2004
  ident: 10.1016/j.jad.2021.08.090_bib0015
  article-title: Recognition of emotion from moving facial and prosodic stimuli in depressed patients
  publication-title: J. Neurol. Neurosurg. Psychiatry
  doi: 10.1136/jnnp.2004.036079
– year: 2016
  ident: 10.1016/j.jad.2021.08.090_bib0003
  article-title: SoundNet: learning sound representations from unlabeled video
– year: 2019
  ident: 10.1016/j.jad.2021.08.090_bib0029
  article-title: XLNet: generalized autoregressive pretraining for language understanding
– volume: 317
  year: 2017
  ident: 10.1016/j.jad.2021.08.090_bib0010
  article-title: Depression Is the leading cause of disability around the world
  publication-title: JAMA
  doi: 10.1001/jama.2017.3826
– volume: 53
  start-page: 140
  year: 2019
  ident: 10.1016/j.jad.2021.08.090_bib0026
  article-title: An investigation of linguistic stress and articulatory vowel characteristics for automatic depression classification
  publication-title: Computer Speech Lang.
  doi: 10.1016/j.csl.2018.08.001
– volume: 14
  start-page: 179
  year: 1990
  ident: 10.1016/j.jad.2021.08.090_bib0008
  article-title: Finding structure in time
  publication-title: Cogn. Sci.
  doi: 10.1207/s15516709cog1402_1
– start-page: 1
  year: 2018
  ident: 10.1016/j.jad.2021.08.090_bib0013
  article-title: Detecting depression using an ensemble logistic regression model based on multiple speech features
  publication-title: Comput. Math Methods Med.
  doi: 10.1155/2018/6508319
– start-page: 1746
  year: 2014
  ident: 10.1016/j.jad.2021.08.090_bib0016
  article-title: Convolutional neural networks for sentence classification
– start-page: 577
  year: 2015
  ident: 10.1016/j.jad.2021.08.090_bib0006
  article-title: Attention-based models for speech recognition
– start-page: 65
  year: 2019
  ident: 10.1016/j.jad.2021.08.090_bib0030
  article-title: A multi-modal hierarchical recurrent neural network for depression detection
– start-page: 4171
  year: 2019
  ident: 10.1016/j.jad.2021.08.090_bib0007
  article-title: BERT: pre-training of deep bidirectional transformers for language understanding
– start-page: 207
  year: 2016
  ident: 10.1016/j.jad.2021.08.090_bib0032
  article-title: Attention-based bidirectional long short-term memory networks for relation classification
– year: 2017
  ident: 10.1016/j.jad.2021.08.090_bib0002
  article-title: Snore sound classification using image-based deep spectrum features
– volume: 128
  start-page: 243
  year: 2011
  ident: 10.1016/j.jad.2021.08.090_bib0022
  article-title: Major depression is associated with impaired processing of emotion in music as well as in facial and vocal stimuli
  publication-title: J Affect Disord
  doi: 10.1016/j.jad.2010.06.039
– ident: 10.1016/j.jad.2021.08.090_bib0004
– volume: 23
  start-page: 56
  year: 1960
  ident: 10.1016/j.jad.2021.08.090_bib0012
  article-title: A rating scale for depression
  publication-title: J. Neurol. Neurosurg. Psychiatr.
  doi: 10.1136/jnnp.23.1.56
– start-page: 3838
  year: 2017
  ident: 10.1016/j.jad.2021.08.090_bib0025
  article-title: Depression detection via harvesting social media: a multimodal dictionary learning solution
– volume: 280
  start-page: 354
  year: 2021
  ident: 10.1016/j.jad.2021.08.090_bib0031
  article-title: Detecting changes in attitudes toward depression on Chinese social media: a text analysis
  publication-title: J. Affect. Disord.
  doi: 10.1016/j.jad.2020.11.040
– year: 2017
  ident: 10.1016/j.jad.2021.08.090_bib0028
  article-title: Attention is all you need
SSID ssj0006970
Score 2.5890849
Snippet •We propose and prove a text reading experiment to make subjects emotions change rapidly.•Features analysis (Low-level audio features, DeepSpectrum features...
Highlights•We propose and prove a text reading experiment to make subjects emotions change rapidly. •Features analysis (Low-level audio features, DeepSpectrum...
Early detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 904
SubjectTerms Artificial intelligence
Deep learning
Depression
Multi-modality
Psychiatric/Mental Health
Title Multi-modal depression detection based on emotional audio and evaluation text
URI https://www.clinicalkey.com/#!/content/1-s2.0-S0165032721008958
https://www.clinicalkey.es/playcontent/1-s2.0-S0165032721008958
https://dx.doi.org/10.1016/j.jad.2021.08.090
https://www.proquest.com/docview/2587764619
Volume 295
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF5KvXgRRcX6KBE8CWuTzW4ex1Is1dIe1GJvSzbZhRZNimmv_nZn8mhRSwVP2SSzeQyz82C_mSHkRgeBsr3Ip8Zwm3JfhzSMtaJMGcQoILAQA8XR2BtM-ONUTBukV-fCIKyy0v2lTi-0dXWlU3Gzs5jNOs-YiGO7DEIYsGOhwIRfzn2U8rvPDczDC4uGcUhMkbre2SwwXvMIi4WysoonquXttumHli5MT_-QHFQ-o9UtP-uINHR6TEZF6ix9zxK4tcazpjBcFuiq1EIDlVgw0GWrHqCLVskss6I0sTZVvi3EfpyQSf_-pTegVW8EGrsB6LYEzKoCa64jFgAXYhAIRxkX2OuKWDsi0Q6Df2eR0UIFQjlaxOAq-BGLtUCv7ZQ00yzVZ8TyjAH75bkQ-2geY8jHuKdCE_guGKskbBG75oqMq8Lh2L_iTdYIsbkERkpkpMSelqHdIrfrKYuyasYuYlazWtbpoKDAJOj0XZP8bZN0Xi3BXDoyZ9KWv8SkRfh65jdJ--uF17UUSFiBuK0SpTpb5ZKJwPc9DpHo-f8efUH28awEyVyS5vJjpa_A1VmqdiHLbbLXfRgOxngcPr0OvwCx7_4a
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5qe9CLKCrWZwRPQmiyyeZxLMXS2sfFFnpbsskGWjQppv3_zuQlaqngbUl28hhmv9lhv5kBeFSeJw0ncPU4tg3ddpWv-6GSOpMxcRSIWEiB4mTqDOb2y4IvGtCrcmGIVllif4HpOVqXVzqlNjvr5bLzSok4hsUwhEE_5nPvAFpUnYo3odUdjgbTGpAdP-8ZR_N1EqgON3Oa1yqgeqGsKORJyLzbPf0A6tz79E_guNw2at3iy06hoZIzmOTZs_p7GuGtmtKa4HCTE6wSjXxUpOFAFd16cF6wjZapFiSR9lXoWyP6xznM-8-z3kAv2yPooeUhvEXoWSU6dBUwDxURok2YMrZQwxYPlckjZTL8dxbEikuPS1PxEHcLbsBCxWnjdgHNJE3UJWhOHKMLcywMf5QdUtTHbEf6seda6K8ivw1GpRURlrXDqYXFm6hIYiuBihSkSEFtLX2jDU-1yLoonLFvMqtULaqMUMQwgbC-T8jdJaSychVmwhQZE4b4ZSltsGvJb8b21wsfKisQuAjpZCVIVLrNBOOe6zo2BqNX_3v0PRwOZpOxGA-no2s4ojsFZ-YGmpuPrbrFnc9G3pWW_Qlwvv8o
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multi-modal+depression+detection+based+on+emotional+audio+and+evaluation+text&rft.jtitle=Journal+of+affective+disorders&rft.au=Ye%2C+Jiayu&rft.au=Yu%2C+Yanhong&rft.au=Wang%2C+Qingxiang&rft.au=Li%2C+Wentao&rft.date=2021-12-01&rft.issn=0165-0327&rft.volume=295&rft.spage=904&rft.epage=913&rft_id=info:doi/10.1016%2Fj.jad.2021.08.090&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_jad_2021_08_090
thumbnail_m http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fcdn.clinicalkey.com%2Fck-thumbnails%2F01650327%2FS0165032721X00146%2Fcov150h.gif