Multi-modal depression detection based on emotional audio and evaluation text

•We propose and prove a text reading experiment to make subjects emotions change rapidly.•Features analysis (Low-level audio features, DeepSpectrum features and word vector features).•Our research propose a multimodal fusion method based on DeepSpectrum features and word vector features to detect de...

Full description

Saved in:

Bibliographic Details
Published in	Journal of affective disorders Vol. 295; pp. 904 - 913
Main Authors	Ye, Jiayu, Yu, Yanhong, Wang, Qingxiang, Li, Wentao, Liang, Hu, Zheng, Yunshao, Fu, Gang
Format	Journal Article
Language	English
Published	Elsevier B.V 01.12.2021
Subjects	Artificial intelligence Deep learning Depression Multi-modality Psychiatric/Mental Health Deep learning Depression Artificial intelligence Multi-modality
Online Access	Get full text

Cover

Loading…

Abstract	•We propose and prove a text reading experiment to make subjects emotions change rapidly.•Features analysis (Low-level audio features, DeepSpectrum features and word vector features).•Our research propose a multimodal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning. Early detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the research of depression identification technology is a complex problem with application value. Our research propose a new experimental method for depression detection based on audio and text. 160 Chinese subjects are investigated in this study. It is worth noting that we propose a text reading experiment to make subjects emotions change rapidly. It will be called Segmental Emotional Speech Experiment (SESE) below. We extract 384-dimensional Low-level audio features to find the differences of different emotional change in SESE. At the same time, our research propose a multi-modal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning. Our experiment proved that SESE can improve the recognition accuracy of depression and found differences in Low-level audio features. Case group and Control group, gender and age are grouped for verification. It is also satisfactory that the multi-modal fusion model achieves accuracy of 0.912 and F1 score of 0.906. Our contribution is twofold. First, we propose and verify SESE, which can provide a new experimental idea for the follow-up researchers. Secondly, a new efficient multi-modal depression recognition model is proposed.
AbstractList	•We propose and prove a text reading experiment to make subjects emotions change rapidly.•Features analysis (Low-level audio features, DeepSpectrum features and word vector features).•Our research propose a multimodal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning. Early detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the research of depression identification technology is a complex problem with application value. Our research propose a new experimental method for depression detection based on audio and text. 160 Chinese subjects are investigated in this study. It is worth noting that we propose a text reading experiment to make subjects emotions change rapidly. It will be called Segmental Emotional Speech Experiment (SESE) below. We extract 384-dimensional Low-level audio features to find the differences of different emotional change in SESE. At the same time, our research propose a multi-modal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning. Our experiment proved that SESE can improve the recognition accuracy of depression and found differences in Low-level audio features. Case group and Control group, gender and age are grouped for verification. It is also satisfactory that the multi-modal fusion model achieves accuracy of 0.912 and F1 score of 0.906. Our contribution is twofold. First, we propose and verify SESE, which can provide a new experimental idea for the follow-up researchers. Secondly, a new efficient multi-modal depression recognition model is proposed. Early detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the research of depression identification technology is a complex problem with application value.BACKGROUNDEarly detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the research of depression identification technology is a complex problem with application value.Our research propose a new experimental method for depression detection based on audio and text. 160 Chinese subjects are investigated in this study. It is worth noting that we propose a text reading experiment to make subjects emotions change rapidly. It will be called Segmental Emotional Speech Experiment (SESE) below. We extract 384-dimensional Low-level audio features to find the differences of different emotional change in SESE. At the same time, our research propose a multi-modal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning.METHODSOur research propose a new experimental method for depression detection based on audio and text. 160 Chinese subjects are investigated in this study. It is worth noting that we propose a text reading experiment to make subjects emotions change rapidly. It will be called Segmental Emotional Speech Experiment (SESE) below. We extract 384-dimensional Low-level audio features to find the differences of different emotional change in SESE. At the same time, our research propose a multi-modal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning.Our experiment proved that SESE can improve the recognition accuracy of depression and found differences in Low-level audio features. Case group and Control group, gender and age are grouped for verification. It is also satisfactory that the multi-modal fusion model achieves accuracy of 0.912 and F1 score of 0.906.RESULTSOur experiment proved that SESE can improve the recognition accuracy of depression and found differences in Low-level audio features. Case group and Control group, gender and age are grouped for verification. It is also satisfactory that the multi-modal fusion model achieves accuracy of 0.912 and F1 score of 0.906.Our contribution is twofold. First, we propose and verify SESE, which can provide a new experimental idea for the follow-up researchers. Secondly, a new efficient multi-modal depression recognition model is proposed.CONCLUSIONSOur contribution is twofold. First, we propose and verify SESE, which can provide a new experimental idea for the follow-up researchers. Secondly, a new efficient multi-modal depression recognition model is proposed. Highlights•We propose and prove a text reading experiment to make subjects emotions change rapidly. •Features analysis (Low-level audio features, DeepSpectrum features and word vector features). •Our research propose a multimodal fusion method based on DeepSpectrum features and word vector features to detect depression by using deep learning.
Author	Li, Wentao Liang, Hu Fu, Gang Zheng, Yunshao Ye, Jiayu Wang, Qingxiang Yu, Yanhong
Author_xml	– sequence: 1 givenname: Jiayu orcidid: 0000-0003-0368-9651 surname: Ye fullname: Ye, Jiayu organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China – sequence: 2 givenname: Yanhong surname: Yu fullname: Yu, Yanhong organization: College of Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine, Jinan 250355, China – sequence: 3 givenname: Qingxiang surname: Wang fullname: Wang, Qingxiang email: wangqx@qlu.edu.cn organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China – sequence: 4 givenname: Wentao surname: Li fullname: Li, Wentao organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China – sequence: 5 givenname: Hu surname: Liang fullname: Liang, Hu organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China – sequence: 6 givenname: Yunshao surname: Zheng fullname: Zheng, Yunshao organization: Shandong Mental Health Center, Jinan 250014, China – sequence: 7 givenname: Gang surname: Fu fullname: Fu, Gang organization: School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
BookMark	eNqFkUFr3DAQhUVJoJukP6C3PfZid0aOLJlCoYSmCST00PYsZGkMcrXWVrJD8-8r7-YUSHrSQ7zvafTmjJ1McSLG3iPUCNh-HOvRuJoDxxpUDR28YRsUsqm4QHnCNsUjKmi4fMvOch4BoO0kbNj9_RJmX-2iM2HraJ8oZx-nImey86p6k8lti6BdXC-KzyzOx62Z3JYeTFjMwTfT3_mCnQ4mZHr3dJ6zX9dff17dVHffv91efbmrbKNaWbkWL3uFHRmuBDSWjMN-aAayjbCEwhHyMh83A4leiR5J2AalNNySENg15-zDMXef4p-F8qx3PlsKwUwUl6y5UFK2l-3BKo9Wm2LOiQZt_XyYeE7GB42g1wL1qEuBei1Qg9KlwELiM3Kf_M6kx1eZT0eGyu8fPCWdrafJkvOp9Kld9K_Sn5_RNvjJWxN-0yPlMS6p1J816sw16B_rUtedcgRQnVAloHs54D-P_wNSMrM7
CitedBy_id	crossref_primary_10_1080_01969722_2022_2122009 crossref_primary_10_1109_ACCESS_2024_3454292 crossref_primary_10_1142_S0129065722500459 crossref_primary_10_1145_3673906 crossref_primary_10_3389_fneur_2022_905917 crossref_primary_10_1016_j_cmpb_2023_107923 crossref_primary_10_1007_s00521_024_10868_x crossref_primary_10_1142_S021800142450023X crossref_primary_10_11834_jig_240017 crossref_primary_10_2139_ssrn_4102839 crossref_primary_10_1111_jocn_17694 crossref_primary_10_1016_j_compbiomed_2023_107805 crossref_primary_10_1109_TAFFC_2024_3370103 crossref_primary_10_3390_s22041545 crossref_primary_10_1080_00051144_2023_2296793 crossref_primary_10_1007_s42979_024_02730_7 crossref_primary_10_3390_s24020348 crossref_primary_10_3390_diagnostics15020210 crossref_primary_10_1109_TCSVT_2024_3491098 crossref_primary_10_1145_3665247 crossref_primary_10_1109_OJCS_2024_3462812 crossref_primary_10_1109_TCSS_2024_3405949 crossref_primary_10_21015_vtse_v11i2_1501 crossref_primary_10_32604_cmc_2024_056666 crossref_primary_10_35377_saucis___1381522 crossref_primary_10_1016_j_bspc_2022_104561 crossref_primary_10_1155_2023_4604885 crossref_primary_10_1016_j_compbiomed_2023_106741 crossref_primary_10_1016_j_compbiomed_2023_107555 crossref_primary_10_1016_j_inffus_2023_102017 crossref_primary_10_2139_ssrn_4180783 crossref_primary_10_3389_fneur_2024_1394210 crossref_primary_10_1016_j_specom_2022_09_001 crossref_primary_10_2139_ssrn_4172609
Cites_doi	10.3115/v1/W14-4012 10.1109/TBME.2007.900562 10.1016/j.im.2020.103349 10.1080/02699930903407948 10.1109/JBHI.2020.2983035 10.1136/jnnp.2004.036079 10.1001/jama.2017.3826 10.1016/j.csl.2018.08.001 10.1207/s15516709cog1402_1 10.1155/2018/6508319 10.1016/j.jad.2010.06.039 10.1136/jnnp.23.1.56 10.1016/j.jad.2020.11.040
ContentType	Journal Article
Copyright	2021 Copyright © 2021. Published by Elsevier B.V.
Copyright_xml	– notice: 2021 – notice: Copyright © 2021. Published by Elsevier B.V.
DBID	AAYXX CITATION 7X8
DOI	10.1016/j.jad.2021.08.090
DatabaseName	CrossRef MEDLINE - Academic
DatabaseTitle	CrossRef MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic
DeliveryMethod	fulltext_linktorsrc
EISSN	1573-2517
EndPage	913
ExternalDocumentID	10_1016_j_jad_2021_08_090 S0165032721008958 1_s2_0_S0165032721008958
GroupedDBID	--- --K --M .1- .FO .~1 0R~ 1B1 1P~ 1RT 1~. 1~5 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JM AABNK AAEDW AAIKJ AAKOC AALRI AAOAW AAQFI AATTM AAWTL AAXKI AAXUO ABBQC ABFNM ABIVO ABJNI ABLJU ABMAC ABMZM ACDAQ ACGFS ACHQT ACIEU ACIUM ACRLP ACVFH ADBBV ADCNI ADEZE AEBSH AEIPS AEKER AENEX AEUPX AEVXI AFPUW AFRHN AFTJW AFXIZ AGCQF AGUBO AGYEJ AHHHB AIEXJ AIGII AIIUN AIKHN AITUG AJRQY AJUYK AKBMS AKRWK AKYEP ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU ANZVX APXCP AXJTR BKOJK BLXMC BNPGV CS3 DU5 EBS EFJIC EFKBS EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA HMQ HMW IHE J1W KOM M29 M2V M39 M3V M41 MO0 N9A O-L O9- OAUVE OH0 OU- OZT P-8 P-9 P2P PC. Q38 ROL RPZ SAE SCC SDF SDG SDP SEL SES SPCBC SSH SSZ T5K UV1 Z5R ~G- 0SF 29J 53G AACTN AAEDT AAGKA AAQXK ABWVN ABXDB ACRPL ADMUD ADNMO ADVLN AFCTW AFJKZ AFKWA AGHFR AJOXV AMFUW ASPBG AVWKF AZFZN EJD FEDTE FGOYB G-2 HEG HMK HMO HVGLF HZ~ NCXOZ R2- RIG SEW SNS SPS WUQ ZGI AAIAV ABLVK ABYKQ EFLBG LCYCR ZA5 AAYWO AAYXX AGQPQ AGRNS CITATION 7X8
ID	FETCH-LOGICAL-c3867-d614b819ea28503cead1bf3fec35ce15de120692afe5b85b1e5c3177a2ce55193
IEDL.DBID	.~1
ISSN	0165-0327 1573-2517
IngestDate	Fri Jul 11 08:05:59 EDT 2025 Tue Jul 01 03:46:19 EDT 2025 Thu Apr 24 23:11:45 EDT 2025 Fri Feb 23 02:41:11 EST 2024 Tue Feb 25 19:59:50 EST 2025 Tue Aug 26 20:09:24 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Keywords	Deep learning Depression Artificial intelligence Multi-modality
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c3867-d614b819ea28503cead1bf3fec35ce15de120692afe5b85b1e5c3177a2ce55193
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ORCID	0000-0003-0368-9651
PQID	2587764619
PQPubID	23479
PageCount	10
ParticipantIDs	proquest_miscellaneous_2587764619 crossref_citationtrail_10_1016_j_jad_2021_08_090 crossref_primary_10_1016_j_jad_2021_08_090 elsevier_sciencedirect_doi_10_1016_j_jad_2021_08_090 elsevier_clinicalkeyesjournals_1_s2_0_S0165032721008958 elsevier_clinicalkey_doi_10_1016_j_jad_2021_08_090
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2021-12-01
PublicationDateYYYYMMDD	2021-12-01
PublicationDate_xml	– month: 12 year: 2021 text: 2021-12-01 day: 01
PublicationDecade	2020
PublicationTitle	Journal of affective disorders
PublicationYear	2021
Publisher	Elsevier B.V
Publisher_xml	– name: Elsevier B.V
References	Elman (bib0008) 1990; 14 Kim (bib0016) 2014 Lam, Dongyan, Lin (bib0017) 2019 Mikolov, Chen, Corrado, Dean (bib0019) 2013 Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y., 2015. On the properties of neural machine translation: encoder–decoder approaches, 103–111. 10.3115/v1/w14-4012. Ma, Yang, Chen, Huang, Wang (bib0018) 2016 Yu, Jiang, Ren, Xu, Zhang, Hu (bib0031) 2021; 280 Joormann, Gotlib (bib0014) 2010; 24 Yang, Dai, Yang, Carbonell, Salakhutdinov, Le (bib0029) 2019 Jiang, Hu, Liu, Wang, Zhang, Li, Kang (bib0013) 2018 Aytar, Vondrick, Torralba (bib0003) 2016 Shen, Jia, Nie, Feng, Zhang, Hu, Chua, Zhu (bib0025) 2017 Tlachac, Rundensteiner (bib0027) 2020; 24 Yin, Liang, Ding, Wang (bib0030) 2019 Stasak, Epps, Goecke (bib0026) 2019; 53 Friedrich (bib0010) 2017; 317 Devlin, Chang, Lee, Toutanova (bib0007) 2019 Hamilton (bib0012) 1960; 23 Zhou, Shi, Tian, Qi, Li, Hao, Xu (bib0032) 2016 Moore, Clements, Peifer, Weisser (bib0021) 2008; 55 Alhanai, Ghassemi, Glass (bib0001) 2018 Naranjo, Kornreich, Campanella, Noël, Vandriette, Gillain, de Longueville, Delatte, Verbanck, Constant (bib0022) 2011; 128 . Guohou, Lina, Dongsong (bib0011) 2020 Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J., 2016. Pruning convolutional neural networks for resource efficient transfer learning. arXiv1611.06440[cs, stat]. Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (bib0028) 2017 Bai, S., Kolter, J.Z., Koltun, V., 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. Kan, Mimura, Kamijimo, Kawamura (bib0015) 2004; 75 Amiriparian, Gerczuk, Ottl, Cummins, Freitag, Pugachevskiy, Baird, Schuller (bib0002) 2017 Pascanu, Mikolov, Bengio (bib0023) 2013 Schuller, Steidl, Batliner (bib0024) 2009 Chorowski, Bahdanau, Serdyuk, Cho, Bengio (bib0006) 2015 Fan, He, Xing, Cai, Lu (bib0009) 2019 Yin (10.1016/j.jad.2021.08.090_bib0030) 2019 Devlin (10.1016/j.jad.2021.08.090_bib0007) 2019 Fan (10.1016/j.jad.2021.08.090_bib0009) 2019 Lam (10.1016/j.jad.2021.08.090_bib0017) 2019 Alhanai (10.1016/j.jad.2021.08.090_bib0001) 2018 Yu (10.1016/j.jad.2021.08.090_bib0031) 2021; 280 Friedrich (10.1016/j.jad.2021.08.090_bib0010) 2017; 317 Moore (10.1016/j.jad.2021.08.090_bib0021) 2008; 55 Joormann (10.1016/j.jad.2021.08.090_bib0014) 2010; 24 Tlachac (10.1016/j.jad.2021.08.090_bib0027) 2020; 24 Stasak (10.1016/j.jad.2021.08.090_bib0026) 2019; 53 Shen (10.1016/j.jad.2021.08.090_bib0025) 2017 Naranjo (10.1016/j.jad.2021.08.090_bib0022) 2011; 128 Jiang (10.1016/j.jad.2021.08.090_bib0013) 2018 Kim (10.1016/j.jad.2021.08.090_bib0016) 2014 Hamilton (10.1016/j.jad.2021.08.090_bib0012) 1960; 23 Kan (10.1016/j.jad.2021.08.090_bib0015) 2004; 75 Ma (10.1016/j.jad.2021.08.090_bib0018) 2016 10.1016/j.jad.2021.08.090_bib0020 Yang (10.1016/j.jad.2021.08.090_bib0029) 2019 Aytar (10.1016/j.jad.2021.08.090_bib0003) 2016 Pascanu (10.1016/j.jad.2021.08.090_bib0023) 2013 Chorowski (10.1016/j.jad.2021.08.090_bib0006) 2015 Guohou (10.1016/j.jad.2021.08.090_bib0011) 2020 10.1016/j.jad.2021.08.090_bib0005 10.1016/j.jad.2021.08.090_bib0004 Zhou (10.1016/j.jad.2021.08.090_bib0032) 2016 Amiriparian (10.1016/j.jad.2021.08.090_bib0002) 2017 Vaswani (10.1016/j.jad.2021.08.090_bib0028) 2017 Elman (10.1016/j.jad.2021.08.090_bib0008) 1990; 14 Schuller (10.1016/j.jad.2021.08.090_bib0024) 2009 Mikolov (10.1016/j.jad.2021.08.090_bib0019) 2013
References_xml	– reference: Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J., 2016. Pruning convolutional neural networks for resource efficient transfer learning. arXiv1611.06440[cs, stat]. – volume: 24 year: 2020 ident: bib0027 article-title: Screening for depression with retrospectively harvested private versus public text publication-title: IEEE J. Biomed. Health Inform. – year: 2017 ident: bib0028 article-title: Attention is all you need publication-title: Advances in Neural Information Processing Systems – volume: 128 start-page: 243 year: 2011 end-page: 251 ident: bib0022 article-title: Major depression is associated with impaired processing of emotion in music as well as in facial and vocal stimuli publication-title: J Affect Disord – volume: 317 year: 2017 ident: bib0010 article-title: Depression Is the leading cause of disability around the world publication-title: JAMA – start-page: 73 year: 2019 end-page: 80 ident: bib0009 article-title: Multi-modality depression detection via multi-scale temporal dilated CNNs publication-title: AVEC 2019 - Proceedings of the 9th International Audio/Visual Emotion Challenge and Workshop – start-page: 1746 year: 2014 end-page: 1751 ident: bib0016 article-title: Convolutional neural networks for sentence classification publication-title: EMNLP 2014–2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference – start-page: 577 year: 2015 end-page: 585 ident: bib0006 article-title: Attention-based models for speech recognition publication-title: Advances in Neural Information Processing Systems – volume: 55 start-page: 96 year: 2008 end-page: 97 ident: bib0021 article-title: Critical analysis of the impact of glottal features in the classification of clinical depression in speech publication-title: IEEE Trans. Biomed. Eng. – start-page: 207 year: 2016 end-page: 212 ident: bib0032 article-title: Attention-based bidirectional long short-term memory networks for relation classification publication-title: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics – year: 2013 ident: bib0019 article-title: Efficient estimation of word representations in vector space publication-title: 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings – start-page: 3946 year: 2019 end-page: 3950 ident: bib0017 article-title: Context-aware deep learning for multi-modal depression detection publication-title: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) – volume: 280 start-page: 354 year: 2021 end-page: 363 ident: bib0031 article-title: Detecting changes in attitudes toward depression on Chinese social media: a text analysis publication-title: J. Affect. Disord. – start-page: 1 year: 2018 end-page: 9 ident: bib0013 article-title: Detecting depression using an ensemble logistic regression model based on multiple speech features publication-title: Comput. Math Methods Med. – year: 2018 ident: bib0001 article-title: Detecting depression with audio/text sequence modeling of interviews publication-title: Proceedings of the Annual Conference of the International Speech Communication Association – volume: 24 year: 2010 ident: bib0014 article-title: Emotion regulation in depression: relation to cognitive inhibition publication-title: Cognit. Emot. – start-page: 65 year: 2019 end-page: 71 ident: bib0030 article-title: A multi-modal hierarchical recurrent neural network for depression detection publication-title: AVEC 2019 - Proceedings of the 9th International Audio/Visual Emotion Challenge and Workshop, Co-Located with MM 2019 – year: 2017 ident: bib0002 article-title: Snore sound classification using image-based deep spectrum features publication-title: Proceedings of the Annual Conference of the International Speech Communication Association – start-page: 3838 year: 2017 end-page: 3844 ident: bib0025 article-title: Depression detection via harvesting social media: a multimodal dictionary learning solution publication-title: IJCAI International Joint Conference on Artificial Intelligence – start-page: 35 year: 2016 end-page: 42 ident: bib0018 article-title: DepAudioNet: an efficient deep model for audio based depression classification publication-title: AVEC 2016 - Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, Co-Located with ACM Multimedia 2016 – volume: 14 start-page: 179 year: 1990 end-page: 211 ident: bib0008 article-title: Finding structure in time publication-title: Cogn. Sci. – year: 2009 ident: bib0024 article-title: The INTERSPEECH 2009 emotion challenge publication-title: Proceedings of the Annual Conference of the International Speech Communication Association – reference: . – reference: Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y., 2015. On the properties of neural machine translation: encoder–decoder approaches, 103–111. 10.3115/v1/w14-4012. – volume: 23 start-page: 56 year: 1960 end-page: 62 ident: bib0012 article-title: A rating scale for depression publication-title: J. Neurol. Neurosurg. Psychiatr. – start-page: 1310 year: 2013 end-page: 1318 ident: bib0023 article-title: On the difficulty of training recurrent neural networks publication-title: 30th International Conference on Machine Learning, ICML 2013 – year: 2016 ident: bib0003 article-title: SoundNet: learning sound representations from unlabeled video publication-title: Proceedings of the 30th International Conference on Neural Information Processing Systems – start-page: 4171 year: 2019 end-page: 4186 ident: bib0007 article-title: BERT: pre-training of deep bidirectional transformers for language understanding publication-title: NAACL HLT 2019–2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference – volume: 75 start-page: 1667 year: 2004 end-page: 1671 ident: bib0015 article-title: Recognition of emotion from moving facial and prosodic stimuli in depressed patients publication-title: J. Neurol. Neurosurg. Psychiatry – reference: Bai, S., Kolter, J.Z., Koltun, V., 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. – year: 2019 ident: bib0029 article-title: XLNet: generalized autoregressive pretraining for language understanding publication-title: Advances in Neural Information Processing Systems – year: 2020 ident: bib0011 article-title: What reveals about depression level? The role of multimodal features at the level of interview questions publication-title: Inf. Manag. – volume: 53 start-page: 140 year: 2019 end-page: 155 ident: bib0026 article-title: An investigation of linguistic stress and articulatory vowel characteristics for automatic depression classification publication-title: Computer Speech Lang. – ident: 10.1016/j.jad.2021.08.090_bib0020 – start-page: 35 year: 2016 ident: 10.1016/j.jad.2021.08.090_bib0018 article-title: DepAudioNet: an efficient deep model for audio based depression classification – year: 2018 ident: 10.1016/j.jad.2021.08.090_bib0001 article-title: Detecting depression with audio/text sequence modeling of interviews – start-page: 3946 year: 2019 ident: 10.1016/j.jad.2021.08.090_bib0017 article-title: Context-aware deep learning for multi-modal depression detection – year: 2013 ident: 10.1016/j.jad.2021.08.090_bib0019 article-title: Efficient estimation of word representations in vector space – ident: 10.1016/j.jad.2021.08.090_bib0005 doi: 10.3115/v1/W14-4012 – start-page: 73 year: 2019 ident: 10.1016/j.jad.2021.08.090_bib0009 article-title: Multi-modality depression detection via multi-scale temporal dilated CNNs – volume: 55 start-page: 96 year: 2008 ident: 10.1016/j.jad.2021.08.090_bib0021 article-title: Critical analysis of the impact of glottal features in the classification of clinical depression in speech publication-title: IEEE Trans. Biomed. Eng. doi: 10.1109/TBME.2007.900562 – year: 2009 ident: 10.1016/j.jad.2021.08.090_bib0024 article-title: The INTERSPEECH 2009 emotion challenge – year: 2020 ident: 10.1016/j.jad.2021.08.090_bib0011 article-title: What reveals about depression level? The role of multimodal features at the level of interview questions publication-title: Inf. Manag. doi: 10.1016/j.im.2020.103349 – start-page: 1310 year: 2013 ident: 10.1016/j.jad.2021.08.090_bib0023 article-title: On the difficulty of training recurrent neural networks – volume: 24 year: 2010 ident: 10.1016/j.jad.2021.08.090_bib0014 article-title: Emotion regulation in depression: relation to cognitive inhibition publication-title: Cognit. Emot. doi: 10.1080/02699930903407948 – volume: 24 year: 2020 ident: 10.1016/j.jad.2021.08.090_bib0027 article-title: Screening for depression with retrospectively harvested private versus public text publication-title: IEEE J. Biomed. Health Inform. doi: 10.1109/JBHI.2020.2983035 – volume: 75 start-page: 1667 year: 2004 ident: 10.1016/j.jad.2021.08.090_bib0015 article-title: Recognition of emotion from moving facial and prosodic stimuli in depressed patients publication-title: J. Neurol. Neurosurg. Psychiatry doi: 10.1136/jnnp.2004.036079 – year: 2016 ident: 10.1016/j.jad.2021.08.090_bib0003 article-title: SoundNet: learning sound representations from unlabeled video – year: 2019 ident: 10.1016/j.jad.2021.08.090_bib0029 article-title: XLNet: generalized autoregressive pretraining for language understanding – volume: 317 year: 2017 ident: 10.1016/j.jad.2021.08.090_bib0010 article-title: Depression Is the leading cause of disability around the world publication-title: JAMA doi: 10.1001/jama.2017.3826 – volume: 53 start-page: 140 year: 2019 ident: 10.1016/j.jad.2021.08.090_bib0026 article-title: An investigation of linguistic stress and articulatory vowel characteristics for automatic depression classification publication-title: Computer Speech Lang. doi: 10.1016/j.csl.2018.08.001 – volume: 14 start-page: 179 year: 1990 ident: 10.1016/j.jad.2021.08.090_bib0008 article-title: Finding structure in time publication-title: Cogn. Sci. doi: 10.1207/s15516709cog1402_1 – start-page: 1 year: 2018 ident: 10.1016/j.jad.2021.08.090_bib0013 article-title: Detecting depression using an ensemble logistic regression model based on multiple speech features publication-title: Comput. Math Methods Med. doi: 10.1155/2018/6508319 – start-page: 1746 year: 2014 ident: 10.1016/j.jad.2021.08.090_bib0016 article-title: Convolutional neural networks for sentence classification – start-page: 577 year: 2015 ident: 10.1016/j.jad.2021.08.090_bib0006 article-title: Attention-based models for speech recognition – start-page: 65 year: 2019 ident: 10.1016/j.jad.2021.08.090_bib0030 article-title: A multi-modal hierarchical recurrent neural network for depression detection – start-page: 4171 year: 2019 ident: 10.1016/j.jad.2021.08.090_bib0007 article-title: BERT: pre-training of deep bidirectional transformers for language understanding – start-page: 207 year: 2016 ident: 10.1016/j.jad.2021.08.090_bib0032 article-title: Attention-based bidirectional long short-term memory networks for relation classification – year: 2017 ident: 10.1016/j.jad.2021.08.090_bib0002 article-title: Snore sound classification using image-based deep spectrum features – volume: 128 start-page: 243 year: 2011 ident: 10.1016/j.jad.2021.08.090_bib0022 article-title: Major depression is associated with impaired processing of emotion in music as well as in facial and vocal stimuli publication-title: J Affect Disord doi: 10.1016/j.jad.2010.06.039 – ident: 10.1016/j.jad.2021.08.090_bib0004 – volume: 23 start-page: 56 year: 1960 ident: 10.1016/j.jad.2021.08.090_bib0012 article-title: A rating scale for depression publication-title: J. Neurol. Neurosurg. Psychiatr. doi: 10.1136/jnnp.23.1.56 – start-page: 3838 year: 2017 ident: 10.1016/j.jad.2021.08.090_bib0025 article-title: Depression detection via harvesting social media: a multimodal dictionary learning solution – volume: 280 start-page: 354 year: 2021 ident: 10.1016/j.jad.2021.08.090_bib0031 article-title: Detecting changes in attitudes toward depression on Chinese social media: a text analysis publication-title: J. Affect. Disord. doi: 10.1016/j.jad.2020.11.040 – year: 2017 ident: 10.1016/j.jad.2021.08.090_bib0028 article-title: Attention is all you need
SSID	ssj0006970
Score	2.5890849
Snippet	•We propose and prove a text reading experiment to make subjects emotions change rapidly.•Features analysis (Low-level audio features, DeepSpectrum features... Highlights•We propose and prove a text reading experiment to make subjects emotions change rapidly. •Features analysis (Low-level audio features, DeepSpectrum... Early detection of depression is very important for the treatment of patients. In view of the current inefficient screening methods for depression, the...
SourceID	proquest crossref elsevier
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	904
SubjectTerms	Artificial intelligence Deep learning Depression Multi-modality Psychiatric/Mental Health
Title	Multi-modal depression detection based on emotional audio and evaluation text
URI	https://www.clinicalkey.com/#!/content/1-s2.0-S0165032721008958 https://www.clinicalkey.es/playcontent/1-s2.0-S0165032721008958 https://dx.doi.org/10.1016/j.jad.2021.08.090 https://www.proquest.com/docview/2587764619
Volume	295
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF5KvXgRRcX6KBE8CWuTzW4ex1Is1dIe1GJvSzbZhRZNimmv_nZn8mhRSwVP2SSzeQyz82C_mSHkRgeBsr3Ip8Zwm3JfhzSMtaJMGcQoILAQA8XR2BtM-ONUTBukV-fCIKyy0v2lTi-0dXWlU3Gzs5jNOs-YiGO7DEIYsGOhwIRfzn2U8rvPDczDC4uGcUhMkbre2SwwXvMIi4WysoonquXttumHli5MT_-QHFQ-o9UtP-uINHR6TEZF6ix9zxK4tcazpjBcFuiq1EIDlVgw0GWrHqCLVskss6I0sTZVvi3EfpyQSf_-pTegVW8EGrsB6LYEzKoCa64jFgAXYhAIRxkX2OuKWDsi0Q6Df2eR0UIFQjlaxOAq-BGLtUCv7ZQ00yzVZ8TyjAH75bkQ-2geY8jHuKdCE_guGKskbBG75oqMq8Lh2L_iTdYIsbkERkpkpMSelqHdIrfrKYuyasYuYlazWtbpoKDAJOj0XZP8bZN0Xi3BXDoyZ9KWv8SkRfh65jdJ--uF17UUSFiBuK0SpTpb5ZKJwPc9DpHo-f8efUH28awEyVyS5vJjpa_A1VmqdiHLbbLXfRgOxngcPr0OvwCx7_4a
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5qe9CLKCrWZwRPQmiyyeZxLMXS2sfFFnpbsskGWjQppv3_zuQlaqngbUl28hhmv9lhv5kBeFSeJw0ncPU4tg3ddpWv-6GSOpMxcRSIWEiB4mTqDOb2y4IvGtCrcmGIVllif4HpOVqXVzqlNjvr5bLzSok4hsUwhEE_5nPvAFpUnYo3odUdjgbTGpAdP-8ZR_N1EqgON3Oa1yqgeqGsKORJyLzbPf0A6tz79E_guNw2at3iy06hoZIzmOTZs_p7GuGtmtKa4HCTE6wSjXxUpOFAFd16cF6wjZapFiSR9lXoWyP6xznM-8-z3kAv2yPooeUhvEXoWSU6dBUwDxURok2YMrZQwxYPlckjZTL8dxbEikuPS1PxEHcLbsBCxWnjdgHNJE3UJWhOHKMLcywMf5QdUtTHbEf6seda6K8ivw1GpRURlrXDqYXFm6hIYiuBihSkSEFtLX2jDU-1yLoonLFvMqtULaqMUMQwgbC-T8jdJaSychVmwhQZE4b4ZSltsGvJb8b21wsfKisQuAjpZCVIVLrNBOOe6zo2BqNX_3v0PRwOZpOxGA-no2s4ojsFZ-YGmpuPrbrFnc9G3pWW_Qlwvv8o
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multi-modal+depression+detection+based+on+emotional+audio+and+evaluation+text&rft.jtitle=Journal+of+affective+disorders&rft.au=Ye%2C+Jiayu&rft.au=Yu%2C+Yanhong&rft.au=Wang%2C+Qingxiang&rft.au=Li%2C+Wentao&rft.date=2021-12-01&rft.issn=0165-0327&rft.volume=295&rft.spage=904&rft.epage=913&rft_id=info:doi/10.1016%2Fj.jad.2021.08.090&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_jad_2021_08_090
thumbnail_m	http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fcdn.clinicalkey.com%2Fck-thumbnails%2F01650327%2FS0165032721X00146%2Fcov150h.gif