BLSTM and CNN Stacking Architecture for Speech Emotion Recognition

Speech Emotion Recognition (SER) is a huge challenge for distinguishing and interpreting the sentiments carried in speech. Fortunately, deep learning is proved to have great ability to deal with acoustic features. For instance, Bidirectional Long Short Term Memory (BLSTM) has an advantage of solving...

Full description

Saved in:

Bibliographic Details
Published in	Neural processing letters Vol. 53; no. 6; pp. 4097 - 4115
Main Authors	Li, Dongdong, Sun, Linyu, Xu, Xinlei, Wang, Zhe, Zhang, Jing, Du, Wenli
Format	Journal Article
Language	English
Published	New York Springer US 01.12.2021 Springer Nature B.V
Subjects	Algorithms Artificial Intelligence Artificial neural networks Classification Complex Systems Computational Intelligence Computer Science Deep learning Emotion recognition Emotions Machine learning Neural networks Speech Speech recognition Statistical analysis Support vector machines Time series Stacking Convolutional neural network Bidirectional long short term memory Speech emotion recognition
Online Access	Get full text
ISSN	1370-4621 1573-773X
DOI	10.1007/s11063-021-10581-z

Cover

Abstract	Speech Emotion Recognition (SER) is a huge challenge for distinguishing and interpreting the sentiments carried in speech. Fortunately, deep learning is proved to have great ability to deal with acoustic features. For instance, Bidirectional Long Short Term Memory (BLSTM) has an advantage of solving time series acoustic features and Convolutional Neural Network (CNN) can discover the local structure among different features. This paper proposed the BLSTM and CNN Stacking Architecture (BCSA) to enhance the ability to recognition emotions. In order to match the input formats of BLSTM and CNN, slicing feature matrices is necessary. For utilizing the different roles of the BLSTM and CNN, the Stacking is employed to integrate the BLSTM and CNN. In detail, taking into account overfitting problem, the estimates of probabilistic quantities from BLSTM and CNN are combined as new data using K-fold cross validation. Finally, based on the Stacking models, the logistic regression is used to recognize emotions effectively by fitting the new data. The experiment results demonstrate that the performance of proposed architecture is better than that of single model. Furthermore, compared with the state-of-the-art model on SER in our knowledge, the proposed method BCSA may be more suitable for SER by integrating time series acoustic features and the local structure among different features.
AbstractList	Speech Emotion Recognition (SER) is a huge challenge for distinguishing and interpreting the sentiments carried in speech. Fortunately, deep learning is proved to have great ability to deal with acoustic features. For instance, Bidirectional Long Short Term Memory (BLSTM) has an advantage of solving time series acoustic features and Convolutional Neural Network (CNN) can discover the local structure among different features. This paper proposed the BLSTM and CNN Stacking Architecture (BCSA) to enhance the ability to recognition emotions. In order to match the input formats of BLSTM and CNN, slicing feature matrices is necessary. For utilizing the different roles of the BLSTM and CNN, the Stacking is employed to integrate the BLSTM and CNN. In detail, taking into account overfitting problem, the estimates of probabilistic quantities from BLSTM and CNN are combined as new data using K-fold cross validation. Finally, based on the Stacking models, the logistic regression is used to recognize emotions effectively by fitting the new data. The experiment results demonstrate that the performance of proposed architecture is better than that of single model. Furthermore, compared with the state-of-the-art model on SER in our knowledge, the proposed method BCSA may be more suitable for SER by integrating time series acoustic features and the local structure among different features.
Author	Wang, Zhe Sun, Linyu Zhang, Jing Li, Dongdong Xu, Xinlei Du, Wenli
Author_xml	– sequence: 1 givenname: Dongdong surname: Li fullname: Li, Dongdong organization: Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology, Department of Computer Science and Engineering, East China University of Science and Technology, Provincial Key Laboratory for Computer Information Processing Technology, Soochow University – sequence: 2 givenname: Linyu surname: Sun fullname: Sun, Linyu organization: Department of Computer Science and Engineering, East China University of Science and Technology – sequence: 3 givenname: Xinlei surname: Xu fullname: Xu, Xinlei organization: Department of Computer Science and Engineering, East China University of Science and Technology – sequence: 4 givenname: Zhe surname: Wang fullname: Wang, Zhe email: wangzhe@ecust.edu.cn organization: Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology, Department of Computer Science and Engineering, East China University of Science and Technology – sequence: 5 givenname: Jing surname: Zhang fullname: Zhang, Jing organization: Department of Computer Science and Engineering, East China University of Science and Technology – sequence: 6 givenname: Wenli surname: Du fullname: Du, Wenli email: wldu@ecust.edu.cn organization: Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology
BookMark	eNp9kM1PAjEQxRuDiYD-A5428VztbPejHIHgR4KYCCbemtKdwiJ0sS0H-estYmLigdO8w_zmzXsd0rKNRUKugd0CY-WdB2AFpywFCiwXQPdnpA15yWlZ8vdW1LxkNCtSuCAd71eMRSxlbTIYjKez50TZKhlOJsk0KP1R20XSd3pZB9Rh5zAxjUumW0S9TEabJtSNTV5RNwtbH_QlOTdq7fHqd3bJ2_1oNnyk45eHp2F_TDWHXqCc55rP84qhNilkuhAZpsIUCGUlMoW8MKUwRglu8ixTOc6hqHpQYZFDaaqCd8nN8e7WNZ879EGump2z0VKmPRA84zFU3EqPW9o13js0cuvqjXJfEpg8dCWPXcnYlfzpSu4jJP5Bug7qEC44Va9Po_yI-uhjF-j-vjpBfQNOq3-X
CitedBy_id	crossref_primary_10_3390_foods11142019 crossref_primary_10_1109_ACCESS_2024_3517733 crossref_primary_10_3390_electronics13061151 crossref_primary_10_1007_s00217_023_04375_x crossref_primary_10_1177_14727978251321951 crossref_primary_10_1007_s11042_023_16465_9 crossref_primary_10_1007_s00170_024_13385_2 crossref_primary_10_1016_j_cjche_2024_06_026 crossref_primary_10_1016_j_specom_2023_01_008 crossref_primary_10_1109_JIOT_2024_3360094 crossref_primary_10_3390_s23136212 crossref_primary_10_1007_s11063_023_11259_4 crossref_primary_10_1016_j_neucom_2023_126623 crossref_primary_10_1007_s11063_022_11036_9 crossref_primary_10_3233_JIFS_219390 crossref_primary_10_1007_s11042_023_17829_x crossref_primary_10_2478_ijssis_2024_0027 crossref_primary_10_1016_j_apacoust_2023_109658 crossref_primary_10_1016_j_eswa_2023_123110
Cites_doi	10.1155/2009/153017 10.1016/j.ins.2020.09.047 10.1109/TASLP.2016.2540805 10.1109/MSP.2012.2205597 10.1038/nature14539 10.1007/s10489-018-1206-2 10.1109/JSTSP.2017.2764438 10.1016/B978-0-08-051584-7.50010-3 10.1016/j.dsp.2012.10.008 10.1109/T-AFFC.2010.1 10.1007/s10579-008-9076-6 10.1016/j.patrec.2014.10.015 10.1109/TMM.2017.2766843 10.1016/S0893-6080(05)80023-1 10.3390/electronics9050713 10.21437/Interspeech.2017-917 10.1109/ICCIC.2015.7435630 10.21437/Interspeech.2009-103 10.1016/j.neunet.2017.02.013 10.1109/ICSPCS.2017.8270472 10.1016/j.specom.2010.08.013 10.1109/ICASSP.2017.7952552 10.1109/ICASSP.2016.7472166 10.1109/ISCAS.2010.5537907 10.4028/www.scientific.net/AMM.610.283 10.1007/s10489-018-1242-y 10.21437/Interspeech.2014-433 10.1109/APSIPA.2017.8282123 10.1016/j.eswa.2020.114177 10.1109/ICASSP.2017.7952120 10.1007/978-3-030-41299-9_34 10.1145/2502081.2502224 10.1109/ICASSP.2016.7471734 10.1007/978-3-030-27535-8_43
ContentType	Journal Article
Copyright	The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021 Copyright Springer Nature B.V. Dec 2021
Copyright_xml	– notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021 – notice: Copyright Springer Nature B.V. Dec 2021
DBID	AAYXX CITATION 8FE 8FG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- P5Z P62 PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PSYQQ
DOI	10.1007/s11063-021-10581-z
DatabaseName	CrossRef ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central UK/Ireland ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central ProQuest Central Student ProQuest SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database ProQuest Advanced Technologies & Aerospace Database (NC LIVE) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China ProQuest One Psychology
DatabaseTitle	CrossRef Advanced Technologies & Aerospace Collection ProQuest One Psychology Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest One Academic Eastern Edition SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central Advanced Technologies & Aerospace Database ProQuest One Applied & Life Sciences ProQuest One Academic UKI Edition ProQuest Central Korea ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New)
DatabaseTitleList	Advanced Technologies & Aerospace Collection
Database_xml	– sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1573-773X
EndPage	4115
ExternalDocumentID	10_1007_s11063_021_10581_z
GrantInformation_xml	– fundername: Natural Science Foundation of China grantid: 62076094 – fundername: Natural Science Foundations of China grantid: 61806078 – fundername: Shanghai Science and Technology Program “Distributed and generative few-shot algorithm and theory research” grantid: 20511100600 – fundername: National Major Scientific and Technological Special Project for “Significant New Drugs Development” grantid: 2019ZX09201004
GroupedDBID	-4Z -5F -5G -BR -EM -Y2 -~C .86 .DC .VR 06D 0R~ 0VY 123 1N0 1SB 2.D 203 28- 29N 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 4.4 406 408 409 40D 40E 53G 5QI 5VS 67Z 6NX 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AAHNG AAIAL AAJKR AAJSJ AAKKN AANZL AAOBN AARHV AARTL AATVU AAUYE AAWCG AAYIU AAYOK AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABEEZ ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMOR ABMQK ABNWP ABQBU ABQSL ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABWNU ABXPI ACACY ACBXY ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACSNA ACULB ACZOJ ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFGXO AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN B-. BA0 BBWZM BDATZ BENPR BGLVJ BGNMA C24 C6C CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 EBLON EBS EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNWQR GQ6 GQ7 GQ8 GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IJ- IKXTQ ITM IWAJR IXC IXE IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K7- KDC KOV KOW LAK LLZTM M4Y MA- N2Q NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P2P P9O PF0 PSYQQ PT5 QOK QOS R4E R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SDH SDM SHX SISQX SNE SNPRN SNX SOHCF SOJ SPH SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TSG TSK TSV TUC U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 Z7R Z7X Z81 Z83 Z88 Z8M Z8R Z8U Z8W Z92 ZMTXR ~EX AASML AAYXX ABDBE ABFSG ACSTC ADHKG AEZWR AFHIU AGQPQ AHPBZ AHWEU AIXLP AYFIA CITATION PHGZM PHGZT 8FE 8FG AZQEC DWQXO GNUQQ JQ2 P62 PKEHL PQEST PQGLB PQQKQ PQUKI PRINS
ID	FETCH-LOGICAL-c319t-335c3b5d0ecf214c684e28f6e17d84ae36f78ffa83f544a5eb16d91de6517fd63
IEDL.DBID	BENPR
ISSN	1370-4621
IngestDate	Wed Aug 13 10:40:16 EDT 2025 Tue Jul 01 01:09:35 EDT 2025 Thu Apr 24 22:58:35 EDT 2025 Fri Feb 21 02:47:43 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	6
Keywords	Stacking Convolutional neural network Bidirectional long short term memory Speech emotion recognition
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c319t-335c3b5d0ecf214c684e28f6e17d84ae36f78ffa83f544a5eb16d91de6517fd63
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
PQID	2918343010
PQPubID	2043838
PageCount	19
ParticipantIDs	proquest_journals_2918343010 crossref_primary_10_1007_s11063_021_10581_z crossref_citationtrail_10_1007_s11063_021_10581_z springer_journals_10_1007_s11063_021_10581_z
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	20211200 2021-12-00 20211201
PublicationDateYYYYMMDD	2021-12-01
PublicationDate_xml	– month: 12 year: 2021 text: 20211200
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York – name: Dordrecht
PublicationTitle	Neural processing letters
PublicationTitleAbbrev	Neural Process Lett
PublicationYear	2021
Publisher	Springer US Springer Nature B.V
Publisher_xml	– name: Springer US – name: Springer Nature B.V
References	Wolpert (CR22) 1992; 5 Busso, Bulut, Lee, Kazemzadeh, Mower, Kim, Chang, Lee, Narayanan (CR23) 2008; 42 CR39 CR38 CR37 CR36 CR35 Kos, Kačič, Vlaj (CR41) 2013; 23 Benavoli, Corani, Demšar, Zaffalon (CR50) 2017; 18 CR33 CR32 Calvo, Sidney (CR1) 2010; 1 CR31 CR30 Cun, Boser, Denker, Howard, Habbard, Jackel (CR26) 1990; 2 CR4 CR6 Zhang, Zhang, Huang, Gao (CR17) 2018; 20 CR8 CR7 CR49 CR47 CR46 CR45 CR44 CR42 Tzirakis, Trigeorgis, Nicolaou, Schuller, Zafeiriou (CR21) 2017; 11 Li, Zhou, Wang, Gao (CR34) 2021; 548 Zhang, Zhang, Nie, Gao, Liu (CR29) 2016; 24 Chandrasekar, Chapaneri, Jayaswal (CR43) 2014; 101 Pachet, Roy (CR48) 2009; 2009 CR19 CR18 Davis, Ieee (CR40) 1990; 28 CR16 Hinto, Li, Dong, Dahl, Mohamed, Navdeep, Senior, Nguyen, Vanhoucke, Sainath (CR3) 2012; 29 CR15 CR14 CR13 CR11 CR10 Lecun, Bengio, Hinton (CR2) 2015; 521 CR51 Yeonguk, Kim (CR12) 2020; 9 CR28 CR27 Xing, Zhikang, Guo, Fujita (CR5) 2019; 49 CR25 CR24 CR20 Trentin, Scherer, Schwenker (CR9) 2015; 66 SB Davis (10581_CR40) 1990; 28 10581_CR32 10581_CR33 10581_CR30 10581_CR31 RA Calvo (10581_CR1) 2010; 1 10581_CR27 10581_CR28 10581_CR25 10581_CR24 M Kos (10581_CR41) 2013; 23 P Chandrasekar (10581_CR43) 2014; 101 10581_CR44 10581_CR42 DH Wolpert (10581_CR22) 1992; 5 W Xing (10581_CR5) 2019; 49 10581_CR7 10581_CR38 F Pachet (10581_CR48) 2009; 2009 10581_CR8 10581_CR39 10581_CR36 10581_CR37 10581_CR35 C Busso (10581_CR23) 2008; 42 P Tzirakis (10581_CR21) 2017; 11 10581_CR6 10581_CR4 Yu Yeonguk (10581_CR12) 2020; 9 G Hinto (10581_CR3) 2012; 29 S Zhang (10581_CR17) 2018; 20 X Zhang (10581_CR29) 2016; 24 10581_CR10 10581_CR11 10581_CR51 YL Cun (10581_CR26) 1990; 2 10581_CR49 Y Lecun (10581_CR2) 2015; 521 10581_CR47 D Li (10581_CR34) 2021; 548 10581_CR45 10581_CR46 E Trentin (10581_CR9) 2015; 66 10581_CR20 10581_CR18 10581_CR19 10581_CR16 10581_CR14 10581_CR15 10581_CR13 A Benavoli (10581_CR50) 2017; 18
References_xml	– ident: CR45 – volume: 2009 start-page: 1 issue: 1 year: 2009 end-page: 23 ident: CR48 article-title: Analytical features: a knowledge-based approach to audio feature generation publication-title: Eurasip J Audio Speech Music Process doi: 10.1155/2009/153017 – ident: CR49 – ident: CR4 – volume: 548 start-page: 328 year: 2021 end-page: 343 ident: CR34 article-title: Exploiting the potentialities of features for speech emotion recognition publication-title: Inf Sci doi: 10.1016/j.ins.2020.09.047 – ident: CR39 – ident: CR16 – ident: CR51 – volume: 24 start-page: 1066 issue: 6 year: 2016 end-page: 1078 ident: CR29 article-title: A pairwise algorithm using the deep stacking network for speech separation and pitch estimation publication-title: IEEE/ACM Trans Audio Speech Lang Process doi: 10.1109/TASLP.2016.2540805 – volume: 29 start-page: 82 issue: 6 year: 2012 end-page: 97 ident: CR3 article-title: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups publication-title: IEEE Signal Process Mag doi: 10.1109/MSP.2012.2205597 – ident: CR35 – ident: CR8 – volume: 18 start-page: 2653 issue: 1 year: 2017 end-page: 2688 ident: CR50 article-title: Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis publication-title: J Mach Learn Res – ident: CR25 – ident: CR42 – ident: CR46 – ident: CR19 – ident: CR15 – ident: CR11 – ident: CR32 – ident: CR36 – volume: 521 start-page: 436 issue: 7553 year: 2015 ident: CR2 article-title: Deep learning publication-title: Nature doi: 10.1038/nature14539 – ident: CR18 – ident: CR47 – ident: CR14 – ident: CR37 – ident: CR30 – volume: 49 start-page: 44 issue: 1 year: 2019 end-page: 52 ident: CR5 article-title: Hierarchical attention based long short-term memory for Chinese lyric generation publication-title: Appl Intell doi: 10.1007/s10489-018-1206-2 – ident: CR10 – volume: 11 start-page: 1301 issue: 8 year: 2017 end-page: 1309 ident: CR21 article-title: End-to-end multimodal emotion recognition using deep neural networks publication-title: IEEE J Sel Top Signal Process doi: 10.1109/JSTSP.2017.2764438 – ident: CR33 – volume: 28 start-page: 65 issue: 4 year: 1990 end-page: 74 ident: CR40 article-title: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences publication-title: Read Speech Recognit doi: 10.1016/B978-0-08-051584-7.50010-3 – ident: CR6 – ident: CR27 – volume: 23 start-page: 659 issue: 2 year: 2013 end-page: 674 ident: CR41 article-title: Acoustic classification and segmentation using modified spectral roll-off and variance-based features publication-title: Digit Signal Process doi: 10.1016/j.dsp.2012.10.008 – volume: 101 start-page: 31 issue: 101 year: 2014 end-page: 36 ident: CR43 article-title: Emotion recognition from speech using discriminative features publication-title: Int J Comput Appl – volume: 1 start-page: 18 issue: 1 year: 2010 end-page: 37 ident: CR1 article-title: Affect detection: an interdisciplinary review of models, methods, and their applications publication-title: IEEE Trans Affect Comput doi: 10.1109/T-AFFC.2010.1 – volume: 42 start-page: 335 issue: 4 year: 2008 end-page: 359 ident: CR23 article-title: Iemocap: interactive emotional dyadic motion capture database publication-title: Lang Resour Eval doi: 10.1007/s10579-008-9076-6 – ident: CR44 – volume: 66 start-page: 4 year: 2015 end-page: 12 ident: CR9 article-title: Emotion recognition from speech signals via a probabilistic echo-state network publication-title: Pattern Recognit Lett doi: 10.1016/j.patrec.2014.10.015 – volume: 20 start-page: 1576 issue: 6 year: 2018 end-page: 1590 ident: CR17 article-title: Speech emotion recognition using deep convolutional neural network and discriminant temporal pyramid matching publication-title: IEEE Trans Multimed doi: 10.1109/TMM.2017.2766843 – ident: CR38 – ident: CR31 – ident: CR13 – volume: 5 start-page: 241 issue: 2 year: 1992 end-page: 259 ident: CR22 article-title: Stacked generalization * publication-title: Neural Networks doi: 10.1016/S0893-6080(05)80023-1 – ident: CR7 – volume: 2 start-page: 396 issue: 2 year: 1990 end-page: 404 ident: CR26 article-title: Handwritten digit recognition with a back-propagation network publication-title: Adv Neural Inf Process Syst – volume: 9 start-page: 713 issue: 5 year: 2020 ident: CR12 article-title: Attention-LSTM-attention model for speech emotion recognition and analysis of IEMOCAP database publication-title: Electronics doi: 10.3390/electronics9050713 – ident: CR28 – ident: CR24 – ident: CR20 – ident: 10581_CR18 doi: 10.21437/Interspeech.2017-917 – ident: 10581_CR44 doi: 10.1109/ICCIC.2015.7435630 – volume: 5 start-page: 241 issue: 2 year: 1992 ident: 10581_CR22 publication-title: Neural Networks doi: 10.1016/S0893-6080(05)80023-1 – ident: 10581_CR35 – ident: 10581_CR16 – volume: 1 start-page: 18 issue: 1 year: 2010 ident: 10581_CR1 publication-title: IEEE Trans Affect Comput doi: 10.1109/T-AFFC.2010.1 – volume: 521 start-page: 436 issue: 7553 year: 2015 ident: 10581_CR2 publication-title: Nature doi: 10.1038/nature14539 – ident: 10581_CR39 doi: 10.21437/Interspeech.2009-103 – ident: 10581_CR10 doi: 10.1016/j.neunet.2017.02.013 – volume: 101 start-page: 31 issue: 101 year: 2014 ident: 10581_CR43 publication-title: Int J Comput Appl – ident: 10581_CR13 doi: 10.1109/ICSPCS.2017.8270472 – ident: 10581_CR47 doi: 10.1016/j.specom.2010.08.013 – ident: 10581_CR7 doi: 10.1109/ICASSP.2017.7952552 – volume: 20 start-page: 1576 issue: 6 year: 2018 ident: 10581_CR17 publication-title: IEEE Trans Multimed doi: 10.1109/TMM.2017.2766843 – volume: 42 start-page: 335 issue: 4 year: 2008 ident: 10581_CR23 publication-title: Lang Resour Eval doi: 10.1007/s10579-008-9076-6 – ident: 10581_CR31 doi: 10.1109/ICASSP.2016.7472166 – ident: 10581_CR36 – volume: 29 start-page: 82 issue: 6 year: 2012 ident: 10581_CR3 publication-title: IEEE Signal Process Mag doi: 10.1109/MSP.2012.2205597 – ident: 10581_CR42 – ident: 10581_CR32 – volume: 28 start-page: 65 issue: 4 year: 1990 ident: 10581_CR40 publication-title: Read Speech Recognit doi: 10.1016/B978-0-08-051584-7.50010-3 – volume: 49 start-page: 44 issue: 1 year: 2019 ident: 10581_CR5 publication-title: Appl Intell doi: 10.1007/s10489-018-1206-2 – ident: 10581_CR27 doi: 10.1109/ISCAS.2010.5537907 – ident: 10581_CR6 – volume: 9 start-page: 713 issue: 5 year: 2020 ident: 10581_CR12 publication-title: Electronics doi: 10.3390/electronics9050713 – ident: 10581_CR45 doi: 10.4028/www.scientific.net/AMM.610.283 – ident: 10581_CR46 – ident: 10581_CR4 doi: 10.1007/s10489-018-1242-y – ident: 10581_CR20 doi: 10.21437/Interspeech.2014-433 – volume: 11 start-page: 1301 issue: 8 year: 2017 ident: 10581_CR21 publication-title: IEEE J Sel Top Signal Process doi: 10.1109/JSTSP.2017.2764438 – ident: 10581_CR37 – volume: 24 start-page: 1066 issue: 6 year: 2016 ident: 10581_CR29 publication-title: IEEE/ACM Trans Audio Speech Lang Process doi: 10.1109/TASLP.2016.2540805 – ident: 10581_CR33 – ident: 10581_CR19 doi: 10.1109/APSIPA.2017.8282123 – ident: 10581_CR28 – volume: 2009 start-page: 1 issue: 1 year: 2009 ident: 10581_CR48 publication-title: Eurasip J Audio Speech Music Process doi: 10.1155/2009/153017 – ident: 10581_CR15 doi: 10.1016/j.eswa.2020.114177 – ident: 10581_CR24 – volume: 23 start-page: 659 issue: 2 year: 2013 ident: 10581_CR41 publication-title: Digit Signal Process doi: 10.1016/j.dsp.2012.10.008 – ident: 10581_CR30 doi: 10.1109/ICASSP.2017.7952120 – ident: 10581_CR38 – ident: 10581_CR14 doi: 10.1007/978-3-030-41299-9_34 – ident: 10581_CR49 doi: 10.1145/2502081.2502224 – volume: 18 start-page: 2653 issue: 1 year: 2017 ident: 10581_CR50 publication-title: J Mach Learn Res – volume: 2 start-page: 396 issue: 2 year: 1990 ident: 10581_CR26 publication-title: Adv Neural Inf Process Syst – volume: 548 start-page: 328 year: 2021 ident: 10581_CR34 publication-title: Inf Sci doi: 10.1016/j.ins.2020.09.047 – ident: 10581_CR8 – ident: 10581_CR51 – ident: 10581_CR25 doi: 10.1109/ICASSP.2016.7471734 – volume: 66 start-page: 4 year: 2015 ident: 10581_CR9 publication-title: Pattern Recognit Lett doi: 10.1016/j.patrec.2014.10.015 – ident: 10581_CR11 doi: 10.1007/978-3-030-27535-8_43
SSID	ssj0010020
Score	2.390752
Snippet	Speech Emotion Recognition (SER) is a huge challenge for distinguishing and interpreting the sentiments carried in speech. Fortunately, deep learning is proved...
SourceID	proquest crossref springer
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	4097
SubjectTerms	Algorithms Artificial Intelligence Artificial neural networks Classification Complex Systems Computational Intelligence Computer Science Deep learning Emotion recognition Emotions Machine learning Neural networks Speech Speech recognition Statistical analysis Support vector machines Time series
SummonAdditionalLinks	– databaseName: SpringerLINK - Czech Republic Consortium dbid: AGYKE link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED5Bu7BQnqJQkAc2SBUntuOMbdWCeHSgrVSmKHZsIYHSiqZLfz3Oq4EKkJo1jpXc-c6fc9_dAVyHNpGSOtRiXFGLaGKnJABpydBcruTYkxnbYsjuJ-RhSqdFUtiiZLuXIcnMU1fJbub0ksYcUwoW5dha7UKdYu7zGtQ7d6-P_XX0IMVA2UHLsy3CHFwky_w-y88NqUKZG4HRbL8ZNGBSvmlOM3lvLxPRlquNIo7bfsoB7BcAFHXyFXMIOyo-gkbZ3AEVtn4M3e7TaPyMwjhCveEQGVAq07_qqPMt8oAM4kWjuVLyDfXzfkDopWQkzeITmAz64969VTRcsKSxxMRyXSpdQSNbSe1gIhknyuGaKexFnITKZdrjWofc1ZSQkBo_zyIfR4pR7OmIuadQi2exOgMUetJADyZs4QgS6kj4iojI8ZUgmjPbbQIupR7Iohp52hTjI6jqKKdCCoyQgkxIwaoJN-tn5nktjn9Ht0plBoVdLgLHNy6MGKdmN-G21E11--_ZzrcbfgF7TqbelPfSglryuVSXBr0k4qpYrF9JMeQS priority: 102 providerName: Springer Nature
Title	BLSTM and CNN Stacking Architecture for Speech Emotion Recognition
URI	https://link.springer.com/article/10.1007/s11063-021-10581-z https://www.proquest.com/docview/2918343010
Volume	53
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV07T8MwED4BXVh4Iwql8sAGFoljO-6AUFu1IB4Roq1UpsjxQwwoLVAWfj12mlBAohkTx8Pd-fzZ9_gATmRAlWKEYS4Mw9TSwCcBKKykeyIlwlgV2RYJvx7RmzEbr0BS1cL4tMrKJxaOWk-UvyM_Jy1nfNSZY3A5fcWeNcpHVysKDVlSK-iLosXYKtScSxbO7mudXvLw-B1X8OioOILFAaachGUZzbyYzp2OfEzTp3gxEeLP31vVAn_-CZkWO1F_CzZKCInac51vw4rJd2CzomdA5WrdhU7nbjC8RzLXqJskyMFK5e_FUftH7AA5zIoGU2PUM-rNGX3QY5VTNMn3YNTvDbvXuKRMwMqtpRmOIqaijOnAKEtCqrighgjLTRhrQaWJuI2FtVJEllEqmfPUXLdCbTgLY6t5tA9r-SQ3B4BkrBx44FmQkYxKq7OWoZkmLZNRK3gQ1SGspJOqsp-4p7V4SRedkL1EUyfRtJBo-lmH0-9_pvNuGktHNyqhp-XKek8XdlCHs0oRi8__z3a4fLYjWCeF7n2mSgPWZm8f5tjhjVnWhFXRv2pCrX31dNtrlibl3o5I-wv_edNU
linkProvider	ProQuest
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT-MwEB7xOMBleS2iSwEf4ATWJn7FOSBUHqVA6QGKxC0kfogDSrtLEYIfxW_EzoMCEtzINckcxp_H33jG_gA204ApxQnHQhqOmWWBbwJQWKXuoUqGkSq6LXqic8VOr_n1BLzUZ2F8W2UdE4tArQfK75H_JbEDH3NwDPaG_7BXjfLV1VpCo4TFmXl6dCnb_e7JoRvfLULaR_2DDq5UBbBycBthSrmiGdeBUZaETAnJDJFWmDDSkqWGChtJa1NJLWcs5S6YCR2H2ggeRlYL6uxOwjSjNPZSEbJ9_Fa18NyrSPCiADNBwuqQTnlUz-VevmLqG8i4DPHzx4VwzG4_FWSLda49D78qgopaJaIWYMLkizBXiz-gKhYswf5-97J_jtJco4NeDznSqvyuO2q9q0wgx4jR5dAYdYuOSr0gdFF3LA3y33D1I65bhql8kJsVQGmkHDURWZCRjKVWZ7FhmSaxyZiVIqANCGvvJKq6rdyLZtwl43uWvUcT59Gk8Gjy3IDtt3-G5V0d337drJ2eVPP2PhmjrAE79UCMX39t7c_31jZgptM_7ybdk97ZKsySAge-J6YJU6P_D2bNMZtRtl7ACcHNT-P3FVwSBwE
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagSIiFN6JQwAMbRMSO7ThjW1oVKBGirdQtSvwQA0orCEt_PXYeTUGAhNc4Hs5358--7-4AuIxdIgTF1GFcUYdo4loSgHBEbIYnOPJFzrYI2WBC7qd0upLFn7Pdq5BkkdNgqzSl2c1c6ps68c3cZGz80dKxKEfOYh1sGHeMrKZPcHsZR7BoKL9y-a5DGEZl2szPa3w9mmq8-S1Emp88_V2wXUJG2C72eA-sqXQf7FTtGGBpnQeg0xmOxo8wTiXshiE0MFLYd3DYXokVQINR4WiulHiBvaKDD3yuOESz9BBM-r1xd-CULRIcYWwnczyPCi-h0lVCY0QE40RhrplCvuQkVh7TPtc65p6mhMTUeGYmAyQVo8jXknlHoJHOUnUMYOwLAxZY4iY4IbGWSaBIInGgEqI5c70mQJV0IlHWD7dtLF6juvKxlWhkJBrlEo0WTXC1_GdeVM_4c3arEnpUWtJ7hAPjdIhxQ24TXFcbUX_-fbWT_02_AJtPt_1oeBc-nIItnKuFJa20QCN7-1BnBnpkyXmuXZ_r5c38
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BLSTM+and+CNN+Stacking+Architecture+for+Speech+Emotion+Recognition&rft.jtitle=Neural+processing+letters&rft.date=2021-12-01&rft.pub=Springer+Nature+B.V&rft.issn=1370-4621&rft.eissn=1573-773X&rft.volume=53&rft.issue=6&rft.spage=4097&rft.epage=4115&rft_id=info:doi/10.1007%2Fs11063-021-10581-z
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1370-4621&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1370-4621&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1370-4621&client=summon