Mood detection from daily conversational speech using denoising autoencoder and LSTM

In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it is commonly accepted that speech emotion perceived by the listener is close to the intended emotion conveyed by the speaker, research has indicated that there still remains a mismat...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 5125 - 5129
Main Authors Kun-Yi Huang, Chung-Hsien Wu, Ming-Hsiang Su, Hsiang-Chi Fu
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2017
Subjects
Online AccessGet full text

Cover

Loading…
Abstract In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it is commonly accepted that speech emotion perceived by the listener is close to the intended emotion conveyed by the speaker, research has indicated that there still remains a mismatch between them. In addition, the individuals with different personalities generally have different emotion expressions. Based on the investigation, in this study, a support vector machine (SVM)-based emotion model is first developed to detect perceived emotion from daily conversational speech. Then, a denoising autoencoder (DAE) is used to construct an emotion conversion model to characterize the relationship between the perceived emotion and the expressed emotion of the subject for a specific personality. Finally, a long short-term memory (LSTM)-based mood model is constructed to model the temporal fluctuation of speech emotions for mood detection. Experimental results show that the proposed method achieved a detection accuracy of 64.5%, improving by 5.0% compared to the HMM-based method.
AbstractList In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it is commonly accepted that speech emotion perceived by the listener is close to the intended emotion conveyed by the speaker, research has indicated that there still remains a mismatch between them. In addition, the individuals with different personalities generally have different emotion expressions. Based on the investigation, in this study, a support vector machine (SVM)-based emotion model is first developed to detect perceived emotion from daily conversational speech. Then, a denoising autoencoder (DAE) is used to construct an emotion conversion model to characterize the relationship between the perceived emotion and the expressed emotion of the subject for a specific personality. Finally, a long short-term memory (LSTM)-based mood model is constructed to model the temporal fluctuation of speech emotions for mood detection. Experimental results show that the proposed method achieved a detection accuracy of 64.5%, improving by 5.0% compared to the HMM-based method.
Author Kun-Yi Huang
Ming-Hsiang Su
Hsiang-Chi Fu
Chung-Hsien Wu
Author_xml – sequence: 1
  surname: Kun-Yi Huang
  fullname: Kun-Yi Huang
  organization: Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
– sequence: 2
  surname: Chung-Hsien Wu
  fullname: Chung-Hsien Wu
  organization: Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
– sequence: 3
  surname: Ming-Hsiang Su
  fullname: Ming-Hsiang Su
  organization: Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
– sequence: 4
  surname: Hsiang-Chi Fu
  fullname: Hsiang-Chi Fu
  organization: Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
BookMark eNotkNFKwzAYhaMouE6fYDd5gdb8SdMmlzLUCR0KreDdiMkfjXTJaDthb-_UXZ3Dge9wOBm5iCkiIQtgBQDTt0_Lu7Z9KTiDuqi1FCDEGclAMs1KgLo6JzMuap2DZm9XJBvHL8aYqks1I906JUcdTminkCL1Q9pSZ0J_oDbFbxxG85ubno47RPtJ92OIH0cgpvDnzH5KGG1yOFATHW3abn1NLr3pR7w56Zy8Ptx3y1XePD8etzZ5gFpOufGgreSu8raSigMoKxR6IxG0kspLLbXTuqw8A45cgTCyNMo543z17riYk8V_b0DEzW4IWzMcNqcHxA9J4VMl
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICASSP.2017.7953133
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 1509041176
9781509041176
EISSN 2379-190X
EndPage 5129
ExternalDocumentID 7953133
Genre orig-research
GroupedDBID 23M
29P
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i175t-af19c52d6fc6582118c38efa5e19858f5959d9946f012e2813a54a8ddadf6bd23
IEDL.DBID RIE
IngestDate Wed Aug 27 02:15:26 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-af19c52d6fc6582118c38efa5e19858f5959d9946f012e2813a54a8ddadf6bd23
PageCount 5
ParticipantIDs ieee_primary_7953133
PublicationCentury 2000
PublicationDate 2017-March
PublicationDateYYYYMMDD 2017-03-01
PublicationDate_xml – month: 03
  year: 2017
  text: 2017-March
PublicationDecade 2010
PublicationTitle Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998)
PublicationTitleAbbrev ICASSP
PublicationYear 2017
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0008748
Score 2.0233471
Snippet In current studies, an extended subjective self-report method is generally used for measuring emotions. Even though it is commonly accepted that speech emotion...
SourceID ieee
SourceType Publisher
StartPage 5125
SubjectTerms denoising autoencoder
Emotion recognition
Hidden Markov models
long short-term memory
Long-term emotion tracking
Mood
mood detection
Predictive models
Speech
Support vector machines
Title Mood detection from daily conversational speech using denoising autoencoder and LSTM
URI https://ieeexplore.ieee.org/document/7953133
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JSwMxGA1tT3pxacWdHDw609kykxylWKo4UmgLvZUsX7QoM6WdOeivN5kZ64IHbx-BkJAv5GV57wWhK-4FzAt04sREB05kAMERHoROQn0wgEo5Z1aNnD7Go1l0PyfzFrreamEAoCKfgWvD6i1f5bK0V2X9hJkZE4Zt1DYHt1qrtV11aRLRxlXI91j_bnAzmYwtdStxm2o__k-p4GO4h9LPhmvWyItbFsKV7788Gf_bs33U-xLq4fEWgg5QC7JDtPvNY7CLpmmeK6ygqDhXGbZ6Eqz48vUNV4zz9aa5DsSbFYB8xpYJ_2QqZPmyinhZ5NbtUsEa80zhh8k07aHZ8HY6GDnNVwrO0uwPCodrn0kSqFjL2EpjfSpDCpoT8BklVBNGmGIsirUBLAioH3IScaoUVzoWKgiPUCfLMzhG2JM-EQx04gkvUpQKSZhgCdfAKNWhf4K6dnwWq9otY9EMzenfxWdox-aoZnWdo06xLuHCwHwhLqv8fgAhcqkE
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8MwGA5zHtSLH5v4bQ4ebde0TZscZTg2XcdgHew20uaNDqUdW3vQX2_T1vmBB28hEBLehDzhzfM8L0I3wrK5ZSvf8KiyDbcABCOywDF8RqAAVCYE12rkYOT1p-7DjM4a6HajhQGAknwGpm6Wf_kyjXOdKuv4vDgxjrOFtgvcp6RSa23uXea7rPYVIhbvDLp3k8lYk7d8sx74o4JKCSC9fRR8Tl3xRl7MPIvM-P2XK-N_13aA2l9SPTzegNAhakByhPa-uQy2UBikqcQSspJ1lWCtKMFSLF7fcMk5X63rhCBeLwHiZ6y58E_FgCRdlC2RZ6n2u5SwwiKReDgJgzaa9u7Dbt-oiykYi-KFkBlCER5TW3oq9rQ4lrDYYaAEBcIZZYpyyiXnrqcKyAKbEUdQVzAphVReJG3nGDWTNIEThK2Y0IiD8q3IciVjUUx5xH2hgDOmHHKKWjo-82XllzGvQ3P2d_c12umHwXA-HIwez9Gu3q-K43WBmtkqh8sC9LPoqtzrDyB9rE0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+of+the+...+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%281998%29&rft.atitle=Mood+detection+from+daily+conversational+speech+using+denoising+autoencoder+and+LSTM&rft.au=Kun-Yi+Huang&rft.au=Chung-Hsien+Wu&rft.au=Ming-Hsiang+Su&rft.au=Hsiang-Chi+Fu&rft.date=2017-03-01&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=5125&rft.epage=5129&rft_id=info:doi/10.1109%2FICASSP.2017.7953133&rft.externalDocID=7953133