Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition

Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different...

Full description

Saved in:
Bibliographic Details
Published inInternational Conference on Affective Computing and Intelligent Interaction and workshops pp. 732 - 737
Main Authors Latif, Siddique, Qadir, Junaid, Bilal, Muhammad
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.09.2019
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that the language invariant representations can be learned without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels.
AbstractList Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that the language invariant representations can be learned without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels.
Author Bilal, Muhammad
Latif, Siddique
Qadir, Junaid
Author_xml – sequence: 1
  givenname: Siddique
  surname: Latif
  fullname: Latif, Siddique
  organization: University of Southern Queensland,Australia
– sequence: 2
  givenname: Junaid
  surname: Qadir
  fullname: Qadir, Junaid
  organization: Information Technology University (ITU)Lahore,Pakistan
– sequence: 3
  givenname: Muhammad
  surname: Bilal
  fullname: Bilal, Muhammad
  organization: University of the West of England (UWE)Bristol,United Kingdom
BookMark eNotkFFLwzAUhaMoOOd-gPjSP9CZmzbNzeOoUwcFQR0-jrS5nZEtKck28N87dU_n4_BxHs41u_DBE2O3wKcAXN_P6sViKjjoKWohJRRnbKIVghIIBSJW52wkQFY5AsAVm6T0xflRlxxRjtjH0qf9QPHgEtlsZg8Uk4nObLKHsDXOHysz7MzOBZ_1IWZ1DCnljfPr_dF5G4i6z2y-DX_CK3Vh7d0v37DL3mwSTU45ZsvH-Xv9nDcvT4t61uROVHqX27KQoDm3VpHsTSexLPq-LQFlBwRKC6FbVZEoZduBBWNaiyRbXioFiKoYs7v_XUdEqyG6rYnfq9MVxQ8KdFST
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ACII.2019.8925513
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781728138886
1728138884
EISSN 2156-8111
EndPage 737
ExternalDocumentID 8925513
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-i269t-d4351900dd7e5fac5843ffb4185c1e179229b76e245bc1d1aabd8e5b047718873
IEDL.DBID RIE
IngestDate Wed Aug 27 02:49:48 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i269t-d4351900dd7e5fac5843ffb4185c1e179229b76e245bc1d1aabd8e5b047718873
PageCount 6
ParticipantIDs ieee_primary_8925513
PublicationCentury 2000
PublicationDate 2019-09-01
PublicationDateYYYYMMDD 2019-09-01
PublicationDate_xml – month: 09
  year: 2019
  text: 2019-09-01
  day: 01
PublicationDecade 2010
PublicationTitle International Conference on Affective Computing and Intelligent Interaction and workshops
PublicationTitleAbbrev ACII
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001950885
Score 1.9627033
Snippet Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the...
SourceID ieee
SourceType Publisher
StartPage 732
SubjectTerms Adaptation models
Data models
domain adaptation
Emotion recognition
Gallium nitride
generative adversarial networks (GANs)
Multi-lingual
Robustness
Speech emotion recognition
Task analysis
Training
Urdu language
Title Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
URI https://ieeexplore.ieee.org/document/8925513
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4AJ0-oYHxnDx4ttKXLtkeDEDDRGJXIjexjqsTYNtJe_PXOtEWi8eCt2bRpu6_5Zueb-Ri7II8Lga91bBy45KBIR-sInFhK6VpfE4ggtsXdcDoPbhZi0WCX37kwAFCSz6BHl2Us36amoKOyfhj5pEfSZE103Kpcre15SilnKurApedG_avRbEbcLZwM1XM_BFRK-zFps9vNmyvayFuvyHXPfP4qyvjfT9tl3W2mHr__tkF7rAHJPmtvpBp4vXI77HmerIuM9oU1WF6qMK8VzT1-nb6rVYJNKqui8hxhLB-R8XTQUX0p8J7HDMC88nEl-cMfNqSjNOmy-WT8NJo6taaCs_KHUe7YgBT5XNdaCSJWBvHHII41lbAxHuDq9P1IyyH4gdDGs55S2oYgtBtItGKhHBywVpImcMh4KHyDv6eVUB5JWUUh4gWDmAIEBLhrHrEO9dMyq8pmLOsuOv67-YTt0FhV9K1T1so_CjhDe5_r83KgvwAQrKug
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1NT8JAEJ0gHvSECsZv96DHYrt0aXvwYPgICBKjEr1htztVYixEaIz-Fv-K_83ZtmA0Xkm8NZum6XSmO2923-wDONIVFwFfZajQNnWB4hhSemiEjuOYiksNIjTboldt9e3zO3GXg495LwwiJuQzLOvLZC9fjYJYL5WduB7XeiQZhbKDb69UoE1O23Xy5jHnzcZNrWVkGgLGkFe9qaFsrUBnmko5KEI_oHxbCUOpj2wJLKRo5NyTThW5LWRgKcv3pXJRSNN2aNZ2nQo9dwmWCWcInnaHfa_gJAKqItsqtUzv5KzWbmu2GIVf-qY_JFuSjNUswOfM1pSo8lSOp7IcvP86BvK_fow1KH33IrLLeZZdhxxGG1CYiVGwbG4qwm0_msRjPfNNULFEZ3ri67-L1UfP_jCiIX-c8g4YAXVW0_DAoFL8IaZ7rseIwSNrpKJG7GpGqxpFJegvxMhNyEejCLeAuYIHZJ70hW9psS7PJUQUEGpCgTblhW0oar8MxunBIIPMJTt_Dx_CSuvmojvotnudXVjVcZKS1fYgP32JcZ_QzVQeJEHG4H7RjvwCn8EIdA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=International+Conference+on+Affective+Computing+and+Intelligent+Interaction+and+workshops&rft.atitle=Unsupervised+Adversarial+Domain+Adaptation+for+Cross-Lingual+Speech+Emotion+Recognition&rft.au=Latif%2C+Siddique&rft.au=Qadir%2C+Junaid&rft.au=Bilal%2C+Muhammad&rft.date=2019-09-01&rft.pub=IEEE&rft.eissn=2156-8111&rft.spage=732&rft.epage=737&rft_id=info:doi/10.1109%2FACII.2019.8925513&rft.externalDocID=8925513