Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition

Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different...

Full description

Saved in:

Bibliographic Details
Published in	International Conference on Affective Computing and Intelligent Interaction and workshops pp. 732 - 737
Main Authors	Latif, Siddique, Qadir, Junaid, Bilal, Muhammad
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2019
Subjects	Adaptation models Data models domain adaptation Emotion recognition Gallium nitride generative adversarial networks (GANs) Multi-lingual Robustness Speech emotion recognition Task analysis Training Urdu language
Online Access	Get full text

Cover

Loading…

Abstract	Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that the language invariant representations can be learned without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels.
AbstractList	Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that the language invariant representations can be learned without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels.
Author	Bilal, Muhammad Latif, Siddique Qadir, Junaid
Author_xml	– sequence: 1 givenname: Siddique surname: Latif fullname: Latif, Siddique organization: University of Southern Queensland,Australia – sequence: 2 givenname: Junaid surname: Qadir fullname: Qadir, Junaid organization: Information Technology University (ITU)Lahore,Pakistan – sequence: 3 givenname: Muhammad surname: Bilal fullname: Bilal, Muhammad organization: University of the West of England (UWE)Bristol,United Kingdom
BookMark	eNotkFFLwzAUhaMoOOd-gPjSP9CZmzbNzeOoUwcFQR0-jrS5nZEtKck28N87dU_n4_BxHs41u_DBE2O3wKcAXN_P6sViKjjoKWohJRRnbKIVghIIBSJW52wkQFY5AsAVm6T0xflRlxxRjtjH0qf9QPHgEtlsZg8Uk4nObLKHsDXOHysz7MzOBZ_1IWZ1DCnljfPr_dF5G4i6z2y-DX_CK3Vh7d0v37DL3mwSTU45ZsvH-Xv9nDcvT4t61uROVHqX27KQoDm3VpHsTSexLPq-LQFlBwRKC6FbVZEoZduBBWNaiyRbXioFiKoYs7v_XUdEqyG6rYnfq9MVxQ8KdFST
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ACII.2019.8925513
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9781728138886 1728138884
EISSN	2156-8111
EndPage	737
ExternalDocumentID	8925513
Genre	orig-research
GroupedDBID	6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL
ID	FETCH-LOGICAL-i269t-d4351900dd7e5fac5843ffb4185c1e179229b76e245bc1d1aabd8e5b047718873
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:49:48 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i269t-d4351900dd7e5fac5843ffb4185c1e179229b76e245bc1d1aabd8e5b047718873
PageCount	6
ParticipantIDs	ieee_primary_8925513
PublicationCentury	2000
PublicationDate	2019-09-01
PublicationDateYYYYMMDD	2019-09-01
PublicationDate_xml	– month: 09 year: 2019 text: 2019-09-01 day: 01
PublicationDecade	2010
PublicationTitle	International Conference on Affective Computing and Intelligent Interaction and workshops
PublicationTitleAbbrev	ACII
PublicationYear	2019
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0001950885
Score	1.9627033
Snippet	Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the...
SourceID	ieee
SourceType	Publisher
StartPage	732
SubjectTerms	Adaptation models Data models domain adaptation Emotion recognition Gallium nitride generative adversarial networks (GANs) Multi-lingual Robustness Speech emotion recognition Task analysis Training Urdu language
Title	Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
URI	https://ieeexplore.ieee.org/document/8925513
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4AJ0-oYHxnDx4ttKXLtkeDEDDRGJXIjexjqsTYNtJe_PXOtEWi8eCt2bRpu6_5Zueb-Ri7II8Lga91bBy45KBIR-sInFhK6VpfE4ggtsXdcDoPbhZi0WCX37kwAFCSz6BHl2Us36amoKOyfhj5pEfSZE103Kpcre15SilnKurApedG_avRbEbcLZwM1XM_BFRK-zFps9vNmyvayFuvyHXPfP4qyvjfT9tl3W2mHr__tkF7rAHJPmtvpBp4vXI77HmerIuM9oU1WF6qMK8VzT1-nb6rVYJNKqui8hxhLB-R8XTQUX0p8J7HDMC88nEl-cMfNqSjNOmy-WT8NJo6taaCs_KHUe7YgBT5XNdaCSJWBvHHII41lbAxHuDq9P1IyyH4gdDGs55S2oYgtBtItGKhHBywVpImcMh4KHyDv6eVUB5JWUUh4gWDmAIEBLhrHrEO9dMyq8pmLOsuOv67-YTt0FhV9K1T1so_CjhDe5_r83KgvwAQrKug
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1NT8JAEJ0gHvSECsZv96DHYrt0aXvwYPgICBKjEr1htztVYixEaIz-Fv-K_83ZtmA0Xkm8NZum6XSmO2923-wDONIVFwFfZajQNnWB4hhSemiEjuOYiksNIjTboldt9e3zO3GXg495LwwiJuQzLOvLZC9fjYJYL5WduB7XeiQZhbKDb69UoE1O23Xy5jHnzcZNrWVkGgLGkFe9qaFsrUBnmko5KEI_oHxbCUOpj2wJLKRo5NyTThW5LWRgKcv3pXJRSNN2aNZ2nQo9dwmWCWcInnaHfa_gJAKqItsqtUzv5KzWbmu2GIVf-qY_JFuSjNUswOfM1pSo8lSOp7IcvP86BvK_fow1KH33IrLLeZZdhxxGG1CYiVGwbG4qwm0_msRjPfNNULFEZ3ri67-L1UfP_jCiIX-c8g4YAXVW0_DAoFL8IaZ7rseIwSNrpKJG7GpGqxpFJegvxMhNyEejCLeAuYIHZJ70hW9psS7PJUQUEGpCgTblhW0oar8MxunBIIPMJTt_Dx_CSuvmojvotnudXVjVcZKS1fYgP32JcZ_QzVQeJEHG4H7RjvwCn8EIdA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=International+Conference+on+Affective+Computing+and+Intelligent+Interaction+and+workshops&rft.atitle=Unsupervised+Adversarial+Domain+Adaptation+for+Cross-Lingual+Speech+Emotion+Recognition&rft.au=Latif%2C+Siddique&rft.au=Qadir%2C+Junaid&rft.au=Bilal%2C+Muhammad&rft.date=2019-09-01&rft.pub=IEEE&rft.eissn=2156-8111&rft.spage=732&rft.epage=737&rft_id=info:doi/10.1109%2FACII.2019.8925513&rft.externalDocID=8925513