Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different...
Saved in:
Published in | International Conference on Affective Computing and Intelligent Interaction and workshops pp. 732 - 737 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that the language invariant representations can be learned without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels. |
---|---|
AbstractList | Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the differences in the distributions of training and test data. These differences become more apparent when training and test data belong to different languages, which cause a significant performance gap between the validation and test scores. It is imperative to build more robust models that can fit in practical applications of SER systems. Therefore, in this paper, we propose a Generative Adversarial Network (GAN)-based model for multilingual SER. Our choice of using GAN is motivated by their great success in learning the underlying data distribution. The proposed model is designed in such a way that the language invariant representations can be learned without requiring target-language data labels. We evaluate our proposed model on four different language emotional datasets, including an Urdu-language dataset to also incorporate alternative languages for which labelled data is difficult to find and which have not been studied much by the mainstream community. Our results show that our proposed model can significantly improve the baseline cross-lingual SER performance for all the considered datasets including the non-mainstream Urdu language data without requiring any labels. |
Author | Bilal, Muhammad Latif, Siddique Qadir, Junaid |
Author_xml | – sequence: 1 givenname: Siddique surname: Latif fullname: Latif, Siddique organization: University of Southern Queensland,Australia – sequence: 2 givenname: Junaid surname: Qadir fullname: Qadir, Junaid organization: Information Technology University (ITU)Lahore,Pakistan – sequence: 3 givenname: Muhammad surname: Bilal fullname: Bilal, Muhammad organization: University of the West of England (UWE)Bristol,United Kingdom |
BookMark | eNotkFFLwzAUhaMoOOd-gPjSP9CZmzbNzeOoUwcFQR0-jrS5nZEtKck28N87dU_n4_BxHs41u_DBE2O3wKcAXN_P6sViKjjoKWohJRRnbKIVghIIBSJW52wkQFY5AsAVm6T0xflRlxxRjtjH0qf9QPHgEtlsZg8Uk4nObLKHsDXOHysz7MzOBZ_1IWZ1DCnljfPr_dF5G4i6z2y-DX_CK3Vh7d0v37DL3mwSTU45ZsvH-Xv9nDcvT4t61uROVHqX27KQoDm3VpHsTSexLPq-LQFlBwRKC6FbVZEoZduBBWNaiyRbXioFiKoYs7v_XUdEqyG6rYnfq9MVxQ8KdFST |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ACII.2019.8925513 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 9781728138886 1728138884 |
EISSN | 2156-8111 |
EndPage | 737 |
ExternalDocumentID | 8925513 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
ID | FETCH-LOGICAL-i269t-d4351900dd7e5fac5843ffb4185c1e179229b76e245bc1d1aabd8e5b047718873 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:49:48 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i269t-d4351900dd7e5fac5843ffb4185c1e179229b76e245bc1d1aabd8e5b047718873 |
PageCount | 6 |
ParticipantIDs | ieee_primary_8925513 |
PublicationCentury | 2000 |
PublicationDate | 2019-09-01 |
PublicationDateYYYYMMDD | 2019-09-01 |
PublicationDate_xml | – month: 09 year: 2019 text: 2019-09-01 day: 01 |
PublicationDecade | 2010 |
PublicationTitle | International Conference on Affective Computing and Intelligent Interaction and workshops |
PublicationTitleAbbrev | ACII |
PublicationYear | 2019 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0001950885 |
Score | 1.9627033 |
Snippet | Cross-lingual speech emotion recognition (SER)is a crucial task for many real-world applications. The performance of SER systems is often degraded by the... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 732 |
SubjectTerms | Adaptation models Data models domain adaptation Emotion recognition Gallium nitride generative adversarial networks (GANs) Multi-lingual Robustness Speech emotion recognition Task analysis Training Urdu language |
Title | Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition |
URI | https://ieeexplore.ieee.org/document/8925513 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4AJ0-oYHxnDx4ttKXLtkeDEDDRGJXIjexjqsTYNtJe_PXOtEWi8eCt2bRpu6_5Zueb-Ri7II8Lga91bBy45KBIR-sInFhK6VpfE4ggtsXdcDoPbhZi0WCX37kwAFCSz6BHl2Us36amoKOyfhj5pEfSZE103Kpcre15SilnKurApedG_avRbEbcLZwM1XM_BFRK-zFps9vNmyvayFuvyHXPfP4qyvjfT9tl3W2mHr__tkF7rAHJPmtvpBp4vXI77HmerIuM9oU1WF6qMK8VzT1-nb6rVYJNKqui8hxhLB-R8XTQUX0p8J7HDMC88nEl-cMfNqSjNOmy-WT8NJo6taaCs_KHUe7YgBT5XNdaCSJWBvHHII41lbAxHuDq9P1IyyH4gdDGs55S2oYgtBtItGKhHBywVpImcMh4KHyDv6eVUB5JWUUh4gWDmAIEBLhrHrEO9dMyq8pmLOsuOv67-YTt0FhV9K1T1so_CjhDe5_r83KgvwAQrKug |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1NT8JAEJ0gHvSECsZv96DHYrt0aXvwYPgICBKjEr1htztVYixEaIz-Fv-K_83ZtmA0Xkm8NZum6XSmO2923-wDONIVFwFfZajQNnWB4hhSemiEjuOYiksNIjTboldt9e3zO3GXg495LwwiJuQzLOvLZC9fjYJYL5WduB7XeiQZhbKDb69UoE1O23Xy5jHnzcZNrWVkGgLGkFe9qaFsrUBnmko5KEI_oHxbCUOpj2wJLKRo5NyTThW5LWRgKcv3pXJRSNN2aNZ2nQo9dwmWCWcInnaHfa_gJAKqItsqtUzv5KzWbmu2GIVf-qY_JFuSjNUswOfM1pSo8lSOp7IcvP86BvK_fow1KH33IrLLeZZdhxxGG1CYiVGwbG4qwm0_msRjPfNNULFEZ3ri67-L1UfP_jCiIX-c8g4YAXVW0_DAoFL8IaZ7rseIwSNrpKJG7GpGqxpFJegvxMhNyEejCLeAuYIHZJ70hW9psS7PJUQUEGpCgTblhW0oar8MxunBIIPMJTt_Dx_CSuvmojvotnudXVjVcZKS1fYgP32JcZ_QzVQeJEHG4H7RjvwCn8EIdA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=International+Conference+on+Affective+Computing+and+Intelligent+Interaction+and+workshops&rft.atitle=Unsupervised+Adversarial+Domain+Adaptation+for+Cross-Lingual+Speech+Emotion+Recognition&rft.au=Latif%2C+Siddique&rft.au=Qadir%2C+Junaid&rft.au=Bilal%2C+Muhammad&rft.date=2019-09-01&rft.pub=IEEE&rft.eissn=2156-8111&rft.spage=732&rft.epage=737&rft_id=info:doi/10.1109%2FACII.2019.8925513&rft.externalDocID=8925513 |