Privacy and Utility Preserving Data Transformation for Speech Emotion Recognition

Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emoti...

Full description

Saved in:

Bibliographic Details
Published in	International Conference on Affective Computing and Intelligent Interaction and workshops pp. 1 - 7
Main Authors	Feng, Tiantian, Narayanan, Shrikanth
Format	Conference Proceeding
Language	English
Published	IEEE 28.09.2021
Subjects	autoencoder Data privacy DNN Education Emotion recognition machine learning Privacy Psychology Servers Speech emotion recognition Speech recognition trust
Online Access	Get full text
ISSN	2156-8111
DOI	10.1109/ACII52823.2021.9597433

Cover

Loading…

Abstract	Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emotions, offering unprecedented opportunities for applications in the field of assistive agents, medical diagnoses, and personalized education. Many existing systems collect and transmit these speech data, either processed or unprocessed, from users' devices to a central server for post analysis. However, egocentric audio sensing for speech emotion recognition has created concerns and risks to privacy, where unintended/improper inferences of sensitive information and demographic information may occur without user consent. Toward addressing these concerns, in this work, we propose a privacy-preserving data transformation technique to mitigate potential threats associated with sensitive information and demographic inferences. The proposed mechanism combines an autoencoder architecture, called replacement autoencoder, with gradient reversal layer to remove sensitive information inside the data, such as sensitive labels and demographics. We empirically validate our approach for predicting emotions using three commonly used datasets for speech emotion recognition. We show that our method can effectively prevent inferences of sensitive emotions and demographic information. We further show that the improved privacy comes at a cost of a minor utility loss for the target application.
AbstractList	Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emotions, offering unprecedented opportunities for applications in the field of assistive agents, medical diagnoses, and personalized education. Many existing systems collect and transmit these speech data, either processed or unprocessed, from users' devices to a central server for post analysis. However, egocentric audio sensing for speech emotion recognition has created concerns and risks to privacy, where unintended/improper inferences of sensitive information and demographic information may occur without user consent. Toward addressing these concerns, in this work, we propose a privacy-preserving data transformation technique to mitigate potential threats associated with sensitive information and demographic inferences. The proposed mechanism combines an autoencoder architecture, called replacement autoencoder, with gradient reversal layer to remove sensitive information inside the data, such as sensitive labels and demographics. We empirically validate our approach for predicting emotions using three commonly used datasets for speech emotion recognition. We show that our method can effectively prevent inferences of sensitive emotions and demographic information. We further show that the improved privacy comes at a cost of a minor utility loss for the target application.
Author	Feng, Tiantian Narayanan, Shrikanth
Author_xml	– sequence: 1 givenname: Tiantian surname: Feng fullname: Feng, Tiantian email: tiantiaf@usc.edu organization: University of Southern California,Signal Analysis and Interpretation Lab,Los Angeles,USA – sequence: 2 givenname: Shrikanth surname: Narayanan fullname: Narayanan, Shrikanth email: shri@ee.usc.edu organization: University of Southern California,Signal Analysis and Interpretation Lab,Los Angeles,USA
BookMark	eNotkF1PwjAYRqvRREB-gYnpH9js2491vSQIuoREVLgmXfcOa1hHuoVk_15Rrp6Tc3EunjG5CW1AQh6BpQDMPM3mRaF4zkXKGYfUKKOlEFdkDFmmJGNgsmsy4qCyJAeAOzLtum929orluRqR93X0J-sGakNFt70_-H6g64gdxpMPe_pse0s30YaubmNje98G-kv084jovuiiaf_UB7p2H_yZ78ltbQ8dTi87IdvlYjN_TVZvL8V8tko8Z6JPZFVZpiVoyYXRpnQlMicqdKwCLrjGSsg6U9ZZLJXiEpk2SoPLRamtMqWYkIf_rkfE3TH6xsZhd3lA_AAfNlLq
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/ACII52823.2021.9597433
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Education Psychology Computer Science
EISBN	1665400196 9781665400190
EISSN	2156-8111
EndPage	7
ExternalDocumentID	9597433
Genre	orig-research
GroupedDBID	6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL
ID	FETCH-LOGICAL-i203t-4dda07417423979bcbe0c3dec0d12327ed34f65acaeb5524e079571c83b7a59b3
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:27:04 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i203t-4dda07417423979bcbe0c3dec0d12327ed34f65acaeb5524e079571c83b7a59b3
PageCount	7
ParticipantIDs	ieee_primary_9597433
PublicationCentury	2000
PublicationDate	2021-Sept.-28
PublicationDateYYYYMMDD	2021-09-28
PublicationDate_xml	– month: 09 year: 2021 text: 2021-Sept.-28 day: 28
PublicationDecade	2020
PublicationTitle	International Conference on Affective Computing and Intelligent Interaction and workshops
PublicationTitleAbbrev	ACII
PublicationYear	2021
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0001950885
Score	1.8475842
Snippet	Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things....
SourceID	ieee
SourceType	Publisher
StartPage	1
SubjectTerms	autoencoder Data privacy DNN Education Emotion recognition machine learning Privacy Psychology Servers Speech emotion recognition Speech recognition trust
Title	Privacy and Utility Preserving Data Transformation for Speech Emotion Recognition
URI	https://ieeexplore.ieee.org/document/9597433
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwGG2AJ55QwHhPH3x0o2vXXR4NQsAEgwoJb6SXDyUmGyHDBH-97TZAjQ--NUt2Sdv1O197zncQuqHRwgSymDpSAHN8FXmOVBycOIwpCZU2ENUKhUePwWDqP8z4rIJu91oYAMjJZ-DaZn6Wr1O1sVtlndiiX8aqqGoSt0KrddhPye1MeSkC9kjcuesOh9xkFMxkgdRzy5t_uKjkQaTfQKPd6wvuyLu7yaSrPn9VZvzv9x2h9kGuh8f7QHSMKpA0UWPn14DL37dpHZpLNkcT1fcr37aFnsbr5YdQWywSjaeZpctusSVn2IUkecX3IhN48g3ipgk2LfyyAlBvuFdYAeHnHRkpTdpo2u9NugOn9FpwlpSwzPG1FhZd2INbM05SSSCKaVBEW9AVgmb-IuBCCZCcUx9IGPPQUxGToeCxZCeolqQJnCJMA-4rUOaJXPiceNJWeQ2ARFpIEy_5GWrZrpuvinIa87LXzv--fIHqdvgsRYNGl6iWrTdwZXBAJq_zCfAF3e6yxg
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LTgIxFL1BXMgKBYxvu3DpwMx0Oo-lQQgoEFRI2JE-LkpMBkIGE_x622EANS7cNU06adpO72l7zj0AN2440YEsci3BkVqeDB1LSIZWFESuHUilIaoRCnd7fmvoPYzYKAe3Wy0MIqbkM6yaYvqWr2Zyaa7KapFBv5TuwT4zYty1Wmt3o5IamrJMBuzYUe2u3m4zfaag-hzoOtWs-Q8flTSMNIvQ3XRgzR55ry4TUZWfv3Iz_reHh1DZCfZIfxuKjiCHcQmKG8cGkv3AJePRnPE5SlDY7n2rMjz1F9MPLleEx4oME0OYXRFDzzBbSfxK7nnCyeAbyJ3FRJfIyxxRvpHG2gyIPG_oSLO4AsNmY1BvWZnbgjV1bZpYnlLc4AvzdKtnSkiBtqQKpa0M7ApQUW_iMy45CsZcD-0gYoEjQyoCziJBjyEfz2I8AeL6zJMo9RcZ95jtCJPn1Uc7VFzoiMlOoWyGbjxfJ9QYZ6N29nf1NRy0Bt3OuNPuPZ5DwUylIWy44QXkk8USLzUqSMRVuhi-AFzNtg4
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=International+Conference+on+Affective+Computing+and+Intelligent+Interaction+and+workshops&rft.atitle=Privacy+and+Utility+Preserving+Data+Transformation+for+Speech+Emotion+Recognition&rft.au=Feng%2C+Tiantian&rft.au=Narayanan%2C+Shrikanth&rft.date=2021-09-28&rft.pub=IEEE&rft.eissn=2156-8111&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FACII52823.2021.9597433&rft.externalDocID=9597433