Privacy and Utility Preserving Data Transformation for Speech Emotion Recognition
Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emoti...
Saved in:
Published in | International Conference on Affective Computing and Intelligent Interaction and workshops pp. 1 - 7 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
28.09.2021
|
Subjects | |
Online Access | Get full text |
ISSN | 2156-8111 |
DOI | 10.1109/ACII52823.2021.9597433 |
Cover
Loading…
Abstract | Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emotions, offering unprecedented opportunities for applications in the field of assistive agents, medical diagnoses, and personalized education. Many existing systems collect and transmit these speech data, either processed or unprocessed, from users' devices to a central server for post analysis. However, egocentric audio sensing for speech emotion recognition has created concerns and risks to privacy, where unintended/improper inferences of sensitive information and demographic information may occur without user consent. Toward addressing these concerns, in this work, we propose a privacy-preserving data transformation technique to mitigate potential threats associated with sensitive information and demographic inferences. The proposed mechanism combines an autoencoder architecture, called replacement autoencoder, with gradient reversal layer to remove sensitive information inside the data, such as sensitive labels and demographics. We empirically validate our approach for predicting emotions using three commonly used datasets for speech emotion recognition. We show that our method can effectively prevent inferences of sensitive emotions and demographic information. We further show that the improved privacy comes at a cost of a minor utility loss for the target application. |
---|---|
AbstractList | Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emotions, offering unprecedented opportunities for applications in the field of assistive agents, medical diagnoses, and personalized education. Many existing systems collect and transmit these speech data, either processed or unprocessed, from users' devices to a central server for post analysis. However, egocentric audio sensing for speech emotion recognition has created concerns and risks to privacy, where unintended/improper inferences of sensitive information and demographic information may occur without user consent. Toward addressing these concerns, in this work, we propose a privacy-preserving data transformation technique to mitigate potential threats associated with sensitive information and demographic inferences. The proposed mechanism combines an autoencoder architecture, called replacement autoencoder, with gradient reversal layer to remove sensitive information inside the data, such as sensitive labels and demographics. We empirically validate our approach for predicting emotions using three commonly used datasets for speech emotion recognition. We show that our method can effectively prevent inferences of sensitive emotions and demographic information. We further show that the improved privacy comes at a cost of a minor utility loss for the target application. |
Author | Feng, Tiantian Narayanan, Shrikanth |
Author_xml | – sequence: 1 givenname: Tiantian surname: Feng fullname: Feng, Tiantian email: tiantiaf@usc.edu organization: University of Southern California,Signal Analysis and Interpretation Lab,Los Angeles,USA – sequence: 2 givenname: Shrikanth surname: Narayanan fullname: Narayanan, Shrikanth email: shri@ee.usc.edu organization: University of Southern California,Signal Analysis and Interpretation Lab,Los Angeles,USA |
BookMark | eNotkF1PwjAYRqvRREB-gYnpH9js2491vSQIuoREVLgmXfcOa1hHuoVk_15Rrp6Tc3EunjG5CW1AQh6BpQDMPM3mRaF4zkXKGYfUKKOlEFdkDFmmJGNgsmsy4qCyJAeAOzLtum929orluRqR93X0J-sGakNFt70_-H6g64gdxpMPe_pse0s30YaubmNje98G-kv084jovuiiaf_UB7p2H_yZ78ltbQ8dTi87IdvlYjN_TVZvL8V8tko8Z6JPZFVZpiVoyYXRpnQlMicqdKwCLrjGSsg6U9ZZLJXiEpk2SoPLRamtMqWYkIf_rkfE3TH6xsZhd3lA_AAfNlLq |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ACII52823.2021.9597433 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Education Psychology Computer Science |
EISBN | 1665400196 9781665400190 |
EISSN | 2156-8111 |
EndPage | 7 |
ExternalDocumentID | 9597433 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
ID | FETCH-LOGICAL-i203t-4dda07417423979bcbe0c3dec0d12327ed34f65acaeb5524e079571c83b7a59b3 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:27:04 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i203t-4dda07417423979bcbe0c3dec0d12327ed34f65acaeb5524e079571c83b7a59b3 |
PageCount | 7 |
ParticipantIDs | ieee_primary_9597433 |
PublicationCentury | 2000 |
PublicationDate | 2021-Sept.-28 |
PublicationDateYYYYMMDD | 2021-09-28 |
PublicationDate_xml | – month: 09 year: 2021 text: 2021-Sept.-28 day: 28 |
PublicationDecade | 2020 |
PublicationTitle | International Conference on Affective Computing and Intelligent Interaction and workshops |
PublicationTitleAbbrev | ACII |
PublicationYear | 2021 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0001950885 |
Score | 1.8475842 |
Snippet | Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things.... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1 |
SubjectTerms | autoencoder Data privacy DNN Education Emotion recognition machine learning Privacy Psychology Servers Speech emotion recognition Speech recognition trust |
Title | Privacy and Utility Preserving Data Transformation for Speech Emotion Recognition |
URI | https://ieeexplore.ieee.org/document/9597433 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwGG2AJ55QwHhPH3x0o2vXXR4NQsAEgwoJb6SXDyUmGyHDBH-97TZAjQ--NUt2Sdv1O197zncQuqHRwgSymDpSAHN8FXmOVBycOIwpCZU2ENUKhUePwWDqP8z4rIJu91oYAMjJZ-DaZn6Wr1O1sVtlndiiX8aqqGoSt0KrddhPye1MeSkC9kjcuesOh9xkFMxkgdRzy5t_uKjkQaTfQKPd6wvuyLu7yaSrPn9VZvzv9x2h9kGuh8f7QHSMKpA0UWPn14DL37dpHZpLNkcT1fcr37aFnsbr5YdQWywSjaeZpctusSVn2IUkecX3IhN48g3ipgk2LfyyAlBvuFdYAeHnHRkpTdpo2u9NugOn9FpwlpSwzPG1FhZd2INbM05SSSCKaVBEW9AVgmb-IuBCCZCcUx9IGPPQUxGToeCxZCeolqQJnCJMA-4rUOaJXPiceNJWeQ2ARFpIEy_5GWrZrpuvinIa87LXzv--fIHqdvgsRYNGl6iWrTdwZXBAJq_zCfAF3e6yxg |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LTgIxFL1BXMgKBYxvu3DpwMx0Oo-lQQgoEFRI2JE-LkpMBkIGE_x622EANS7cNU06adpO72l7zj0AN2440YEsci3BkVqeDB1LSIZWFESuHUilIaoRCnd7fmvoPYzYKAe3Wy0MIqbkM6yaYvqWr2Zyaa7KapFBv5TuwT4zYty1Wmt3o5IamrJMBuzYUe2u3m4zfaag-hzoOtWs-Q8flTSMNIvQ3XRgzR55ry4TUZWfv3Iz_reHh1DZCfZIfxuKjiCHcQmKG8cGkv3AJePRnPE5SlDY7n2rMjz1F9MPLleEx4oME0OYXRFDzzBbSfxK7nnCyeAbyJ3FRJfIyxxRvpHG2gyIPG_oSLO4AsNmY1BvWZnbgjV1bZpYnlLc4AvzdKtnSkiBtqQKpa0M7ApQUW_iMy45CsZcD-0gYoEjQyoCziJBjyEfz2I8AeL6zJMo9RcZ95jtCJPn1Uc7VFzoiMlOoWyGbjxfJ9QYZ6N29nf1NRy0Bt3OuNPuPZ5DwUylIWy44QXkk8USLzUqSMRVuhi-AFzNtg4 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=International+Conference+on+Affective+Computing+and+Intelligent+Interaction+and+workshops&rft.atitle=Privacy+and+Utility+Preserving+Data+Transformation+for+Speech+Emotion+Recognition&rft.au=Feng%2C+Tiantian&rft.au=Narayanan%2C+Shrikanth&rft.date=2021-09-28&rft.pub=IEEE&rft.eissn=2156-8111&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FACII52823.2021.9597433&rft.externalDocID=9597433 |