Privacy and Utility Preserving Data Transformation for Speech Emotion Recognition

Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emoti...

Full description

Saved in:
Bibliographic Details
Published inInternational Conference on Affective Computing and Intelligent Interaction and workshops pp. 1 - 7
Main Authors Feng, Tiantian, Narayanan, Shrikanth
Format Conference Proceeding
LanguageEnglish
Published IEEE 28.09.2021
Subjects
Online AccessGet full text
ISSN2156-8111
DOI10.1109/ACII52823.2021.9597433

Cover

Loading…
Abstract Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emotions, offering unprecedented opportunities for applications in the field of assistive agents, medical diagnoses, and personalized education. Many existing systems collect and transmit these speech data, either processed or unprocessed, from users' devices to a central server for post analysis. However, egocentric audio sensing for speech emotion recognition has created concerns and risks to privacy, where unintended/improper inferences of sensitive information and demographic information may occur without user consent. Toward addressing these concerns, in this work, we propose a privacy-preserving data transformation technique to mitigate potential threats associated with sensitive information and demographic inferences. The proposed mechanism combines an autoencoder architecture, called replacement autoencoder, with gradient reversal layer to remove sensitive information inside the data, such as sensitive labels and demographics. We empirically validate our approach for predicting emotions using three commonly used datasets for speech emotion recognition. We show that our method can effectively prevent inferences of sensitive emotions and demographic information. We further show that the improved privacy comes at a cost of a minor utility loss for the target application.
AbstractList Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things. Notably, continuously worn wearable sensors enable researchers to collect egocentric speech data to study and assess real-life expressed emotions, offering unprecedented opportunities for applications in the field of assistive agents, medical diagnoses, and personalized education. Many existing systems collect and transmit these speech data, either processed or unprocessed, from users' devices to a central server for post analysis. However, egocentric audio sensing for speech emotion recognition has created concerns and risks to privacy, where unintended/improper inferences of sensitive information and demographic information may occur without user consent. Toward addressing these concerns, in this work, we propose a privacy-preserving data transformation technique to mitigate potential threats associated with sensitive information and demographic inferences. The proposed mechanism combines an autoencoder architecture, called replacement autoencoder, with gradient reversal layer to remove sensitive information inside the data, such as sensitive labels and demographics. We empirically validate our approach for predicting emotions using three commonly used datasets for speech emotion recognition. We show that our method can effectively prevent inferences of sensitive emotions and demographic information. We further show that the improved privacy comes at a cost of a minor utility loss for the target application.
Author Feng, Tiantian
Narayanan, Shrikanth
Author_xml – sequence: 1
  givenname: Tiantian
  surname: Feng
  fullname: Feng, Tiantian
  email: tiantiaf@usc.edu
  organization: University of Southern California,Signal Analysis and Interpretation Lab,Los Angeles,USA
– sequence: 2
  givenname: Shrikanth
  surname: Narayanan
  fullname: Narayanan, Shrikanth
  email: shri@ee.usc.edu
  organization: University of Southern California,Signal Analysis and Interpretation Lab,Los Angeles,USA
BookMark eNotkF1PwjAYRqvRREB-gYnpH9js2491vSQIuoREVLgmXfcOa1hHuoVk_15Rrp6Tc3EunjG5CW1AQh6BpQDMPM3mRaF4zkXKGYfUKKOlEFdkDFmmJGNgsmsy4qCyJAeAOzLtum929orluRqR93X0J-sGakNFt70_-H6g64gdxpMPe_pse0s30YaubmNje98G-kv084jovuiiaf_UB7p2H_yZ78ltbQ8dTi87IdvlYjN_TVZvL8V8tko8Z6JPZFVZpiVoyYXRpnQlMicqdKwCLrjGSsg6U9ZZLJXiEpk2SoPLRamtMqWYkIf_rkfE3TH6xsZhd3lA_AAfNlLq
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ACII52823.2021.9597433
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Education
Psychology
Computer Science
EISBN 1665400196
9781665400190
EISSN 2156-8111
EndPage 7
ExternalDocumentID 9597433
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-i203t-4dda07417423979bcbe0c3dec0d12327ed34f65acaeb5524e079571c83b7a59b3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:27:04 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-4dda07417423979bcbe0c3dec0d12327ed34f65acaeb5524e079571c83b7a59b3
PageCount 7
ParticipantIDs ieee_primary_9597433
PublicationCentury 2000
PublicationDate 2021-Sept.-28
PublicationDateYYYYMMDD 2021-09-28
PublicationDate_xml – month: 09
  year: 2021
  text: 2021-Sept.-28
  day: 28
PublicationDecade 2020
PublicationTitle International Conference on Affective Computing and Intelligent Interaction and workshops
PublicationTitleAbbrev ACII
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001950885
Score 1.8475842
Snippet Speech carries rich information not only about an individual's intent but about demographic traits, physical and psychological state among other things....
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms autoencoder
Data privacy
DNN
Education
Emotion recognition
machine learning
Privacy
Psychology
Servers
Speech emotion recognition
Speech recognition
trust
Title Privacy and Utility Preserving Data Transformation for Speech Emotion Recognition
URI https://ieeexplore.ieee.org/document/9597433
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bT8IwGG2AJ55QwHhPH3x0o2vXXR4NQsAEgwoJb6SXDyUmGyHDBH-97TZAjQ--NUt2Sdv1O197zncQuqHRwgSymDpSAHN8FXmOVBycOIwpCZU2ENUKhUePwWDqP8z4rIJu91oYAMjJZ-DaZn6Wr1O1sVtlndiiX8aqqGoSt0KrddhPye1MeSkC9kjcuesOh9xkFMxkgdRzy5t_uKjkQaTfQKPd6wvuyLu7yaSrPn9VZvzv9x2h9kGuh8f7QHSMKpA0UWPn14DL37dpHZpLNkcT1fcr37aFnsbr5YdQWywSjaeZpctusSVn2IUkecX3IhN48g3ipgk2LfyyAlBvuFdYAeHnHRkpTdpo2u9NugOn9FpwlpSwzPG1FhZd2INbM05SSSCKaVBEW9AVgmb-IuBCCZCcUx9IGPPQUxGToeCxZCeolqQJnCJMA-4rUOaJXPiceNJWeQ2ARFpIEy_5GWrZrpuvinIa87LXzv--fIHqdvgsRYNGl6iWrTdwZXBAJq_zCfAF3e6yxg
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3LTgIxFL1BXMgKBYxvu3DpwMx0Oo-lQQgoEFRI2JE-LkpMBkIGE_x622EANS7cNU06adpO72l7zj0AN2440YEsci3BkVqeDB1LSIZWFESuHUilIaoRCnd7fmvoPYzYKAe3Wy0MIqbkM6yaYvqWr2Zyaa7KapFBv5TuwT4zYty1Wmt3o5IamrJMBuzYUe2u3m4zfaag-hzoOtWs-Q8flTSMNIvQ3XRgzR55ry4TUZWfv3Iz_reHh1DZCfZIfxuKjiCHcQmKG8cGkv3AJePRnPE5SlDY7n2rMjz1F9MPLleEx4oME0OYXRFDzzBbSfxK7nnCyeAbyJ3FRJfIyxxRvpHG2gyIPG_oSLO4AsNmY1BvWZnbgjV1bZpYnlLc4AvzdKtnSkiBtqQKpa0M7ApQUW_iMy45CsZcD-0gYoEjQyoCziJBjyEfz2I8AeL6zJMo9RcZ95jtCJPn1Uc7VFzoiMlOoWyGbjxfJ9QYZ6N29nf1NRy0Bt3OuNPuPZ5DwUylIWy44QXkk8USLzUqSMRVuhi-AFzNtg4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=International+Conference+on+Affective+Computing+and+Intelligent+Interaction+and+workshops&rft.atitle=Privacy+and+Utility+Preserving+Data+Transformation+for+Speech+Emotion+Recognition&rft.au=Feng%2C+Tiantian&rft.au=Narayanan%2C+Shrikanth&rft.date=2021-09-28&rft.pub=IEEE&rft.eissn=2156-8111&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FACII52823.2021.9597433&rft.externalDocID=9597433