PrivacyBot: Detecting Privacy Sensitive Information in Unstructured Texts

With the swift proliferation of Internet services and always connected smart devices, users continue to (un)intentionally share copious amount of data on daily basis. While the availability of such a big amount of data is useful to extract interesting nuggets in areas such as behavioral or medical r...

Full description

Saved in:
Bibliographic Details
Published in2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) pp. 53 - 60
Main Authors Tesfay, Welderufael B., Serna, Jetzabel, Rannenberg, Kai
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2019
Subjects
Online AccessGet full text
DOI10.1109/SNAMS.2019.8931855

Cover

Abstract With the swift proliferation of Internet services and always connected smart devices, users continue to (un)intentionally share copious amount of data on daily basis. While the availability of such a big amount of data is useful to extract interesting nuggets in areas such as behavioral or medical research, it also brings about unprecedented user information privacy violation consequences, e.g., identity theft and reputation damage of target users. This is aggravated when users share Privacy Sensitive Information (PSI) on-line, often times, including to unintended audience. In this regard, detecting PSI disclosure becomes an essential step towards tackling the long(short) term privacy consequences of divulging such information. As such, in this paper, we present PrivacyBot, a machine-learning based proof-of-concept that detects PSI in user-generated unstructured texts. A rigorous set of experiments show that our approach can detect PSI with an accuracy of up-to 95%. Furthermore, PrivacyBot provides a fine-grained category of PSI types (with an accuracy of up-to 88%), defined based on existing work and Art. 9 of the European Union (EU) General Data Protection Regulation (GDPR). Results are promising and shed light on the possibility of integrating such tools to support users in making informed privacy related decisions when disclose PSI on-line.
AbstractList With the swift proliferation of Internet services and always connected smart devices, users continue to (un)intentionally share copious amount of data on daily basis. While the availability of such a big amount of data is useful to extract interesting nuggets in areas such as behavioral or medical research, it also brings about unprecedented user information privacy violation consequences, e.g., identity theft and reputation damage of target users. This is aggravated when users share Privacy Sensitive Information (PSI) on-line, often times, including to unintended audience. In this regard, detecting PSI disclosure becomes an essential step towards tackling the long(short) term privacy consequences of divulging such information. As such, in this paper, we present PrivacyBot, a machine-learning based proof-of-concept that detects PSI in user-generated unstructured texts. A rigorous set of experiments show that our approach can detect PSI with an accuracy of up-to 95%. Furthermore, PrivacyBot provides a fine-grained category of PSI types (with an accuracy of up-to 88%), defined based on existing work and Art. 9 of the European Union (EU) General Data Protection Regulation (GDPR). Results are promising and shed light on the possibility of integrating such tools to support users in making informed privacy related decisions when disclose PSI on-line.
Author Tesfay, Welderufael B.
Rannenberg, Kai
Serna, Jetzabel
Author_xml – sequence: 1
  givenname: Welderufael B.
  surname: Tesfay
  fullname: Tesfay, Welderufael B.
  organization: Goethe University Frankfurt,Frankfurt am Main,Germany
– sequence: 2
  givenname: Jetzabel
  surname: Serna
  fullname: Serna, Jetzabel
  organization: Goethe University Frankfurt,Frankfurt am Main,Germany
– sequence: 3
  givenname: Kai
  surname: Rannenberg
  fullname: Rannenberg, Kai
  organization: Goethe University Frankfurt,Frankfurt am Main,Germany
BookMark eNotj0tOwzAUAI0EC1q4AGx8gQR_45hdKRQilY-UVmJXPSfPyBJ1kONW9PYsyGqkWYw0M3Ieh4iE3HBWcs7sXfu2eG1Lwbgtayt5rfUZmXEjai6sqj4vSfORwhG608OQ7-kjZuxyiF90srTFOIYcjkib6Ie0hxyGSEOk2zjmdOjyIWFPN_ibxyty4eF7xOuJc7JdPW2WL8X6_blZLtZFEEzmAsD0upbeO1DOGqm44UIYEN6xqlbKVwq4UbbTTjjdY8egqpQ1zCnZMwA5J7f_3YCIu58U9pBOu-lO_gEWaUpH
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/SNAMS.2019.8931855
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 172812946X
9781728129464
EndPage 60
ExternalDocumentID 8931855
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i203t-aa7d583ffba4b9734171227a2fb06844f64a1749c5b2b5dec0a664970b43d0aa3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:38:00 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-aa7d583ffba4b9734171227a2fb06844f64a1749c5b2b5dec0a664970b43d0aa3
PageCount 8
ParticipantIDs ieee_primary_8931855
PublicationCentury 2000
PublicationDate 2019-Oct.
PublicationDateYYYYMMDD 2019-10-01
PublicationDate_xml – month: 10
  year: 2019
  text: 2019-Oct.
PublicationDecade 2010
PublicationTitle 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS)
PublicationTitleAbbrev SNAMS
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8023443
Snippet With the swift proliferation of Internet services and always connected smart devices, users continue to (un)intentionally share copious amount of data on daily...
SourceID ieee
SourceType Publisher
StartPage 53
SubjectTerms EU GDPR
General Data Protection Regulation
Information privacy
Machine learning
Privacy
Task analysis
Twitter
Title PrivacyBot: Detecting Privacy Sensitive Information in Unstructured Texts
URI https://ieeexplore.ieee.org/document/8931855
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA5tT55UWvFNDh5Nm2azycajL6rQUmgLvZU8oQhbqVtBf72T3W1F8eBtySYkmwn5Mpvvm0HoKpOOxuwnBLw2T7hOAoFTKyPG2D5Aej_QUsU_HInBjD_P03kDXe-0MN77knzmu_GxvMt3K7uJv8p6gK0AL2kTNWGZVVqtrQ6Gqt5kBL5vJGuB9auKPzKmlIDxuI-G264qnshLd1OYrv38FYXxv2M5QJ1vaR4e70DnEDV83kZP4_XyXduP21Vxg-99vBeAl7guxZNIUo_bGq7FR9EYeJnjWR0-drP2Dk9hm37roNnjw_RuQOocCWTJaFIQraVLsyQEo7lREjBJ9hmTmgVDRcZ5EFyD06FsaphJnbdUC8GVpIYnjmqdHKFWvsr9McJBB6lYUC4VGhppJXzmmLTcMWWoDyeoHadh8VqFwVjUM3D6d_EZ2oumqHhv56gFn-MvAL8Lc1ka7gtm_J4R
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LTwIxEJ4gHvSkBoxve_DoQul2t9Sjr4ACIQESbqTdtgkxWQwuJvrrne4uGI0Hb5tumz4m6ddpv28G4KotDPXZTwL02mzAVegCPLWyQOukhZDecjRX8fcHcWfCn6bRtALXGy2MtTYnn9mG_8zf8s0iWfmrsiZiK8JLtAXbiPs8KtRaayUMlc3RAL1fT9dC-xdVf-RMySHjcQ_6684KpshLY5XpRvL5Kw7jf0ezD_VvcR4ZbmDnACo2rUF3uJy_q-TjdpHdkHvrXwbwJylLycjT1P3GRkr5kTcHmadkUgaQXS2tIWPcqN_qMHl8GN91gjJLQjBnNMwCpYSJ2qFzWnEtBaKSaDEmFHOaxm3OXcwVuh0yiTTTkbEJVXHMpaCah4YqFR5CNV2k9giIU05I5qSJYoWNlIxt2zCRcMOkptYdQ80vw-y1CIQxK1fg5O_iS9jpjPu9Wa87eD6FXW-WggV3BlWcmj1HNM_0RW7EL9uyoV4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+Sixth+International+Conference+on+Social+Networks+Analysis%2C+Management+and+Security+%28SNAMS%29&rft.atitle=PrivacyBot%3A+Detecting+Privacy+Sensitive+Information+in+Unstructured+Texts&rft.au=Tesfay%2C+Welderufael+B.&rft.au=Serna%2C+Jetzabel&rft.au=Rannenberg%2C+Kai&rft.date=2019-10-01&rft.pub=IEEE&rft.spage=53&rft.epage=60&rft_id=info:doi/10.1109%2FSNAMS.2019.8931855&rft.externalDocID=8931855