Anonymizing Personal Text Messages Posted in Online Social Networks and Detecting Disclosures of Personal Information

While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One way to reduce this risk is to anonymize personal information in messages before they are posted. Furthermore, if personal information is someho...

Full description

Saved in:
Bibliographic Details
Published inIEICE Transactions on Information and Systems Vol. E98.D; no. 1; pp. 78 - 88
Main Authors NGUYEN-SON, Hoang-Quoc, TRAN, Minh-Triet, YOSHIURA, Hiroshi, SONEHARA, Noboru, ECHIZEN, Isao
Format Journal Article
LanguageEnglish
Published The Institute of Electronics, Information and Communication Engineers 2015
Subjects
Online AccessGet full text

Cover

Loading…
Abstract While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One way to reduce this risk is to anonymize personal information in messages before they are posted. Furthermore, if personal information is somehow disclosed, the person who disclosed it should be identifiable. Several methods developed for anonymizing personal information in natural language text simply remove sensitive phrases, making the anonymized text message unnatural. Other methods change the message by using synonymization or structural alteration to create fingerprints for detecting disclosure, but they do not support the creation of a sufficient number of fingerprints for friends of an online social network user. We have developed a system for anonymizing personal information in text messages that generalizes sensitive phrases. It also creates a sufficient number of fingerprints of a message by using synonyms so that, if personal information is revealed online, the person who revealed it can be identified. A distribution metric is used to ensure that the degree of anonymization is appropriate for each group of friends. A threshold is used to improve the naturalness of the fingerprinted messages so that they do not catch the attention of attackers. Evaluation using about 55,000 personal tweets in English demonstrated that our system creates sufficiently natural fingerprinted messages for friends and groups of friends. The practicality of the system was demonstrated by creating a web application for controlling messages posted on Facebook.
AbstractList While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One way to reduce this risk is to anonymize personal information in messages before they are posted. Furthermore, if personal information is somehow disclosed, the person who disclosed it should be identifiable. Several methods developed for anonymizing personal information in natural language text simply remove sensitive phrases, making the anonymized text message unnatural. Other methods change the message by using synonymization or structural alteration to create fingerprints for detecting disclosure, but they do not support the creation of a sufficient number of fingerprints for friends of an online social network user. We have developed a system for anonymizing personal information in text messages that generalizes sensitive phrases. It also creates a sufficient number of fingerprints of a message by using synonyms so that, if personal information is revealed online, the person who revealed it can be identified. A distribution metric is used to ensure that the degree of anonymization is appropriate for each group of friends. A threshold is used to improve the naturalness of the fingerprinted messages so that they do not catch the attention of attackers. Evaluation using about 55,000 personal tweets in English demonstrated that our system creates sufficiently natural fingerprinted messages for friends and groups of friends. The practicality of the system was demonstrated by creating a web application for controlling messages posted on Facebook.
Author NGUYEN-SON, Hoang-Quoc
TRAN, Minh-Triet
SONEHARA, Noboru
YOSHIURA, Hiroshi
ECHIZEN, Isao
Author_xml – sequence: 1
  fullname: NGUYEN-SON, Hoang-Quoc
  organization: Graduate University for Advanced Studies
– sequence: 2
  fullname: TRAN, Minh-Triet
  organization: University of Science
– sequence: 3
  fullname: YOSHIURA, Hiroshi
  organization: The University of Electro-Communications
– sequence: 4
  fullname: SONEHARA, Noboru
  organization: National Institute of Informatics
– sequence: 5
  fullname: ECHIZEN, Isao
  organization: Graduate University for Advanced Studies
BookMark eNpNkElPwzAQRi0EEmX5Bxx85BLwJLabHBFlk1gqAWfLdSZgSG3wuILy6wkq22ku7z2Nvi22HmJAxvZAHICqx4c52UA-dAelAHl1PxUC9BobwViqAioN62wkGtBFrapyk20RPQ1EXYIascXR0FrO_YcPD3yKiWKwPb_D98yvkMg-IPFppIwt94HfhN4H5LfR-YG6xvwW0zNxG1o-wYwuf1UmnlwfaZEGNXZ_0YvQxTS32cewwzY62xPuft9tdn96cnd8XlzenF0cH10WTkGTC2xqCWhnNbbSyU7MZFVK1dStHYvSCdm0EpxuulZJ1KWeAaJGXTldS9epFqtttr_qvqT4ukDKZj48h31vA8YFGdBaCNXAuBpQuUJdikQJO_OS_NympQFhvlY2PyubfysP2nSlPVEexvqVbMre9fgnnTS1mRj4vv8Sv6h7tMlgqD4BrbGT5g
Cites_doi 10.7551/mitpress/1130.003.0016
10.1007/978-3-642-03688-0_16
10.1145/1242572.1242667
10.1145/1121995.1121997
10.29012/jpc.v4i2.620
10.1145/219717.219748
10.1007/978-3-642-40099-5_34
10.1007/3-540-36415-3_13
10.1109/AINA.2010.118
10.1109/SITIS.2013.108
10.1145/1178766.1178777
10.1007/978-3-540-73599-1_31
10.1145/275487.275508
10.1007/978-3-642-17773-6_23
ContentType Journal Article
Copyright 2015 The Institute of Electronics, Information and Communication Engineers
Copyright_xml – notice: 2015 The Institute of Electronics, Information and Communication Engineers
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1587/transinf.2014MUP0016
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1745-1361
EndPage 88
ExternalDocumentID 10_1587_transinf_2014MUP0016
article_transinf_E98_D_1_E98_D_2014MUP0016_article_char_en
GroupedDBID -~X
1TH
5GY
ABQTQ
ABTAH
ABZEH
ACGFS
ADNWM
AENEX
AFFNX
ALMA_UNASSIGNED_HOLDINGS
C1A
CKLRP
CS3
DU5
EBS
EJD
F5P
H13
ICE
JSF
JSH
KQ8
OK1
P2P
RIG
RJT
RYL
RZJ
TN5
TQK
VOH
ZE2
ZKX
ZY4
AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c519t-e9841eab8ed4c4f0b4324598da702c049d41c69fd54e626b1ee6e63c684cf5de3
ISSN 0916-8532
IngestDate Tue Dec 03 23:50:26 EST 2024
Fri Dec 06 04:06:38 EST 2024
Wed Apr 05 06:36:45 EDT 2023
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c519t-e9841eab8ed4c4f0b4324598da702c049d41c69fd54e626b1ee6e63c684cf5de3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://www.jstage.jst.go.jp/article/transinf/E98.D/1/E98.D_2014MUP0016/_article/-char/en
PQID 1660059173
PQPubID 23500
PageCount 11
ParticipantIDs proquest_miscellaneous_1660059173
crossref_primary_10_1587_transinf_2014MUP0016
jstage_primary_article_transinf_E98_D_1_E98_D_2014MUP0016_article_char_en
PublicationCentury 2000
PublicationDate 2015
2015-00-00
20150101
PublicationDateYYYYMMDD 2015-01-01
PublicationDate_xml – year: 2015
  text: 2015
PublicationDecade 2010
PublicationTitle IEICE Transactions on Information and Systems
PublicationTitleAlternate IEICE Trans. Inf. & Syst.
PublicationYear 2015
Publisher The Institute of Electronics, Information and Communication Engineers
Publisher_xml – name: The Institute of Electronics, Information and Communication Engineers
References [1] F. Stutzman, R. Gross, and A. Acquisti, “Silent listeners: The evolution of privacy and disclosure on facebook,” J. Priv. and Conf., vol.4, no.2, pp.7-41, 2012.
[8] I. Ounis, C. Macdonald, J. Lin, and I. Soboroff, “Overview of the trec-2011 microblog track,” Proc. 20th Tex. Retriev. Conf., 2011.
[14] H. Kataoka, A. Utsumi, Y. Hirose, and H. Yoshiura, “Disclosure control of natural language information to enable secure and enjoyable communication over the internet,” Proc. 15th Inter. Worksh. Sec. Prot., pp.178-188, 2010.
[2] D. Kokkinakis and A. Thurin, “Anonymisation of swedish clinical data,” Proc. 11th AI in Medic., pp.237-241, 2007.
[3] B. Medlock, “An introduction to nlp-based textual anonymisation,” Proc. 5th Conf. Lang. Res. and Eval., 2006.
[6] C.Y. Chang and S. Clark, “Linguistic steganography using automatically generated paraphrases,” Proc. 11th NAACL HLT, pp.591-599, 2010.
[13] M.J. Atallah, V. Raskin, C.F. Hempelmann, M. Karahan, R. Sion, U. Topkara, and K.E. Triezenberg, “Natural language watermarking and tamperproofing,” Proc. 5th Infor. Hid., pp.196-212, 2002.
[9] P. Samarati and L. Sweeney, “Generalizing data to provide anonymity when disclosing information,” Proc. 17th ACM SIGACT-SIGMOD-SIGART, p.188, 1998.
[19] N. Shuyo, “Language detection library for java,” 2010.
[5] Y. Zhang, G. Blackwood, and S. Clark, “Syntax-based word ordering incorporating a large-scale language model,” Proc. 13th Conf. Euro. Assoc. for Comput. Ling., pp.736-746, 2012.
[12] M. Topkara, U. Topkara, and M.J. Atallah, “Words are not enough: Sentence level natural language watermarking,” Proc. 4th ACM Worsh. Cont. Prot. and Sec., pp.37-46, 2006.
[18] S. Machida, S. Shimada, and I. Echizen, “Settings of access control by detecting privacy leaks in sns,” Proc. 9th Signal Image Technology and Internet Based Sytems, pp.660-666, 2013.
[17] C.Y. Chang and S. Clark, “Practical linguistic steganography using contextual synonym substitution and vertex colour coding,” Proc. Empirical Methods in NLP, pp.1194-1203, 2010.
[21] X. Liu, S. Zhang, F. Wei, and M. Zhou, “Recognizing named entities in tweets,” Proc. 49th ACL HLT, pp.359-367, 2011.
[10] F. Suchanek, G. Kasneci, and G. Weikum, “YAGO: A core of semantic knowledge,” Proc. 16th Inter. Conf. WWW, pp.697-706, 2007.
[15] J.C. Platt, “Fast training of support vector machines using sequential minimal optimization,” Adv. in Ker. Meth., pp.185-208, 1999.
[4] H.Q. Nguyen-Son, M.T. Tran, D. Tien, H. Yoshiura, N. Sonehara, and I. Echizen, “Automatic anonymous fingerprinting of text posted on social networking services,” Proc. 11th Inter. Worksh. Digit. Forens. and Waterm., pp.410-424, 2013.
[11] T.H. Ngoc, I. Echizen, K. Komei, and H. Yoshiura, “New approach to quantification of privacy on social network sites,” Proc. 24th Adv. Infor. Netw. and Appl., pp.556-564, 2010.
[16] G.A. Miller, “Wordnet: A lexical database for english,” Commun. ACM, vol.38, no.11, pp.39-41, 1995.
[22] J.W. Byun and E. Bertino, “Micro-views, or on how to protect privacy while enhancing data usability: Concepts and challenges,” ACM SIGMOD Record, vol.35, no.1, pp.9-13, 2006.
[7] X. Zheng, L. Huang, Z. Chen, Z. Yu, and W. Yang, “Hiding information by context-based synonym substitution,” Proc. 8th Inter. Worksh. Digit. Forens. and Waterm., pp.162-169, 2009.
[20] B. Han, P. Cook, and T. Baldwin, “Automatically constructing a normalisation dictionary for microblogs,” Proc. Joint Conf. Emp. Meth. in Nat. Lang. Proc. and Comput. Nat. Lang. Learn., pp.421-432, 2012.
11
22
12
13
14
15
16
17
18
19
1
2
3
4
5
6
7
8
9
20
10
21
References_xml – ident: 17
– ident: 3
– ident: 15
  doi: 10.7551/mitpress/1130.003.0016
– ident: 7
  doi: 10.1007/978-3-642-03688-0_16
– ident: 10
  doi: 10.1145/1242572.1242667
– ident: 22
  doi: 10.1145/1121995.1121997
– ident: 5
– ident: 1
  doi: 10.29012/jpc.v4i2.620
– ident: 16
  doi: 10.1145/219717.219748
– ident: 4
  doi: 10.1007/978-3-642-40099-5_34
– ident: 19
– ident: 13
  doi: 10.1007/3-540-36415-3_13
– ident: 11
  doi: 10.1109/AINA.2010.118
– ident: 18
  doi: 10.1109/SITIS.2013.108
– ident: 12
  doi: 10.1145/1178766.1178777
– ident: 6
– ident: 2
  doi: 10.1007/978-3-540-73599-1_31
– ident: 8
– ident: 9
  doi: 10.1145/275487.275508
– ident: 21
– ident: 14
  doi: 10.1007/978-3-642-17773-6_23
– ident: 20
SSID ssj0018215
ssib053832749
ssib002991706
ssib036429076
ssib036429077
ssib023157076
Score 2.117343
Snippet While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One...
SourceID proquest
crossref
jstage
SourceType Aggregation Database
Publisher
StartPage 78
SubjectTerms anonymized text message
Applications programs
disclosure detection
fingerprint
Fingerprints
Messages
Online
online social network
Risk
Short message service
Social networks
Texts
Title Anonymizing Personal Text Messages Posted in Online Social Networks and Detecting Disclosures of Personal Information
URI https://www.jstage.jst.go.jp/article/transinf/E98.D/1/E98.D_2014MUP0016/_article/-char/en
https://search.proquest.com/docview/1660059173
Volume E98.D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
ispartofPNX IEICE Transactions on Information and Systems, 2015, Vol.E98.D(1), pp.78-88
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9pAEF4laQ_toY-0VelLWyk3ZIrx2thHFJxCpQBJQUpOFrbHCVVlVwVf8us749k1RuHQRr0AWtbrtWdm57HzzQpxgirNCyBxrH4KBMlRsRUEkFm-G_TsGBWGt6yyLSbeaKG-XblXB4cnjaylchN3kru9uJKHUBXbkK6Ekv0HytaDYgP-RvriJ1IYP_-Kxuy7r-7I3Z9pq7o9x-W2fU4nm1D5hlkF4qCoBhcVbWs87oTTv9c6FZm2EmiU4Wqd_CwoasjZcWZQjVqqqajN2XE4Pg25QDoDJKrNh0ZnzgxtVEWnuPPXxXWIC_m0yjAYFcv8xrooi6QOIlwOJpzRn99ac3Tl68yc6-n30XhxOaiuW6F-v13VEaLpJBwN-K9JgYxdNuMZjOXsGCzcToZEWB8EpBfM3cnvIGjq8o07IU7bQ6bT8VPgBb6vXMt2uAC80QBh4HeG95idV3Q-YMjYBv5ereNS3OZsQy8b2ylfUJ0vZmRNb7VsnfuouSUy3SO8ezSMbP3duDgyXQmAh_x-KB5V5R4p9eBiu1fm9_icDvO0GiCKk_qyb0o7BtjjH-iD3Nw3RCrrav5CPNNukRzwVF6KA8iPxXNz5IjUGuhYPG3Uz3wlyoYASMOrkgRAGgGQLABylUsWAMkCII0ASCSyrAVANgRAFtl20AZbvBaLs3B-OrL0OSJWgv7JxoLAVzYsYx9SlaisG1MVSjfw02W_20vQRU6VnXhBlroK0L-PbQAPPCfxfJVkbgrOG3GEjwNvhQS0sB3oqthWsfIh83v4mgGc2MuwGdKWsMzLjX5xuZiI3GzstSV4gxgtMWYK1L0fzh4t8dkQMUIdQRt_yxyKch3Znkcgc7vvvPuP93svnpD8cjzygzja_C7hI1rom_hTxaB_ANxo830
link.rule.ids 314,780,784,4024,27923,27924,27925
linkProvider Colorado Alliance of Research Libraries
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Anonymizing+Personal+Text+Messages+Posted+in+Online+Social+Networks+and+Detecting+Disclosures+of+Personal+Information&rft.jtitle=IEICE+Transactions+on+Information+and+Systems&rft.au=NGUYEN-SON%2C+Hoang-Quoc&rft.au=TRAN%2C+Minh-Triet&rft.au=YOSHIURA%2C+Hiroshi&rft.au=SONEHARA%2C+Noboru&rft.date=2015&rft.pub=The+Institute+of+Electronics%2C+Information+and+Communication+Engineers&rft.issn=0916-8532&rft.eissn=1745-1361&rft.volume=E98.D&rft.issue=1&rft.spage=78&rft.epage=88&rft_id=info:doi/10.1587%2Ftransinf.2014MUP0016&rft.externalDocID=article_transinf_E98_D_1_E98_D_2014MUP0016_article_char_en
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0916-8532&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0916-8532&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0916-8532&client=summon