Anonymizing Personal Text Messages Posted in Online Social Networks and Detecting Disclosures of Personal Information
While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One way to reduce this risk is to anonymize personal information in messages before they are posted. Furthermore, if personal information is someho...
Saved in:
Published in | IEICE Transactions on Information and Systems Vol. E98.D; no. 1; pp. 78 - 88 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
The Institute of Electronics, Information and Communication Engineers
2015
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One way to reduce this risk is to anonymize personal information in messages before they are posted. Furthermore, if personal information is somehow disclosed, the person who disclosed it should be identifiable. Several methods developed for anonymizing personal information in natural language text simply remove sensitive phrases, making the anonymized text message unnatural. Other methods change the message by using synonymization or structural alteration to create fingerprints for detecting disclosure, but they do not support the creation of a sufficient number of fingerprints for friends of an online social network user. We have developed a system for anonymizing personal information in text messages that generalizes sensitive phrases. It also creates a sufficient number of fingerprints of a message by using synonyms so that, if personal information is revealed online, the person who revealed it can be identified. A distribution metric is used to ensure that the degree of anonymization is appropriate for each group of friends. A threshold is used to improve the naturalness of the fingerprinted messages so that they do not catch the attention of attackers. Evaluation using about 55,000 personal tweets in English demonstrated that our system creates sufficiently natural fingerprinted messages for friends and groups of friends. The practicality of the system was demonstrated by creating a web application for controlling messages posted on Facebook. |
---|---|
AbstractList | While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One way to reduce this risk is to anonymize personal information in messages before they are posted. Furthermore, if personal information is somehow disclosed, the person who disclosed it should be identifiable. Several methods developed for anonymizing personal information in natural language text simply remove sensitive phrases, making the anonymized text message unnatural. Other methods change the message by using synonymization or structural alteration to create fingerprints for detecting disclosure, but they do not support the creation of a sufficient number of fingerprints for friends of an online social network user. We have developed a system for anonymizing personal information in text messages that generalizes sensitive phrases. It also creates a sufficient number of fingerprints of a message by using synonyms so that, if personal information is revealed online, the person who revealed it can be identified. A distribution metric is used to ensure that the degree of anonymization is appropriate for each group of friends. A threshold is used to improve the naturalness of the fingerprinted messages so that they do not catch the attention of attackers. Evaluation using about 55,000 personal tweets in English demonstrated that our system creates sufficiently natural fingerprinted messages for friends and groups of friends. The practicality of the system was demonstrated by creating a web application for controlling messages posted on Facebook. |
Author | NGUYEN-SON, Hoang-Quoc TRAN, Minh-Triet SONEHARA, Noboru YOSHIURA, Hiroshi ECHIZEN, Isao |
Author_xml | – sequence: 1 fullname: NGUYEN-SON, Hoang-Quoc organization: Graduate University for Advanced Studies – sequence: 2 fullname: TRAN, Minh-Triet organization: University of Science – sequence: 3 fullname: YOSHIURA, Hiroshi organization: The University of Electro-Communications – sequence: 4 fullname: SONEHARA, Noboru organization: National Institute of Informatics – sequence: 5 fullname: ECHIZEN, Isao organization: Graduate University for Advanced Studies |
BookMark | eNpNkElPwzAQRi0EEmX5Bxx85BLwJLabHBFlk1gqAWfLdSZgSG3wuILy6wkq22ku7z2Nvi22HmJAxvZAHICqx4c52UA-dAelAHl1PxUC9BobwViqAioN62wkGtBFrapyk20RPQ1EXYIascXR0FrO_YcPD3yKiWKwPb_D98yvkMg-IPFppIwt94HfhN4H5LfR-YG6xvwW0zNxG1o-wYwuf1UmnlwfaZEGNXZ_0YvQxTS32cewwzY62xPuft9tdn96cnd8XlzenF0cH10WTkGTC2xqCWhnNbbSyU7MZFVK1dStHYvSCdm0EpxuulZJ1KWeAaJGXTldS9epFqtttr_qvqT4ukDKZj48h31vA8YFGdBaCNXAuBpQuUJdikQJO_OS_NympQFhvlY2PyubfysP2nSlPVEexvqVbMre9fgnnTS1mRj4vv8Sv6h7tMlgqD4BrbGT5g |
Cites_doi | 10.7551/mitpress/1130.003.0016 10.1007/978-3-642-03688-0_16 10.1145/1242572.1242667 10.1145/1121995.1121997 10.29012/jpc.v4i2.620 10.1145/219717.219748 10.1007/978-3-642-40099-5_34 10.1007/3-540-36415-3_13 10.1109/AINA.2010.118 10.1109/SITIS.2013.108 10.1145/1178766.1178777 10.1007/978-3-540-73599-1_31 10.1145/275487.275508 10.1007/978-3-642-17773-6_23 |
ContentType | Journal Article |
Copyright | 2015 The Institute of Electronics, Information and Communication Engineers |
Copyright_xml | – notice: 2015 The Institute of Electronics, Information and Communication Engineers |
DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
DOI | 10.1587/transinf.2014MUP0016 |
DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Computer and Information Systems Abstracts |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1745-1361 |
EndPage | 88 |
ExternalDocumentID | 10_1587_transinf_2014MUP0016 article_transinf_E98_D_1_E98_D_2014MUP0016_article_char_en |
GroupedDBID | -~X 1TH 5GY ABQTQ ABTAH ABZEH ACGFS ADNWM AENEX AFFNX ALMA_UNASSIGNED_HOLDINGS C1A CKLRP CS3 DU5 EBS EJD F5P H13 ICE JSF JSH KQ8 OK1 P2P RIG RJT RYL RZJ TN5 TQK VOH ZE2 ZKX ZY4 AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c519t-e9841eab8ed4c4f0b4324598da702c049d41c69fd54e626b1ee6e63c684cf5de3 |
ISSN | 0916-8532 |
IngestDate | Tue Dec 03 23:50:26 EST 2024 Fri Dec 06 04:06:38 EST 2024 Wed Apr 05 06:36:45 EDT 2023 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c519t-e9841eab8ed4c4f0b4324598da702c049d41c69fd54e626b1ee6e63c684cf5de3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
OpenAccessLink | https://www.jstage.jst.go.jp/article/transinf/E98.D/1/E98.D_2014MUP0016/_article/-char/en |
PQID | 1660059173 |
PQPubID | 23500 |
PageCount | 11 |
ParticipantIDs | proquest_miscellaneous_1660059173 crossref_primary_10_1587_transinf_2014MUP0016 jstage_primary_article_transinf_E98_D_1_E98_D_2014MUP0016_article_char_en |
PublicationCentury | 2000 |
PublicationDate | 2015 2015-00-00 20150101 |
PublicationDateYYYYMMDD | 2015-01-01 |
PublicationDate_xml | – year: 2015 text: 2015 |
PublicationDecade | 2010 |
PublicationTitle | IEICE Transactions on Information and Systems |
PublicationTitleAlternate | IEICE Trans. Inf. & Syst. |
PublicationYear | 2015 |
Publisher | The Institute of Electronics, Information and Communication Engineers |
Publisher_xml | – name: The Institute of Electronics, Information and Communication Engineers |
References | [1] F. Stutzman, R. Gross, and A. Acquisti, “Silent listeners: The evolution of privacy and disclosure on facebook,” J. Priv. and Conf., vol.4, no.2, pp.7-41, 2012. [8] I. Ounis, C. Macdonald, J. Lin, and I. Soboroff, “Overview of the trec-2011 microblog track,” Proc. 20th Tex. Retriev. Conf., 2011. [14] H. Kataoka, A. Utsumi, Y. Hirose, and H. Yoshiura, “Disclosure control of natural language information to enable secure and enjoyable communication over the internet,” Proc. 15th Inter. Worksh. Sec. Prot., pp.178-188, 2010. [2] D. Kokkinakis and A. Thurin, “Anonymisation of swedish clinical data,” Proc. 11th AI in Medic., pp.237-241, 2007. [3] B. Medlock, “An introduction to nlp-based textual anonymisation,” Proc. 5th Conf. Lang. Res. and Eval., 2006. [6] C.Y. Chang and S. Clark, “Linguistic steganography using automatically generated paraphrases,” Proc. 11th NAACL HLT, pp.591-599, 2010. [13] M.J. Atallah, V. Raskin, C.F. Hempelmann, M. Karahan, R. Sion, U. Topkara, and K.E. Triezenberg, “Natural language watermarking and tamperproofing,” Proc. 5th Infor. Hid., pp.196-212, 2002. [9] P. Samarati and L. Sweeney, “Generalizing data to provide anonymity when disclosing information,” Proc. 17th ACM SIGACT-SIGMOD-SIGART, p.188, 1998. [19] N. Shuyo, “Language detection library for java,” 2010. [5] Y. Zhang, G. Blackwood, and S. Clark, “Syntax-based word ordering incorporating a large-scale language model,” Proc. 13th Conf. Euro. Assoc. for Comput. Ling., pp.736-746, 2012. [12] M. Topkara, U. Topkara, and M.J. Atallah, “Words are not enough: Sentence level natural language watermarking,” Proc. 4th ACM Worsh. Cont. Prot. and Sec., pp.37-46, 2006. [18] S. Machida, S. Shimada, and I. Echizen, “Settings of access control by detecting privacy leaks in sns,” Proc. 9th Signal Image Technology and Internet Based Sytems, pp.660-666, 2013. [17] C.Y. Chang and S. Clark, “Practical linguistic steganography using contextual synonym substitution and vertex colour coding,” Proc. Empirical Methods in NLP, pp.1194-1203, 2010. [21] X. Liu, S. Zhang, F. Wei, and M. Zhou, “Recognizing named entities in tweets,” Proc. 49th ACL HLT, pp.359-367, 2011. [10] F. Suchanek, G. Kasneci, and G. Weikum, “YAGO: A core of semantic knowledge,” Proc. 16th Inter. Conf. WWW, pp.697-706, 2007. [15] J.C. Platt, “Fast training of support vector machines using sequential minimal optimization,” Adv. in Ker. Meth., pp.185-208, 1999. [4] H.Q. Nguyen-Son, M.T. Tran, D. Tien, H. Yoshiura, N. Sonehara, and I. Echizen, “Automatic anonymous fingerprinting of text posted on social networking services,” Proc. 11th Inter. Worksh. Digit. Forens. and Waterm., pp.410-424, 2013. [11] T.H. Ngoc, I. Echizen, K. Komei, and H. Yoshiura, “New approach to quantification of privacy on social network sites,” Proc. 24th Adv. Infor. Netw. and Appl., pp.556-564, 2010. [16] G.A. Miller, “Wordnet: A lexical database for english,” Commun. ACM, vol.38, no.11, pp.39-41, 1995. [22] J.W. Byun and E. Bertino, “Micro-views, or on how to protect privacy while enhancing data usability: Concepts and challenges,” ACM SIGMOD Record, vol.35, no.1, pp.9-13, 2006. [7] X. Zheng, L. Huang, Z. Chen, Z. Yu, and W. Yang, “Hiding information by context-based synonym substitution,” Proc. 8th Inter. Worksh. Digit. Forens. and Waterm., pp.162-169, 2009. [20] B. Han, P. Cook, and T. Baldwin, “Automatically constructing a normalisation dictionary for microblogs,” Proc. Joint Conf. Emp. Meth. in Nat. Lang. Proc. and Comput. Nat. Lang. Learn., pp.421-432, 2012. 11 22 12 13 14 15 16 17 18 19 1 2 3 4 5 6 7 8 9 20 10 21 |
References_xml | – ident: 17 – ident: 3 – ident: 15 doi: 10.7551/mitpress/1130.003.0016 – ident: 7 doi: 10.1007/978-3-642-03688-0_16 – ident: 10 doi: 10.1145/1242572.1242667 – ident: 22 doi: 10.1145/1121995.1121997 – ident: 5 – ident: 1 doi: 10.29012/jpc.v4i2.620 – ident: 16 doi: 10.1145/219717.219748 – ident: 4 doi: 10.1007/978-3-642-40099-5_34 – ident: 19 – ident: 13 doi: 10.1007/3-540-36415-3_13 – ident: 11 doi: 10.1109/AINA.2010.118 – ident: 18 doi: 10.1109/SITIS.2013.108 – ident: 12 doi: 10.1145/1178766.1178777 – ident: 6 – ident: 2 doi: 10.1007/978-3-540-73599-1_31 – ident: 8 – ident: 9 doi: 10.1145/275487.275508 – ident: 21 – ident: 14 doi: 10.1007/978-3-642-17773-6_23 – ident: 20 |
SSID | ssj0018215 ssib053832749 ssib002991706 ssib036429076 ssib036429077 ssib023157076 |
Score | 2.117343 |
Snippet | While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One... |
SourceID | proquest crossref jstage |
SourceType | Aggregation Database Publisher |
StartPage | 78 |
SubjectTerms | anonymized text message Applications programs disclosure detection fingerprint Fingerprints Messages Online online social network Risk Short message service Social networks Texts |
Title | Anonymizing Personal Text Messages Posted in Online Social Networks and Detecting Disclosures of Personal Information |
URI | https://www.jstage.jst.go.jp/article/transinf/E98.D/1/E98.D_2014MUP0016/_article/-char/en https://search.proquest.com/docview/1660059173 |
Volume | E98.D |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
ispartofPNX | IEICE Transactions on Information and Systems, 2015, Vol.E98.D(1), pp.78-88 |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9pAEF4laQ_toY-0VelLWyk3ZIrx2thHFJxCpQBJQUpOFrbHCVVlVwVf8us749k1RuHQRr0AWtbrtWdm57HzzQpxgirNCyBxrH4KBMlRsRUEkFm-G_TsGBWGt6yyLSbeaKG-XblXB4cnjaylchN3kru9uJKHUBXbkK6Ekv0HytaDYgP-RvriJ1IYP_-Kxuy7r-7I3Z9pq7o9x-W2fU4nm1D5hlkF4qCoBhcVbWs87oTTv9c6FZm2EmiU4Wqd_CwoasjZcWZQjVqqqajN2XE4Pg25QDoDJKrNh0ZnzgxtVEWnuPPXxXWIC_m0yjAYFcv8xrooi6QOIlwOJpzRn99ac3Tl68yc6-n30XhxOaiuW6F-v13VEaLpJBwN-K9JgYxdNuMZjOXsGCzcToZEWB8EpBfM3cnvIGjq8o07IU7bQ6bT8VPgBb6vXMt2uAC80QBh4HeG95idV3Q-YMjYBv5ereNS3OZsQy8b2ylfUJ0vZmRNb7VsnfuouSUy3SO8ezSMbP3duDgyXQmAh_x-KB5V5R4p9eBiu1fm9_icDvO0GiCKk_qyb0o7BtjjH-iD3Nw3RCrrav5CPNNukRzwVF6KA8iPxXNz5IjUGuhYPG3Uz3wlyoYASMOrkgRAGgGQLABylUsWAMkCII0ASCSyrAVANgRAFtl20AZbvBaLs3B-OrL0OSJWgv7JxoLAVzYsYx9SlaisG1MVSjfw02W_20vQRU6VnXhBlroK0L-PbQAPPCfxfJVkbgrOG3GEjwNvhQS0sB3oqthWsfIh83v4mgGc2MuwGdKWsMzLjX5xuZiI3GzstSV4gxgtMWYK1L0fzh4t8dkQMUIdQRt_yxyKch3Znkcgc7vvvPuP93svnpD8cjzygzja_C7hI1rom_hTxaB_ANxo830 |
link.rule.ids | 314,780,784,4024,27923,27924,27925 |
linkProvider | Colorado Alliance of Research Libraries |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Anonymizing+Personal+Text+Messages+Posted+in+Online+Social+Networks+and+Detecting+Disclosures+of+Personal+Information&rft.jtitle=IEICE+Transactions+on+Information+and+Systems&rft.au=NGUYEN-SON%2C+Hoang-Quoc&rft.au=TRAN%2C+Minh-Triet&rft.au=YOSHIURA%2C+Hiroshi&rft.au=SONEHARA%2C+Noboru&rft.date=2015&rft.pub=The+Institute+of+Electronics%2C+Information+and+Communication+Engineers&rft.issn=0916-8532&rft.eissn=1745-1361&rft.volume=E98.D&rft.issue=1&rft.spage=78&rft.epage=88&rft_id=info:doi/10.1587%2Ftransinf.2014MUP0016&rft.externalDocID=article_transinf_E98_D_1_E98_D_2014MUP0016_article_char_en |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0916-8532&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0916-8532&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0916-8532&client=summon |