Preserving Privacy with Probabilistic Indistinguishability in Weighted Social Networks

The increasing popularity of social networks has inspired recent research to explore social graphs for marketing and data mining. As social networks often contain sensitive information about individuals, preserving privacy when publishing social graphs becomes an important issue. In this paper, we c...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on parallel and distributed systems Vol. 28; no. 5; pp. 1417 - 1429
Main Authors Liu, Qin, Wang, Guojun, Li, Feng, Yang, Shuhui, Wu, Jie
Format Journal Article
LanguageEnglish
Published New York IEEE 01.05.2017
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The increasing popularity of social networks has inspired recent research to explore social graphs for marketing and data mining. As social networks often contain sensitive information about individuals, preserving privacy when publishing social graphs becomes an important issue. In this paper, we consider the identity disclosure problem in releasing weighted social graphs. We identify weighted 1*-neighborhood attacks , which assume that an attacker has knowledge about not only a target's one-hop neighbors and connections between them ( 1-neighborhood graph ), but also related node degrees and edge weights. With this information, an attacker may re-identify a target with high confidence, even if any node's 1-neighborhood graph is isomorphic with <inline-formula><tex-math notation="LaTeX">k-1</tex-math> <inline-graphic xlink:href="liu-ieq1-2615020.gif"/> </inline-formula> other nodes' graphs. To counter this attack while preserving high utility of the published graph, we define a key privacy property, probabilistic indistinguishability , and propose a heuristic indistinguishable group anonymization (HIGA) scheme to anonymize a weighted social graph with such a property. Extensive experiments on both real and synthetic data sets illustrate the effectiveness and efficiency of the proposed scheme.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2016.2615020