Efficient subgraph search on large anonymized graphs

Summary Graph is one of the most important data structures to model social networks and becomes popular to find interesting relationships between individuals. Since graphs may contain sensitive information, data curators usually need to anonymize the graph before publication to prevent individual re...

Full description

Saved in:
Bibliographic Details
Published inConcurrency and computation Vol. 31; no. 23
Main Authors Ding, Xiaofeng, Ou, Yangling, Jia, Jianhong, Jin, Hai, Liu, Jixue
Format Journal Article
LanguageEnglish
Published Hoboken Wiley Subscription Services, Inc 10.12.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Summary Graph is one of the most important data structures to model social networks and becomes popular to find interesting relationships between individuals. Since graphs may contain sensitive information, data curators usually need to anonymize the graph before publication to prevent individual re‐identification, which thus leads to plenty of anonymized graphs for data sharing and exploration. However, the new structures and properties of anonymized graphs make the traditional graph indexing method inefficient or even invalid for query processing. To address the subgraph query problem over anonymized graph database, in this paper, we first introduce basic concepts about anonymized graphs and subgraph queries, then propose an index structure named Closure+‐tree to process the subgraph query efficiently. In particular, graphs were organized hierarchically that each node is an union of its child nodes under some specified mapping functions. During the processing of subgraph queries, the whole graph descendants will be pruned if their union does not contain the query graph. To evaluate the performance of our proposed Closure+‐tree, extensive experiments are performed on both real and synthetic graph data sets. The experimental results revealed that our index structure can prune up to 80% unqualified graphs with variable size of queries. Furthermore, the size of our index structure is only around a quarter of the entire anonymized graph data set, which indicates good scalability over large data sets.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.4511