Fast and Parallel Ranking-based Clustering for Heterogeneous Graphs

The demands for graph data analysis methods are increasing. RankClus is a framework to extract clusters by integrating clustering and ranking on heterogeneous graphs; it enhances the clustering results by alternately updates the results of clustering and ranking for the better understanding of the c...

Full description

Saved in:
Bibliographic Details
Published inJournal of data intelligence Vol. 1; no. 2; pp. 137 - 158
Main Authors Yamazaki, Kotaro, Sato, Tomoki, Shiokawa, Hiroaki, Kitagawa, Hiroyuki
Format Journal Article
LanguageEnglish
Published 01.06.2020
Online AccessGet full text
ISSN2577-610X
2577-610X
DOI10.26421/JDI1.2-3

Cover

More Information
Summary:The demands for graph data analysis methods are increasing. RankClus is a framework to extract clusters by integrating clustering and ranking on heterogeneous graphs; it enhances the clustering results by alternately updates the results of clustering and ranking for the better understanding of the clusters. However, RankClus is computationally expensive if a graph is large since it needs to iterate both clustering and ranking for all nodes. In this paper, to address this problem, we propose a novel fast RankClus algorithm for heterogeneous graphs. To speed up the entire procedure of RankClus, our proposed algorithm reduces the computational cost of the ranking process in each iteration. Our proposal measures how each node affects the clustering result; if it is not significant, we prune the node. Furthermore, we also present a parallel algorithm by extending our proposed algorithm by fully exploiting a modern manycore CPU. As a result, our extensive evaluations clarified that our fast and parallel algorithms drastically cut off the computation time of the original algorithm RancClus.
ISSN:2577-610X
2577-610X
DOI:10.26421/JDI1.2-3