Fast and Parallel Ranking-based Clustering for Heterogeneous Graphs
The demands for graph data analysis methods are increasing. RankClus is a framework to extract clusters by integrating clustering and ranking on heterogeneous graphs; it enhances the clustering results by alternately updates the results of clustering and ranking for the better understanding of the c...
Saved in:
Published in | Journal of data intelligence Vol. 1; no. 2; pp. 137 - 158 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
01.06.2020
|
Online Access | Get full text |
ISSN | 2577-610X 2577-610X |
DOI | 10.26421/JDI1.2-3 |
Cover
Summary: | The demands for graph data analysis methods are increasing. RankClus is a framework to extract clusters by integrating clustering and ranking on heterogeneous graphs; it enhances the clustering results by alternately updates the results of clustering and ranking for the better understanding of the clusters. However, RankClus is computationally expensive if a graph is large since it needs to iterate both clustering and ranking for all nodes. In this paper, to address this problem, we propose a novel fast RankClus algorithm for heterogeneous graphs. To speed up the entire procedure of RankClus, our proposed algorithm reduces the computational cost of the ranking process in each iteration. Our proposal measures how each node affects the clustering result; if it is not significant, we prune the node. Furthermore, we also present a parallel algorithm by extending our proposed algorithm by fully exploiting a modern manycore CPU. As a result, our extensive evaluations clarified that our fast and parallel algorithms drastically cut off the computation time of the original algorithm RancClus. |
---|---|
ISSN: | 2577-610X 2577-610X |
DOI: | 10.26421/JDI1.2-3 |