A generalized tree augmented naive Bayes link prediction model

•We present a novel tree augmented naive Bayes link prediction method called TAN.•TAN can overcome the limits of recent literatures based on naive Bayes theories.•A novel definition of mutual information between common neighbors has been introduced to calculate the similarity between potential nodes...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational science Vol. 27; pp. 206 - 217
Main Author Wu, Jiehua
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.07.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•We present a novel tree augmented naive Bayes link prediction method called TAN.•TAN can overcome the limits of recent literatures based on naive Bayes theories.•A novel definition of mutual information between common neighbors has been introduced to calculate the similarity between potential nodes.•A weighted cluster coefficient is designed to adapt the TAN model into a weighted network link prediction scene.•Experiments with synthetic, real-world and weighted data are presented. This paper studies link prediction, a recently emerged hot topic with many important applications, noticeably in complex network analysis. We propose a novel similarity-based approach which improves the well-known naive Bayes method by introducing a new tree augmented naive (TAN) Bayes probabilistic model. It makes better link predictions since the model alleviates the strong independency hypothesis among shared common neighbors to match the real-world situation. To obtain the latent correlation among common neighbors, we exploit mutual information to quantify the influence from neighbors’ neighborhood. This yields a better performance than those methods which employing more local link/triangle structure information. In addition, the TAN model are easily adopted to other common neighbors-based methods such as AA and RA. Experimental results on synthetic and real-world networks show that our algorithms outperform the baseline methods, in terms of both effectiveness and efficiency.
ISSN:1877-7503
1877-7511
DOI:10.1016/j.jocs.2018.04.006