Classification by multi-semantic meta path and active weight learning in heterogeneous information networks
•The complex correlations in Heterogeneous Information Network are represented by meta path.•Multi-semantic Meta path and jump path strengthen the associations between the nodes.•The active weight learning method is proposed for multiple kinds of meta-path.•The classification task in HINs reaches hi...
Saved in:
Published in | Expert systems with applications Vol. 123; pp. 227 - 236 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
Elsevier Ltd
01.06.2019
Elsevier BV |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •The complex correlations in Heterogeneous Information Network are represented by meta path.•Multi-semantic Meta path and jump path strengthen the associations between the nodes.•The active weight learning method is proposed for multiple kinds of meta-path.•The classification task in HINs reaches higher accuracy even with the small labeled data size.•The performance of our approach achieves significant improvement than other methods.
Heterogeneous information network (HIN) is a kind of large-scale network which contains different types of objects and complex links. It is distinguished from a homogenous network for its heterogeneity of objects represented as nodes and complexity of links, which also makes the object classification more difficult. A meta-path can denote the relationship between nodes in HINs, and the path information can be enriched by extracting jump-paths. Based on this idea, the problem of data sparseness can be alleviated effectively. As multiple meta-paths represent different semantics, we propose an active weight learning method for each type, which aims to maximize the weight of meta-path with strong correlation and lower the weight if the correlation is weak. The feature matrix based on the meta-path is constructed and the Random Forest classifier is trained to implement the node classification in HINs. The experimental results show that our method achieves better performance in the complex network by using the fewer labeled data. The active learning strategy is effective for identifying objects to label for training. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2019.01.044 |