An Improved HITS Algorithm Based on Analysis of Web Page Links and Web Content Similarity
HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift&q...
Saved in:
Published in | 2016 International Conference on Cyberworlds (CW) pp. 147 - 150 |
---|---|
Main Author | |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift"-a deviation between search and topic-would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic drift to some degree. |
---|---|
DOI: | 10.1109/CW.2016.30 |