An Improved HITS Algorithm Based on Analysis of Web Page Links and Web Content Similarity

HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift&q...

Full description

Saved in:
Bibliographic Details
Published in2016 International Conference on Cyberworlds (CW) pp. 147 - 150
Main Author Weiming Yang
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.09.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift"-a deviation between search and topic-would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic drift to some degree.
DOI:10.1109/CW.2016.30