An Improved HITS Algorithm Based on Analysis of Web Page Links and Web Content Similarity
HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift&q...
Saved in:
Published in | 2016 International Conference on Cyberworlds (CW) pp. 147 - 150 |
---|---|
Main Author | |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift"-a deviation between search and topic-would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic drift to some degree. |
---|---|
AbstractList | HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift"-a deviation between search and topic-would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic drift to some degree. |
Author | Weiming Yang |
Author_xml | – sequence: 1 surname: Weiming Yang fullname: Weiming Yang email: ywm519@163.com organization: Coll. of Comput. & Inf. Sci., Chongqing Normal Univ., Chongqing, China |
BookMark | eNotjE9LwzAcQCO4g85dvHrJF2j95V_bHGtRVygobGN4Gr90yQy2yWiL0G_vUE8PHrx3S65DDJaQewYpY6Afq33KgWWpgCuy0nnBFGjgAoS6IR9loHV_HuK3PdJ1vd3QsjvFwU-fPX3C8SJjoGXAbh79SKOje2voO54sbXz4GimG46-qYphsmOjG977DSz_fkYXDbrSrfy7J7uV5W62T5u21rsom8SxXU4JMWWO4RIWFbI3mjgkmBEfZZhqUQ845ZoZD63gmFDippXGtc5a3BgopluTh7-uttYfz4Hsc5kOeq4xJJn4AeQBMkg |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/CW.2016.30 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9781509023035 1509023038 |
EndPage | 150 |
ExternalDocumentID | 7756141 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i175t-a15ebb24a5a84cb92f131332a4c6905fa222a6b20cf26350f494bfcffe2cb0843 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:38:03 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i175t-a15ebb24a5a84cb92f131332a4c6905fa222a6b20cf26350f494bfcffe2cb0843 |
PageCount | 4 |
ParticipantIDs | ieee_primary_7756141 |
PublicationCentury | 2000 |
PublicationDate | 2016-Sept. |
PublicationDateYYYYMMDD | 2016-09-01 |
PublicationDate_xml | – month: 09 year: 2016 text: 2016-Sept. |
PublicationDecade | 2010 |
PublicationTitle | 2016 International Conference on Cyberworlds (CW) |
PublicationTitleAbbrev | CYBER |
PublicationYear | 2016 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.651776 |
Snippet | HITS (HyperLink-Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 147 |
SubjectTerms | Algorithm design and analysis Authority page Computational efficiency Computers Correlation Crawlers HITS algorithm Hub page Symmetric matrices Web content similarity Web pages |
Title | An Improved HITS Algorithm Based on Analysis of Web Page Links and Web Content Similarity |
URI | https://ieeexplore.ieee.org/document/7756141 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjZ1LTwIxFIUbZOVKDRjf6cKlHaalnccSiQRNMCRAwBXpU4nSITps_PW2HR6JceGu6aaT3mTuPe25XwG4zdqaCKIw0kpQRHnKUEbbCcIyzyTPiWLhxnTwnPQn9GnGZjVwt-uF0VoH85mO_DDc5atCrv1RWStNPbfSaZ0DJ9yqXq0NcRTHeas79U6tJAqG5v1LKSFR9I7AYLtE5Q95j9aliOT3L_rif7_hGDT3LXlwuEs2J6CmbQO8dCysTgW0gv3H8Qh2Pl4LJ_fflvDepScFCwu32BFYGDjVAg7dHwR6CfoFuVVhKiCqbAlHi-XCKV1XmDfBpPcw7vbR5q0EtHAFQIk4ZloIQjnjGZUiJwa3nfwknEqnf5nhrg7giSCxNB4_ExuaU2GkMZpIEbvwnIK6Law-AzDOjaKcigxLTytMs4SlignlWfYyw-k5aPhtma8qHMZ8syMXf09fgkMflcqWdQXq5edaX7s8XoqbEMAfKQOe7A |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV3LTgIxFG0QF7pSA8a3XehyYKa081i4QJSAAiEBAq6wTyXKjIEhRr_FX_HfbGd4JMYtibumi6Y3t-npac89BeDCL0nEkHAsKRi2MPWI5eOSazk88DkNkCDJi2mz5dZ6-G5ABhnwtayFkVIm4jNZMM3kLV9EfGauyoqeZ3wrnbmE8l5-vGuCNr2q3-hsXiJUve1Watb8DwFrpIExtqhDJGMIU0J9zFmAlFPStAxRzDUvJIpqfKQuQzZXxpbFVjjATHGlJOLM1tPW426ATX3OICitDpt7nDp2UKz0jTbMLSQS6tXfLAk0VXfA9yKoVJHyUpjFrMA_f_k9_teod0F-VXQI20s43QMZGebAQzmE6b2HFLBW73Zg-fUpmozi5zG81gAsYBTChbEKjBTsSwbbeo-EhmRPIQ1F0pWYcIUx7IzGI83lNfXIg95aYtoH2TAK5QGAdqAEppj5Djd-jJ7vEk8QJoxbP_cd7xDkTBqGb6nhx3CegaO_u8_BVq3bbAwb9db9Mdg2KyIVoZ2AbDyZyVN9aonZWbJ4IHhcd95-APml_No |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+International+Conference+on+Cyberworlds+%28CW%29&rft.atitle=An+Improved+HITS+Algorithm+Based+on+Analysis+of+Web+Page+Links+and+Web+Content+Similarity&rft.au=Weiming+Yang&rft.date=2016-09-01&rft.pub=IEEE&rft.spage=147&rft.epage=150&rft_id=info:doi/10.1109%2FCW.2016.30&rft.externalDocID=7756141 |