Multiclass Single Label Model for Web Page Classification

Web is a huge repository of information and there is a need of categorization of web pages to facilitate better search and retrieval of pages. Web page classification has become a challenging task due to the exponential growth of the World Wide Web and this study augments classification model with a...

Full description

Saved in:
Bibliographic Details
Published in2019 International Conference on Recent Advances in Energy-efficient Computing and Communication (ICRAECC) pp. 1 - 6
Main Authors Kag, Aakash, Jenila, Livingston L. M., Merlin, Livingston L. M., Agnel, Livingston L. G.X
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Web is a huge repository of information and there is a need of categorization of web pages to facilitate better search and retrieval of pages. Web page classification has become a challenging task due to the exponential growth of the World Wide Web and this study augments classification model with a general facility for automatically assigning class label (e.g., sport, news) to web pages based on the output of a Naive Bayes classifier. For the purpose of build classification model, yahoo Open Directory Project (ODP) data set has been used for create training and testing set. In this research work web page classification was done using Uniform Resource Locator (URL) features, Meta data, Meta keywords, Internal Links and text, which gives better result than URLs based method.
DOI:10.1109/ICRAECC43874.2019.8995087