Multiclass Single Label Model for Web Page Classification
Web is a huge repository of information and there is a need of categorization of web pages to facilitate better search and retrieval of pages. Web page classification has become a challenging task due to the exponential growth of the World Wide Web and this study augments classification model with a...
Saved in:
Published in | 2019 International Conference on Recent Advances in Energy-efficient Computing and Communication (ICRAECC) pp. 1 - 6 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.03.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Web is a huge repository of information and there is a need of categorization of web pages to facilitate better search and retrieval of pages. Web page classification has become a challenging task due to the exponential growth of the World Wide Web and this study augments classification model with a general facility for automatically assigning class label (e.g., sport, news) to web pages based on the output of a Naive Bayes classifier. For the purpose of build classification model, yahoo Open Directory Project (ODP) data set has been used for create training and testing set. In this research work web page classification was done using Uniform Resource Locator (URL) features, Meta data, Meta keywords, Internal Links and text, which gives better result than URLs based method. |
---|---|
DOI: | 10.1109/ICRAECC43874.2019.8995087 |