Bengali News Headline Categorization Using Optimized Machine Learning Pipeline
Bengali text based news portal is now very common and increasing day by day. With easy access of internet technology, reading news through online is now a regular task. Different types of news are represented in the news portal. The system presented in this paper categorizes the news headline of new...
Saved in:
Published in | International journal of information engineering and electronic business Vol. 13; no. 1; pp. 15 - 24 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Hong Kong
Modern Education and Computer Science Press
08.02.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Bengali text based news portal is now very common and increasing day by day. With easy access of internet technology, reading news through online is now a regular task. Different types of news are represented in the news portal. The system presented in this paper categorizes the news headline of news portal or sites. Prediction is made by machine learning algorithm. Large number of collected data are trained and tested. As pre-processing tasks such as tokenization, digit removal, removing punctuation marks, symbols, and deletion of stop words are processed. A set of stop words is also created manually. Strong stop words leads to better performance. Stop words deletion plays a lead role in feature selection. For optimization, genetic algorithm is used which results in reduced feature size. A comparison is also explored without optimization process. Dataset is established by collecting news headline from various Bengali news portal and sites. Resultant output shows well performance in categorization. |
---|---|
ISSN: | 2074-9023 2074-9031 |
DOI: | 10.5815/ijieeb.2021.01.02 |