Categorization of Bangla Medical Text Documents Based on Hybrid Internal Feature

This paper aims to develop an automatic text categorization system that classifies Bangla medical and non-medical text documents based on two primary features, that is, word length and the presence of English equivalent words in the text documents. To start with, it has been shown that based on the...

Full description

Saved in:
Bibliographic Details
Published inComputational Intelligence, Communications, and Business Analytics Vol. 1031; pp. 181 - 192
Main Authors Dhar, Ankita, Dash, Niladri Sekhar, Roy, Kaushik
Format Book Chapter
LanguageEnglish
Published Singapore Springer Singapore Pte. Limited 2019
Springer Singapore
SeriesCommunications in Computer and Information Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper aims to develop an automatic text categorization system that classifies Bangla medical and non-medical text documents based on two primary features, that is, word length and the presence of English equivalent words in the text documents. To start with, it has been shown that based on the word length and the number of English equivalent words present in a particular text, Bangla medical text documents can be identified among other text documents of any domain. SGD (Stochastic Gradient Descent) classification algorithm is used and an accuracy of 97.75% has been achieved. Comparisons have also been done with other commonly used classifiers to test the system from which it has been observed that SGD performs better than those classifiers.
ISBN:9789811385803
9811385807
ISSN:1865-0929
1865-0937
DOI:10.1007/978-981-13-8581-0_15