Tulu Manuscript OCR: Preserving Ancient Wisdom through Character Recognition

Tulu, largely spoken in coastal Karnataka, has a distinct alphabet that used to be written on palm leaves. This study addresses the scarcity of efficient OCR solutions. Employing machine learning algorithms that include decision tree, k-nearest neighbors (KNN), and random forest. The system achieves...

Full description

Saved in:

Bibliographic Details
Published in	2024 Second International Conference on Data Science and Information System (ICDSIS) pp. 1 - 7
Main Authors	Jayashree, K R, Sinchana, Manisha, K, Deekshitha, Sudarshan, K, Kannadaguli, Prashanth
Format	Conference Proceeding
Language	English
Published	IEEE 17.05.2024
Subjects	Accuracy Classification algorithms classifier - level fusion strategy Machine learning algorithms Nearest neighbor methods Optical character recognition optical character recognition (OCR) Sea measurements Training tulu character recognition system
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Tulu, largely spoken in coastal Karnataka, has a distinct alphabet that used to be written on palm leaves. This study addresses the scarcity of efficient OCR solutions. Employing machine learning algorithms that include decision tree, k-nearest neighbors (KNN), and random forest. The system achieves its highest accuracy of 92.35 \% with the random forest algorithm. The system's versatility in handling diverse font styles and sizes is crucial for Tulu character recognition. The inclusion of a classifier-level fusion strategy enhances recognition accuracy, which is vital given the intricate nature of Tulu characters. This research advances OCR technology for Indian languages, specifically meeting the unique needs of the Tulu script. The effectiveness of the random forest algorithm, achieving high accuracy, underscores its potential for broader applications. The proposed Tulu Character Recognition System represents a pivotal step in addressing the OCR gap for Indian languages, holding promise for future linguistic technology advancements.
DOI:	10.1109/ICDSIS61070.2024.10594489