Tulu Manuscript OCR: Preserving Ancient Wisdom through Character Recognition

Tulu, largely spoken in coastal Karnataka, has a distinct alphabet that used to be written on palm leaves. This study addresses the scarcity of efficient OCR solutions. Employing machine learning algorithms that include decision tree, k-nearest neighbors (KNN), and random forest. The system achieves...

Full description

Saved in:
Bibliographic Details
Published in2024 Second International Conference on Data Science and Information System (ICDSIS) pp. 1 - 7
Main Authors Jayashree, K R, Sinchana, Manisha, K, Deekshitha, Sudarshan, K, Kannadaguli, Prashanth
Format Conference Proceeding
LanguageEnglish
Published IEEE 17.05.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Tulu, largely spoken in coastal Karnataka, has a distinct alphabet that used to be written on palm leaves. This study addresses the scarcity of efficient OCR solutions. Employing machine learning algorithms that include decision tree, k-nearest neighbors (KNN), and random forest. The system achieves its highest accuracy of 92.35 \% with the random forest algorithm. The system's versatility in handling diverse font styles and sizes is crucial for Tulu character recognition. The inclusion of a classifier-level fusion strategy enhances recognition accuracy, which is vital given the intricate nature of Tulu characters. This research advances OCR technology for Indian languages, specifically meeting the unique needs of the Tulu script. The effectiveness of the random forest algorithm, achieving high accuracy, underscores its potential for broader applications. The proposed Tulu Character Recognition System represents a pivotal step in addressing the OCR gap for Indian languages, holding promise for future linguistic technology advancements.
DOI:10.1109/ICDSIS61070.2024.10594489