MLIBT: A multi-level improvised binarization technique for Tamizhi inscriptions
[Display omitted] •To create standard Tamizhi dataset: capturing onsite inscriptions, collecting from various ASI, annotating the collected datasets by domain experts and validated by the subject experts.•To enlarge the custom dataset by data augmentation to test the suitability for deep learning ap...
Saved in:
Published in | Expert systems with applications Vol. 236; p. 121320 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.02.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | [Display omitted]
•To create standard Tamizhi dataset: capturing onsite inscriptions, collecting from various ASI, annotating the collected datasets by domain experts and validated by the subject experts.•To enlarge the custom dataset by data augmentation to test the suitability for deep learning approach for Tamizhi inscription binaraization.•To develop a multi-level improvised binarization algorithm for the Tamizhi inscription to separate the foreground text from the stone background by applying improved median filtering, iterative thresholding and modified adaptive thresholding.•To perform post-processing optimization for the binarized results using swell and shrink filters.
The Tamizhi inscriptions, one of the earliest ever discovered, is predominantly found on memorial stones and caves which dates 5th century BCE to 3rd century CE. Today’s generations need ways to interpret the script in order to know the historical figures and events because the Tamizhi script evolved into the modern Tamil script over time. Currently, only few epigraphists are available to manually decode the inscriptions into modern Tamil. Hence, there is a need for an alternate way to preserve this cultural heritage. Image processing is one such digital technology that enables binarization on inscription images, and the retrieved text may then be utilized to convert them to the needed target language. Nevertheless, binarization of Tamizhi inscription images are highly complex due to aging, environmental factors, handwritten, similar foreground & background and uneven size and shapes of the stones. Also, due to the small dataset, deep learning techniques are inapplicable. Furthermore, existing approaches produce poor results for Tamizhi inscriptions since they can only be used on flat stone backgrounds and require sufficient light illumination for effective binarization. This research suggests a multi-level improvised binarization solution for Tamizhi inscription images to address these challenges. It achieves this utilizing post-processing with shrink and swell filters together with an improved median filter with modified adaptive thresholding. Outperforming the current binarization techniques, which achieved a maximum accuracy of about 74%, MLIBT produced an accuracy of around 92.19%. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2023.121320 |