Character segmentation for Nastaleeq URDU OCR: A review
Urdu Nastaleeq is a highly cursive, context sensitive language, written diagonally from top right to bottom left that makes it difficult to segment the partial word or a compete word into characters. Further due to stacking of characters, the segmentation at the character level is hard to perform. S...
Saved in:
Published in | 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) pp. 1489 - 1493 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.03.2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Urdu Nastaleeq is a highly cursive, context sensitive language, written diagonally from top right to bottom left that makes it difficult to segment the partial word or a compete word into characters. Further due to stacking of characters, the segmentation at the character level is hard to perform. Some researchers have performed the ligature level segmentation and have succeeded to a great extent, but the accuracy of segmentation is still less and needs to improved. In this paper, the methodology for segmentation of Urdu nastaleeq at the character level is presented. The various challenges encountered during segmentation have been discussed in detail. |
---|---|
DOI: | 10.1109/ICEEOT.2016.7754931 |