Character segmentation for Nastaleeq URDU OCR: A review

Urdu Nastaleeq is a highly cursive, context sensitive language, written diagonally from top right to bottom left that makes it difficult to segment the partial word or a compete word into characters. Further due to stacking of characters, the segmentation at the character level is hard to perform. S...

Full description

Saved in:

Bibliographic Details
Published in	2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) pp. 1489 - 1493
Main Authors	Ganai, Aejaz Farooq, Lone, Faisal Rasheed
Format	Conference Proceeding
Language	English
Published	IEEE 01.03.2016
Subjects	Character recognition Hidden Markov models Image segmentation Junctions Ligature Naskh Nastaleeq Optical character recognition software Optimization Shape Urdu OCR Urdu Segmentation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Urdu Nastaleeq is a highly cursive, context sensitive language, written diagonally from top right to bottom left that makes it difficult to segment the partial word or a compete word into characters. Further due to stacking of characters, the segmentation at the character level is hard to perform. Some researchers have performed the ligature level segmentation and have succeeded to a great extent, but the accuracy of segmentation is still less and needs to improved. In this paper, the methodology for segmentation of Urdu nastaleeq at the character level is presented. The various challenges encountered during segmentation have been discussed in detail.
DOI:	10.1109/ICEEOT.2016.7754931