Online continuous multi-stroke Persian/Arabic character recognition by novel spatio-temporal features for digitizer pen devices
Nowadays, digitizer pens have become front end of many digital devices. The increasing use of this technology has necessitated the need for producing pen-based virtual keyboard systems. Despite attempts to create such systems in English, their absence for Persian/Arabic languages is an obvious defec...
Saved in:
Published in | Neural computing & applications Vol. 32; no. 8; pp. 3853 - 3872 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
London
Springer London
01.04.2020
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Nowadays, digitizer pens have become front end of many digital devices. The increasing use of this technology has necessitated the need for producing pen-based virtual keyboard systems. Despite attempts to create such systems in English, their absence for Persian/Arabic languages is an obvious defect. The goal of this paper is presenting an online continuous Persian/Arabic character recognition method. A character in Persian/Arabic language is made of two types of signs or strokes: main body and delayed strokes (which may be zero or more sign). In this paper, a set of novel and discriminative spatial features are defined for these strokes. These features then are used in a novel algorithm to create a genetic programming-based decision tree called GPDT. The GPDT and spatio-temporal features are utilized by non-deterministic finite automata (NDFA) to recognize group-related strokes and related characters. The reason for using spatio-temporal features is the sameness of the main body of some Persian/Arabic letters (e.g., “ح، خ، ج، چ”). There are also two other issues related to recognizing Persian/Arabic letters: unknown number of delayed stroke segments and the sameness of delayed strokes placement, which are removed by using an NDFA. In fact, after identifying group of main body with the help of GPDT, each recognized stroke makes a move in NDFA to stop in a character state (final state on the end of a path in NDFA). The proposed algorithm recognizes continuous Persian/Arabic letters and digits with a 92.43% accuracy and isolated letters and digits with 97.52% accuracy. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-019-04225-6 |