An effective and conceptually simple feature representation for off-line text-independent writer identification

•A novel system for off-line text-independent writer identification is proposed.•Effective feature extraction and dimensionality reduction methods are proposed.•Experiments are conducted on four challenging datasets (two English and two Arabic).•The proposed system provides high identification rates...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 123; pp. 357 - 376
Main Authors Chahi, Abderrazak, El merabet, Youssef, Ruichek, Yassine, Touahni, Raja
Format Journal Article
LanguageEnglish
Published New York Elsevier Ltd 01.06.2019
Elsevier BV
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•A novel system for off-line text-independent writer identification is proposed.•Effective feature extraction and dimensionality reduction methods are proposed.•Experiments are conducted on four challenging datasets (two English and two Arabic).•The proposed system provides high identification rates.•The proposed system demonstrates good performance stability across all the datasets. Feature engineering forms an important component of machine learning and pattern recognition. It is a fundamental process for off-line writer identification of handwritten documents, which continues to be an interesting subject of research in various forensic and authentication areas. In this work, we propose an efficient, yet computationally and conceptually simple framework for off-line text independent writer identification using local textural features in characterizing the writing style of each writer. These include Local Binary Patterns (LBP), Local Ternary Patterns (LTP), and Local Phase Quantization (LPQ). Our approach focuses on exploiting the writing images at small observation regions where a set of connected component sub-images are cropped and extracted from each handwriting sample (document or set of word/text line images). These connected components are seen as texture images where each one of them is subjected to feature extraction using LBP, LPQ, or LTP. Then, a histogram sequence concatenation is applied to the feature image after dimensionality reduction followed by image subdivision into a number of non-overlapping regions. For classification, the 1-NN (Nearest Neighbor) classifier is used to identify the writer of the questioned samples based on the dissimilarity of feature vectors computed from all components in the writing. Experiments on IFN/ENIT (411 writers/Arabic), AHTID/MW (53 writers/Arabic), CVL (309 writers/English), and IAM (657 writers/English) databases demonstrate that our proposed system outperforms old and recent state-of-the-art writer identification systems on Arabic script, and demonstrates a competitive performance on English ones.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2019.01.045