Character-level Chinese Writer Identification using Path Signature Feature, DropStroke and Deep CNN

Most existing online writer-identification systems require that the text content is supplied in advance and rely on separately designed features and classifiers. The identifications are based on lines of text, entire paragraphs, or entire documents; however, these materials are not always available....

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Yang, Weixin, Jin, Lianwen, Liu, Manfei
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 19.05.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Most existing online writer-identification systems require that the text content is supplied in advance and rely on separately designed features and classifiers. The identifications are based on lines of text, entire paragraphs, or entire documents; however, these materials are not always available. In this paper, we introduce a path-signature feature to an end-to-end text-independent writer-identification system with a deep convolutional neural network (DCNN). Because deep models require a considerable amount of data to achieve good performance, we propose a data-augmentation method named DropStroke to enrich personal handwriting. Experiments were conducted on online handwritten Chinese characters from the CASIA-OLHWDB1.0 dataset, which consists of 3,866 classes from 420 writers. For each writer, we only used 200 samples for training and the remaining 3,666. The results reveal that the path-signature feature is useful for writer identification, and the proposed DropStroke technique enhances the generalization and significantly improves performance.
ISSN:2331-8422