Attribute CNNs for word spotting in handwritten documents

Word spotting has become a field of strong research interest in document image analysis over the last years. Recently, AttributeSVMs were proposed which predict a binary attribute representation (Almazán et al. in IEEE Trans Pattern Anal Mach Intell 36(12):2552–2566, 2014 ). At their time, this infl...

Full description

Saved in:
Bibliographic Details
Published inInternational journal on document analysis and recognition Vol. 21; no. 3; pp. 199 - 218
Main Authors Sudholt, Sebastian, Fink, Gernot A.
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.09.2018
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Word spotting has become a field of strong research interest in document image analysis over the last years. Recently, AttributeSVMs were proposed which predict a binary attribute representation (Almazán et al. in IEEE Trans Pattern Anal Mach Intell 36(12):2552–2566, 2014 ). At their time, this influential method defined the state of the art in segmentation-based word spotting. In this work, we present an approach for learning attribute representations with convolutional neural networks(CNNs). By taking a probabilistic perspective on training CNNs, we derive two different loss functions for binary and real-valued word string embeddings. In addition, we propose two different CNN architectures, specifically designed for word spotting. These architectures are able to be trained in an end-to-end fashion. In a number of experiments, we investigate the influence of different word string embeddings and optimization strategies. We show our attribute CNNs to achieve state-of-the-art results for segmentation-based word spotting on a large variety of data sets.
ISSN:1433-2833
1433-2825
DOI:10.1007/s10032-018-0295-0