Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

We present recursive recurrent neural networks with attention modeling (R2AM) for lexicon-free optical character recognition in natural scene images. The primary advantages of the proposed method are: (1) use of recursive convolutional neural networks (CNNs), which allow for parametrically efficient...

Full description

Saved in:
Bibliographic Details
Published in2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2231 - 2239
Main Authors Chen-Yu Lee, Osindero, Simon
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We present recursive recurrent neural networks with attention modeling (R2AM) for lexicon-free optical character recognition in natural scene images. The primary advantages of the proposed method are: (1) use of recursive convolutional neural networks (CNNs), which allow for parametrically efficient and effective image feature extraction, (2) an implicitly learned character-level language model, embodied in a recurrent neural network which avoids the need to use N-grams, and (3) the use of a soft-attention mechanism, allowing the model to selectively exploit image features in a coordinated way, and allowing for end-to-end training within a standard backpropagation framework. We validate our method with state-of-the-art performance on challenging benchmark datasets: Street View Text, IIIT5k, ICDAR and Synth90k.
ISSN:1063-6919
DOI:10.1109/CVPR.2016.245