Convolutional Character Networks
Recent progress has been made on developing a unified framework for joint text detection and recognition in natural images, but existing joint models were mostly built on two-stage framework by involving ROI pooling, which can degrade the performance on recognition task. In this work, we propose con...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
17.10.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Recent progress has been made on developing a unified framework for joint
text detection and recognition in natural images, but existing joint models
were mostly built on two-stage framework by involving ROI pooling, which can
degrade the performance on recognition task. In this work, we propose
convolutional character networks, referred as CharNet, which is an one-stage
model that can process two tasks simultaneously in one pass. CharNet directly
outputs bounding boxes of words and characters, with corresponding character
labels. We utilize character as basic element, allowing us to overcome the main
difficulty of existing approaches that attempted to optimize text detection
jointly with a RNN-based recognition branch. In addition, we develop an
iterative character detection approach able to transform the ability of
character detection learned from synthetic data to real-world images. These
technical improvements result in a simple, compact, yet powerful one-stage
model that works reliably on multi-orientation and curved text. We evaluate
CharNet on three standard benchmarks, where it consistently outperforms the
state-of-the-art approaches [25, 24] by a large margin, e.g., with improvements
of 65.33%->71.08% (with generic lexicon) on ICDAR 2015, and 54.0%->69.23% on
Total-Text, on end-to-end text recognition. Code is available at:
https://github.com/MalongTech/research-charnet. |
---|---|
DOI: | 10.48550/arxiv.1910.07954 |