Advances in the BBN BYBLOS OCR system

We present some recent advances in the BBN BYBLOS OCR system. This OCR system can be used to recognize Arabic, Chinese, and English with high accuracy. A major change in the system is the use of continuous-density HMMs, which allow us to take advantage of a large amount of training data and to use u...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318) pp. 337 - 340
Main Authors Zhidong Lu, Schwartz, R., Natarajan, P., Bazzi, I., Makhoul, J.
Format Conference Proceeding
LanguageEnglish
Published IEEE 1999
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We present some recent advances in the BBN BYBLOS OCR system. This OCR system can be used to recognize Arabic, Chinese, and English with high accuracy. A major change in the system is the use of continuous-density HMMs, which allow us to take advantage of a large amount of training data and to use unsupervised adaptation methods to improve accuracy in many cases, e.g., on degraded data. Another advance is the substantial increase in recognition speed. With this increased speed, the system is fast enough for practical use on Arabic and English data. The extension of the system to Chinese further demonstrated the language independence of this system and showed that this system can be used on languages with large character sets and complicated character structures. The Chinese OCR system yielded high accuracy on newspaper data.
ISBN:9780769503189
0769503187
DOI:10.1109/ICDAR.1999.791793