SEGMENTATION OF A WORD BITMAP INTO INDIVIDUAL CHARACTERS OR GLYPHS DURING AN OCR PROCESS

An image processing apparatus is provided that includes a character chopper component that segments words into individual characters in a bitmap of a textual image undergoing an OCR process. The Character chopper component is configured to produce a set of (possibly curved) chop-lines which divide a...

Full description

Saved in:
Bibliographic Details
Main Author NIJEMCEVIC DJORDJE
Format Patent
LanguageEnglish
Published 10.11.2011
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:An image processing apparatus is provided that includes a character chopper component that segments words into individual characters in a bitmap of a textual image undergoing an OCR process. The Character chopper component is configured to produce a set of (possibly curved) chop-lines which divide a bitmap of any given word into its individual character or glyph candidates. Cases where an input bitmap contains two separate words are handled by marking a place where those words should be split. The character segmentation algorithm computes the set of vertically oriented, curved chop-lines by considering glyph and background colors in a given word bitmap. The set is filtered afterwards using various heuristics, in order to preserve those lines that indeed do separate a word's glyphs and minimize the number of those that do not.
Bibliography:Application Number: US20100776576