Segmentation of a word bitmap into individual characters or glyphs during an OCR process

An image processing apparatus is provided that includes a character chopper component that segments words into individual characters in a bitmap of a textual image undergoing an OCR process. The Character chopper component is configured to produce a set of (possibly curved) chop-lines which divide a...

Full description

Saved in:
Bibliographic Details
Main Author NIJEMCEVIC DJORDJE
Format Patent
LanguageEnglish
Published 29.10.2013
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:An image processing apparatus is provided that includes a character chopper component that segments words into individual characters in a bitmap of a textual image undergoing an OCR process. The Character chopper component is configured to produce a set of (possibly curved) chop-lines which divide a bitmap of any given word into its individual character or glyph candidates. Cases where an input bitmap contains two separate words are handled by marking a place where those words should be split. The character segmentation algorithm computes the set of vertically oriented, curved chop-lines by considering glyph and background colors in a given word bitmap. The set is filtered afterwards using various heuristics, in order to preserve those lines that indeed do separate a word's glyphs and minimize the number of those that do not.
Bibliography:Application Number: US20100776576