Representations and metrics for off-line handwriting segmentation

Segmentation is a key step in many off-line handwriting recognition systems but, to date, there are almost no ground truth segmentation databases and no widely accepted and formally defined metrics for segmentation performance. This paper proposes a representation of segmentations and presegmentatio...

Full description

Saved in:
Bibliographic Details
Published inProceedings Eighth International Workshop on Frontiers in Handwriting Recognition pp. 428 - 433
Main Author Breuel, T.M.
Format Conference Proceeding
LanguageEnglish
Published IEEE 2002
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Segmentation is a key step in many off-line handwriting recognition systems but, to date, there are almost no ground truth segmentation databases and no widely accepted and formally defined metrics for segmentation performance. This paper proposes a representation of segmentations and presegmentations in terms of color images. Such representations allow convenient interchange of ground truth and hypothesized segmentations in the form of standard image formats. The paper formally defines the notions of oversegmentation and undersegmentation in terms of the maximal bipartite match between corresponding pixels. It also defines a number of metrics that quantify the frequency and extent of events in handwriting like kerning, splitting, and merging of characters. It is hoped that these metrics and representations will find wider use in the community and serve as a basis for creating standard training and test databases of segmentation data.
ISBN:9780769516929
0769516920
DOI:10.1109/IWFHR.2002.1030948