A segmentation-free approach for keyword search in historical typewritten documents

In this paper, we propose a novel segmentation-free approach for keyword search in historical typewritten documents combining image preprocessing, synthetic data creation, word spotting and user feedback technologies. Our aim is to search for keywords typed by the user in a large collection of digit...

Full description

Saved in:
Bibliographic Details
Published inEighth International Conference on Document Analysis and Recognition (ICDAR'05) pp. 54 - 58 Vol. 1
Main Authors Gatos, B., Konidaris, T., Ntzios, K., Pratikakis, I., Perantonis, S.J.
Format Conference Proceeding
LanguageEnglish
Published IEEE 2005
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we propose a novel segmentation-free approach for keyword search in historical typewritten documents combining image preprocessing, synthetic data creation, word spotting and user feedback technologies. Our aim is to search for keywords typed by the user in a large collection of digitized typewritten historical documents. The proposed method is based on: (i) image preprocessing for image binarization and enhancement, noisy border and frame removal, orientation and skew correction; (ii) creation of synthetic image words from keywords typed by the user; (Hi) word segmentation using dynamic parameters; (iv) efficient feature extraction for each image word and (v) a retrieval procedure that is optimized by user's feedback. Experimental results prove the efficiency of the proposed approach.
ISBN:9780769524207
0769524206
ISSN:1520-5363
2379-2140
DOI:10.1109/ICDAR.2005.30