Automatic Transcription of Handwritten Medieval Documents

The automatic transcription of historical documents is vital for the creation of digital libraries. In order to make images of valuable old documents amenable to browsing, a transcription of high accuracy is needed. In this paper, two state-of-the art recognizers originally developed for modern scri...

Full description

Saved in:

Bibliographic Details
Published in	2009 15th International Conference on Virtual Systems and Multimedia pp. 137 - 142
Main Authors	Fischer, A., Wuthrich, M., Liwicki, M., Frinken, V., Bunke, H., Viehhauser, G., Stolz, M.
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2009
Subjects	Artificial intelligence Computer science Computer Vision for Cultural Heritage Handwriting recognition Hidden Markov models Mathematics Multimedia systems Neural networks Software libraries Vocabulary Writing
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The automatic transcription of historical documents is vital for the creation of digital libraries. In order to make images of valuable old documents amenable to browsing, a transcription of high accuracy is needed. In this paper, two state-of-the art recognizers originally developed for modern scripts are applied to medieval documents. The first is based on Hidden Markov Models and the second uses a Neural Network with a bidirectional Long Short-Term Memory. On a dataset of word images extracted from a medieval manuscript of the 13th century, written in Middle High German by several writers, it is demonstrated that a word accuracy of 93.32% is achievable. This is far above the word accuracy of 77.12% achieved with the same recognizers for unconstrained modern scripts written in English. These results encourage the development of real world systems for automatic transcription of historical documents with a view to image and text browsing in digital libraries.
ISBN:	0769537901 9780769537900
DOI:	10.1109/VSMM.2009.26