Large scalability in document image matching using text retrieval

► We cast image matching as text retrieval. ► We have indexed more than 500,000 images. ► We provide recognition in approximately 500ms. We present a method that addresses image matching from partial blurry images by casting it as a problem of text retrieval. This allows us to leverage existing text...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition letters Vol. 33; no. 7; pp. 863 - 871
Main Author Moraleda, Jorge
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.05.2012
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:► We cast image matching as text retrieval. ► We have indexed more than 500,000 images. ► We provide recognition in approximately 500ms. We present a method that addresses image matching from partial blurry images by casting it as a problem of text retrieval. This allows us to leverage existing text document retrieval techniques and achieve efficiency and scalability similar to text search applications. As an initial application, we present a document image matching system in which the user supplies a query image of a small patch of a paper document taken with a cell phone camera, and the system returns a label identifying the original electronic document if found in a previously indexed collection. We have implemented our method in a client server architecture. Feature computation on a mobile client is done in under 100ms, while end-to-end document recognition on a collection of more than 4300 pages requires approximately 500ms per image. Approximately 170ms is connection time and thus subject to network speed variations. We conclude presenting scalability results on a collection of nearly 500,000 documents.
ISSN:0167-8655
1872-7344
DOI:10.1016/j.patrec.2011.10.013