Large scalability in document image matching using text retrieval

► We cast image matching as text retrieval. ► We have indexed more than 500,000 images. ► We provide recognition in approximately 500ms. We present a method that addresses image matching from partial blurry images by casting it as a problem of text retrieval. This allows us to leverage existing text...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition letters Vol. 33; no. 7; pp. 863 - 871
Main Author	Moraleda, Jorge
Format	Journal Article
Language	English
Published	Elsevier B.V 01.05.2012
Subjects	Document image matching Image based document retrieval Image matching Image matching Document image matching Image based document retrieval
Online Access	Get full text

Cover

Loading…

More Information
Summary:	► We cast image matching as text retrieval. ► We have indexed more than 500,000 images. ► We provide recognition in approximately 500ms. We present a method that addresses image matching from partial blurry images by casting it as a problem of text retrieval. This allows us to leverage existing text document retrieval techniques and achieve efficiency and scalability similar to text search applications. As an initial application, we present a document image matching system in which the user supplies a query image of a small patch of a paper document taken with a cell phone camera, and the system returns a label identifying the original electronic document if found in a previously indexed collection. We have implemented our method in a client server architecture. Feature computation on a mobile client is done in under 100ms, while end-to-end document recognition on a collection of more than 4300 pages requires approximately 500ms per image. Approximately 170ms is connection time and thus subject to network speed variations. We conclude presenting scalability results on a collection of nearly 500,000 documents.
ISSN:	0167-8655 1872-7344
DOI:	10.1016/j.patrec.2011.10.013