DCT Inspired Feature Transform for Image Retrieval and Reconstruction

Scale invariant feature transform (SIFT) is effective for representing images in computer vision tasks, as one of the most resistant feature descriptions to common image deformations. However, two issues should be addressed: first, feature description based on gradient accumulation is not compact an...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on image processing Vol. 25; no. 9; pp. 4406 - 4420
Main Authors	Wang, Yunhe, Shi, Miaojing, You, Shan, Xu, Chao
Format	Journal Article
Language	English
Published	United States IEEE 01.09.2016 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	DCT intrinsic orientation Deformation resistance Descriptions Detectors DIFT Discrete cosine transforms Feature extraction image matching Image reconstruction Image representation Image retrieval Invariants Orientation Redundancy Retrieval Transforms
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Scale invariant feature transform (SIFT) is effective for representing images in computer vision tasks, as one of the most resistant feature descriptions to common image deformations. However, two issues should be addressed: first, feature description based on gradient accumulation is not compact and contains redundancies; second, multiple orientations are often extracted from one local region and therefore produce multiple descriptions, which is not good for memory efficiency. To resolve these two issues, this paper introduces a novel method to determine the dominant orientation for multiple-orientation cases, named discrete cosine transform (DCT) intrinsic orientation, and a new DCT inspired feature transform (DIFT). In each local region, it first computes a unique DCT intrinsic orientation via DCT matrix and rotates the region accordingly, and then describes the rotated region with partial DCT matrix coefficients to produce an optimized low-dimensional descriptor. We test the accuracy and robustness of DIFT on real image matching. Afterward, extensive applications performed on public benchmarks for visual retrieval show that using DCT intrinsic orientation achieves performance on a par with SIFT, but with only 60% of its features; replacing the SIFT description with DIFT reduces dimensions from 128 to 32 and improves precision. Image reconstruction resulting from DIFT is presented to show another of its advantages over SIFT.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2016.2590323