Spatial selection for attentional visual tracking

Long-duration tracking of general targets is quite challenging for computer vision, because in practice target may undergo large uncertainties in its visual appearance and the unconstrained environments may be cluttered and distractive, although tracking has never been a challenge to the human visua...

Full description

Saved in:

Bibliographic Details
Published in	2007 IEEE Conference on Computer Vision and Pattern Recognition pp. 1 - 8
Main Authors	Ming Yang, Junsong Yuan, Ying Wu
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2007
Subjects	Computer vision Data mining Humans Motion estimation Psychology Robustness Target tracking Uncertainty Visual perception Visual system
Online Access	Get full text
ISBN	9781424411795 1424411793
ISSN	1063-6919 1063-6919
DOI	10.1109/CVPR.2007.383178

Cover

Loading…

More Information
Summary:	Long-duration tracking of general targets is quite challenging for computer vision, because in practice target may undergo large uncertainties in its visual appearance and the unconstrained environments may be cluttered and distractive, although tracking has never been a challenge to the human visual system. Psychological and cognitive findings indicate that the human perception is attentional and selective, and both early attentional selection that may be innate and late attentional selection that may be learned are necessary for human visual tracking. This paper proposes a new visual tracking approach by reflecting some aspects of spatial selective attention, and presents a novel attentional visual tracking (AVT) algorithm. In AVT, the early selection process extracts a pool of attentional regions (ARs) that are defined as the salient image regions which have good localization properties, and the late selection process dynamically identifies a subset of discriminative attentional regions (D-ARs) through a discriminative learning on the historical data on the fly. The computationally demanding process of matching of the AR pool is done in an efficient and innovative way by using the idea in the locality-sensitive hashing (LSH) technique. The proposed AVT algorithm is general, robust and computationally efficient, as shown in extensive experiments on a large variety of real-world video.
ISBN:	9781424411795 1424411793
ISSN:	1063-6919 1063-6919
DOI:	10.1109/CVPR.2007.383178