Spatial selection for attentional visual tracking

Long-duration tracking of general targets is quite challenging for computer vision, because in practice target may undergo large uncertainties in its visual appearance and the unconstrained environments may be cluttered and distractive, although tracking has never been a challenge to the human visua...

Full description

Saved in:
Bibliographic Details
Published in2007 IEEE Conference on Computer Vision and Pattern Recognition pp. 1 - 8
Main Authors Ming Yang, Junsong Yuan, Ying Wu
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2007
Subjects
Online AccessGet full text
ISBN9781424411795
1424411793
ISSN1063-6919
1063-6919
DOI10.1109/CVPR.2007.383178

Cover

Loading…
More Information
Summary:Long-duration tracking of general targets is quite challenging for computer vision, because in practice target may undergo large uncertainties in its visual appearance and the unconstrained environments may be cluttered and distractive, although tracking has never been a challenge to the human visual system. Psychological and cognitive findings indicate that the human perception is attentional and selective, and both early attentional selection that may be innate and late attentional selection that may be learned are necessary for human visual tracking. This paper proposes a new visual tracking approach by reflecting some aspects of spatial selective attention, and presents a novel attentional visual tracking (AVT) algorithm. In AVT, the early selection process extracts a pool of attentional regions (ARs) that are defined as the salient image regions which have good localization properties, and the late selection process dynamically identifies a subset of discriminative attentional regions (D-ARs) through a discriminative learning on the historical data on the fly. The computationally demanding process of matching of the AR pool is done in an efficient and innovative way by using the idea in the locality-sensitive hashing (LSH) technique. The proposed AVT algorithm is general, robust and computationally efficient, as shown in extensive experiments on a large variety of real-world video.
ISBN:9781424411795
1424411793
ISSN:1063-6919
1063-6919
DOI:10.1109/CVPR.2007.383178