BM³E : Discriminative Density Propagation for Visual Tracking

We introduce BM 3 E, a conditional Bayesian mixture of experts Markov model, that achieves consistent probabilistic estimates for discriminative visual tracking. The model applies to problems of temporal and uncertain inference and represents the unexplored bottom-up counterpart of pervasive generat...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on pattern analysis and machine intelligence Vol. 29; no. 11; pp. 2030 - 2044
Main Authors	Sminchisescu, C., Kanaujia, A., Metaxas, D.N.
Format	Journal Article
Language	English
Published	New York IEEE 01.11.2007 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Bayesian methods computer vision Context modeling Distributed computing Filtering Histograms Inference algorithms Kalman filters Mesh generation motion Predictive models Runtime statistical models Studies tracking video analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We introduce BM 3 E, a conditional Bayesian mixture of experts Markov model, that achieves consistent probabilistic estimates for discriminative visual tracking. The model applies to problems of temporal and uncertain inference and represents the unexplored bottom-up counterpart of pervasive generative models estimated with Kalman filtering or particle filtering. Instead of inverting a nonlinear generative observation model at runtime, we learn to cooperatively predict complex state distributions directly from descriptors that encode image observations (typically, bag-of-feature global image histograms or descriptors computed over regular spatial grids). These are integrated in a conditional graphical model in order to enforce temporal smoothness constraints and allow a principled management of uncertainty. The algorithms combine sparsity, mixture modeling, and nonlinear dimensionality reduction for efficient computation in high-dimensional continuous state spaces. The combined system automatically self-initializes and recovers from failure. The research has three contributions: (1) we establish the density propagation rules for discriminative inference in continuous, temporal chain models, (2) we propose flexible supervised and unsupervised algorithms to learn feed-forward, multivalued contextual mappings (multimodal state distributions) based on compact, conditional Bayesian mixture of experts models, and (3) we validate the framework empirically for the reconstruction of 3D human motion in monocular video sequences. Our tests on both real and motion-capture-based sequences show significant performance gains with respect to competing nearest neighbor, regression, and structured prediction methods.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	0162-8828 1939-3539
DOI:	10.1109/TPAMI.2007.1111