Tracking People's Hands and Feet Using Mixed Network AND/OR Search

We describe a framework that leverages mixed probabilistic and deterministic networks and their AND/OR search space to efficiently find and track the hands and feet of multiple interacting humans in 2D from a single camera view. Our framework detects and tracks multiple people's heads, hands, a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on pattern analysis and machine intelligence Vol. 35; no. 5; pp. 1248 - 1262
Main Authors	Morariu, V. I., Harwood, D., Davis, L. S.
Format	Journal Article
Language	English
Published	Los Alamitos, CA IEEE 01.05.2013 IEEE Computer Society
Subjects	Applied sciences Artificial intelligence Basketball Computer science; control theory; systems Computer systems and distributed systems. User interface Databases, Factual Discriminant Analysis Exact sciences and technology Extremities Foot - physiology Graphical models Hand - physiology Humans Image Processing, Computer-Assisted - methods Information retrieval. Graph Least-Squares Analysis motion Motor Activity - physiology Pattern analysis Pattern Recognition, Automated - methods Pattern recognition. Digital image processing. Computational geometry pictorial structures Probabilistic logic Search problems Software Theoretical computing Tracking Training Occlusion Lazy programming High resolution Tracking Image processing Video signal Pedestrian Activity pictorial structures Optimization Multiple image Robustness Camera Deterministic approach Head Target tracking Image resolution motion Information retrieval Foot Multiple view Implicit enumeration method Optimal solution Probabilistic net Complex network Occultation Remote supervision
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We describe a framework that leverages mixed probabilistic and deterministic networks and their AND/OR search space to efficiently find and track the hands and feet of multiple interacting humans in 2D from a single camera view. Our framework detects and tracks multiple people's heads, hands, and feet through partial or full occlusion; requires few constraints (does not require multiple views, high image resolution, knowledge of performed activities, or large training sets); and makes use of constraints and AND/OR Branch-and-Bound with lazy evaluation and carefully computed bounds to efficiently solve the complex network that results from the consideration of interperson occlusion. Our main contributions are: 1) a multiperson part-based formulation that emphasizes extremities and allows for the globally optimal solution to be obtained in each frame, and 2) an efficient and exact optimization scheme that relies on AND/OR Branch-and-Bound, lazy factor evaluation, and factor cost sensitive bound computation. We demonstrate our approach on three datasets: the public single person HumanEva dataset, outdoor sequences where multiple people interact in a group meeting scenario, and outdoor one-on-one basketball videos. The first dataset demonstrates that our framework achieves state-of-the-art performance in the single person setting, while the last two demonstrate robustness in the presence of partial and full occlusion and fast nontrivial motion.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2012.187