Action detection in complex scenes with spatial and temporal ambiguities

In this paper, we investigate the detection of semantic human actions in complex scenes. Unlike conventional action recognition in well-controlled environments, action detection in complex scenes suffers from cluttered backgrounds, heavy crowds, occluded bodies, and spatial-temporal boundary ambigui...

Full description

Saved in:

Bibliographic Details
Published in	2009 IEEE 12th International Conference on Computer Vision pp. 128 - 135
Main Authors	Yuxiao Hu, Liangliang Cao, Fengjun Lv, Shuicheng Yan, Yihong Gong, Huang, Thomas S
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2009
Subjects	Computer vision Detectors Humans Layout Machine learning Merchandise Performance evaluation Simulated annealing Spatial databases Support vector machines
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we investigate the detection of semantic human actions in complex scenes. Unlike conventional action recognition in well-controlled environments, action detection in complex scenes suffers from cluttered backgrounds, heavy crowds, occluded bodies, and spatial-temporal boundary ambiguities caused by imperfect human detection and tracking. Conventional algorithms are likely to fail with such spatial-temporal ambiguities. In this work, the candidate regions of an action are treated as a bag of instances. Then a novel multiple-instance learning framework, named SMILE-SVM (Simulated annealing Multiple Instance LEarning Support Vector Machines), is presented for learning human action detector based on imprecise action locations. SMILE-SVM is extensively evaluated with satisfactory performances on two tasks: (1) human action detection on a public video action database with cluttered backgrounds, and (2) a real world problem of detecting whether the customers in a shopping mall show an intention to purchase the merchandise on shelf (even if they didn't buy it eventually). In addition, the complementary nature of motion and appearance features in action detection are also validated, demonstrating a boosted performance in our experiments.
ISBN:	9781424444205 1424444209
ISSN:	1550-5499 2380-7504
DOI:	10.1109/ICCV.2009.5459153