Temporally Consistent 3D Human Pose Estimation Using Dual 360° Cameras

This paper presents a 3D human pose estimation system that uses a stereo pair of 360° sensors to capture the complete scene from a single location. The approach combines the advantages of omnidirectional capture, the accuracy of multiple view 3D pose estimation and the portability of monocular acqui...

Full description

Saved in:
Bibliographic Details
Published inProceedings / IEEE Workshop on Applications of Computer Vision pp. 81 - 90
Main Authors Shere, Matthew, Kim, Hansung, Hilton, Adrian
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.01.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper presents a 3D human pose estimation system that uses a stereo pair of 360° sensors to capture the complete scene from a single location. The approach combines the advantages of omnidirectional capture, the accuracy of multiple view 3D pose estimation and the portability of monocular acquisition. Joint monocular belief maps for joint locations are estimated from 360° images and are used to fit a 3D skeleton to each frame. Temporal data association and smoothing is performed to produce accurate 3D pose estimates throughout the sequence. We evaluate our system on the Panoptic Studio dataset, as well as real 360° video for tracking multiple people, demonstrating an average Mean Per Joint Position Error of 12.47cm with 30cm baseline cameras. We also demonstrate improved capabilities over perspective and 360° multi-view systems when presented with limited camera views of the subject.
ISSN:2642-9381
DOI:10.1109/WACV48630.2021.00013