Inside-Out Multiperson 3-D Pose Estimation Using the Panoramic Camera Capture System

Estimating the 3-D human poses of multiple individuals using multiple cameras is a significant research topic within the field of vision-based measurement. Contrary to the classical outside-in camera capture system, the inside-out panoramic camera capture system can cover larger scenes with fewer ca...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on instrumentation and measurement Vol. 73; pp. 1 - 17
Main Authors Qin, Haidong, Dai, Yanran, Jiang, Yuqi, Li, Dongdong, Liu, Hongwei, Zhang, Yong, Li, Jing, Yang, Tao
Format Journal Article
LanguageEnglish
Published New York IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Estimating the 3-D human poses of multiple individuals using multiple cameras is a significant research topic within the field of vision-based measurement. Contrary to the classical outside-in camera capture system, the inside-out panoramic camera capture system can cover larger scenes with fewer cameras for 3-D human pose estimation. This advancement extends the application of 3-D human pose measurement beyond small spaces like motion capture studios. For example, this approach can be utilized for intelligent security surveillance in broad outdoor squares or capturing athletes' movements in large-scale sports scenes. However, existing inside-out 3-D human pose estimation methods that utilize panoramic cameras encounter challenges, particularly in multiperson occlusion scenarios. Aimed at these problems, this article presents a novel inside-out multiperson 3-D human pose estimation method using just a few calibrated panoramic cameras. Specifically, we first propose a cross-view multiperson matching algorithm based on panoramic camera epipolar geometry constraint to improve human body matching robustness across viewpoints. Then, we take advantage of multiple panoramic cameras and introduce a multiview human pose clustering and fusion algorithm to improve the average recall (AR) of 3-D human pose estimation. In addition, we propose a multiview human pose nonlinear optimization algorithm to jointly optimize the weighted reprojection errors of estimated 3-D human poses, which can further improve the average precision (AP). We have conducted extensive experiments on the public Panoptic Studio dataset and self-built real and simulated datasets to demonstrate that our method can inside-out estimate 3-D human poses using multiple panoramic cameras. Compared to state-of-the-art methods, the 3-D human pose omission is greatly reduced through the complementarity of multiple cameras, and the precision of 3-D human pose estimation is largely improved by utilizing multicamera observation information.
ISSN:0018-9456
1557-9662
DOI:10.1109/TIM.2023.3346490