MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes

International Conference on 3D Vision 2022 (Oral) 3D human motion capture from monocular RGB images respecting interactions of a subject with complex and possibly deformable environments is a very challenging, ill-posed and under-explored problem. Existing methods address it only weakly and do not m...

Full description

Saved in:
Bibliographic Details
Main Authors Li, Zhi, Shimada, Soshi, Schiele, Bernt, Theobalt, Christian, Golyanik, Vladislav
Format Journal Article
LanguageEnglish
Published 17.08.2022
Subjects
Online AccessGet full text
DOI10.48550/arxiv.2208.08439

Cover

Loading…
More Information
Summary:International Conference on 3D Vision 2022 (Oral) 3D human motion capture from monocular RGB images respecting interactions of a subject with complex and possibly deformable environments is a very challenging, ill-posed and under-explored problem. Existing methods address it only weakly and do not model possible surface deformations often occurring when humans interact with scene surfaces. In contrast, this paper proposes MoCapDeform, i.e., a new framework for monocular 3D human motion capture that is the first to explicitly model non-rigid deformations of a 3D scene for improved 3D human pose estimation and deformable environment reconstruction. MoCapDeform accepts a monocular RGB video and a 3D scene mesh aligned in the camera space. It first localises a subject in the input monocular video along with dense contact labels using a new raycasting based strategy. Next, our human-environment interaction constraints are leveraged to jointly optimise global 3D human poses and non-rigid surface deformations. MoCapDeform achieves superior accuracy than competing methods on several datasets, including our newly recorded one with deforming background scenes.
DOI:10.48550/arxiv.2208.08439