Robust 3D Human Avatar Reconstruction From Monocular Videos Using Depth Optimization and Camera Pose Estimation

This paper presents a novel approach for 3D human avatar reconstruction from monocular RGB videos, overcoming the limitations of existing template-based methods such as BANMo. We introduce a two-fold optimization framework: first, using RelPose++ for accurate camera pose estimation and second, incor...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 13; pp. 57886 - 57897
Main Authors	Kim, Kyung Min, Song, Byung Cheol
Format	Journal Article
Language	English
Published	Piscataway IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	3D avatar reconstruction Accuracy Avatars Cameras Chamfering deformable 3D avatar Depth measurement Formability Image reconstruction monocular video integration Optimization Pose estimation Reconstruction Rendering (computer graphics) Robustness Shape structural alignment in 3D Three-dimensional displays Video Videos
Online Access	Get full text
ISSN	2169-3536 2169-3536
DOI	10.1109/ACCESS.2025.3556445

Cover

More Information
Summary:	This paper presents a novel approach for 3D human avatar reconstruction from monocular RGB videos, overcoming the limitations of existing template-based methods such as BANMo. We introduce a two-fold optimization framework: first, using RelPose++ for accurate camera pose estimation and second, incorporating depth maps for enhancing 3D shape reconstruction. Our method minimizes so-called intra-frame and inter-frame distances, optimizing both detailed frame-level accuracy and maintaining temporal coherence across multiple video frames. Extensive experiments on the MEAD, Multiface and FEED datasets demonstrate the superiority of our approach in generating realistic, deformable 3D avatars, achieving significant improvements in Chamfer distance and F-score compared to existing methods. This framework is particularly effective in complex scenarios, such as bust-shot videos with partial views of subjects, offering robust and high-quality 3D reconstructions.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2025.3556445