Progressive Hypothesis Transformer for 3D Human Mesh Recovery

Recent advancements in Transformer-based human mesh reconstruction (HMR) are commendable. However, these models often lift 2D images directly to 3D vertices without explicit intermediate guidance. In addition, the global attention mechanism tends to spread attention across larger body areas and even...

Full description

Saved in:
Bibliographic Details
Published in2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) pp. 6311 - 6320
Main Authors Liao, Huang-Ru, Lin, Jen-Chun, Lee, Chun-Yi
Format Conference Proceeding
LanguageEnglish
Published IEEE 03.01.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recent advancements in Transformer-based human mesh reconstruction (HMR) are commendable. However, these models often lift 2D images directly to 3D vertices without explicit intermediate guidance. In addition, the global attention mechanism tends to spread attention across larger body areas and even unrelated background regions during human mesh estimation, rather than focusing on critical local regions such as human body joints. This tendency leads to inaccurate and unrealistic results for complex activities. To address these challenges, we introduce the Progressive Hypothesis Transformer, which employs 2D and 3D pose predictions to progressively guide our model. Moreover, we propose a mechanism that generates multiple plausible hypotheses for both 2D and 3D poses to mitigate potential inaccuracies arising from intermediate pose estimations. Our model also incorporates inter-intra attention to capture correlations between joints and hypotheses. Experimental results demonstrate that our method surpasses existing image-based approaches on Human3.6M [13] and 3DPW [36] with fewer parameters and relatively lower computational costs.
ISSN:2642-9381
DOI:10.1109/WACV57701.2024.00620