3D Human Pose Estimation Based on Transformer

Currently, 3D human pose estimation has gradually been a well-liked subject. Although various models based on the deep neural network have produced an excellent performance, they still suffer from the ignorance of multiple feasible pose solutions and the problem of the relatively-fixed input length....

Full description

Saved in:
Bibliographic Details
Published inJournal of physics. Conference series Vol. 2562; no. 1; pp. 12067 - 12072
Main Authors Yin, He, Lv, Chang, Shao, Yeqin
Format Journal Article
LanguageEnglish
Published Bristol IOP Publishing 01.08.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Currently, 3D human pose estimation has gradually been a well-liked subject. Although various models based on the deep neural network have produced an excellent performance, they still suffer from the ignorance of multiple feasible pose solutions and the problem of the relatively-fixed input length. To solve these issues, a coordinate transformer encoder based on a 2D pose is constructed to generate multiple feasible pose solutions, and multi-to-one pose mapping is employed to generate a reliable pose. A temporal transformer encoder is used to exploit the temporal dependencies of consecutive pose sequences, which avoids the issue of relatively-fixed input length caused by temporal dilated convolution. Adequate experiments indicate that our model achieves a promising performance.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/2562/1/012067