Temporal Representation Learning on Monocular Videos for 3D Human Pose Estimation

In this article we propose an unsupervised feature extraction method to capture temporal information on monocular videos, where we detect and encode subject of interest in each frame and leverage contrastive self-supervised (CSS) learning to extract rich latent vectors. Instead of simply treating th...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on pattern analysis and machine intelligence Vol. 45; no. 5; pp. 6415 - 6427
Main Authors Honari, Sina, Constantin, Victor, Rhodin, Helge, Salzmann, Mathieu, Fua, Pascal
Format Journal Article
LanguageEnglish
Published United States IEEE 01.05.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…