End-to-End Recovery of Human Shape and Pose

We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most current methods that compute 2D or 3D joint locations, we produce a richer and more useful mesh representation that is parameterized by shape...

Full description

Saved in:
Bibliographic Details
Published in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 7122 - 7131
Main Authors Kanazawa, Angjoo, Black, Michael J., Jacobs, David W., Malik, Jitendra
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2018
Subjects
Online AccessGet full text

Cover

Loading…
Abstract We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most current methods that compute 2D or 3D joint locations, we produce a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. The main objective is to minimize the reprojection loss of keypoints, which allows our model to be trained using in-the-wild images that only have ground truth 2D annotations. However, the reprojection loss alone is highly underconstrained. In this work we address this problem by introducing an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes. We show that HMR can be trained with and without using any paired 2D-to-3D supervision. We do not rely on intermediate 2D keypoint detections and infer 3D pose and shape parameters directly from image pixels. Our model runs in real-time given a bounding box containing the person. We demonstrate our approach on various images in-the-wild and out-perform previous optimization-based methods that output 3D meshes and show competitive results on tasks such as 3D joint location estimation and part segmentation.
AbstractList We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most current methods that compute 2D or 3D joint locations, we produce a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. The main objective is to minimize the reprojection loss of keypoints, which allows our model to be trained using in-the-wild images that only have ground truth 2D annotations. However, the reprojection loss alone is highly underconstrained. In this work we address this problem by introducing an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes. We show that HMR can be trained with and without using any paired 2D-to-3D supervision. We do not rely on intermediate 2D keypoint detections and infer 3D pose and shape parameters directly from image pixels. Our model runs in real-time given a bounding box containing the person. We demonstrate our approach on various images in-the-wild and out-perform previous optimization-based methods that output 3D meshes and show competitive results on tasks such as 3D joint location estimation and part segmentation.
Author Jacobs, David W.
Malik, Jitendra
Black, Michael J.
Kanazawa, Angjoo
Author_xml – sequence: 1
  givenname: Angjoo
  surname: Kanazawa
  fullname: Kanazawa, Angjoo
– sequence: 2
  givenname: Michael J.
  surname: Black
  fullname: Black, Michael J.
– sequence: 3
  givenname: David W.
  surname: Jacobs
  fullname: Jacobs, David W.
– sequence: 4
  givenname: Jitendra
  surname: Malik
  fullname: Malik, Jitendra
BookMark eNotzEtLw0AUQOFRFKw1axduspdJ78zceS0lVCsUWupjW24mNxixSUmq0H9vQVff4sC5Fhdd37EQtwoKpSDOyvf1ptCgQgHgEc9EFn1Q1gTnUEM8FxMFzkgXVbwS2Th-AoB2wQS0E3E_72p56OWJfMOp_-HhmPdNvvjeUZe_fNCeczq1dT_yjbhs6Gvk7N-peHucv5YLuVw9PZcPS9lqVAcZnbeeoqtrZ3RAIs-IibCimm1iVmwSh6ph5RNWQGhtstrEhpyznqOZiru_b8vM2_3Q7mg4boP1IaA2v3E7Q0M
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR.2018.00744
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
EISBN 9781538664209
1538664208
EISSN 1063-6919
EndPage 7131
ExternalDocumentID 8578842
Genre orig-research
GroupedDBID 6IE
6IH
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-i241t-96757a96dd63284aa7e44ca4bade5cee1e3ce8bfe17c4b0a455c5239fa6657e93
IEDL.DBID RIE
IngestDate Wed Aug 27 02:52:15 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i241t-96757a96dd63284aa7e44ca4bade5cee1e3ce8bfe17c4b0a455c5239fa6657e93
PageCount 10
ParticipantIDs ieee_primary_8578842
PublicationCentury 2000
PublicationDate 2018-06
PublicationDateYYYYMMDD 2018-06-01
PublicationDate_xml – month: 06
  year: 2018
  text: 2018-06
PublicationDecade 2010
PublicationTitle 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
PublicationTitleAbbrev CVPR
PublicationYear 2018
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002683845
ssj0003211698
Score 2.6224551
Snippet We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most...
SourceID ieee
SourceType Publisher
StartPage 7122
SubjectTerms Biological system modeling
Estimation
Joints
Shape
Solid modeling
Three-dimensional displays
Two dimensional displays
Title End-to-End Recovery of Human Shape and Pose
URI https://ieeexplore.ieee.org/document/8578842
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELZKJ6YCLeItD2zgNq4dx5mrVhVSUQUUdav8uAiElCCaDvDrOSehCMTAZCdezomTu-_uuztCLh0PRcq0Ymi8WyazGJix2jOLCEgpyyGrekbObtV0IW-W8bJFrre5MABQkc-gH6ZVLN8XbhNcZQONx0tL_OHu4Fjnam39KUOlhW4iZOFaILJRqW6q-fAoHYwe53eByxXIk4mUP9qpVNpk0iGzLzlqEslLf1Pavvv4VaLxv4Lukd533h6dbzXSPmlBfkA6jaFJm8943SVX49yzsmA40IA_8Ti_0yKjlUef3j-ZV6AG1-bFGnpkMRk_jKas6ZrAnlEblyxFCJCYVHmvBOoeYxKQ0hlpjYcYBeAgHGibAU-ctJGRcewQjaaZCUEYSMUhaedFDkeEoiknOLcGoshKL006VFmsMiW5d1oId0y6Ye-r17owxqrZ9snft0_Jbnj6Nc_qjLTLtw2co0Yv7UX1Kj8BpzOdnQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFG-IHvSECsZvd_CmhZV2XXcmEFQgRMFwI233Fo3JRmQc9K_3dZsYjQdP--jldW33e7_3SciVZa5ImZIUlXdDRRIA1UbF1CADktIwSIqekaOxHMzE3TyY18jNJhcGAIrgM2i528KXH2d27UxlbYXbSwn84W4j7geszNbaWFQ6UnFV-cjcM0duIyNV1fNhftTuPk0eXDSXC58MhfjRUKXAk36djL4kKcNIXlvr3LTsx68ijf8VdY80vzP3vMkGk_ZJDdIDUq9UTa86yKsGue6lMc0zihfPMVDc0O9elniFTd97fNZL8DSOTbIVNMms35t2B7Tqm0BfEI9zGiEJCHUk41hyRB-tQxDCamF0DAEKwIBbUCYBFlphfI3f0iIfjRLt3DAQ8UOylWYpHBEPlTnOmNHg-0bEQkcdmQQykYLFVnFuj0nDzX2xLEtjLKppn_z9-pLsDKaj4WJ4O74_JbtuJcqoqzOylb-t4RzxPTcXxbJ-AhY_oOY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=End-to-End+Recovery+of+Human+Shape+and+Pose&rft.au=Kanazawa%2C+Angjoo&rft.au=Black%2C+Michael+J.&rft.au=Jacobs%2C+David+W.&rft.au=Malik%2C+Jitendra&rft.date=2018-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=7122&rft.epage=7131&rft_id=info:doi/10.1109%2FCVPR.2018.00744&rft.externalDocID=8578842