Learning camera viewpoint using CNN to improve 3D body pose estimation

The objective of this work is to estimate 3D human pose from a single RGB image. Extracting image representations which incorporate both spatial relation of body parts and their relative depth plays an essential role in accurate3D pose reconstruction. In this paper, for the first time, we show that...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Mona Fathollahi Ghezelghieh, Rangachar Kasturi, Sarkar, Sudeep
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 18.09.2016
Subjects	Body parts Cameras Error reduction Image reconstruction Image segmentation Mathematical models Robustness (mathematics) Three dimensional bodies
Online Access	Get full text
ISSN	2331-8422

Cover

More Information
Summary:	The objective of this work is to estimate 3D human pose from a single RGB image. Extracting image representations which incorporate both spatial relation of body parts and their relative depth plays an essential role in accurate3D pose reconstruction. In this paper, for the first time, we show that camera viewpoint in combination to 2D joint lo-cations significantly improves 3D pose accuracy without the explicit use of perspective geometry mathematical models.To this end, we train a deep Convolutional Neural Net-work (CNN) to learn categorical camera viewpoint. To make the network robust against clothing and body shape of the subject in the image, we utilized 3D computer rendering to synthesize additional training images. We test our framework on the largest 3D pose estimation bench-mark, Human3.6m, and achieve up to 20% error reduction compared to the state-of-the-art approaches that do not use body part segmentation.
Bibliography:	content type line 50 SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1
ISSN:	2331-8422