Training-based head pose estimation under monocular vision

Although many 3D head pose estimation methods based on monocular vision can achieve an accuracy of 5°, how to reduce the number of required training samples and how to not to use any hardware parameters as input features are still among the biggest challenges in the field of head pose estimation. To...

Full description

Saved in:

Bibliographic Details
Published in	IET computer vision Vol. 10; no. 8; pp. 798 - 805
Main Authors	Guo, Zhizhi, Zhou, Qianxiang, Liu, Zhongqi, Liu, Chunhui
Format	Journal Article
Language	English
Published	The Institution of Engineering and Technology 01.12.2016 Wiley
Subjects	Accuracy computer vision Face recognition Facial facial key point detection system feature extraction feature vector space Hardware head pose linear combination head pose space Illumination minimisation Monocular vision Parameters pose estimation Research Article sparse training sample set selection Training training sample number reduction training-based head pose estimation ℓ1-minimisation monocular vision ℓ1-minimisation head pose space facial key point detection system sparse training sample set selection feature extraction feature vector space head pose linear combination training-based head pose estimation pose estimation computer vision minimisation training sample number reduction
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Although many 3D head pose estimation methods based on monocular vision can achieve an accuracy of 5°, how to reduce the number of required training samples and how to not to use any hardware parameters as input features are still among the biggest challenges in the field of head pose estimation. To aim at these challenges, the authors propose an accurate head pose estimation method which can act as an extension to facial key point detection systems. The basic idea is to use the normalised distance between key points as input features, and to use ℓ1-minimisation to select a set of sparse training samples which can reflect the mapping relationship between the feature vector space and the head pose space. The linear combination of the head poses corresponding to these samples represents the head pose of the test sample. The experiment results show that the authors’ method can achieve an accuracy of 2.6° without any extra hardware parameters or information of the subject. In addition, in the case of large head movement and varying illumination, the authors’ method is still able to estimate the head pose.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1751-9632 1751-9640 1751-9640
DOI:	10.1049/iet-cvi.2015.0457