Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network

We propose an efficient and accurate head orientation estimation algorithm using a monocular camera. Our approach is leveraged by deep neural network and we exploit the architecture in a data regression manner to learn the mapping function between visual appearance and three dimensional head orienta...

Full description

Saved in:

Bibliographic Details
Published in	Computer Vision -- ACCV 2014 Vol. 9005; pp. 82 - 96
Main Authors	Ahn, Byungtae, Park, Jaesik, Kweon, In So
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 01.01.2015 Springer International Publishing
Series	Lecture Notes in Computer Science
Subjects	Artificial intelligence Convolutional Neural Network Graphic Processing Unit Image processing Input Image Particle Filter Pattern recognition Random Forest
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We propose an efficient and accurate head orientation estimation algorithm using a monocular camera. Our approach is leveraged by deep neural network and we exploit the architecture in a data regression manner to learn the mapping function between visual appearance and three dimensional head orientation angles. Therefore, in contrast to classification based approaches, our system outputs continuous head orientation. The algorithm uses convolutional filters trained with a large number of augmented head appearances, thus it is user independent and covers large pose variations. Our key observation is that an input image having $$32 \times 32$$ resolution is enough to achieve about 3 degrees of mean square error, which can be used for efficient head orientation applications. Therefore, our architecture takes only 1 ms on roughly localized head positions with the aid of GPU. We also propose particle filter based post-processing to enhance stability of the estimation further in video sequences. We compare the performance with the state-of-the-art algorithm which utilizes depth sensor and we validate our head orientation estimator on Internet photos and video.
Bibliography:	Original Abstract: We propose an efficient and accurate head orientation estimation algorithm using a monocular camera. Our approach is leveraged by deep neural network and we exploit the architecture in a data regression manner to learn the mapping function between visual appearance and three dimensional head orientation angles. Therefore, in contrast to classification based approaches, our system outputs continuous head orientation. The algorithm uses convolutional filters trained with a large number of augmented head appearances, thus it is user independent and covers large pose variations. Our key observation is that an input image having \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$32 \times 32$$\end{document} resolution is enough to achieve about 3 degrees of mean square error, which can be used for efficient head orientation applications. Therefore, our architecture takes only 1 ms on roughly localized head positions with the aid of GPU. We also propose particle filter based post-processing to enhance stability of the estimation further in video sequences. We compare the performance with the state-of-the-art algorithm which utilizes depth sensor and we validate our head orientation estimator on Internet photos and video.
ISBN:	9783319168104 331916810X
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-319-16811-1_6