Audio-visual affect recognition through multi-stream fused HMM for HCI

Advances in computer processing power and emerging algorithms are allowing new ways of envisioning human computer interaction. This paper focuses on the development of a computing algorithm that uses audio and visual sensors to detect and track a user's affective state to aid computer decision...

Full description

Saved in:

Bibliographic Details
Published in	2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) Vol. 2; pp. 967 - 972 vol. 2
Main Authors	Zeng, Z., Tu, J., Pianfetti, B., Liu, M., Zhang, T., Zhang, Z., Huang, T.S., Levinson, S.
Format	Conference Proceeding
Language	English
Published	IEEE 2005
Subjects	Application software Coupled mode analysis Decision making Entropy Hidden Markov models Human computer interaction Mutual information Streaming media Testing Training data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Advances in computer processing power and emerging algorithms are allowing new ways of envisioning human computer interaction. This paper focuses on the development of a computing algorithm that uses audio and visual sensors to detect and track a user's affective state to aid computer decision making. Using our multi-stream fused hidden Markov model (MFHMM), we analyzed coupled audio and visual streams to detect 11 cognitive/emotive states. The MFHMM allows the building of an optimal connection among multiple streams according to the maximum entropy principle and the maximum mutual information criterion. Person-independent experimental results from 20 subjects in 660 sequences show that the MFHMM approach performs with an accuracy of 80.61% which outperforms face-only HMM, pitch-only HMM, energy-only HMM, and independent HMM fusion.
ISBN:	0769523722 9780769523729
ISSN:	1063-6919 1063-6919
DOI:	10.1109/CVPR.2005.77