Modelling Spatio-Temporal Saliency to Predict Gaze Direction for Short Videos

This paper presents a spatio-temporal saliency model that predicts eye movement during video free viewing. This model is inspired by the biology of the first steps of the human visual system. The model extracts two signals from video stream corresponding to the two main outputs of the retina: parvoc...

Full description

Saved in:

Bibliographic Details
Published in	International journal of computer vision Vol. 82; no. 3; pp. 231 - 243
Main Authors	Marat, Sophie, Ho Phuoc, Tien, Granjon, Lionel, Guyader, Nathalie, Pellerin, Denis, Guérin-Dugué, Anne
Format	Journal Article
Language	English
Published	Boston Springer US 01.05.2009 Springer Springer Nature B.V Springer Verlag
Subjects	Applied sciences Artificial Intelligence Computer Imaging Computer Science Computer science; control theory; systems Engineering Sciences Exact sciences and technology Eye movements Image Processing and Computer Vision Pattern Recognition Pattern Recognition and Graphics Pattern recognition. Digital image processing. Computational geometry Short Paper Signal and Image processing Vision Gaze prediction Saliency Spatio-temporal model Video viewing Streaming Visual system Gaze Image processing Video signal Very large databases Spatio-temporal model . Gaze prediction . Video viewing Retina Biology Modeling Eye movement
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents a spatio-temporal saliency model that predicts eye movement during video free viewing. This model is inspired by the biology of the first steps of the human visual system. The model extracts two signals from video stream corresponding to the two main outputs of the retina: parvocellular and magnocellular. Then, both signals are split into elementary feature maps by cortical-like filters. These feature maps are used to form two saliency maps: a static and a dynamic one. These maps are then fused into a spatio-temporal saliency map. The model is evaluated by comparing the salient areas of each frame predicted by the spatio-temporal saliency map to the eye positions of different subjects during a free video viewing experiment with a large database (17000 frames). In parallel, the static and the dynamic pathways are analyzed to understand what is more or less salient and for what type of videos our model is a good or a poor predictor of eye movement.
Bibliography:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-009-0215-3