A Learning-Based Visual Saliency Prediction Model for Stereoscopic 3D Video (LBVS-3D)
Multimedia Tools and Applications, 2016 Over the past decade, many computational saliency prediction models have been proposed for 2D images and videos. Considering that the human visual system has evolved in a natural 3D environment, it is only natural to want to design visual attention models for...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
13.03.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Multimedia Tools and Applications, 2016 Over the past decade, many computational saliency prediction models have been
proposed for 2D images and videos. Considering that the human visual system has
evolved in a natural 3D environment, it is only natural to want to design
visual attention models for 3D content. Existing monocular saliency models are
not able to accurately predict the attentive regions when applied to 3D
image/video content, as they do not incorporate depth information. This paper
explores stereoscopic video saliency prediction by exploiting both low-level
attributes such as brightness, color, texture, orientation, motion, and depth,
as well as high-level cues such as face, person, vehicle, animal, text, and
horizon. Our model starts with a rough segmentation and quantifies several
intuitive observations such as the effects of visual discomfort level, depth
abruptness, motion acceleration, elements of surprise, size and compactness of
the salient regions, and emphasizing only a few salient objects in a scene. A
new fovea-based model of spatial distance between the image regions is adopted
for considering local and global feature calculations. To efficiently fuse the
conspicuity maps generated by our method to one single saliency map that is
highly correlated with the eye-fixation data, a random forest based algorithm
is utilized. The performance of the proposed saliency model is evaluated
against the results of an eye-tracking experiment, which involved 24 subjects
and an in-house database of 61 captured stereoscopic videos. Our stereo video
database as well as the eye-tracking data are publicly available along with
this paper. Experiment results show that the proposed saliency prediction
method achieves competitive performance compared to the state-of-the-art
approaches. |
---|---|
DOI: | 10.48550/arxiv.1803.04842 |