Viewing Bias Matters in 360° Videos Visual Saliency Prediction

360° video has been applied to many areas such as immersive contents, virtual tours, and surveillance systems. Compared to the field of view prediction on planar videos, the explosive amount of information contained in the omni-directional view on the entire sphere poses an additional challenge in p...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 11; pp. 46084 - 46094
Main Authors	Chen, Peng-Wen, Yang, Tsung-Shan, Huang, Gi-Luen, Huang, Chia-Wen, Chao, Yu-Chieh, Lu, Chien-Hung, Wu, Pei-Yuan
Format	Journal Article
Language	English
Published	Piscataway IEEE 2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	360° videos Bias Coders Convolutional neural networks Datasets Decoding Deep learning Feature extraction Field of view Prediction models Predictive models Recurrent neural networks Salience Surveillance systems Three-dimensional displays Video Videos Viewing viewing bias Visual saliency prediction Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	360° video has been applied to many areas such as immersive contents, virtual tours, and surveillance systems. Compared to the field of view prediction on planar videos, the explosive amount of information contained in the omni-directional view on the entire sphere poses an additional challenge in predicting high-salient regions in 360° videos. In this work, we propose a visual saliency prediction model that directly takes 360° video in the equirectangular format. Unlike previous works that often adopted recurrent neural network (RNN) architecture for the saliency detection task, in this work, we utilize 3D convolution to a spatial-temporal encoder and generalize SphereNet kernels to construct a spatial-temporal decoder. We further study the statistical properties of viewing biases present in 360° datasets across various video types, which provides us with insights into the design of a fusing mechanism that incorporates the predicted saliency map with the viewing bias in an adaptive manner. The proposed model yields state-of-the-art performance, as evidenced by empirical results over renowned 360° visual saliency datasets such as Salient360!, PVS, and Sport360.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2023.3269564