User Navigation Modeling, Rate-Distortion Analysis, and End-to-End Optimization for Viewport-Driven 360 ^\circ Video Streaming

The emerging technologies of Virtual Reality (VR) and 360<inline-formula><tex-math notation="LaTeX">^\circ </tex-math></inline-formula> video introduce new challenges for state-of-the-art video communication systems. Enormous data volume and spatial user navigation...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on multimedia Vol. 25; pp. 5941 - 5956
Main Authors Chakareski, Jacob, Corbillon, Xavier, Simon, Gwendal, Swaminathan, Viswanathan
Format Journal Article
LanguageEnglish
Published IEEE 2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The emerging technologies of Virtual Reality (VR) and 360<inline-formula><tex-math notation="LaTeX">^\circ </tex-math></inline-formula> video introduce new challenges for state-of-the-art video communication systems. Enormous data volume and spatial user navigation are unique characteristics of 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> videos that necessitate a space-time effective allocation of the available network streaming bandwidth over the 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video content to maximize the Quality of Experience (QoE) delivered to the user. Towards this objective, we investigate a framework for viewport-driven rate-distortion optimized 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video streaming that integrates the user view navigation patterns and the spatiotemporal rate-distortion characteristics of the 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video content to maximize the delivered user viewport video quality, for the given network/system resources. The framework comprises a methodology for assigning dynamic navigation likelihoods over the 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video spatiotemporal panorama, induced by the user navigation patterns, an analysis and characterization of the 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video panorama's spatiotemporal rate-distortion characteristics that leverage preprocessed spatial tilling of the content, and an optimization problem formulation and solution that capture and aim to maximize the delivered expected viewport video quality, given a user's navigation patterns, the 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video encoding/streaming decisions, and the available system/network resources. We formulate a Markov model to capture the navigation patterns of a user over the 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video panorama and simultaneously extend our actual navigation datasets by synthesizing additional realistic navigation data. Moreover, we investigate the impact of using two different tile sizes for equirectangular tiling of the 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video panorama. Our experimental results demonstrate the advantages of our framework over the conventional approach of streaming a monolithic uniformly-encoded 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video and a state-of-the-art navigation-speed based reference method. Considerable average and instantaneous viewport video quality gains of up to 5 dB are demonstrated in the case of five popular 4 K 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> videos. In addition, we explore the impact of two different popular 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video quality metrics applied to evaluate the streaming performance of our system framework and the two reference methods. Finally, we demonstrate that by exploiting the unequal rate-distortion characteristics of the different spatial sectors of the 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video panorama, we can enable spatially more uniform and temporally higher 360<inline-formula><tex-math notation="LaTeX">^\circ</tex-math></inline-formula> video viewport quality delivered to the user, relative to monolithic streaming.
ISSN:1520-9210
1941-0077
DOI:10.1109/TMM.2022.3201397