Spatial-Random-Access-Enabled Video Coding for Interactive Virtual Pan/Tilt/Zoom Functionality

High-spatial-resolution videos offer the possibility of viewing an arbitrary region-of-interest (RoI) interactively. Zoom functionality enables watching high-resolution content even on displays of lower spatial resolution. If arbitrary regions corresponding to arbitrary zoom factors can be served to...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology Vol. 21; no. 5; pp. 577 - 588
Main Authors	Mavlankar, A, Girod, B
Format	Journal Article
Language	English
Published	New York, NY IEEE 01.05.2011 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Applied sciences Automatic voltage control Coding Coding, codes Encoding Exact sciences and technology Extraction Image processing Information, signal and communications theory Interactive video streaming Miscellaneous Optimization pan/tilt/zoom Pixel Playbacks Random variables Reduction region-of-interest Signal and communications theory Signal processing Spatial resolution Stores Streaming media Systems, networks and services of telecommunications Telecommunications Telecommunications and information theory Teletraffic Video coding Viewing Video coding Performance evaluation High resolution Video signal region-of-interest Multiresolution analysis Video signal processing pan/tilt/zoom Traffic control Zoom Motion compensation Interactive video streaming Streaming Background Flow rate regulation Teletraffic Decoding Information transmission Interactive system Interest region Tilt angle Interactive video Traffic management Signal processing Random access Personal communication networks Spatial resolution
Online Access	Get full text

Cover

Loading…

More Information
Summary:	High-spatial-resolution videos offer the possibility of viewing an arbitrary region-of-interest (RoI) interactively. Zoom functionality enables watching high-resolution content even on displays of lower spatial resolution. If arbitrary regions corresponding to arbitrary zoom factors can be served to the user, the transmission and/or decoding of the entire high-spatial-resolution video can be avoided. Moreover, if the video content can be encoded such that arbitrary RoIs corresponding to different zoom factors can be simply extracted from the compressed bitstream, we can avoid dedicated video encoding for each user. We propose such a video coding scheme that is vital in allowing the system to scale to large numbers of remote users as well as to encode and store the content for subsequent repeated playback. Apart from generating a multi-resolution representation, our coding scheme uses P slices from H.264/AVC. We study the tradeoff in the choice of slice size. A larger slice size enables higher coding efficiency for representing the entire scene but increases the number of pixels that have to be transmitted. The optimal slice size achieves the best tradeoff and minimizes the expected transmission bitrate. Experimental results confirm the optimality of our predicted slice size for various test cases. Furthermore, we propose an improvement based on background extraction and long-term memory motion-compensated prediction. Experiments indicate up to 85% bitrate reduction while retaining efficient random access capability.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2011.2129170