Attention-Based Inter-Prediction for Versatile Video Coding

Versatile Video Coding (VVC) is the latest video coding standard, which provides significant coding efficiency to its successors based on new coding tools and flexibility. In this paper, we propose a generative adversarial network-based inter-picture prediction approach for VVC. The proposed method...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 11; p. 1
Main Authors Tran, Quang Nhat, Yang, Shih-Hsuan
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Versatile Video Coding (VVC) is the latest video coding standard, which provides significant coding efficiency to its successors based on new coding tools and flexibility. In this paper, we propose a generative adversarial network-based inter-picture prediction approach for VVC. The proposed method involves two major parts, deep attention map estimation and deep frame interpolation. Adjacent VVC-coded frames in every other frame are taken as the reference data for the proposed inter-picture prediction. The deep attention map classifies pixels into high-interest and low-interest. The low-interest pixels are replaced by the generated data from frame interpolation without extra coded bits, while the other pixels are encoded using the conventional VVC coding tools. The generation of the attention map and interpolated frame can be incorporated into the VVC encoding algorithm under a unified framework. Experimental results show that the proposed method improves the coding efficiency of VVC with a moderate increase (26.7%) in runtime. An average 1.91% BD-rate savings compared to the VVC reference software under the Random-Access configuration was achieved, where significant bitrate reduction for chroma components (U and V) was observed.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2023.3303510