Configurable Fast Block Partitioning for VVC Intra Coding Using Light Gradient Boosting Machine

This article presents a configurable fast block partitioning decision for Versatile Video Coding (VVC) intra-frame prediction using Light Gradient Boosting Machine (LGBM). VVC further improves the coding efficiency by introducing a Quadtree with nested Multi-Type Tree (QTMT), enabling five split typ...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology Vol. 32; no. 6; pp. 3947 - 3960
Main Authors	Saldanha, Mario, Sanchez, Gustavo, Marcon, Cesar, Agostini, Luciano
Format	Journal Article
Language	English
Published	New York IEEE 01.06.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Boosting Classifiers Complexity theory Distortion Efficiency Encoding Evaluation intra coding light gradient boosting machine machine learning Optimization Partitioning Shape Streaming media timesaving Transforms VVC
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This article presents a configurable fast block partitioning decision for Versatile Video Coding (VVC) intra-frame prediction using Light Gradient Boosting Machine (LGBM). VVC further improves the coding efficiency by introducing a Quadtree with nested Multi-Type Tree (QTMT), enabling five split types allowing square and rectangular Coding Unit (CU) sizes. However, this improvement in the coding efficiency comes at the cost of a high computational burden since several combinations of block sizes and prediction modes are evaluated through the costly Rate-Distortion Optimization (RDO) process. In this article, we propose a partitioning decision using LGBM classifiers to avoid the exhaustive RDO process and skip the evaluation of split types that are unlikely to be chosen as the best one. For this purpose, five classifiers (one for each split type) were offline trained with an efficient training process and using effective features of texture, coding, and context information. The proposed solution is highly configurable and can provide several operation points with different tradeoffs between timesaving and coding efficiency, according to the application requirements. Considering five operation points, the configurable solution can reduce the encoding time from 35.22% to 61.34%, with coding efficiency losses from 0.46% to 2.43%. Compared to the state-of-the-art, our solution is able to outperform the related works in terms of combined rate-distortion and timesaving.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2021.3108671