Adaptive Deep Reinforcement Learning-Based In-Loop Filter for VVC

Deep learning-based in-loop filters have recently demonstrated great improvement for both coding efficiency and subjective quality in video coding. However, most existing deep learning-based in-loop filters tend to develop a sophisticated model in exchange for good performance, and they employ a sin...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on image processing Vol. 30; pp. 5439 - 5451
Main Authors	Huang, Zhijie, Sun, Jun, Guo, Xiaopeng, Shang, Mingyu
Format	Journal Article
Language	English
Published	New York IEEE 2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Coding Decision making Deep learning deep reinforcement learning Feature extraction Filtration Image coding Image reconstruction In-loop filter Optimization Random access Reinforcement learning Streaming media versatile video coding (VVC) Video coding
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Deep learning-based in-loop filters have recently demonstrated great improvement for both coding efficiency and subjective quality in video coding. However, most existing deep learning-based in-loop filters tend to develop a sophisticated model in exchange for good performance, and they employ a single network structure to all reconstructed samples, which lack sufficient adaptiveness to the various video content, limiting their performances to some extent. In contrast, this paper proposes an adaptive deep reinforcement learning-based in-loop filter (ARLF) for versatile video coding (VVC). Specifically, we treat the filtering as a decision-making process and employ an agent to select an appropriate network by leveraging recent advances in deep reinforcement learning. To this end, we develop a lightweight backbone and utilize it to design a network set <inline-formula> <tex-math notation="LaTeX">\mathcal {S} </tex-math></inline-formula> containing networks with different complexities. Then a simple but efficient agent network is designed to predict the optimal network from <inline-formula> <tex-math notation="LaTeX">\mathcal {S} </tex-math></inline-formula>, which makes the model adaptive to various video contents. To improve the robustness of our model, a two-stage training scheme is further proposed to train the agent and tune the network set. The coding tree unit (CTU) is seen as the basic unit for the in-loop filtering processing. A CTU level control flag is applied in the sense of rate-distortion optimization (RDO). Extensive experimental results show that our ARLF approach obtains on average 2.17%, 2.65%, 2.58%, 2.51% under all-intra, low-delay P, low-delay, and random access configurations, respectively. Compared with other deep learning-based methods, the proposed approach can achieve better performance with low computation complexity.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1057-7149 1941-0042 1941-0042
DOI:	10.1109/TIP.2021.3084345