Adaptive Deep Reinforcement Learning-Based In-Loop Filter for VVC
Deep learning-based in-loop filters have recently demonstrated great improvement for both coding efficiency and subjective quality in video coding. However, most existing deep learning-based in-loop filters tend to develop a sophisticated model in exchange for good performance, and they employ a sin...
Saved in:
Published in | IEEE transactions on image processing Vol. 30; pp. 5439 - 5451 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Deep learning-based in-loop filters have recently demonstrated great improvement for both coding efficiency and subjective quality in video coding. However, most existing deep learning-based in-loop filters tend to develop a sophisticated model in exchange for good performance, and they employ a single network structure to all reconstructed samples, which lack sufficient adaptiveness to the various video content, limiting their performances to some extent. In contrast, this paper proposes an adaptive deep reinforcement learning-based in-loop filter (ARLF) for versatile video coding (VVC). Specifically, we treat the filtering as a decision-making process and employ an agent to select an appropriate network by leveraging recent advances in deep reinforcement learning. To this end, we develop a lightweight backbone and utilize it to design a network set <inline-formula> <tex-math notation="LaTeX">\mathcal {S} </tex-math></inline-formula> containing networks with different complexities. Then a simple but efficient agent network is designed to predict the optimal network from <inline-formula> <tex-math notation="LaTeX">\mathcal {S} </tex-math></inline-formula>, which makes the model adaptive to various video contents. To improve the robustness of our model, a two-stage training scheme is further proposed to train the agent and tune the network set. The coding tree unit (CTU) is seen as the basic unit for the in-loop filtering processing. A CTU level control flag is applied in the sense of rate-distortion optimization (RDO). Extensive experimental results show that our ARLF approach obtains on average 2.17%, 2.65%, 2.58%, 2.51% under all-intra, low-delay P, low-delay, and random access configurations, respectively. Compared with other deep learning-based methods, the proposed approach can achieve better performance with low computation complexity. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 1057-7149 1941-0042 1941-0042 |
DOI: | 10.1109/TIP.2021.3084345 |