A slimmable framework for practical neural video compression

Deep learning is being increasingly applied to image and video compression in a new paradigm known as neural video compression. While achieving impressive rate–distortion (RD) performance, neural video codecs (NVC) require heavy neural networks, which in turn have large memory and computational cost...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 610; p. 128525
Main Authors Liu, Zhaocheng, Yang, Fei, Wang, Defa, Górriz Blanch, Marc, Murn, Luka, Wan, Shuai, Zhang, Saiping, Mrak, Marta, Herranz, Luis
Format Journal Article
LanguageEnglish
Published Elsevier B.V 28.12.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Deep learning is being increasingly applied to image and video compression in a new paradigm known as neural video compression. While achieving impressive rate–distortion (RD) performance, neural video codecs (NVC) require heavy neural networks, which in turn have large memory and computational costs and often lack important functionalities such as variable rate. These are significant limitations to their practical application. Addressing these problems, recent slimmable image codecs can dynamically adjust their model capacity to elegantly reduce the memory and computation requirements, without harming RD performance. However, the extension to video is not straightforward due to the non-trivial interplay with complex motion estimation and compensation modules in most NVC architectures. In this paper we propose the slimmable video codec framework (SlimVC) that integrates an slimmable autoencoder and a motion-free conditional entropy model. We show that the slimming mechanism is also applicable to the more complex case of video architectures, providing SlimVC with simultaneous control of the computational cost, memory and rate, which are all important requirements in practice. We further provide detailed experimental analysis, and describe application scenarios that can benefit from slimmable video codecs.
ISSN:0925-2312
DOI:10.1016/j.neucom.2024.128525