Video Compressed Sensing Using a Convolutional Neural Network

Recently, a few image compressed sensing (CS) methods based on deep learning have been developed, which achieve remarkable reconstruction quality with low computational complexity. However, these existing deep learning-based image CS methods focus on exploring intraframe correlation while ignoring i...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology Vol. 31; no. 2; pp. 425 - 438
Main Authors	Shi, Wuzhen, Liu, Shaohui, Jiang, Feng, Zhao, Debin
Format	Journal Article
Language	English
Published	New York IEEE 01.02.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Artificial neural networks Compensation Compressed sensing Computer architecture Convolution convolutional neural network Convolutional neural networks Correlation Deep learning Image reconstruction Machine learning Multilevel multilevel feature compensation Neural networks Reconstruction Sampling video compressed sensing Video compression video reconstruction Video sequences
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Recently, a few image compressed sensing (CS) methods based on deep learning have been developed, which achieve remarkable reconstruction quality with low computational complexity. However, these existing deep learning-based image CS methods focus on exploring intraframe correlation while ignoring interframe cues, resulting in inefficiency when directly applied to video CS. In this paper, we propose a novel video CS framework based on a convolutional neural network (dubbed VCSNet) to explore both intraframe and interframe correlations. Specifically, VCSNet divides the video sequence into multiple groups of pictures (GOPs), of which the first frame is a keyframe that is sampled at a higher sampling ratio than the other nonkeyframes. In a GOP, the block-based framewise sampling by a convolution layer is proposed, which leads to the sampling matrix being automatically optimized. In the reconstruction process, the framewise initial reconstruction by using a linear convolutional neural network is first presented, which effectively utilizes the intraframe correlation. Then, the deep reconstruction with multilevel feature compensation is proposed, which compensates the nonkeyframes with the keyframe in a multilevel feature compensation manner. Such multilevel feature compensation allows the network to better explore both intraframe and interframe correlations. Extensive experiments on six benchmark videos show that VCSNet provides better performance over state-of-the-art video CS methods and deep learning-based image CS methods in both objective and subjective reconstruction quality.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2020.2978703