Video Compressed Sensing Using a Convolutional Neural Network

Recently, a few image compressed sensing (CS) methods based on deep learning have been developed, which achieve remarkable reconstruction quality with low computational complexity. However, these existing deep learning-based image CS methods focus on exploring intraframe correlation while ignoring i...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 31; no. 2; pp. 425 - 438
Main Authors Shi, Wuzhen, Liu, Shaohui, Jiang, Feng, Zhao, Debin
Format Journal Article
LanguageEnglish
Published New York IEEE 01.02.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recently, a few image compressed sensing (CS) methods based on deep learning have been developed, which achieve remarkable reconstruction quality with low computational complexity. However, these existing deep learning-based image CS methods focus on exploring intraframe correlation while ignoring interframe cues, resulting in inefficiency when directly applied to video CS. In this paper, we propose a novel video CS framework based on a convolutional neural network (dubbed VCSNet) to explore both intraframe and interframe correlations. Specifically, VCSNet divides the video sequence into multiple groups of pictures (GOPs), of which the first frame is a keyframe that is sampled at a higher sampling ratio than the other nonkeyframes. In a GOP, the block-based framewise sampling by a convolution layer is proposed, which leads to the sampling matrix being automatically optimized. In the reconstruction process, the framewise initial reconstruction by using a linear convolutional neural network is first presented, which effectively utilizes the intraframe correlation. Then, the deep reconstruction with multilevel feature compensation is proposed, which compensates the nonkeyframes with the keyframe in a multilevel feature compensation manner. Such multilevel feature compensation allows the network to better explore both intraframe and interframe correlations. Extensive experiments on six benchmark videos show that VCSNet provides better performance over state-of-the-art video CS methods and deep learning-based image CS methods in both objective and subjective reconstruction quality.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2020.2978703