PFTA-Net: Progressive Feature Alignment and Temporal Attention Fusion Networks for Video Inpainting

The goal of video inpainting is to fill in missing regions with reasonable and coherent content in a video sequence. Due to the motion of cameras and objects, the reference frame and the target frame are not aligned, and the useful information of the reference frame cannot be well utilized. Therefor...

Full description

Saved in:

Bibliographic Details
Published in	2023 IEEE International Conference on Image Processing (ICIP) pp. 191 - 195
Main Authors	Zhang, Yanni, Wu, Zhiliang, Yan, Yan
Format	Conference Proceeding
Language	English
Published	IEEE 08.10.2023
Subjects	Cameras Image reconstruction iterative alignment Motion compensation temporal attention fusion Termination of employment Video inpainting Video sequences
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The goal of video inpainting is to fill in missing regions with reasonable and coherent content in a video sequence. Due to the motion of cameras and objects, the reference frame and the target frame are not aligned, and the useful information of the reference frame cannot be well utilized. Therefore, temporal alignment plays an important role in video inpainting. Some studies have attempted to divide the remote alignment into multiple sub-alignments and process them step by step, but error accumulation is inevitable. In this paper, we present a novel progressive feature alignment and temporal attention fusion network, namely PFTA-Net. Specifically, we design a progressive feature alignment module, which employs sub-alignments with a progressive refinement scheme, resulting in more accurate motion compensation. After alignment, we propose a temporal attention fusion module, which computes temporal attention weights for each aligned reference frame feature, resulting in modulated features for reconstructing the target frame. Our extensive evaluations, including both quantitative and qualitative assessments, demonstrate the better performance and efficacy of our video inpainting network.
DOI:	10.1109/ICIP49359.2023.10222097