Optimizing Transformer for Large-Hole Image Inpainting

In recent years, leveraging Convolutional Neural Network (CNN) to optimize Transformer (called hybrid model) has achieved great progress in image inpainting. However, the slow growth of the effective receptive field of CNN in processing large-hole regions significantly limits the overall performance...

Full description

Saved in:

Bibliographic Details
Published in	2023 IEEE International Conference on Image Processing (ICIP) pp. 1180 - 1184
Main Authors	Li, Zixuan, Wang, Yuan-Gen
Format	Conference Proceeding
Language	English
Published	IEEE 08.10.2023
Subjects	Convolution Convolutional codes Decoding fast Fourier convolution Image inpainting Image restoration information loss receptive field Transformer Transformers Transforms Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In recent years, leveraging Convolutional Neural Network (CNN) to optimize Transformer (called hybrid model) has achieved great progress in image inpainting. However, the slow growth of the effective receptive field of CNN in processing large-hole regions significantly limits the overall performance. To alleviate this problem, this paper proposes a new Transformer-CNN-based hybrid framework (termed PUT+) by introducing the fast Fourier convolution (FFC) into the CNN-based refinement network. The proposed framework introduces an improved Patch-based Vector Quantized Variational Auto-Encoder (P-VQVAE+). The encoder transforms the masked region into non-overlapping patch-based unquantized feature vectors as the input of Un-Quantized Transformer (UQ-Transformer). The decoder restores the masked region from the predicted quantized features output by the UQ-Transformer while maintaining the unmasked region unchanged. Many experimental results show that the proposed method outperforms the state-of-the-art by a large margin, especially for image inpainting with large masked areas. The code is available at https://github.com/GZHU-DVL/PUTplus.
DOI:	10.1109/ICIP49359.2023.10222218