UC-former: A multi-scale image deraining network using enhanced transformer

While convolutional neural networks (CNN) have achieved remarkable performance in single image deraining tasks, it is still a very challenging task due to CNN’s limited receptive field and the unreality of the output image. In this paper, UC-former, an effective and efficient U-shaped architecture b...

Full description

Saved in:

Bibliographic Details
Published in	Computer vision and image understanding Vol. 248; p. 104097
Main Authors	Zhou, Weina, Ye, Linhui
Format	Journal Article
Language	English
Published	Elsevier Inc 01.11.2024
Subjects	Multi-scale feature fusion Self-attention Single image deraining Transformer Single image deraining Transformer Multi-scale feature fusion Self-attention
Online Access	Get full text
ISSN	1077-3142
DOI	10.1016/j.cviu.2024.104097

Cover

Loading…

More Information
Summary:	While convolutional neural networks (CNN) have achieved remarkable performance in single image deraining tasks, it is still a very challenging task due to CNN’s limited receptive field and the unreality of the output image. In this paper, UC-former, an effective and efficient U-shaped architecture based on transformer for image deraining was presented. In UC-former, there are two core designs to avoid heavy self-attention computation and inefficient communications across encoder and decoder. First, we propose a novel channel across Transformer block, which computes self-attention between channels. It significantly reduces the computational complexity of high-resolution rain maps while capturing global context. Second, we propose a multi-scale feature fusion module between the encoder and decoder to combine low-level local features and high-level non-local features. In addition, we employ depth-wise convolution and H-Swish non-linear activation function in Transformer Blocks to enhance rain removal authenticity. Extensive experiments indicate that our method outperforms the state-of-the-art deraining approaches on synthetic and real-world rainy datasets. •Proposed an effective U-shaped transformer framework for single-image deraining.•Multi-scale feature fusion module strengthens the connection between the transformer.•Enhanced transformer captures cross-channel connections and controls feature transmission.•Experiments indicate our method’s effectiveness on synthetic and real-world datasets.
ISSN:	1077-3142
DOI:	10.1016/j.cviu.2024.104097