DesnowFormer: an effective transformer-based image desnowing network

Single image desnowing is an important and challenge task for lots of computer vision applications, such as visual tracking and video surveillance. Although existing deep learning-based methods have achieved promising results, most of them rely on the local deep features and neglect global relations...

Full description

Saved in:

Bibliographic Details
Published in	2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) pp. 1 - 5
Main Authors	Zhang, Ting, Jiang, Nanfeng, Lin, Junhong, Lin, Jielian, Zhao, Tiesong
Format	Conference Proceeding
Language	English
Published	IEEE 13.12.2022
Subjects	Computer vision Image Desnowing Learning systems Residual Spatial Attention Semantics Training Transformer block Transformers Visual communication Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Single image desnowing is an important and challenge task for lots of computer vision applications, such as visual tracking and video surveillance. Although existing deep learning-based methods have achieved promising results, most of them rely on the local deep features and neglect global relationship information between the local regions. Therefore, inevitably leading to over-smooth or detail loss results. To solve this issue, we design a UNet-based end-to-end architecture for image desnowing. Specially, to better characterize global information and preserve image detail, we combine Window-based Self-Attention (WSA) transformer block with Residue Spatial Attention (RSA) to build basic unit of our network. Besides, to protect the structure of the image effectively, we also introduce a Residue Channel (RC) loss to guide high-quality image restoration. Extensive experimental results on both synthetic and real-world datasets demonstrate that the proposed model achieves new state-of-the-art results.
ISSN:	2642-9357
DOI:	10.1109/VCIP56404.2022.10008815