Learning A Sparse Transformer Network for Effective Image Deraining

Transformers-based methods have achieved significant performance in image deraining as they can model the non-local information which is vital for high-quality image reconstruction. In this paper, we find that most existing Transformers usually use all similarities of the tokens from the query-key p...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 5896 - 5905
Main Authors	Chen, Xiang, Li, Hao, Li, Mingqiang, Pan, Jinshan
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2023
Subjects	Adaptation models Benchmark testing Computational modeling Computer vision Image restoration Low-level vision Source coding Transformers
Online Access	Get full text
ISSN	1063-6919
DOI	10.1109/CVPR52729.2023.00571

Cover

Abstract	Transformers-based methods have achieved significant performance in image deraining as they can model the non-local information which is vital for high-quality image reconstruction. In this paper, we find that most existing Transformers usually use all similarities of the tokens from the query-key pairs for the feature aggregation. However, if the tokens from the query are different from those of the key, the self-attention values estimated from these tokens also involve in feature aggregation, which accordingly interferes with the clear image restoration. To overcome this problem, we propose an effective DeRaining network, Sparse Transformer (DRSformer) that can adaptively keep the most useful self-attention values for feature aggregation so that the aggregated features better facilitate high-quality image reconstruction. Specifically, we develop a learnable top-k selection operator to adaptively retain the most crucial attention scores from the keys for each query for better feature aggregation. Simultaneously, as the naive feed-forward network in Transformers does not model the multi-scale information that is important for latent clear image restoration, we develop an effective mixed-scale feed-forward network to generate better features for image deraining. To learn an enriched set of hybrid features, which combines local context from CNN operators, we equip our model with mixture of experts feature compensator to present a cooperation refinement deraining scheme. Extensive experimental results on the commonly used benchmarks demonstrate that the proposed method achieves favorable performance against state-of-the-art approaches. The source code and trained models are available at https://github.com/cschenxiang/DRSformer.
AbstractList	Transformers-based methods have achieved significant performance in image deraining as they can model the non-local information which is vital for high-quality image reconstruction. In this paper, we find that most existing Transformers usually use all similarities of the tokens from the query-key pairs for the feature aggregation. However, if the tokens from the query are different from those of the key, the self-attention values estimated from these tokens also involve in feature aggregation, which accordingly interferes with the clear image restoration. To overcome this problem, we propose an effective DeRaining network, Sparse Transformer (DRSformer) that can adaptively keep the most useful self-attention values for feature aggregation so that the aggregated features better facilitate high-quality image reconstruction. Specifically, we develop a learnable top-k selection operator to adaptively retain the most crucial attention scores from the keys for each query for better feature aggregation. Simultaneously, as the naive feed-forward network in Transformers does not model the multi-scale information that is important for latent clear image restoration, we develop an effective mixed-scale feed-forward network to generate better features for image deraining. To learn an enriched set of hybrid features, which combines local context from CNN operators, we equip our model with mixture of experts feature compensator to present a cooperation refinement deraining scheme. Extensive experimental results on the commonly used benchmarks demonstrate that the proposed method achieves favorable performance against state-of-the-art approaches. The source code and trained models are available at https://github.com/cschenxiang/DRSformer.
Author	Pan, Jinshan Li, Mingqiang Li, Hao Chen, Xiang
Author_xml	– sequence: 1 givenname: Xiang surname: Chen fullname: Chen, Xiang organization: School of Computer Science and Engineering, Nanjing University of Science and Technology – sequence: 2 givenname: Hao surname: Li fullname: Li, Hao organization: School of Computer Science and Engineering, Nanjing University of Science and Technology – sequence: 3 givenname: Mingqiang surname: Li fullname: Li, Mingqiang organization: Information Science Academy, China Electronics Technology Group Corporation – sequence: 4 givenname: Jinshan surname: Pan fullname: Pan, Jinshan organization: School of Computer Science and Engineering, Nanjing University of Science and Technology
BookMark	eNotjNFKwzAUhqMoOGffYBd5gdaTpMnJuRx16qCo6PR2hPZ0VG070qL49m7o1cf_wf9dirN-6FmIhYJMKaDr4u3p2WrUlGnQJgOwqE5EQkjeWDCgNPlTMVPgTOpI0YVIxvEdAIxWypGfiaLkEPu238mlfNmHOLLcxNCPzRA7jvKBp-8hfsjDlKum4Wpqv1iuu7BjecMxtMfrlThvwufIyT_n4vV2tSnu0_Lxbl0sy7TVkE9pTqqxZJ1DZ3xVe3Q1OtQeA5qKgXWovfXeWQ9EITdQc228R5UHq49iLhZ_3ZaZt_vYdiH-bBUc6ojW_ALP4kwD
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/CVPR52729.2023.00571
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Applied Sciences
EISBN	9798350301298
EISSN	1063-6919
EndPage	5905
ExternalDocumentID	10204775
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: U22B2049,U19B2040,61922043,61872421,62272230 funderid: 10.13039/501100001809 – fundername: National Key R&D Program of China grantid: 2018AAA0102001 funderid: 10.13039/501100012166 – fundername: Fundamental Research Funds for the Central Universities grantid: 30920041109 funderid: 10.13039/501100012226
GroupedDBID	6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO
ID	FETCH-LOGICAL-i204t-491f595667638cd876d767287a73ce0e2ad8588658099a430ded388714a529a43
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:56:29 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i204t-491f595667638cd876d767287a73ce0e2ad8588658099a430ded388714a529a43
PageCount	10
ParticipantIDs	ieee_primary_10204775
PublicationCentury	2000
PublicationDate	2023-June
PublicationDateYYYYMMDD	2023-06-01
PublicationDate_xml	– month: 06 year: 2023 text: 2023-June
PublicationDecade	2020
PublicationTitle	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online)
PublicationTitleAbbrev	CVPR
PublicationYear	2023
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0003211698
Score	2.6593854
Snippet	Transformers-based methods have achieved significant performance in image deraining as they can model the non-local information which is vital for high-quality...
SourceID	ieee
SourceType	Publisher
StartPage	5896
SubjectTerms	Adaptation models Benchmark testing Computational modeling Computer vision Image restoration Low-level vision Source coding Transformers
Title	Learning A Sparse Transformer Network for Effective Image Deraining
URI	https://ieeexplore.ieee.org/document/10204775
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELagE1N5FPGWB9aEJH6PqFAVJKoKWtStcuwLQoi2KunCr-ecpEUgIbElHhLLj---s--7I-TSMYdWgOPiBZlE3CoTGc9lpI0tDMszodMgcH4YyP6Y30_EpBGrV1oYAKiCzyAOj9Vdvp-7VTgqwx2eJVwpsU22cZ3VYq3NgQpDV0Ya3cjj0sRcdZ-HjyJD9hiHGuFx0F2mP4qoVDak1yaD9d_r0JG3eFXmsfv8lZjx393bJZ1vuR4dbgzRHtmC2T5pN_ySNrv344B0m2SqL_SaPi3QowU6WvNWWNJBHRFO8ZXWSY0RCendOyIOvYGmlESHjHu3o24_aoooRK_YmzLiJi0EOkESgUQ7j-DnlVToJ1nFHCSQWa-F1khEkCtazhIPniHypNyKLDQcktZsPoMjQrkTTjprZRF4XsGtY_gNr4RgXiU5OyadMCjTRZ0nY7oej5M_2k_JTpiYOvDqjLTK5QrO0cSX-UU1tV9orqN4
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED5BGWAqjyLeeGBNSOJXMqJC1UIbVdCibpVjuwgh2qqkC7-ec5IWgYTElniwLTu-7zvnvjuAK001ogDDj9eKwGNKJl5imPDiRE0SmkU8Dp3AuZeK9pDdj_ioEqsXWhhrbRF8Zn33WPzLNzO9dFdleMKjgEnJN2ELgZ_xUq61vlKh6MyIJK4EcmGQXDef-488Qv7ouyrhvlNehj_KqBQo0qpDuhq_DB5585d55uvPX6kZ_z3BXWh8C_ZIfw1Fe7Bhp_tQrxgmqc7vxwE0q3SqL-SGPM3Rp7VksGKudkHSMiac4Csp0xqjLSSdd7Q55NZWxSQaMGzdDZptryqj4L3ibHKPJeGEoxsk0JTE2qD5M1JI9JSUpNoGNlIm5nGMVATZomI0MNZQtD0hUzxyDYdQm86m9ggI01wLrZSYOKY3YUpT7MNIzqmRQUaPoeEWZTwvM2WMV-tx8kf7JWy3B73uuNtJH05hx21SGYZ1BrV8sbTnCPh5dlFs8xcrj6bF
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE+Computer+Society+Conference+on+Computer+Vision+and+Pattern+Recognition.+Online%29&rft.atitle=Learning+A+Sparse+Transformer+Network+for+Effective+Image+Deraining&rft.au=Chen%2C+Xiang&rft.au=Li%2C+Hao&rft.au=Li%2C+Mingqiang&rft.au=Pan%2C+Jinshan&rft.date=2023-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=5896&rft.epage=5905&rft_id=info:doi/10.1109%2FCVPR52729.2023.00571&rft.externalDocID=10204775