Infrared and Visible Image Fusion with Overlapped Window Transformer

An overlap window-based transformer is proposed for infrared and visible image fusion. A multi-head self-attention mechanism based on overlapping windows is designed. By introducing overlapping regions between windows, local features can interact across different windows, avoiding the discontinuity...

Full description

Saved in:

Bibliographic Details
Published in	Journal of advanced computational intelligence and intelligent informatics Vol. 29; no. 4; pp. 838 - 846
Main Authors	Liu, Xingwang, Mersha, Bemnet Wondimagegnehu, Hirota, Kaoru, Dai, Yaping
Format	Journal Article
Language	English
Published	Tokyo Fuji Technology Press Co. Ltd 20.07.2025
Subjects	Computer vision Design Infrared imagery Methods Radiation Semantics
Online Access	Get full text

Cover

Loading…

More Information
Summary:	An overlap window-based transformer is proposed for infrared and visible image fusion. A multi-head self-attention mechanism based on overlapping windows is designed. By introducing overlapping regions between windows, local features can interact across different windows, avoiding the discontinuity and information isolation issues caused by non-overlapping partitions. The proposed model is trained using an unsupervised loss function composed of three terms: pixel, gradient, and structural loss. With the end-to-end model and the unsupervised loss function, our method eliminates the need to manually design complex activity-level measurements and fusion strategies. Extensive experiments on the public TNO (grayscale) and RoadScene (RGB) datasets demonstrate that the proposed method achieves the expected long-distance dependency modeling capabilities when fusing infrared and visible images, as well as the positive results in both qualitative and quantitative evaluations.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1343-0130 1883-8014
DOI:	10.20965/jaciii.2025.p0838