MCAFNet: Multiscale cross-modality adaptive fusion network for multispectral object detection

Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to enhance the precision a-nd robustness of object detection under diverse environmental c-onditions. Although this approach has improved detecti...

Full description

Saved in:

Bibliographic Details
Published in	Digital signal processing Vol. 159; p. 104996
Main Authors	Zheng, Shangpo, Junfeng, Liu, Zeng, Jun
Format	Journal Article
Language	English
Published	Elsevier Inc 01.04.2025
Subjects	Attention mechanism cross-modality multimodal adaptive feature fusion multispectral object detection transformer multimodal adaptive feature fusion transformer Attention mechanism multispectral object detection cross-modality
Online Access	Get full text

Cover

Loading…

Abstract	Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to enhance the precision a-nd robustness of object detection under diverse environmental c-onditions. Although this approach has improved detection capab-ilities, significant challenges remain in fully leveraging the specif-ic detail information of each single modality and accurately capt-uring cross-modality shared features information. To address th-ese challenges, we propose a Multiscale Cross-modality Adaptive Fusion Network (MCAFNet). This network incorporates Cross- modality interactive Transformer (CMIT) module, Multimodal Adaptive Weighted Fusion (MAWF) module, and a 3D-Integrated Attention Feature Enhancement (3D-IAFE) module. These components work together to comprehensively extract complementary feature between modalities and specific detailed feature within each modality, thereby enhancing the accuracy and robustness of multimodal object detection. Extensive experimental validation and in-depth ablation studies confirm the effectiveness of the proposed method, achieving state-of-the-art detection performance on multiple public datasets.
AbstractList	Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to enhance the precision a-nd robustness of object detection under diverse environmental c-onditions. Although this approach has improved detection capab-ilities, significant challenges remain in fully leveraging the specif-ic detail information of each single modality and accurately capt-uring cross-modality shared features information. To address th-ese challenges, we propose a Multiscale Cross-modality Adaptive Fusion Network (MCAFNet). This network incorporates Cross- modality interactive Transformer (CMIT) module, Multimodal Adaptive Weighted Fusion (MAWF) module, and a 3D-Integrated Attention Feature Enhancement (3D-IAFE) module. These components work together to comprehensively extract complementary feature between modalities and specific detailed feature within each modality, thereby enhancing the accuracy and robustness of multimodal object detection. Extensive experimental validation and in-depth ablation studies confirm the effectiveness of the proposed method, achieving state-of-the-art detection performance on multiple public datasets.
ArticleNumber	104996
Author	Junfeng, Liu Zeng, Jun Zheng, Shangpo
Author_xml	– sequence: 1 givenname: Shangpo surname: Zheng fullname: Zheng, Shangpo organization: School of Automation Science and Engineering, South China University of Technology Science and Engineering, Guangzhou 510641, PR China – sequence: 2 givenname: Liu surname: Junfeng fullname: Junfeng, Liu organization: School of Automation Science and Engineering, South China University of Technology Science and Engineering, Guangzhou 510641, PR China – sequence: 3 givenname: Jun surname: Zeng fullname: Zeng, Jun email: junzeng@scut.edu.cn organization: School of Electric Power Engineering, South China University of Technology, Guangzhou 510641, PR China
BookMark	eNp9kMFOAjEQhnvAREAfwFtfYLFd2tLqiRBRE9CLHk3T3Z0mXZftpi0Y3t4Cnj39M8l8kz_fBI163wNCd5TMKKHivp01cZiVpOR5Z0qJERpTwmlREsKu0STGlhCyYKUYo6_tarl-g_SAt_suuVibDnAdfIzFzjemc-mITWOG5A6A7T463-Me0o8P39j6gHdnaoA6BdNhX7V5wg2kHPn0Bl1Z00W4_csp-lw_faxeis378-tquSnqkqlUgLBWSkEAiCwrsKZazAmzRvCKWSqBcEkpk8oaKi0XDSiupFwIxWsowar5FNHL33PzAFYPwe1MOGpK9MmJbnV2ok9O9MVJZh4vDORiBwdBx9pBX0PjQm6vG-_-oX8BdBZvkQ
Cites_doi	10.1109/TCSVT.2023.3234340 10.1109/TCSVT.2016.2581660 10.1109/LSP.2023.3309578 10.1109/TCYB.2021.3095305 10.1109/TCSVT.2016.2539684 10.1109/TIM.2022.3216413 10.1016/j.inffus.2022.10.034 10.1109/TCSVT.2022.3180274 10.3390/s16060820 10.1007/s13369-021-06181-7 10.1109/TCSVT.2021.3060162 10.3390/rs13183656 10.1109/TCSVT.2023.3306870 10.1016/j.patcog.2023.109913 10.2139/ssrn.4227745 10.1016/j.jvcir.2015.11.002 10.1016/j.neucom.2022.04.015 10.1016/j.inffus.2018.09.015 10.1109/TCSVT.2021.3054584 10.1109/TPAMI.2016.2577031 10.3390/s21124184 10.1109/TCSVT.2015.2511812 10.1109/TCSVT.2021.3109895 10.1109/TCSVT.2022.3168279 10.1016/j.patcog.2022.108786 10.1109/TVT.2004.834875 10.1109/TCSVT.2021.3056725
ContentType	Journal Article
Copyright	2025
Copyright_xml	– notice: 2025
DBID	AAYXX CITATION
DOI	10.1016/j.dsp.2025.104996
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
ExternalDocumentID	10_1016_j_dsp_2025_104996 S1051200425000181
GroupedDBID	--K --M .DC .~1 0R~ 1B1 1~. 1~5 29G 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AAEDT AAEDW AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AATTM AAXKI AAXUO AAYFN AAYWO ABBOA ABDPE ABFNM ABJNI ABMAC ABWVN ABXDB ACDAQ ACGFS ACNNM ACRLP ACRPL ACVFH ACZNC ADBBV ADCNI ADEZE ADFGL ADJOM ADMUD ADNMO ADTZH AEBSH AECPX AEIPS AEKER AENEX AEUPX AFJKZ AFPUW AFTJW AFXIZ AGCQF AGHFR AGQPQ AGRNS AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIGII AIIUN AIKHN AITUG AKBMS AKRWK AKYEP ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU AOUOD APXCP ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC BNPGV CAG COF CS3 DM4 DU5 EBS EFBJH EJD EO8 EO9 EP2 EP3 F0J F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG5 LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SES SET SEW SPC SPCBC SSH SST SSV SSZ T5K WUQ XPP ZMT ZU3 ~G- AAYXX CITATION
ID	FETCH-LOGICAL-c249t-e6ff8860ee082befab7304fa65b4f18e05811489fa18f56de959887695ce2ef93
IEDL.DBID	.~1
ISSN	1051-2004
IngestDate	Sun Jul 06 05:07:30 EDT 2025 Sat Jun 28 18:18:40 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Keywords	multimodal adaptive feature fusion transformer Attention mechanism multispectral object detection cross-modality
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c249t-e6ff8860ee082befab7304fa65b4f18e05811489fa18f56de959887695ce2ef93
ParticipantIDs	crossref_primary_10_1016_j_dsp_2025_104996 elsevier_sciencedirect_doi_10_1016_j_dsp_2025_104996
PublicationCentury	2000
PublicationDate	April 2025 2025-04-00
PublicationDateYYYYMMDD	2025-04-01
PublicationDate_xml	– month: 04 year: 2025 text: April 2025
PublicationDecade	2020
PublicationTitle	Digital signal processing
PublicationYear	2025
Publisher	Elsevier Inc
Publisher_xml	– name: Elsevier Inc
References	Tang, He, Liu, Duan, Si (bib0030) Jul. 2023; 33 He, Zhang, Ren, Sun (bib0017) Jun. 2016 Zhang, Lei, Xie, Fang, Li, Du (bib0065) 2023; 61 Prakash, Chitta, Geiger (bib0043) Jun. 2021 S. Pei, J. Lin, W. Liu, T. Zhao, and C.-W. Lin, “Beyond night visibility: adaptive multi-scale fusion of infrared and visible images,” 2024. [Online]. Available Wang, Girshick, Gupta, He (bib0049) Jun. 2018 Li, Zhang, Hu, Zhu, Fu, Chen (bib0064) April 2024; 34 Ren, He, Girshick, Sun (bib0053) 2015; 28 Fang, Yamada, Ninomiya (bib0032) Nov. 2004; 53 Redmon J, Farhadi A. YOLOv3: an incremental improvement [EB/OL]. (2018-05-25). [2024-05-20]. Zhang, Fromont, Lefevre, Avignon (bib0056) Sep. 2021 You, Xie, Feng, Mei, Ji (bib0023) Aug. 2023; 30 Zhang, Fromont, Lefèvre, Avignon (bib0010) 2021 G. J. et al., “ultralytics/yolov5: v5.0,” 2021. [Online]. Available Yang, Liu, Huang, Wan, Wen, Guan (bib0011) Dec. 2021; 31 He, Zhang, Ren, Sun (bib0015) 2014 Ren, He, Girshick, Sun (bib0019) Jun. 2017; 39 Zheng (bib0051) Aug. 2022; 52 Qingyun, Zhaokui (bib0062) 2022; 130 Q. Fang, D. Han, and Z. Wang, “Cross-modality fusion transformer for multispectral object detection,” 2022. [Online]. Available Zhang, Liu, Zhang, Yang, Qiao, Huang, Hussain (bib0036) Oct. 2019; 50 Wang, Chen, Shao, Li, Zhang (bib0025) Jul. 2022; 71 Teutsch, Muller, Huber (bib0004) Sep. 2014 Krizhevsky, Sutskever, Hinton (bib0016) Sep. 2012 Dosovitskiy, Kolesnikov, Weissenborn, Zhai, Unterthiner, Dehghani, Minderer, Heigold, Gelly, Uszkoreit, Houlsby (bib0041) Jun. 2021 . Jia, Zhu, Li, Tang, Zhou (bib0038) Oct. 2021 Jin, Guo, He, Xu, Wang, Su (bib0013) 2022; 491 Chen, Shi, Ye, Mertz, Ramanan, Kong (bib0067) Oct. 2022 Team (bib0039) Liu, Fan, Jiang, Liu, Luo (bib0008) Jan. 2022; 32 Liu (bib0042) Oct. 2021 Sun, Cao, Zhu, Hu (bib0060) Oct. 2022; 32 Dhanaraj, Sharma, Sarkar, Karnam, Chachlakis, Ptucha, Markopoulos, Saber (bib0061) May. 2020 Zhang, Chen, Huang (bib0027) Apr. 2022 Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (bib0040) Jun. 2017 V. Vibashan, J. Maria Jose Valanarasu, P. Oza, and V.M. Patel, “Imagefusion transformer,” 2021, arXiv:2107.09011. Bao, Huang, Hu, Xiang (bib0028) 2022; 13534 González, Fang, Socarras (bib0033) Jun. 2016; 16 Jeong, Ko, Nam (bib0005) Jun. 2017; 27 Zhang, Liu, Zhang, Yang, Qiao, Huang, Hussain (bib0009) 2019; 50 Liu, Lam, Zhao, Qiu (bib0022) Jan. 2021; 32 Fuhr, Jung (bib0002) May 2017; 27 Menze, Geiger (bib0001) Jun. 2015 Wang, Wu, Zhu, Li, Zuo, Hu (bib0047) Jun. 2020 Yang, Zhang, Li, Xie (bib0048) Jul. 2021 Lee, Jovanov, Philips (bib0006) Jul. 2022 Li, Pan, Zhang, Wang, Yu (bib0070) April. 2024 Tang, Xiang, Zhang, Gong, Ma (bib0068) Mar. 2023; 91 Wang, Wang, Wu, Xu, Zhang (bib0007) Jun. 2022; 32 Woo, Park, Lee, So Kweon (bib0046) Oct. 2018 Radford, Hallacy, Ramesh, Goh, Agarwal, Sastry, Askell, Mishkin, Clark, Krueger, Sutskever (bib0044) Jul. 2021; 139 Pei, Lin, Liu, Zhao, Lin (bib0072) Mar. 2024 Zhou, Sun, Ren, Wang (bib0059) Sep. 2021; 13 Jiang, Cai, Yang (bib0055) Sep. 2022; 47 Jin, Yi, Xu (bib0014) Nov. 2022; 32 Zhao (bib0026) Apr. 2023 Wagner, Fischer, Herman, Behnke (bib0034) Apr., 2016 Zhang, Fromont, Lefevre, Avignon (bib0058) Jan. 2021 Cao, Yang, Zhao (bib0012) 2021; 21 Zhang, Fromont, Lefèvre, Avignon (bib0057) Oct. 2020 Razakarivony, Jurie (bib0037) Jan. 2016; 34 Liu, Zhang, Wang, Metaxas (bib0035) Sep. 2016 Hu, Shen, Sun (bib0045) Jun. 2018 Liu (bib0018) Dec. 2019 Bilal, Khan, Khan, Kyung (bib0003) Oct. 2017; 27 Zhang, Wang, Dayoub, Sunderhauf (bib0031) Jun. 2020 Selvaraju, Cogswell, Das, Vedantam, Parikh, Batra (bib0063) Oct. 2017 Cao, Bin, Hamari, Blasch, Liu (bib0069) Jun. 2023 Shen, Chen, Liu, Zuo, Fan, Yang (bib0071) 2024; 145 Redmon, Divvala, Girshick, Farhadi (bib0020) Jun. 2016 Zhou, Chen, Cao (bib0021) Dec. 2020 Teutsch (10.1016/j.dsp.2025.104996_bib0004) 2014 Jeong (10.1016/j.dsp.2025.104996_bib0005) 2017; 27 Ren (10.1016/j.dsp.2025.104996_bib0019) 2017; 39 You (10.1016/j.dsp.2025.104996_bib0023) 2023; 30 Zhang (10.1016/j.dsp.2025.104996_bib0057) 2020 Dhanaraj (10.1016/j.dsp.2025.104996_bib0061) 2020 Wagner (10.1016/j.dsp.2025.104996_bib0034) 2016 Liu (10.1016/j.dsp.2025.104996_bib0035) 2016 He (10.1016/j.dsp.2025.104996_bib0015) 2014 Ren (10.1016/j.dsp.2025.104996_bib0053) 2015; 28 Menze (10.1016/j.dsp.2025.104996_bib0001) 2015 Fang (10.1016/j.dsp.2025.104996_bib0032) 2004; 53 Tang (10.1016/j.dsp.2025.104996_bib0030) 2023; 33 Liu (10.1016/j.dsp.2025.104996_bib0042) 2021 Razakarivony (10.1016/j.dsp.2025.104996_bib0037) 2016; 34 10.1016/j.dsp.2025.104996_bib0050 10.1016/j.dsp.2025.104996_bib0052 Zhou (10.1016/j.dsp.2025.104996_bib0021) 2020 Liu (10.1016/j.dsp.2025.104996_bib0022) 2021; 32 Yang (10.1016/j.dsp.2025.104996_bib0011) 2021; 31 Vaswani (10.1016/j.dsp.2025.104996_bib0040) 2017 Zhang (10.1016/j.dsp.2025.104996_bib0010) 2021 Jiang (10.1016/j.dsp.2025.104996_bib0055) 2022; 47 Lee (10.1016/j.dsp.2025.104996_bib0006) 2022 Zhao (10.1016/j.dsp.2025.104996_bib0026) 2023 Yang (10.1016/j.dsp.2025.104996_bib0048) 2021 Tang (10.1016/j.dsp.2025.104996_bib0068) 2023; 91 Pei (10.1016/j.dsp.2025.104996_bib0072) 2024 Liu (10.1016/j.dsp.2025.104996_bib0018) 2019 Qingyun (10.1016/j.dsp.2025.104996_bib0062) 2022; 130 Zhang (10.1016/j.dsp.2025.104996_bib0031) 2020 Li (10.1016/j.dsp.2025.104996_bib0070) 2024 Cao (10.1016/j.dsp.2025.104996_bib0012) 2021; 21 Cao (10.1016/j.dsp.2025.104996_bib0069) 2023 Jin (10.1016/j.dsp.2025.104996_bib0014) 2022; 32 Hu (10.1016/j.dsp.2025.104996_bib0045) 2018 Krizhevsky (10.1016/j.dsp.2025.104996_bib0016) 2012 Fuhr (10.1016/j.dsp.2025.104996_bib0002) 2017; 27 Prakash (10.1016/j.dsp.2025.104996_bib0043) 2021 10.1016/j.dsp.2025.104996_bib0066 10.1016/j.dsp.2025.104996_bib0024 Wang (10.1016/j.dsp.2025.104996_bib0025) 2022; 71 Woo (10.1016/j.dsp.2025.104996_bib0046) 2018 Chen (10.1016/j.dsp.2025.104996_bib0067) 2022 Zhang (10.1016/j.dsp.2025.104996_bib0036) 2019; 50 Zhou (10.1016/j.dsp.2025.104996_bib0059) 2021; 13 Jin (10.1016/j.dsp.2025.104996_bib0013) 2022; 491 Zhang (10.1016/j.dsp.2025.104996_bib0065) 2023; 61 Zhang (10.1016/j.dsp.2025.104996_bib0009) 2019; 50 10.1016/j.dsp.2025.104996_bib0029 Wang (10.1016/j.dsp.2025.104996_bib0047) 2020 Sun (10.1016/j.dsp.2025.104996_bib0060) 2022; 32 Jia (10.1016/j.dsp.2025.104996_bib0038) 2021 Liu (10.1016/j.dsp.2025.104996_bib0008) 2022; 32 Bao (10.1016/j.dsp.2025.104996_bib0028) 2022; 13534 Redmon (10.1016/j.dsp.2025.104996_bib0020) 2016 González (10.1016/j.dsp.2025.104996_bib0033) 2016; 16 Zhang (10.1016/j.dsp.2025.104996_bib0058) 2021 Wang (10.1016/j.dsp.2025.104996_bib0049) 2018 Selvaraju (10.1016/j.dsp.2025.104996_bib0063) 2017 Bilal (10.1016/j.dsp.2025.104996_bib0003) 2017; 27 Dosovitskiy (10.1016/j.dsp.2025.104996_bib0041) 2021 Team (10.1016/j.dsp.2025.104996_bib0039) Zheng (10.1016/j.dsp.2025.104996_bib0051) 2022; 52 Zhang (10.1016/j.dsp.2025.104996_bib0056) 2021 He (10.1016/j.dsp.2025.104996_bib0017) 2016 Zhang (10.1016/j.dsp.2025.104996_bib0027) 2022 Wang (10.1016/j.dsp.2025.104996_bib0007) 2022; 32 Radford (10.1016/j.dsp.2025.104996_bib0044) 2021; 139 Shen (10.1016/j.dsp.2025.104996_bib0071) 2024; 145 Li (10.1016/j.dsp.2025.104996_bib0064) 2024; 34
References_xml	– volume: 27 start-page: 1132 year: May 2017 end-page: 1142 ident: bib0002 article-title: Camera self-calibration based on nonlinear optimization and applications in surveillance systems publication-title: IEEE Trans. Circuits Syst. Video Technol. – reference: Q. Fang, D. Han, and Z. Wang, “Cross-modality fusion transformer for multispectral object detection,” 2022. [Online]. Available: – start-page: 7073 year: Jun. 2021 end-page: 7083 ident: bib0043 article-title: Multi-modal fusion transformer for end-to-end autonomous driving publication-title: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR) – start-page: 449 year: Sep. 2021 end-page: 453 ident: bib0056 article-title: Deep active learningfrom multispectral data through cross-modality prediction inconsistency – year: Apr., 2016 ident: bib0034 article-title: Multispectral pedestrian detection using deep fusion convolutional neural networks publication-title: . Neural Netw. (ESANN) – volume: 34 start-page: 187 year: Jan. 2016 end-page: 203 ident: bib0037 article-title: Vehicle detection in aerial imagery: a small target detection benchmark publication-title: Journal of Visual Communication and Image Representation – start-page: 5906 year: Apr. 2023 end-page: 5916 ident: bib0026 article-title: CDDFuse: correlation-driven dual-branch feature decomposition for multi-modality image fusion publication-title: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR) – volume: 27 start-page: 2260 year: Oct. 2017 end-page: 2273 ident: bib0003 article-title: A low complexity pedestrian detection framework for smart video surveillance systems publication-title: IEEE Trans. Circuits Syst. Video Technol. – volume: 32 start-page: 105 year: Jan. 2022 end-page: 119 ident: bib0008 article-title: Learning a deep multiscale feature ensemble and an edge-attention guidance for image fusion publication-title: IEEE Trans. Circuits Syst. Video Technol. – reference: Redmon J, Farhadi A. YOLOv3: an incremental improvement [EB/OL]. (2018-05-25). [2024-05-20]. – volume: 13 start-page: 3656 year: Sep. 2021 ident: bib0059 article-title: Visible-thermal image object detection via the combination of illumination conditions and temperature information publication-title: Remote Sens – start-page: 9992 year: Oct. 2021 end-page: 10002 ident: bib0042 article-title: Swin transformer: hierarchical vision transformer usingshifted windows publication-title: Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV) – start-page: 770 year: Jun. 2016 end-page: 778 ident: bib0017 article-title: Deep residual learning for image recognition publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – year: 2014 ident: bib0015 article-title: Spatial pyramid pooling in deep convolutional networks for visual recognition publication-title: Proc. Eur. Conf. Comput. Vis. (ECCV) – ident: bib0039 article-title: Free flir thermal dataset for algorithm training – volume: 50 start-page: 20 year: 2019 end-page: 29 ident: bib0009 article-title: Cross-modality interactive attention network for multispectral pedestrian detection publication-title: Inf. Fusion – start-page: 72 year: 2021 end-page: 80 ident: bib0010 article-title: Guided attentive feature fusion for multispectral pedestrian detection publication-title: Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV), Jan. 3-8 – volume: 32 start-page: 315 year: Jan. 2021 end-page: 329 ident: bib0022 article-title: Deep cross-modal representation learning and distillation for illumination-invariant pedestrian detection publication-title: IEEE Trans. Circuits Syst. Video Technol. – reference: V. Vibashan, J. Maria Jose Valanarasu, P. Oza, and V.M. Patel, “Imagefusion transformer,” 2021, arXiv:2107.09011. – start-page: 1 year: Oct. 2020 end-page: 5 ident: bib0057 article-title: Multispectral fusion for object detection with cyclic fuse-and-refine blocks publication-title: Proc. IEEE Int. Conf. Image Process. (ICIP) – reference: G. J. et al., “ultralytics/yolov5: v5.0,” 2021. [Online]. Available: – year: Jul. 2022 ident: bib0006 article-title: Cross-modality attention and multimodal fusion transformer for pedestrian detection publication-title: Proc. Eur. Conf. Comput. Vis. (ECCV) Workshops – start-page: 787 year: Dec. 2020 end-page: 803 ident: bib0021 article-title: Improving multispectral pedestrian detection by addressing modality imbalance problems publication-title: Proc. Eur. Conf. Comput. Vis. (ECCV) – start-page: 72 year: Jan. 2021 end-page: 80 ident: bib0058 article-title: Guided attentive feature fusion for multispectral pedestrian detection publication-title: Proc. IEEE Winter Conf. Appl. Comput. Vis. (WACV) – year: Jun. 2021 ident: bib0041 article-title: An image is worth 16×16 words: transformers for image recognition at scale publication-title: Proc. Int. Conf. Learn. Represent. (ICLR) – volume: 33 start-page: 3159 year: Jul. 2023 end-page: 3172 ident: bib0030 article-title: DATFuse: Infrared and visible image fusion via dual attention transformer publication-title: IEEE Trans. Circuits Syst. Video Technol. – start-page: 7132 year: Jun. 2018 end-page: 7141 ident: bib0045 article-title: Squeeze-and-excitation networks publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – year: Mar. 2024 ident: bib0072 article-title: Beyond night visibility: adaptive multi-scale fusion of infrared and visible images publication-title: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR) – start-page: 618 year: Oct. 2017 end-page: 626 ident: bib0063 article-title: Grad-CAM: visual explanations from deep networks via gradient-based localization publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – start-page: 21 year: Dec. 2019 end-page: 37 ident: bib0018 article-title: SSD: Single shot multibox detector publication-title: Proc. Eur. Conf. Comput. Vis. (ECCV) – volume: 91 start-page: 477 year: Mar. 2023 end-page: 493 ident: bib0068 article-title: DivFusion: darkness-free infrared and visible image fusion publication-title: Inf. Fusion – year: Sep. 2016 ident: bib0035 article-title: Multispectral deep neural networks for pedestrian detection publication-title: Proc. British Mach. Vis. Conf. (BMVC) – volume: 27 start-page: 1368 year: Jun. 2017 end-page: 1380 ident: bib0005 article-title: Early detection of sudden pedestrian crossing for safe driving during summer nights publication-title: IEEE Trans. Circuits Syst. Video Technol. – volume: 28 start-page: 91 year: 2015 end-page: 99 ident: bib0053 article-title: Faster R-CNN: Towards real-time object detection with region proposal networks publication-title: Proc. Adv. Neural Inf. Process. Syst. – year: April. 2024 ident: bib0070 article-title: MambaDFuse: a mamba-based dual-phase model for multi-modality image fusion publication-title: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR) – volume: 32 start-page: 3360 year: Jun. 2022 end-page: 3374 ident: bib0007 article-title: UNFusion: A unified multi-scale densely connected network for infrared and visible image fusion publication-title: IEEE Trans. Circuits Syst. Video Technol. – volume: 30 start-page: 1172 year: Aug. 2023 end-page: 1176 ident: bib0023 article-title: Multi-scale aggregation transformers for multispectral object detection publication-title: IEEE Signal Processing Letters – start-page: 3061 year: Jun. 2015 end-page: 3070 ident: bib0001 article-title: Object scene flow for autonomous vehicles publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – volume: 32 start-page: 6700 year: Oct. 2022 end-page: 6713 ident: bib0060 article-title: Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning publication-title: IEEE Trans. Circuits Syst. Video Technol. – volume: 31 start-page: 4771 year: Dec. 2021 end-page: 4783 ident: bib0011 article-title: Infrared and visible image fusion via texture conditional generative adversarial network publication-title: IEEE Trans. Circuits Syst. Video Technol. – start-page: 779 year: Jun. 2016 end-page: 788 ident: bib0020 article-title: You only look once: unified, real-time object detection publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – volume: 145 year: 2024 ident: bib0071 article-title: ICAFusion: iterative cross-attention guided feature fusion for multispectral object detection publication-title: Pattern Recognition – volume: 21 start-page: 4184 year: 2021 ident: bib0012 article-title: Attention fusion for one-stage multispectral pedestrian detection publication-title: Sensors – start-page: 11534 year: Jun. 2020 end-page: 11542 ident: bib0047 article-title: ECANet: Efficient channel attention for deep convolutional neural networks publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – start-page: 3489 year: Oct. 2021 end-page: 3497 ident: bib0038 article-title: LLVIP: A visible-infrared paired dataset for low-light vision publication-title: Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshops (ICCVW) – start-page: 11863 year: Jul. 2021 end-page: 11874 ident: bib0048 article-title: SimAM: a simple, parameter-free attention module for convolutional neural networks publication-title: Proc. 38th Int. Conf. Mach. Learn. (ICML) – volume: 34 start-page: 3017 year: April 2024 end-page: 3029 ident: bib0064 article-title: Stabilizing multispectral pedestrian detection with evidential hybrid fusion publication-title: IEEE Trans. Circuits Syst. Video Technol. – start-page: 209 year: Sep. 2014 end-page: 216 ident: bib0004 article-title: Low resolution person detection with a moving thermal infrared camera by hot spot classification publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – reference: S. Pei, J. Lin, W. Liu, T. Zhao, and C.-W. Lin, “Beyond night visibility: adaptive multi-scale fusion of infrared and visible images,” 2024. [Online]. Available: – volume: 47 start-page: 2289 year: Sep. 2022 end-page: 2303 ident: bib0055 article-title: IARet: A lightweight multiscale infrared aerocraft recognition algorithm publication-title: Arab. J. Sci. Eng. – volume: 16 start-page: 820 year: Jun. 2016 ident: bib0033 article-title: Pedestrian detection at day/night time with visible and FIR cameras: A comparison publication-title: Sensors – year: May. 2020 ident: bib0061 article-title: Vehicle detection from multi-modal aerial imagery using YOLOv3 with mid-level fusion publication-title: SPIE Defense+Commercial Sensing – volume: 53 start-page: 1679 year: Nov. 2004 end-page: 1697 ident: bib0032 article-title: A shape-independent method for pedestrian detection with far-infrared images publication-title: IEEE Trans. Veh. Technol. – start-page: 6000 year: Jun. 2017 end-page: 6010 ident: bib0040 article-title: Attention is all you need publication-title: Proc. 31st Int. Conf. Neural Inf. Process. Syst. (NIPS) – volume: 139 start-page: 8748 year: Jul. 2021 end-page: 8763 ident: bib0044 article-title: Learning transferable visual models from natural language supervision publication-title: Proceedings of the 38th International Conference on Machine Learning (ICML) – start-page: 3 year: Oct. 2018 end-page: 19 ident: bib0046 article-title: CBAM: Convolutional block attention module publication-title: Proc. Eur. Conf. Comput. Vis. (ECCV) – volume: 61 start-page: 1 year: 2023 end-page: 15 ident: bib0065 article-title: SuperYOLO: super resolution assisted object detection in multimodal remote sensing imagery publication-title: IEEE Transac. Geosci. Remote Sens. – volume: 13534 year: 2022 ident: bib0028 article-title: Attention-guided multi-modal and multi-scale fusion for multispectral pedestrian detection publication-title: Pattern Recognition and Computer Vision – volume: 50 start-page: 20 year: Oct. 2019 end-page: 29 ident: bib0036 article-title: Cross-modality interactive attention network for multispectral pedestrian detection publication-title: Inf. Fusion. – start-page: 403 year: Jun. 2023 end-page: 411 ident: bib0069 article-title: Multimodal object detection by channel switching and spatial attention publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – volume: 52 start-page: 8574 year: Aug. 2022 end-page: 8586 ident: bib0051 article-title: Enhancing geometric factors in model learning and inference for object detection and instance segmentation publication-title: IEEE – volume: 39 start-page: 1137 year: Jun. 2017 end-page: 1149 ident: bib0019 article-title: Faster R-CNN: Towards real-time object detection with region proposal networks publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – reference: . – volume: 32 start-page: 7632 year: Nov. 2022 end-page: 7645 ident: bib0014 article-title: MoADNet: mobile asymmetric dual-stream networks for real-time and lightweight RGB-D salient object detection publication-title: IEEE Transac. Circuit Syst. Video Technol. – start-page: 8510 year: Jun. 2020 end-page: 8519 ident: bib0031 article-title: VarifocalNet: an IoU-aware dense object detector publication-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) – start-page: 898 year: Apr. 2022 end-page: 907 ident: bib0027 article-title: CAT-Det: Contrastively augmented transformer for multimodal 3D object detection publication-title: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR) – volume: 491 start-page: 414 year: 2022 end-page: 425 ident: bib0013 article-title: FCMNet: Frequency-aware cross-modality attention networks for RGB-D salient object detection publication-title: Neurocomputing – volume: 130 year: 2022 ident: bib0062 article-title: Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery publication-title: Pattern Recognit – start-page: 1106 year: Sep. 2012 end-page: 1114 ident: bib0016 article-title: Imagenet classification with deep convolutional neural networks publication-title: Proc. Adv. Neural Inf. Process. Syst. (NeurIPS) – start-page: 7794 year: Jun. 2018 end-page: 7803 ident: bib0049 article-title: Non-local neural networks publication-title: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR) – year: Oct. 2022 ident: bib0067 article-title: Multimodal object detection via probabilistic ensembling publication-title: Proc. Eur. Conf. Comput. Vis. (ECCV) – volume: 71 start-page: 1 year: Jul. 2022 end-page: 12 ident: bib0025 article-title: SwinFuse: a residual swin transformer fusion network for infrared and visible images publication-title: IEEE Trans. Instrum. Meas. – volume: 33 start-page: 3159 issue: 7 year: 2023 ident: 10.1016/j.dsp.2025.104996_bib0030 article-title: DATFuse: Infrared and visible image fusion via dual attention transformer publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2023.3234340 – start-page: 8510 year: 2020 ident: 10.1016/j.dsp.2025.104996_bib0031 article-title: VarifocalNet: an IoU-aware dense object detector – year: 2016 ident: 10.1016/j.dsp.2025.104996_bib0034 article-title: Multispectral pedestrian detection using deep fusion convolutional neural networks – start-page: 3489 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0038 article-title: LLVIP: A visible-infrared paired dataset for low-light vision – volume: 27 start-page: 2260 issue: 10 year: 2017 ident: 10.1016/j.dsp.2025.104996_bib0003 article-title: A low complexity pedestrian detection framework for smart video surveillance systems publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2016.2581660 – volume: 30 start-page: 1172 year: 2023 ident: 10.1016/j.dsp.2025.104996_bib0023 article-title: Multi-scale aggregation transformers for multispectral object detection publication-title: IEEE Signal Processing Letters doi: 10.1109/LSP.2023.3309578 – year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0041 article-title: An image is worth 16×16 words: transformers for image recognition at scale – year: 2024 ident: 10.1016/j.dsp.2025.104996_bib0070 article-title: MambaDFuse: a mamba-based dual-phase model for multi-modality image fusion – volume: 52 start-page: 8574 issue: 8 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0051 article-title: Enhancing geometric factors in model learning and inference for object detection and instance segmentation publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2021.3095305 – start-page: 7794 year: 2018 ident: 10.1016/j.dsp.2025.104996_bib0049 article-title: Non-local neural networks – volume: 27 start-page: 1368 issue: 6 year: 2017 ident: 10.1016/j.dsp.2025.104996_bib0005 article-title: Early detection of sudden pedestrian crossing for safe driving during summer nights publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2016.2539684 – ident: 10.1016/j.dsp.2025.104996_bib0052 – volume: 28 start-page: 91 year: 2015 ident: 10.1016/j.dsp.2025.104996_bib0053 article-title: Faster R-CNN: Towards real-time object detection with region proposal networks – volume: 71 start-page: 1 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0025 article-title: SwinFuse: a residual swin transformer fusion network for infrared and visible images publication-title: IEEE Trans. Instrum. Meas. doi: 10.1109/TIM.2022.3216413 – volume: 91 start-page: 477 year: 2023 ident: 10.1016/j.dsp.2025.104996_bib0068 article-title: DivFusion: darkness-free infrared and visible image fusion publication-title: Inf. Fusion doi: 10.1016/j.inffus.2022.10.034 – volume: 32 start-page: 7632 issue: 11 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0014 article-title: MoADNet: mobile asymmetric dual-stream networks for real-time and lightweight RGB-D salient object detection publication-title: IEEE Transac. Circuit Syst. Video Technol. doi: 10.1109/TCSVT.2022.3180274 – start-page: 403 year: 2023 ident: 10.1016/j.dsp.2025.104996_bib0069 article-title: Multimodal object detection by channel switching and spatial attention – volume: 139 start-page: 8748 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0044 article-title: Learning transferable visual models from natural language supervision – volume: 16 start-page: 820 issue: 6 year: 2016 ident: 10.1016/j.dsp.2025.104996_bib0033 article-title: Pedestrian detection at day/night time with visible and FIR cameras: A comparison publication-title: Sensors doi: 10.3390/s16060820 – volume: 47 start-page: 2289 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0055 article-title: IARet: A lightweight multiscale infrared aerocraft recognition algorithm publication-title: Arab. J. Sci. Eng. doi: 10.1007/s13369-021-06181-7 – start-page: 1106 year: 2012 ident: 10.1016/j.dsp.2025.104996_bib0016 article-title: Imagenet classification with deep convolutional neural networks – volume: 32 start-page: 315 issue: 1 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0022 article-title: Deep cross-modal representation learning and distillation for illumination-invariant pedestrian detection publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2021.3060162 – start-page: 3061 year: 2015 ident: 10.1016/j.dsp.2025.104996_bib0001 article-title: Object scene flow for autonomous vehicles – start-page: 6000 year: 2017 ident: 10.1016/j.dsp.2025.104996_bib0040 article-title: Attention is all you need – volume: 13 start-page: 3656 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0059 article-title: Visible-thermal image object detection via the combination of illumination conditions and temperature information publication-title: Remote Sens doi: 10.3390/rs13183656 – start-page: 7073 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0043 article-title: Multi-modal fusion transformer for end-to-end autonomous driving – year: 2024 ident: 10.1016/j.dsp.2025.104996_bib0072 article-title: Beyond night visibility: adaptive multi-scale fusion of infrared and visible images – volume: 34 start-page: 3017 issue: 4 year: 2024 ident: 10.1016/j.dsp.2025.104996_bib0064 article-title: Stabilizing multispectral pedestrian detection with evidential hybrid fusion publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2023.3306870 – volume: 145 year: 2024 ident: 10.1016/j.dsp.2025.104996_bib0071 article-title: ICAFusion: iterative cross-attention guided feature fusion for multispectral object detection publication-title: Pattern Recognition doi: 10.1016/j.patcog.2023.109913 – start-page: 5906 year: 2023 ident: 10.1016/j.dsp.2025.104996_bib0026 article-title: CDDFuse: correlation-driven dual-branch feature decomposition for multi-modality image fusion – start-page: 7132 year: 2018 ident: 10.1016/j.dsp.2025.104996_bib0045 article-title: Squeeze-and-excitation networks – volume: 61 start-page: 1 year: 2023 ident: 10.1016/j.dsp.2025.104996_bib0065 article-title: SuperYOLO: super resolution assisted object detection in multimodal remote sensing imagery publication-title: IEEE Transac. Geosci. Remote Sens. – start-page: 779 year: 2016 ident: 10.1016/j.dsp.2025.104996_bib0020 article-title: You only look once: unified, real-time object detection – ident: 10.1016/j.dsp.2025.104996_bib0066 – ident: 10.1016/j.dsp.2025.104996_bib0024 doi: 10.2139/ssrn.4227745 – year: 2014 ident: 10.1016/j.dsp.2025.104996_bib0015 article-title: Spatial pyramid pooling in deep convolutional networks for visual recognition – volume: 34 start-page: 187 year: 2016 ident: 10.1016/j.dsp.2025.104996_bib0037 article-title: Vehicle detection in aerial imagery: a small target detection benchmark publication-title: Journal of Visual Communication and Image Representation doi: 10.1016/j.jvcir.2015.11.002 – volume: 491 start-page: 414 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0013 article-title: FCMNet: Frequency-aware cross-modality attention networks for RGB-D salient object detection publication-title: Neurocomputing doi: 10.1016/j.neucom.2022.04.015 – start-page: 898 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0027 article-title: CAT-Det: Contrastively augmented transformer for multimodal 3D object detection – volume: 50 start-page: 20 year: 2019 ident: 10.1016/j.dsp.2025.104996_bib0009 article-title: Cross-modality interactive attention network for multispectral pedestrian detection publication-title: Inf. Fusion doi: 10.1016/j.inffus.2018.09.015 – ident: 10.1016/j.dsp.2025.104996_bib0050 – start-page: 72 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0010 article-title: Guided attentive feature fusion for multispectral pedestrian detection – volume: 31 start-page: 4771 issue: 12 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0011 article-title: Infrared and visible image fusion via texture conditional generative adversarial network publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2021.3054584 – start-page: 11534 year: 2020 ident: 10.1016/j.dsp.2025.104996_bib0047 article-title: ECANet: Efficient channel attention for deep convolutional neural networks – start-page: 209 year: 2014 ident: 10.1016/j.dsp.2025.104996_bib0004 article-title: Low resolution person detection with a moving thermal infrared camera by hot spot classification – volume: 39 start-page: 1137 issue: 6 year: 2017 ident: 10.1016/j.dsp.2025.104996_bib0019 article-title: Faster R-CNN: Towards real-time object detection with region proposal networks publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2016.2577031 – start-page: 11863 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0048 article-title: SimAM: a simple, parameter-free attention module for convolutional neural networks – volume: 21 start-page: 4184 issue: 12 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0012 article-title: Attention fusion for one-stage multispectral pedestrian detection publication-title: Sensors doi: 10.3390/s21124184 – year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0067 article-title: Multimodal object detection via probabilistic ensembling – volume: 27 start-page: 1132 issue: 5 year: 2017 ident: 10.1016/j.dsp.2025.104996_bib0002 article-title: Camera self-calibration based on nonlinear optimization and applications in surveillance systems publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2015.2511812 – start-page: 787 year: 2020 ident: 10.1016/j.dsp.2025.104996_bib0021 article-title: Improving multispectral pedestrian detection by addressing modality imbalance problems – start-page: 618 year: 2017 ident: 10.1016/j.dsp.2025.104996_bib0063 article-title: Grad-CAM: visual explanations from deep networks via gradient-based localization – start-page: 770 year: 2016 ident: 10.1016/j.dsp.2025.104996_bib0017 article-title: Deep residual learning for image recognition – volume: 13534 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0028 article-title: Attention-guided multi-modal and multi-scale fusion for multispectral pedestrian detection – ident: 10.1016/j.dsp.2025.104996_bib0029 – start-page: 1 year: 2020 ident: 10.1016/j.dsp.2025.104996_bib0057 article-title: Multispectral fusion for object detection with cyclic fuse-and-refine blocks – volume: 32 start-page: 3360 issue: 6 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0007 article-title: UNFusion: A unified multi-scale densely connected network for infrared and visible image fusion publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2021.3109895 – start-page: 9992 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0042 article-title: Swin transformer: hierarchical vision transformer usingshifted windows – start-page: 21 year: 2019 ident: 10.1016/j.dsp.2025.104996_bib0018 article-title: SSD: Single shot multibox detector – year: 2016 ident: 10.1016/j.dsp.2025.104996_bib0035 article-title: Multispectral deep neural networks for pedestrian detection – start-page: 449 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0056 article-title: Deep active learningfrom multispectral data through cross-modality prediction inconsistency – start-page: 3 year: 2018 ident: 10.1016/j.dsp.2025.104996_bib0046 article-title: CBAM: Convolutional block attention module – year: 2020 ident: 10.1016/j.dsp.2025.104996_bib0061 article-title: Vehicle detection from multi-modal aerial imagery using YOLOv3 with mid-level fusion – volume: 32 start-page: 6700 issue: 10 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0060 article-title: Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2022.3168279 – ident: 10.1016/j.dsp.2025.104996_bib0039 – year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0006 article-title: Cross-modality attention and multimodal fusion transformer for pedestrian detection – volume: 50 start-page: 20 year: 2019 ident: 10.1016/j.dsp.2025.104996_bib0036 article-title: Cross-modality interactive attention network for multispectral pedestrian detection publication-title: Inf. Fusion. doi: 10.1016/j.inffus.2018.09.015 – volume: 130 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0062 article-title: Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery publication-title: Pattern Recognit doi: 10.1016/j.patcog.2022.108786 – start-page: 72 year: 2021 ident: 10.1016/j.dsp.2025.104996_bib0058 article-title: Guided attentive feature fusion for multispectral pedestrian detection – volume: 53 start-page: 1679 issue: 6 year: 2004 ident: 10.1016/j.dsp.2025.104996_bib0032 article-title: A shape-independent method for pedestrian detection with far-infrared images publication-title: IEEE Trans. Veh. Technol. doi: 10.1109/TVT.2004.834875 – volume: 32 start-page: 105 issue: 1 year: 2022 ident: 10.1016/j.dsp.2025.104996_bib0008 article-title: Learning a deep multiscale feature ensemble and an edge-attention guidance for image fusion publication-title: IEEE Trans. Circuits Syst. Video Technol. doi: 10.1109/TCSVT.2021.3056725
SSID	ssj0007426
Score	2.3903005
Snippet	Multispectral object detection techniques integrate data from various spectral modalities, such as combining thermal images with RGB visible light images, to...
SourceID	crossref elsevier
SourceType	Index Database Publisher
StartPage	104996
SubjectTerms	Attention mechanism cross-modality multimodal adaptive feature fusion multispectral object detection transformer
Title	MCAFNet: Multiscale cross-modality adaptive fusion network for multispectral object detection
URI	https://dx.doi.org/10.1016/j.dsp.2025.104996
Volume	159
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1NS8NAEB1KvehB_MTPsgdPQmyT7qZZb6VYqtJetNCLLJvsLFQwLW169bc7u0mwgl48JuSLl8nM2933JgA3FDLdNAotBS8mAc9CHUjLu4GQIaacosRY53ceT-LRlD_NxKwBg9oL42SVVe4vc7rP1tWedoVmezmft1-IGYSRDzpHVLz9mvOei_K7z2-ZBw39vMOIDvYRUa9seo2XWbuWlZFwK53S9e3_rTZt1ZvhAexXRJH1y2c5hAbmR7C31T7wGN7Gg_5wgsU98zbaNcGNzF86-FgYT7CZNnrpMhqzGzcvxvJS9s2IqzIvJvRWyxXdaZG6KRlmsPDqrPwEpsOH18EoqH6XEGQ0hioCjK1NkriDSGU9RatT-nq51bFIuQ0T7IjEDX6k1WFiRWxQCkkpJpYiwwit7J5CM1_keAasZ0KZZa6S6Yj4VKydoRYt7yWG4BbmHG5roNSy7IqharnYuyJUlUNVlaieA6-hVD9eraKs_fdpF_877RJ23VYprrmCZrHa4DXxhiJt-cBowU7_8Xk0-QIOR8H6
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JasMwEBUhObQ9lK40XXXoqWASO5Jj9RZCg9MslyaQSxGyNYIU6oTE-f-OZLuk0F569c7zeOZJem9MyCOGTCcJfIPBC5HHUl95wrCOx4UPCcMo0cb6nSfTMJ6z1wVf1Ei_8sJYWWWZ-4uc7rJ1uaVVotlaL5etN2QGfuCCzhIVa79u2O5UvE4aveEonn4nZBz9OZMRHu-ColrcdDIvvbVdKwNuFzuFbd3_W3naKzmDE3JcckXaKx7nlNQgOyNHex0Ez8n7pN8bTCF_ps5Ju0XEgbpLe58r7Tg2VVqtbVKjZmenxmhWKL8p0lXq9ITObbnBO60SOytDNeROoJVdkPngZdaPvfKPCV6Kw6jcg9CYKArbAFjZEzAqwQ-YGRXyhBk_gjaP7PhHGOVHhocaBBeYZULBUwjAiM4lqWerDK4I7WpfpKktZipAShUq66kFw7qRRsS5bpKnCii5LhpjyEox9iERVWlRlQWqTcIqKOWPtysxcf992vX_TnsgB_FsMpbj4XR0Qw7tnkJrc0vq-WYHd0gj8uS-DJMvysvEqw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MCAFNet%3A+Multiscale+cross-modality+adaptive+fusion+network+for+multispectral+object+detection&rft.jtitle=Digital+signal+processing&rft.au=Zheng%2C+Shangpo&rft.au=Junfeng%2C+Liu&rft.au=Zeng%2C+Jun&rft.date=2025-04-01&rft.pub=Elsevier+Inc&rft.issn=1051-2004&rft.volume=159&rft_id=info:doi/10.1016%2Fj.dsp.2025.104996&rft.externalDocID=S1051200425000181
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-2004&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-2004&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-2004&client=summon