Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery

•We propose a simple yet effective CMAFF module that can fuse the complementary information of multispectral remote sensing images with joint common-modality and differential-modality attentions.•We confirm the effectiveness of our cross-modality fusion attention module through extensive ablation st...

Full description

Saved in:

Bibliographic Details
Published in	Pattern recognition Vol. 130; p. 108786
Main Authors	Qingyun, Fang, Zhaokui, Wang
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.10.2022
Subjects	Attention Cross-modality Feature fusion Multispectral remote sensing imagery Object detection Cross-modality Feature fusion Multispectral remote sensing imagery Attention Object detection
Online Access	Get full text
ISSN	0031-3203 1873-5142
DOI	10.1016/j.patcog.2022.108786

Cover

Loading…

Abstract	•We propose a simple yet effective CMAFF module that can fuse the complementary information of multispectral remote sensing images with joint common-modality and differential-modality attentions.•We confirm the effectiveness of our cross-modality fusion attention module through extensive ablation studies.•We design a new two-stream object detection network YOLOFusion for multispectral remote sensing images and verify its performance. Cross-modality fusing complementary information of multispectral remote sensing image pairs can improve the perception ability of detection algorithms, making them more robust and reliable for a wider range of applications, such as nighttime detection. Compared with prior methods, we think different features should be processed specifically, the modality-specific features should be retained and enhanced, while the modality-shared features should be cherry-picked from the RGB and thermal IR modalities. Following this idea, a novel and lightweight multispectral feature fusion approach with joint common-modality and differential-modality attentions are proposed, named Cross-Modality Attentive Feature Fusion (CMAFF). Given the intermediate feature maps of RGB and thermal images, our module parallel infers attention maps from two separate modalities, common- and differential-modality, then the attention maps are multiplied to the input feature map respectively for adaptive feature enhancement or selection. Extensive experiments demonstrate that our proposed approach can achieve the state-of-the-art performance at a low computation cost.
AbstractList	•We propose a simple yet effective CMAFF module that can fuse the complementary information of multispectral remote sensing images with joint common-modality and differential-modality attentions.•We confirm the effectiveness of our cross-modality fusion attention module through extensive ablation studies.•We design a new two-stream object detection network YOLOFusion for multispectral remote sensing images and verify its performance. Cross-modality fusing complementary information of multispectral remote sensing image pairs can improve the perception ability of detection algorithms, making them more robust and reliable for a wider range of applications, such as nighttime detection. Compared with prior methods, we think different features should be processed specifically, the modality-specific features should be retained and enhanced, while the modality-shared features should be cherry-picked from the RGB and thermal IR modalities. Following this idea, a novel and lightweight multispectral feature fusion approach with joint common-modality and differential-modality attentions are proposed, named Cross-Modality Attentive Feature Fusion (CMAFF). Given the intermediate feature maps of RGB and thermal images, our module parallel infers attention maps from two separate modalities, common- and differential-modality, then the attention maps are multiplied to the input feature map respectively for adaptive feature enhancement or selection. Extensive experiments demonstrate that our proposed approach can achieve the state-of-the-art performance at a low computation cost.
ArticleNumber	108786
Author	Zhaokui, Wang Qingyun, Fang
Author_xml	– sequence: 1 givenname: Fang surname: Qingyun fullname: Qingyun, Fang email: fqy17@mails.tsinghua.edu.cn – sequence: 2 givenname: Wang surname: Zhaokui fullname: Zhaokui, Wang email: wangzk@tsinghua.edu.cn
BookMark	eNqFkM1KAzEUhYNUsFbfwEVeYGqSSWdSF4IU_6DgRtchTW5KhpmkJGmhb2-GceVCVwfO5RzO_a7RzAcPCN1RsqSENvfd8qCyDvslI4wVS7SiuUBzKtq6WlHOZmhOSE2rmpH6Cl2n1BFC23KYI7OJIaVqCEb1Lp-xyhl8difAFlQ-xqLH5ILHNkQcdh3ojA3kIqPpPB6OfXbpUIyoehxhCBlwAp-c32M3qD3E8w26tKpPcPujC_T18vy5eau2H6_vm6dtpWvS5EobUjdW76ixVrQcaMsNA7smrdg1ygqr-JoRIThrOGW0EcJwzlZkTSixrYB6gR6mXj0-FcFK7bIal5ZxrpeUyJGX7OTES4685MSrhPmv8CGW_fH8X-xxikF57OQgyqQdeA3GxQJFmuD-LvgGCeyLCg
CitedBy_id	crossref_primary_10_1109_JSTARS_2024_3361556 crossref_primary_10_1109_TNNLS_2023_3266452 crossref_primary_10_1016_j_infrared_2023_105077 crossref_primary_10_3788_AOS240664 crossref_primary_10_1016_j_patcog_2023_110215 crossref_primary_10_1109_TGRS_2024_3446814 crossref_primary_10_3390_rs15020370 crossref_primary_10_1080_2150704X_2024_2305177 crossref_primary_10_3390_rs16244649 crossref_primary_10_1080_2150704X_2023_2254912 crossref_primary_10_1109_LGRS_2023_3276052 crossref_primary_10_3390_rs15030614 crossref_primary_10_1016_j_neucom_2025_129913 crossref_primary_10_1109_LSP_2023_3309578 crossref_primary_10_1109_JSEN_2025_3530076 crossref_primary_10_1109_TCSVT_2024_3418965 crossref_primary_10_3390_su14159733 crossref_primary_10_3390_s24134098 crossref_primary_10_1109_TCSVT_2024_3454631 crossref_primary_10_1109_JSTARS_2024_3447649 crossref_primary_10_1109_TGRS_2023_3293147 crossref_primary_10_1016_j_patcog_2025_111579 crossref_primary_10_1016_j_patcog_2022_109071 crossref_primary_10_3788_IRLA20240253 crossref_primary_10_1109_JSTARS_2024_3504549 crossref_primary_10_1109_JSTARS_2023_3315544 crossref_primary_10_1109_JSEN_2024_3399193 crossref_primary_10_3390_rs16234451 crossref_primary_10_1109_TGRS_2024_3376819 crossref_primary_10_3390_rs15184539 crossref_primary_10_1109_JIOT_2024_3400856 crossref_primary_10_1007_s00530_024_01540_4 crossref_primary_10_1109_TITS_2024_3412417 crossref_primary_10_1186_s13634_023_01002_5 crossref_primary_10_1016_j_eswa_2024_123233 crossref_primary_10_1016_j_iswa_2023_200264 crossref_primary_10_1016_j_patcog_2023_109762 crossref_primary_10_1016_j_isprsjprs_2024_09_025 crossref_primary_10_3390_rs16020327 crossref_primary_10_1016_j_patcog_2025_111441 crossref_primary_10_3390_electronics12244902 crossref_primary_10_1109_JSTARS_2024_3452707 crossref_primary_10_3389_feart_2024_1381192 crossref_primary_10_1109_LGRS_2023_3339214 crossref_primary_10_3390_rs16061071 crossref_primary_10_1016_j_image_2023_117027 crossref_primary_10_1109_LGRS_2025_3527560 crossref_primary_10_1109_TGRS_2023_3258666 crossref_primary_10_11834_jig_230495 crossref_primary_10_1038_s41598_024_77244_6 crossref_primary_10_1016_j_engappai_2024_108774 crossref_primary_10_1016_j_compeleceng_2025_110133 crossref_primary_10_1360_SSPMA_2024_0291 crossref_primary_10_3390_drones8030112 crossref_primary_10_3390_rs17061095 crossref_primary_10_3390_drones7010020 crossref_primary_10_1016_j_dsp_2025_104996 crossref_primary_10_1016_j_patcog_2023_109434 crossref_primary_10_1016_j_elerap_2025_101479 crossref_primary_10_1109_JSTARS_2025_3526995 crossref_primary_10_1109_LGRS_2024_3440045 crossref_primary_10_1109_ACCESS_2024_3404248 crossref_primary_10_1016_j_asr_2024_08_028 crossref_primary_10_1109_JSEN_2023_3324451 crossref_primary_10_1109_TGRS_2024_3367934 crossref_primary_10_3390_plants13141980 crossref_primary_10_1088_1361_6501_ad66f8 crossref_primary_10_1109_TGRS_2024_3363057 crossref_primary_10_3390_electronics13020443 crossref_primary_10_1016_j_patcog_2023_109913 crossref_primary_10_1364_JOSAA_511058
Cites_doi	10.1016/j.patcog.2018.03.007 10.5244/C.30.73 10.1109/TPAMI.2014.2300479 10.1016/j.patcog.2018.08.005 10.3390/s21124184 10.1016/j.patcog.2019.107103 10.1016/j.patcog.2022.108717 10.1016/j.patcog.2020.107639 10.1016/j.patcog.2021.108102 10.1109/TPAMI.2016.2577031 10.3390/rs12152501 10.1016/j.patcog.2020.107333 10.1109/ACCESS.2020.2993998 10.1016/j.jvcir.2015.11.002 10.1016/j.patcog.2019.106986 10.1023/B:VISI.0000029664.99615.94 10.1109/JSTARS.2020.3041316 10.1016/j.patcog.2020.107474 10.1007/978-3-031-20077-9_9 10.1177/1729881419842995 10.1109/TPAMI.2015.2389824 10.1007/s11263-009-0275-4 10.1016/j.patcog.2020.107635 10.1016/j.patcog.2012.10.009 10.1016/j.inffus.2018.09.015
ContentType	Journal Article
Copyright	2022 Elsevier Ltd
Copyright_xml	– notice: 2022 Elsevier Ltd
DBID	AAYXX CITATION
DOI	10.1016/j.patcog.2022.108786
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1873-5142
ExternalDocumentID	10_1016_j_patcog_2022_108786 S0031320322002679
GroupedDBID	--K --M -D8 -DT -~X .DC .~1 0R~ 123 1B1 1RT 1~. 1~5 29O 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABFRF ABHFT ABJNI ABMAC ABTAH ABXDB ABYKQ ACBEA ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADMXK ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F0J F5P FD6 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA KOM KZ1 LG9 LMP LY1 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG RNS ROL RPZ SBC SDF SDG SDP SDS SES SEW SPC SPCBC SST SSV SSZ T5K TN5 UNMZH VOH WUQ XJE XPP ZMT ZY4 ~G- AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AFXIZ AGCQF AGQPQ AGRNS AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP BNPGV CITATION SSH
ID	FETCH-LOGICAL-c306t-cd036fcb1dff874e174d2ef9078b6af8fa4920884264121688d442509010f78e3
IEDL.DBID	.~1
ISSN	0031-3203
IngestDate	Thu Apr 24 22:50:40 EDT 2025 Tue Jul 01 02:36:38 EDT 2025 Fri Feb 23 02:39:33 EST 2024
IsPeerReviewed	true
IsScholarly	true
Keywords	Cross-modality Feature fusion Multispectral remote sensing imagery Attention Object detection
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c306t-cd036fcb1dff874e174d2ef9078b6af8fa4920884264121688d442509010f78e3
ParticipantIDs	crossref_citationtrail_10_1016_j_patcog_2022_108786 crossref_primary_10_1016_j_patcog_2022_108786 elsevier_sciencedirect_doi_10_1016_j_patcog_2022_108786
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	October 2022 2022-10-00
PublicationDateYYYYMMDD	2022-10-01
PublicationDate_xml	– month: 10 year: 2022 text: October 2022
PublicationDecade	2020
PublicationTitle	Pattern recognition
PublicationYear	2022
Publisher	Elsevier Ltd
Publisher_xml	– name: Elsevier Ltd
References	Paszke, Gross, Massa, Lerer, Bradbury, Chanan, Killeen, Lin, Gimelshein, Antiga (bib0054) 2019; vol. 32 Y.-T. Chen, J. Shi, C. Mertz, S. Kong, D. Ramanan, Multimodal object detection via bayesian fusion Quan, Chen, Shao, Teng, Xu, Ji (bib0007) 2021; 111 Dollár, Appel, Belongie, Perona (bib0005) 2014; 36 Zhou, Chen, Cao (bib0052) 2020 Pang, Chen, Shi, Feng, Ouyang, Lin (bib0015) 2019 Svendsen, Martino, Camps-Valls (bib0023) 2020; 100 Z. Zhang, T. He, H. Zhang, Z. Zhang, J. Xie, M. Li, Bag of freebies for training object detection neural networks (2021). Hwang, Park, Kim, Choi, Kweon (bib0031) 2015 Zhang, Fromont, Lefevre, Avignon (bib0035) 2021 Ren, He, Girshick, Sun (bib0012) 2017; 39 G. Jocher, A. Stoken, J. Borovec, L. Changyu, A. Hogan, et al., ultralytics/yolov5: v3. 1-Bug fixes and performance improvements, 2020. A. Bochkovskiy, C.-Y. Wang, H.Y.M. Liao, YOLOv4: optimal speed and accuracy of object detection Mandal, Shah, Meena, Vipparthi (bib0057) 2019 Li, Song, Tong, Tang (bib0034) 2019; 85 Pham, Courtrai, Friguet, Lefévre, Baussard (bib0027) 2020; 12 Fang, Li, Gu, Zhu, Lim (bib0011) 2020; 107 Dhanaraj, Sharma, Sarkar, Karnam, Chachlakis, Ptucha, Markopoulos, Saber (bib0036) 2020 Y. Zheng, I.H. Izzat, S. Ziaee, GFD-SSD: gated fusion double SSD for multispectral pedestrian detection Girshick (bib0009) 2015 Razakarivony, Jurie (bib0053) 2016; 34 Li, Chen, Wang, Zhang (bib0016) 2019 Liu, Anguelov, Erhan, Szegedy, Reed, Fu, Berg (bib0018) 2016 Zheng, Gong, Liu, Jiang, Zhan, Lu, Zhang (bib0022) 2022; 129 Sharma, Dhanaraj, Karnam, Chachlakis, Ptucha, Markopoulos, Saber (bib0037) 2021; 14 Lin, Maire, Belongie, Hays, Perona, Ramanan, Dollár, Zitnick (bib0040) 2014 Wang, Liao, Wu, Chen, Hsieh, Yeh (bib0049) 2020 Park, Kim, Sohn (bib0033) 2018; 80 (2019). Cai, Vasconcelos (bib0014) 2018 Girshick, Donahue, Darrell, Malik (bib0008) 2014 Redmon, Farhadi (bib0042) 2017 Tan, Pang, Le (bib0019) 2020 Ding, Xue, Long, Xia, Lu (bib0026) 2019 Everingham, Gool, Williams, Winn, Zisserman (bib0041) 2010; 88 Cao, Yang, Zhao, Guo, Li (bib0046) 2021; 21 Zhang, Liu, Zhang, Yang, Qiao, Huang, Hussain (bib0047) 2019; 50 Zhang, Fromont, Lefèvre, Avignon (bib0058) 2020 Zhong, Sun, Huo (bib0010) 2019; 96 Lowe (bib0003) 2004; 60 Zhao, Jia, Li (bib0024) 2021; 111 (2020). Dalal, Triggs (bib0002) 2005 Liu, Qi, Qin, Shi, Jia (bib0051) 2018 Redmon, Divvala, Girshick, Farhadi (bib0020) 2016 Zhao, Wang, Wu, Li, Zhao (bib0021) 2020; 104 Pang, Chen, Shi, Feng, Ouyang, Lin (bib0013) 2019 J. Liu, S. Zhang, S. Wang, D.N. Metaxas, Multispectral deep neural networks for pedestrian detection Maaten, Hinton (bib0056) 2008; 9 Qingyun, Lin, Zhaokui (bib0028) 2020; 8 J. Redmon, A. Farhadi, YOLOV3: an incremental improvement A.V. Etten, You only look twice: rapid multi-scale object detection in satellite imagery Bai, Wang, Liu, Liu, Song, Sebe, Kim (bib0006) 2021; 120 Z. Ge, S. Liu, F. Wang, Z. Li, J. Sun, YOLOX: exceeding YOLO series in 2021 Minetto, Thome, Cord, Leite, Stolfi (bib0001) 2013; 46 Viola, Jones (bib0004) 2001 Yang, Liu, He, Li (bib0030) 2019; 16 (2018). He, Zhang, Ren, Sun (bib0050) 2015; 37 (2016). Han, Ding, Li, Xia (bib0029) 2022; 60 Lin, Goyal, Girshick, He, Dollár (bib0017) 2017 Pang (10.1016/j.patcog.2022.108786_bib0015) 2019 Zheng (10.1016/j.patcog.2022.108786_bib0022) 2022; 129 Zhang (10.1016/j.patcog.2022.108786_bib0058) 2020 Minetto (10.1016/j.patcog.2022.108786_bib0001) 2013; 46 Sharma (10.1016/j.patcog.2022.108786_bib0037) 2021; 14 Zhong (10.1016/j.patcog.2022.108786_bib0010) 2019; 96 Fang (10.1016/j.patcog.2022.108786_bib0011) 2020; 107 10.1016/j.patcog.2022.108786_bib0038 10.1016/j.patcog.2022.108786_bib0032 Ding (10.1016/j.patcog.2022.108786_bib0026) 2019 Viola (10.1016/j.patcog.2022.108786_bib0004) 2001 Cao (10.1016/j.patcog.2022.108786_bib0046) 2021; 21 Wang (10.1016/j.patcog.2022.108786_bib0049) 2020 10.1016/j.patcog.2022.108786_bib0039 Liu (10.1016/j.patcog.2022.108786_bib0018) 2016 Girshick (10.1016/j.patcog.2022.108786_bib0009) 2015 Quan (10.1016/j.patcog.2022.108786_bib0007) 2021; 111 Lin (10.1016/j.patcog.2022.108786_bib0017) 2017 Yang (10.1016/j.patcog.2022.108786_bib0030) 2019; 16 Li (10.1016/j.patcog.2022.108786_bib0016) 2019 10.1016/j.patcog.2022.108786_bib0048 Pham (10.1016/j.patcog.2022.108786_bib0027) 2020; 12 10.1016/j.patcog.2022.108786_bib0043 10.1016/j.patcog.2022.108786_bib0044 10.1016/j.patcog.2022.108786_bib0045 Redmon (10.1016/j.patcog.2022.108786_bib0020) 2016 Dollár (10.1016/j.patcog.2022.108786_bib0005) 2014; 36 Liu (10.1016/j.patcog.2022.108786_bib0051) 2018 Tan (10.1016/j.patcog.2022.108786_bib0019) 2020 Dhanaraj (10.1016/j.patcog.2022.108786_bib0036) 2020 Girshick (10.1016/j.patcog.2022.108786_bib0008) 2014 Bai (10.1016/j.patcog.2022.108786_bib0006) 2021; 120 Cai (10.1016/j.patcog.2022.108786_bib0014) 2018 He (10.1016/j.patcog.2022.108786_bib0050) 2015; 37 Redmon (10.1016/j.patcog.2022.108786_bib0042) 2017 Zhao (10.1016/j.patcog.2022.108786_bib0024) 2021; 111 Mandal (10.1016/j.patcog.2022.108786_bib0057) 2019 Everingham (10.1016/j.patcog.2022.108786_bib0041) 2010; 88 Lowe (10.1016/j.patcog.2022.108786_bib0003) 2004; 60 Dalal (10.1016/j.patcog.2022.108786_bib0002) 2005 10.1016/j.patcog.2022.108786_bib0055 Maaten (10.1016/j.patcog.2022.108786_bib0056) 2008; 9 Svendsen (10.1016/j.patcog.2022.108786_bib0023) 2020; 100 Park (10.1016/j.patcog.2022.108786_bib0033) 2018; 80 Zhang (10.1016/j.patcog.2022.108786_bib0047) 2019; 50 Qingyun (10.1016/j.patcog.2022.108786_bib0028) 2020; 8 Lin (10.1016/j.patcog.2022.108786_bib0040) 2014 Zhou (10.1016/j.patcog.2022.108786_bib0052) 2020 Ren (10.1016/j.patcog.2022.108786_bib0012) 2017; 39 Li (10.1016/j.patcog.2022.108786_bib0034) 2019; 85 Razakarivony (10.1016/j.patcog.2022.108786_bib0053) 2016; 34 Zhao (10.1016/j.patcog.2022.108786_bib0021) 2020; 104 Paszke (10.1016/j.patcog.2022.108786_bib0054) 2019; vol. 32 Han (10.1016/j.patcog.2022.108786_bib0029) 2022; 60 10.1016/j.patcog.2022.108786_bib0025 Zhang (10.1016/j.patcog.2022.108786_bib0035) 2021 Pang (10.1016/j.patcog.2022.108786_bib0013) 2019 Hwang (10.1016/j.patcog.2022.108786_bib0031) 2015
References_xml	– volume: 39 start-page: 1137 year: 2017 end-page: 1149 ident: bib0012 article-title: Faster R-CNN: towards real-time object detection with region proposal networks publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 50 start-page: 20 year: 2019 end-page: 29 ident: bib0047 article-title: Cross-modality interactive attention network for multispectral pedestrian detection publication-title: Inf. Fusion – reference: Z. Zhang, T. He, H. Zhang, Z. Zhang, J. Xie, M. Li, Bag of freebies for training object detection neural networks, – start-page: 2844 year: 2019 end-page: 2853 ident: bib0026 article-title: Learning roi transformer for oriented object detection in aerial images publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition – volume: 34 start-page: 187 year: 2016 end-page: 203 ident: bib0053 article-title: Vehicle detection in aerial imagery: a small target detection benchmark publication-title: J. Vis. Commun. Image Represent – reference: (2019). – reference: J. Liu, S. Zhang, S. Wang, D.N. Metaxas, Multispectral deep neural networks for pedestrian detection, – reference: G. Jocher, A. Stoken, J. Borovec, L. Changyu, A. Hogan, et al., ultralytics/yolov5: v3. 1-Bug fixes and performance improvements, 2020. – start-page: 740 year: 2014 end-page: 755 ident: bib0040 article-title: Microsoft COCO: Common objects in context publication-title: Proceedings of the European Conference Computer Vision – start-page: 21 year: 2016 end-page: 37 ident: bib0018 article-title: SSD: single shot multibox detector publication-title: Proceedings of the European Conference Computer Vision – reference: (2020). – volume: 100 start-page: 107103 year: 2020 ident: bib0023 article-title: Active emulation of computer codes with gaussian processes – application to remote sensing publication-title: Pattern Recognit. – reference: Y. Zheng, I.H. Izzat, S. Ziaee, GFD-SSD: gated fusion double SSD for multispectral pedestrian detection, – start-page: 6517 year: 2017 end-page: 6525 ident: bib0042 article-title: YOLO9000: Better, faster, stronger publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – reference: (2016). – reference: A.V. Etten, You only look twice: rapid multi-scale object detection in satellite imagery, – reference: J. Redmon, A. Farhadi, YOLOV3: an incremental improvement, – start-page: 787 year: 2020 end-page: 803 ident: bib0052 article-title: Improving multispectral pedestrian detection by addressing modality imbalance problems publication-title: Proceedings of the European Conference Computer Vision – start-page: 886 year: 2005 end-page: 893 ident: bib0002 article-title: Histograms of oriented gradients for human detection publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – volume: 120 start-page: 108102 year: 2021 ident: bib0006 article-title: Explainable deep learning for efficient and robust pattern recognition: a survey of recent developments publication-title: Pattern Recognit. – volume: 129 start-page: 108717 year: 2022 ident: bib0022 article-title: HFA-Net: high frequency attention siamese network for building change detection in VHR remote sensing images publication-title: Pattern Recognit. – reference: (2021). – start-page: 511 year: 2001 end-page: 518 ident: bib0004 article-title: Rapid object detection using a boosted cascade of simple features publication-title: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition – volume: 14 start-page: 1497 year: 2021 end-page: 1508 ident: bib0037 article-title: YOLOrs: object detection in multimodal remote sensing imagery publication-title: IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. – reference: Z. Ge, S. Liu, F. Wang, Z. Li, J. Sun, YOLOX: exceeding YOLO series in 2021, – start-page: 821 year: 2019 end-page: 830 ident: bib0013 article-title: Libra R-CNN: towards balanced learning for object detection publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – volume: 21 year: 2021 ident: bib0046 article-title: Attention fusion for one-stage multispectral pedestrian detection publication-title: Sensors – volume: 104 start-page: 107333 year: 2020 ident: bib0021 article-title: Remote sensing image segmentation using geodesic-kernel functions and multi-feature spaces publication-title: Pattern Recognit. – start-page: 6154 year: 2018 end-page: 6162 ident: bib0014 article-title: Cascade R-CNN: delving into high quality object detection publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – start-page: 2980 year: 2017 end-page: 2988 ident: bib0017 article-title: Focal loss for dense object detection publication-title: Proceedings of the IEEE International Conference on Computer Vision – start-page: 10778 year: 2020 end-page: 10787 ident: bib0019 article-title: Efficientdet: Scalable and efficient object detection publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition – reference: (2018). – start-page: 580 year: 2014 end-page: 587 ident: bib0008 article-title: Rich feature hierarchies for accurate object detection and semantic segmentation publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – start-page: 1139506 year: 2020 ident: bib0036 article-title: Vehicle detection from multi-modal aerial imagery using YOLOv3 with mid-level fusion publication-title: Proceedings of the Big Data II: Learning, Analytics, and Applications – volume: 80 start-page: 143 year: 2018 end-page: 155 ident: bib0033 article-title: Unified multi-spectral pedestrian detection based on probabilistic fusion networks publication-title: Pattern Recognit. – volume: 60 start-page: 91 year: 2004 end-page: 110 ident: bib0003 article-title: Distinctive image features from scale-invariant keypoints publication-title: Int. J. Comput. Vis. – volume: 107 start-page: 107474 year: 2020 ident: bib0011 article-title: A novel hybrid approach for crack detection publication-title: Pattern Recognit. – start-page: 779 year: 2016 end-page: 788 ident: bib0020 article-title: You only look once: unified, real-time object detection publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – volume: 88 start-page: 303 year: 2010 end-page: 338 ident: bib0041 article-title: The pascal visual object classes (VOC) challenge publication-title: Int. J. Comput. Vis. – volume: 111 start-page: 107635 year: 2021 ident: bib0024 article-title: Hyperspectral remote sensing image classification based on tighter random projection with minimal intra-class variance algorithm publication-title: Pattern Recognit. – volume: 36 start-page: 1532 year: 2014 end-page: 1545 ident: bib0005 article-title: Fast feature pyramids for object detection publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 12 year: 2020 ident: bib0027 article-title: YOLO-Fine: one-stage detector of small objects under various backgrounds in remote sensing images publication-title: Remote Sens. – volume: vol. 32 start-page: 8026 year: 2019 end-page: 8037 ident: bib0054 article-title: Pytorch: an imperative style, high-performance deep learning library publication-title: Proceedings of the Advances in Neural Information Processing Systems – start-page: 6054 year: 2019 end-page: 6063 ident: bib0016 article-title: Scale-aware trident networks for object detection publication-title: Proceedings of the IEEE/CVF International Conference on Computer Vision – reference: A. Bochkovskiy, C.-Y. Wang, H.Y.M. Liao, YOLOv4: optimal speed and accuracy of object detection, – start-page: 1571 year: 2020 end-page: 1580 ident: bib0049 article-title: CSPNet: a new backbone that can enhance learning capability of CNN publication-title: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops – volume: 9 start-page: 2579 year: 2008 end-page: 2605 ident: bib0056 article-title: Visualizing data using t-SNE publication-title: J. Mach. Learn. Res. – volume: 16 start-page: 1 year: 2019 end-page: 9 ident: bib0030 article-title: Air-to-ground multimodal object detection algorithm based on feature association learning publication-title: Int. J. Adv. Robot. Syst. – volume: 60 start-page: 1 year: 2022 end-page: 11 ident: bib0029 article-title: Align deep features for oriented object detection publication-title: IEEE Trans. Geosci. Remote Sens. – volume: 85 start-page: 161 year: 2019 end-page: 171 ident: bib0034 article-title: Illumination-aware faster R-CNN for robust multispectral pedestrian detection publication-title: Pattern Recognit. – start-page: 3098 year: 2019 end-page: 3102 ident: bib0057 article-title: SSSDET: simple short and shallow network for resource efficient vehicle detection in aerial scenes publication-title: Proceedings of the IEEE International Conference on Image Processing – start-page: 1037 year: 2015 end-page: 1045 ident: bib0031 article-title: Multispectral pedestrian detection: benchmark dataset and baseline publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – start-page: 72 year: 2021 end-page: 80 ident: bib0035 article-title: Guided attentive feature fusion for multispectral pedestrian detection publication-title: Proceedings of the IEEE Winter Conference on Applications of Computer Vision – start-page: 1440 year: 2015 end-page: 1448 ident: bib0009 article-title: Fast R-CNN publication-title: Proceedings of the IEEE International Conference on Computer Vision – start-page: 276 year: 2020 end-page: 280 ident: bib0058 article-title: Multispectral fusion for object detection with cyclic fuse-and-refine blocks publication-title: Proceedings of the IEEE International Conference on Image Processing – volume: 111 start-page: 107639 year: 2021 ident: bib0007 article-title: Image denoising using complex-valued deep CNN publication-title: Pattern Recognit. – volume: 37 start-page: 1904 year: 2015 end-page: 1916 ident: bib0050 article-title: Spatial pyramid pooling in deep convolutional networks for visual recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 96 start-page: 106986 year: 2019 ident: bib0010 article-title: Improved localization accuracy by LocNet for faster R-CNN based text detection in natural scene images publication-title: Pattern Recognit. – start-page: 821 year: 2019 end-page: 830 ident: bib0015 article-title: Libra R-CNN: towards balanced learning for object detection publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – volume: 8 start-page: 93058 year: 2020 end-page: 93068 ident: bib0028 article-title: An efficient feature pyramid network for object detection in remote sensing imagery publication-title: IEEE Access – volume: 46 start-page: 1078 year: 2013 end-page: 1090 ident: bib0001 article-title: T-HOG: an effective gradient-based descriptor for single line text regions publication-title: Pattern Recognit. – reference: Y.-T. Chen, J. Shi, C. Mertz, S. Kong, D. Ramanan, Multimodal object detection via bayesian fusion, – start-page: 8759 year: 2018 end-page: 8768 ident: bib0051 article-title: Path aggregation network for instance segmentation publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition – volume: 80 start-page: 143 year: 2018 ident: 10.1016/j.patcog.2022.108786_bib0033 article-title: Unified multi-spectral pedestrian detection based on probabilistic fusion networks publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2018.03.007 – start-page: 779 year: 2016 ident: 10.1016/j.patcog.2022.108786_bib0020 article-title: You only look once: unified, real-time object detection – ident: 10.1016/j.patcog.2022.108786_bib0043 – ident: 10.1016/j.patcog.2022.108786_bib0032 doi: 10.5244/C.30.73 – start-page: 6154 year: 2018 ident: 10.1016/j.patcog.2022.108786_bib0014 article-title: Cascade R-CNN: delving into high quality object detection – volume: 36 start-page: 1532 issue: 8 year: 2014 ident: 10.1016/j.patcog.2022.108786_bib0005 article-title: Fast feature pyramids for object detection publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2014.2300479 – start-page: 1037 year: 2015 ident: 10.1016/j.patcog.2022.108786_bib0031 article-title: Multispectral pedestrian detection: benchmark dataset and baseline – volume: 85 start-page: 161 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0034 article-title: Illumination-aware faster R-CNN for robust multispectral pedestrian detection publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2018.08.005 – volume: 21 issue: 12 year: 2021 ident: 10.1016/j.patcog.2022.108786_bib0046 article-title: Attention fusion for one-stage multispectral pedestrian detection publication-title: Sensors doi: 10.3390/s21124184 – start-page: 276 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0058 article-title: Multispectral fusion for object detection with cyclic fuse-and-refine blocks – start-page: 8759 year: 2018 ident: 10.1016/j.patcog.2022.108786_bib0051 article-title: Path aggregation network for instance segmentation – start-page: 2844 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0026 article-title: Learning roi transformer for oriented object detection in aerial images – start-page: 6517 year: 2017 ident: 10.1016/j.patcog.2022.108786_bib0042 article-title: YOLO9000: Better, faster, stronger – start-page: 2980 year: 2017 ident: 10.1016/j.patcog.2022.108786_bib0017 article-title: Focal loss for dense object detection – volume: 100 start-page: 107103 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0023 article-title: Active emulation of computer codes with gaussian processes – application to remote sensing publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2019.107103 – ident: 10.1016/j.patcog.2022.108786_bib0025 – volume: 129 start-page: 108717 year: 2022 ident: 10.1016/j.patcog.2022.108786_bib0022 article-title: HFA-Net: high frequency attention siamese network for building change detection in VHR remote sensing images publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2022.108717 – start-page: 1139506 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0036 article-title: Vehicle detection from multi-modal aerial imagery using YOLOv3 with mid-level fusion – volume: 111 start-page: 107639 year: 2021 ident: 10.1016/j.patcog.2022.108786_bib0007 article-title: Image denoising using complex-valued deep CNN publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2020.107639 – volume: 120 start-page: 108102 year: 2021 ident: 10.1016/j.patcog.2022.108786_bib0006 article-title: Explainable deep learning for efficient and robust pattern recognition: a survey of recent developments publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2021.108102 – start-page: 511 year: 2001 ident: 10.1016/j.patcog.2022.108786_bib0004 article-title: Rapid object detection using a boosted cascade of simple features – volume: 39 start-page: 1137 issue: 6 year: 2017 ident: 10.1016/j.patcog.2022.108786_bib0012 article-title: Faster R-CNN: towards real-time object detection with region proposal networks publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2016.2577031 – start-page: 821 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0013 article-title: Libra R-CNN: towards balanced learning for object detection – volume: 12 issue: 15 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0027 article-title: YOLO-Fine: one-stage detector of small objects under various backgrounds in remote sensing images publication-title: Remote Sens. doi: 10.3390/rs12152501 – start-page: 3098 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0057 article-title: SSSDET: simple short and shallow network for resource efficient vehicle detection in aerial scenes – volume: 60 start-page: 1 year: 2022 ident: 10.1016/j.patcog.2022.108786_bib0029 article-title: Align deep features for oriented object detection publication-title: IEEE Trans. Geosci. Remote Sens. – volume: 104 start-page: 107333 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0021 article-title: Remote sensing image segmentation using geodesic-kernel functions and multi-feature spaces publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2020.107333 – start-page: 6054 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0016 article-title: Scale-aware trident networks for object detection – volume: 8 start-page: 93058 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0028 article-title: An efficient feature pyramid network for object detection in remote sensing imagery publication-title: IEEE Access doi: 10.1109/ACCESS.2020.2993998 – volume: 34 start-page: 187 year: 2016 ident: 10.1016/j.patcog.2022.108786_bib0053 article-title: Vehicle detection in aerial imagery: a small target detection benchmark publication-title: J. Vis. Commun. Image Represent doi: 10.1016/j.jvcir.2015.11.002 – start-page: 72 year: 2021 ident: 10.1016/j.patcog.2022.108786_bib0035 article-title: Guided attentive feature fusion for multispectral pedestrian detection – volume: 96 start-page: 106986 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0010 article-title: Improved localization accuracy by LocNet for faster R-CNN based text detection in natural scene images publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2019.106986 – volume: 60 start-page: 91 issue: 2 year: 2004 ident: 10.1016/j.patcog.2022.108786_bib0003 article-title: Distinctive image features from scale-invariant keypoints publication-title: Int. J. Comput. Vis. doi: 10.1023/B:VISI.0000029664.99615.94 – start-page: 1440 year: 2015 ident: 10.1016/j.patcog.2022.108786_bib0009 article-title: Fast R-CNN – volume: 14 start-page: 1497 year: 2021 ident: 10.1016/j.patcog.2022.108786_bib0037 article-title: YOLOrs: object detection in multimodal remote sensing imagery publication-title: IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. doi: 10.1109/JSTARS.2020.3041316 – ident: 10.1016/j.patcog.2022.108786_bib0039 – start-page: 821 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0015 article-title: Libra R-CNN: towards balanced learning for object detection – start-page: 1571 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0049 article-title: CSPNet: a new backbone that can enhance learning capability of CNN – start-page: 580 year: 2014 ident: 10.1016/j.patcog.2022.108786_bib0008 article-title: Rich feature hierarchies for accurate object detection and semantic segmentation – volume: 107 start-page: 107474 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0011 article-title: A novel hybrid approach for crack detection publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2020.107474 – ident: 10.1016/j.patcog.2022.108786_bib0045 doi: 10.1007/978-3-031-20077-9_9 – volume: 16 start-page: 1 issue: 3 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0030 article-title: Air-to-ground multimodal object detection algorithm based on feature association learning publication-title: Int. J. Adv. Robot. Syst. doi: 10.1177/1729881419842995 – start-page: 740 year: 2014 ident: 10.1016/j.patcog.2022.108786_bib0040 article-title: Microsoft COCO: Common objects in context – volume: 37 start-page: 1904 issue: 9 year: 2015 ident: 10.1016/j.patcog.2022.108786_bib0050 article-title: Spatial pyramid pooling in deep convolutional networks for visual recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2015.2389824 – volume: 9 start-page: 2579 issue: 11 year: 2008 ident: 10.1016/j.patcog.2022.108786_bib0056 article-title: Visualizing data using t-SNE publication-title: J. Mach. Learn. Res. – volume: 88 start-page: 303 issue: 2 year: 2010 ident: 10.1016/j.patcog.2022.108786_bib0041 article-title: The pascal visual object classes (VOC) challenge publication-title: Int. J. Comput. Vis. doi: 10.1007/s11263-009-0275-4 – ident: 10.1016/j.patcog.2022.108786_bib0048 – volume: 111 start-page: 107635 year: 2021 ident: 10.1016/j.patcog.2022.108786_bib0024 article-title: Hyperspectral remote sensing image classification based on tighter random projection with minimal intra-class variance algorithm publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2020.107635 – ident: 10.1016/j.patcog.2022.108786_bib0044 – start-page: 787 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0052 article-title: Improving multispectral pedestrian detection by addressing modality imbalance problems – volume: 46 start-page: 1078 issue: 3 year: 2013 ident: 10.1016/j.patcog.2022.108786_bib0001 article-title: T-HOG: an effective gradient-based descriptor for single line text regions publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2012.10.009 – volume: 50 start-page: 20 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0047 article-title: Cross-modality interactive attention network for multispectral pedestrian detection publication-title: Inf. Fusion doi: 10.1016/j.inffus.2018.09.015 – ident: 10.1016/j.patcog.2022.108786_bib0038 – volume: vol. 32 start-page: 8026 year: 2019 ident: 10.1016/j.patcog.2022.108786_bib0054 article-title: Pytorch: an imperative style, high-performance deep learning library – start-page: 886 year: 2005 ident: 10.1016/j.patcog.2022.108786_bib0002 article-title: Histograms of oriented gradients for human detection – start-page: 21 year: 2016 ident: 10.1016/j.patcog.2022.108786_bib0018 article-title: SSD: single shot multibox detector – ident: 10.1016/j.patcog.2022.108786_bib0055 – start-page: 10778 year: 2020 ident: 10.1016/j.patcog.2022.108786_bib0019 article-title: Efficientdet: Scalable and efficient object detection
SSID	ssj0017142
Score	2.6867542
Snippet	•We propose a simple yet effective CMAFF module that can fuse the complementary information of multispectral remote sensing images with joint common-modality...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	108786
SubjectTerms	Attention Cross-modality Feature fusion Multispectral remote sensing imagery Object detection
Title	Cross-modality attentive feature fusion for object detection in multispectral remote sensing imagery
URI	https://dx.doi.org/10.1016/j.patcog.2022.108786
Volume	130
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5KvXjxLdZH2YPXtc17cyxFqYo9WegtdF8lYtNS04MXf7szm6QoiILHhN1NmJ2dx_LNNwDXgQqEQTfBlZ9ggmKF4TINIk5UJhjxSmMlVSM_jePRJHyYRtMWDJtaGIJV1ra_sunOWtdverU0e6s8pxpfoh3so0ZSIpFQER-x16FO33xsYR7U37tiDA88TqOb8jmH8VqhuVvOMUv0fQLbJVRR_ZN7-uJy7g5gr44V2aD6nUNomeII9ps-DKw-lsegh7QWXyy1C6oZUWYWZMaYNY63k9kNXYoxDFDZUtLNC9OmdCCsguUFc6hCV3O5xu-tDW6fYW8EbS_mLF8QzcX7CUzubp-HI153T-AK04CSK43OySrpaWtFEhpMPbRvLCbDQsYzK-wsTH20MRQSeb4XC6FDPMB9wmvYRJjgFNrFsjBnwOI0UH6skn4qdagjK2M1SyMtrdVCykh1IGiElqmaWpw6XLxmDYbsJatEnZGos0rUHeDbWauKWuOP8UmzH9k3FcnQ-v868_zfMy9gl54q9N4ltMv1xlxhFFLKrlOzLuwM7h9H408GvN5h
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8JAEJ4gHPTi24jPPXjdQN_bIyESkMcJEm4N-yIYKQTh4L93pg-iidHEa9tpm9nZeex-8y3Ak6c8YTBMcOVGWKBYYbiMvYATlQlmvNJYSd3Iw1HYnfgv02BagXbZC0OwysL35z4989bFlUahzcZ6saAeX6IdbKJFUiERxQdQI3Yqvwq1Vq_fHe03EyLHz0nDPYeTQNlBl8G81ujxVnMsFF2X8HYRNVX_FKG-RJ3OKRwX6SJr5X90BhWTnsNJeRQDK2bmBeg2vYsvVzrLqxmxZqbkyZg1GXUnsztaF2OYo7KVpMUXps02w2GlbJGyDFiYtV1u8HsbgyNo2Duh29M5WyyJ6eLjEiad53G7y4sDFLjCSmDLlcb4ZJV0tLUi8g1WH9o1FuthIcOZFXbmxy66GcqKHNcJhdA-zuEmQTZsJIx3BdV0lZprYGHsKTdUUTOW2teBlaGaxYGW1mohZaDq4JVKS1TBLk6HXLwlJYzsNclVnZCqk1zVdeB7qXXOrvHH81E5Hsk3K0kwAPwqefNvyUc47I6Hg2TQG_Vv4Yju5GC-O6huNztzj0nJVj4URvcJqizhEg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cross-modality+attentive+feature+fusion+for+object+detection+in+multispectral+remote+sensing+imagery&rft.jtitle=Pattern+recognition&rft.au=Qingyun%2C+Fang&rft.au=Zhaokui%2C+Wang&rft.date=2022-10-01&rft.issn=0031-3203&rft.volume=130&rft.spage=108786&rft_id=info:doi/10.1016%2Fj.patcog.2022.108786&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_patcog_2022_108786
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon