SCLNet: A Scale-Robust Complementary Learning Network for Object Detection in UAV Images
Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of scale challenges, which encompass scale variation and small objects, continues to hinder object detection in UAV images. Although existing work...
Saved in:
Published in | IEEE transactions on geoscience and remote sensing Vol. 62; pp. 1 - 19 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of scale challenges, which encompass scale variation and small objects, continues to hinder object detection in UAV images. Although existing works propose solutions, they are implicitly modeled and have redundant steps, so detection performance remains limited. One specific work addressing the above scale challenges can help improve the performance of UAV image detectors. Compared to natural scenes, scale challenges in UAV images happen with problems of limited perception in comprehensive scales and poor robustness to small objects. We found that complementary learning is beneficial for the detection model to address the scale challenges. Therefore, the article introduces it to form our scale-robust complementary learning network (SCLNet) in conjunction with the object detection model. The SCLNet consists of two implementations and a cooperation method. In detail, one implementation is based on our proposed scale-complementary decoder and scale-complementary loss function to explicitly extract complementary information as a complement, named comprehensive-scale complementary learning (CSCL). Another implementation is based on our proposed contrastive complement network and contrastive complement loss function to explicitly guide the learning of small objects with the rich texture detail information of the large objects, named interscale contrastive complementary learning (ICCL). In addition, an end-to-end cooperation (ECoop) between two implementations and with the detection model is proposed to exploit each potential. In short, SCLNet forms a more comprehensive representation through feature complementary and improves the representation of small objects through interscale contrast, which in turn comes to improve scale robustness and detection performance. Thorough experiments prove the effectiveness of our SCLNet on Visdrone and UAVDT datasets, including the fact that the novel components included in SCLNet are effective and competitive with many CNN-based and transformer-based methods, among other aspects. In general, our SCLNet can effectively address scale challenges and is a competitive model in UAV image object detection. |
---|---|
AbstractList | Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of scale challenges, which encompass scale variation and small objects, continues to hinder object detection in UAV images. Although existing works propose solutions, they are implicitly modeled and have redundant steps, so detection performance remains limited. One specific work addressing the above scale challenges can help improve the performance of UAV image detectors. Compared to natural scenes, scale challenges in UAV images happen with problems of limited perception in comprehensive scales and poor robustness to small objects. We found that complementary learning is beneficial for the detection model to address the scale challenges. Therefore, the article introduces it to form our scale-robust complementary learning network (SCLNet) in conjunction with the object detection model. The SCLNet consists of two implementations and a cooperation method. In detail, one implementation is based on our proposed scale-complementary decoder and scale-complementary loss function to explicitly extract complementary information as a complement, named comprehensive-scale complementary learning (CSCL). Another implementation is based on our proposed contrastive complement network and contrastive complement loss function to explicitly guide the learning of small objects with the rich texture detail information of the large objects, named interscale contrastive complementary learning (ICCL). In addition, an end-to-end cooperation (ECoop) between two implementations and with the detection model is proposed to exploit each potential. In short, SCLNet forms a more comprehensive representation through feature complementary and improves the representation of small objects through interscale contrast, which in turn comes to improve scale robustness and detection performance. Thorough experiments prove the effectiveness of our SCLNet on Visdrone and UAVDT datasets, including the fact that the novel components included in SCLNet are effective and competitive with many CNN-based and transformer-based methods, among other aspects. In general, our SCLNet can effectively address scale challenges and is a competitive model in UAV image object detection. |
Author | Mao, Yongqiang Diao, Wenhui Li, Xuexue Sun, Xian Li, Xinming |
Author_xml | – sequence: 1 givenname: Xuexue orcidid: 0000-0002-0177-7001 surname: Li fullname: Li, Xuexue organization: Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China – sequence: 2 givenname: Wenhui orcidid: 0000-0002-3931-3974 surname: Diao fullname: Diao, Wenhui email: diaowh@aircas.ac.cn organization: Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China – sequence: 3 givenname: Yongqiang orcidid: 0000-0001-9256-3668 surname: Mao fullname: Mao, Yongqiang email: lixuexue20@mails.ucas.ac.cn organization: Department of Electronic Engineering, Tsinghua University, Beijing, China – sequence: 4 givenname: Xinming surname: Li fullname: Li, Xinming organization: Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China – sequence: 5 givenname: Xian orcidid: 0000-0002-0038-9816 surname: Sun fullname: Sun, Xian organization: Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China |
BookMark | eNpNkEtLw0AYRQepYFv9AYKLAdep807irlSthWChD3E3JJMvJbWZqTMp4r83oS5c3c253-OM0MA6CwjdUjKhlKQPm_lqPWGEiQmXRAomL9CQSplERAkxQENCUxWxJGVXaBTCnhAqJI2H6GM9y96gfcRTvDb5AaKVK06hxTPXHA_QgG1z_4MzyL2t7Q536Lfzn7hyHi-LPZgWP0HbRe0sri3eTt_xosl3EK7RZZUfAtz85RhtX543s9coW84Xs2kWGSZUG3HGTZoURSJjCYwJqQxPISEgSxbHouTSlBWBMlZK5UBJQUoFylCQhsYVo3yM7s9zj959nSC0eu9O3nYrNaeCE5bEvKfomTLeheCh0kdfN91rmhLdC9S9QN0L1H8Cu87duVMDwD--P0Ul_BdrmGy4 |
CODEN | IGRSD2 |
Cites_doi | 10.1109/CVPR.2018.00644 10.3390/rs15061687 10.1007/978-3-030-58595-2_24 10.1109/TCSVT.2022.3168279 10.1109/TGRS.2022.3140809 10.1016/j.neucom.2021.11.105 10.1007/978-3-030-01249-6_23 10.1109/TGRS.2022.3203163 10.1109/TPAMI.2020.3024900 10.1016/j.isprsjprs.2023.04.009 10.1109/ICCV48922.2021.00986 10.1016/j.isprsjprs.2021.12.004 10.1109/CVPR.2016.91 10.1109/CVPRW50498.2020.00103 10.1109/CVPR52688.2022.01330 10.1109/TPAMI.2016.2577031 10.1037/0033-295X.102.3.419 10.1109/CVPRW59228.2023.00484 10.1109/TGRS.2022.3175213 10.1609/aaai.v36i2.20099 10.1109/TPAMI.2023.3290594 10.1109/WACV48630.2021.00330 10.1109/TIP.2013.2259840 10.1109/CVPR52729.2023.01291 10.1016/j.tics.2016.05.004 10.1109/ICCVW54120.2021.00313 10.1109/TGRS.2023.3298852 10.1007/s00138-018-0994-z 10.1609/aaai.v34i04.6126 10.1109/TGRS.2022.3201056 10.1109/CVPRW59228.2023.00198 10.1109/TPAMI.2018.2858759 10.1109/TGRS.2022.3183567 10.1109/TIP.2020.3045636 10.1007/978-3-319-10602-1_48 10.24963/ijcai.2018/607 10.1109/ICCV.2017.324 10.1111/j.1551-6709.2011.01214.x 10.1007/978-3-030-58452-8_13 10.1109/ICCV.2019.00840 10.1145/3474085.3475467 10.1109/ICCVW.2019.00030 10.1109/TGRS.2021.3062048 10.1109/TGRS.2022.3140856 10.1109/ICCVW.2019.00007 10.1037/a0033812 10.1109/tgrs.2021.3051466 10.1109/CVPR.2018.00144 10.48550/arXiv.1906.07155 10.1609/aaai.v36i1.19986 10.1109/CVPR.2016.207 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
DBID | 97E RIA RIE AAYXX CITATION 7UA 8FD C1K F1W FR3 H8D H96 KR7 L.G L7M |
DOI | 10.1109/TGRS.2024.3505425 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Water Resources Abstracts Technology Research Database Environmental Sciences and Pollution Management ASFA: Aquatic Sciences and Fisheries Abstracts Engineering Research Database Aerospace Database Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources Civil Engineering Abstracts Aquatic Science & Fisheries Abstracts (ASFA) Professional Advanced Technologies Database with Aerospace |
DatabaseTitle | CrossRef Aerospace Database Civil Engineering Abstracts Aquatic Science & Fisheries Abstracts (ASFA) Professional Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources Technology Research Database ASFA: Aquatic Sciences and Fisheries Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Water Resources Abstracts Environmental Sciences and Pollution Management |
DatabaseTitleList | Aerospace Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Physics |
EISSN | 1558-0644 |
EndPage | 19 |
ExternalDocumentID | 10_1109_TGRS_2024_3505425 10766668 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Key Research and Development Program of China grantid: 2022ZD0118402 funderid: 10.13039/501100012166 |
GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AETIX AFRAH AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ H~9 IBMZZ ICLAB IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 VH1 Y6R AAYOK AAYXX CITATION RIG 7UA 8FD C1K F1W FR3 H8D H96 KR7 L.G L7M |
ID | FETCH-LOGICAL-c246t-323c98bb8575e22456c39e80e5d2774d35cdf0ed7666ae10b0d6e6c1e5c17f213 |
IEDL.DBID | RIE |
ISSN | 0196-2892 |
IngestDate | Mon Jun 30 10:17:33 EDT 2025 Tue Jul 01 02:15:32 EDT 2025 Wed Aug 27 02:33:17 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c246t-323c98bb8575e22456c39e80e5d2774d35cdf0ed7666ae10b0d6e6c1e5c17f213 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0001-9256-3668 0000-0002-0177-7001 0000-0002-3931-3974 0000-0002-0038-9816 |
PQID | 3143028731 |
PQPubID | 85465 |
PageCount | 19 |
ParticipantIDs | ieee_primary_10766668 proquest_journals_3143028731 crossref_primary_10_1109_TGRS_2024_3505425 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 20240000 2024-00-00 20240101 |
PublicationDateYYYYMMDD | 2024-01-01 |
PublicationDate_xml | – year: 2024 text: 20240000 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on geoscience and remote sensing |
PublicationTitleAbbrev | TGRS |
PublicationYear | 2024 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref12 ref15 ref14 ref53 ref52 ref11 ref10 Zhu (ref26) 2020 ref54 Dosovitskiy (ref36) 2020 ref17 ref16 ref19 ref18 ref51 ref46 ref45 ref48 ref47 ref42 ref41 ref44 ref43 ref49 ref8 ref7 ref9 ref4 ref3 ref6 ref5 ref40 ref35 ref34 ref37 ref31 ref30 ref33 ref32 ref2 ref1 ref39 ref38 ref24 ref23 ref25 ref20 ref22 ref21 ref28 ref27 ref29 Wei (ref50) 2020 |
References_xml | – ident: ref5 doi: 10.1109/CVPR.2018.00644 – ident: ref23 doi: 10.3390/rs15061687 – ident: ref21 doi: 10.1007/978-3-030-58595-2_24 – ident: ref54 doi: 10.1109/TCSVT.2022.3168279 – ident: ref3 doi: 10.1109/TGRS.2022.3140809 – ident: ref46 doi: 10.1016/j.neucom.2021.11.105 – ident: ref41 doi: 10.1007/978-3-030-01249-6_23 – ident: ref34 doi: 10.1109/TGRS.2022.3203163 – ident: ref38 doi: 10.1109/TPAMI.2020.3024900 – ident: ref9 doi: 10.1016/j.isprsjprs.2023.04.009 – ident: ref45 doi: 10.1109/ICCV48922.2021.00986 – ident: ref39 doi: 10.1016/j.isprsjprs.2021.12.004 – ident: ref52 doi: 10.1109/CVPR.2016.91 – year: 2020 ident: ref50 article-title: AMRNet: Chips augmentation in aerial images object detection publication-title: arXiv:2009.07168 – ident: ref10 doi: 10.1109/CVPRW50498.2020.00103 – ident: ref16 doi: 10.1109/CVPR52688.2022.01330 – ident: ref51 doi: 10.1109/TPAMI.2016.2577031 – ident: ref27 doi: 10.1037/0033-295X.102.3.419 – ident: ref48 doi: 10.1109/CVPRW59228.2023.00484 – ident: ref37 doi: 10.1109/TGRS.2022.3175213 – ident: ref47 doi: 10.1609/aaai.v36i2.20099 – ident: ref19 doi: 10.1109/TPAMI.2023.3290594 – ident: ref12 doi: 10.1109/WACV48630.2021.00330 – ident: ref35 doi: 10.1109/TIP.2013.2259840 – ident: ref17 doi: 10.1109/CVPR52729.2023.01291 – ident: ref29 doi: 10.1016/j.tics.2016.05.004 – ident: ref14 doi: 10.1109/ICCVW54120.2021.00313 – ident: ref44 doi: 10.1109/TGRS.2023.3298852 – ident: ref1 doi: 10.1007/s00138-018-0994-z – ident: ref31 doi: 10.1609/aaai.v34i04.6126 – year: 2020 ident: ref26 article-title: Deformable DETR: Deformable transformers for end-to-end object detection publication-title: arXiv:2010.04159 – ident: ref18 doi: 10.1109/TGRS.2022.3201056 – ident: ref24 doi: 10.1109/CVPRW59228.2023.00198 – ident: ref43 doi: 10.1109/TPAMI.2018.2858759 – ident: ref2 doi: 10.1109/TGRS.2022.3183567 – ident: ref11 doi: 10.1109/TIP.2020.3045636 – ident: ref4 doi: 10.1007/978-3-319-10602-1_48 – ident: ref30 doi: 10.24963/ijcai.2018/607 – ident: ref53 doi: 10.1109/ICCV.2017.324 – year: 2020 ident: ref36 article-title: An image is worth 16×16 words: Transformers for image recognition at scale publication-title: arXiv:2010.11929 – ident: ref20 doi: 10.1111/j.1551-6709.2011.01214.x – ident: ref25 doi: 10.1007/978-3-030-58452-8_13 – ident: ref15 doi: 10.1109/ICCV.2019.00840 – ident: ref49 doi: 10.1145/3474085.3475467 – ident: ref40 doi: 10.1109/ICCVW.2019.00030 – ident: ref6 doi: 10.1109/TGRS.2021.3062048 – ident: ref7 doi: 10.1109/TGRS.2022.3140856 – ident: ref13 doi: 10.1109/ICCVW.2019.00007 – ident: ref28 doi: 10.1037/a0033812 – ident: ref8 doi: 10.1109/tgrs.2021.3051466 – ident: ref32 doi: 10.1109/CVPR.2018.00144 – ident: ref42 doi: 10.48550/arXiv.1906.07155 – ident: ref22 doi: 10.1609/aaai.v36i1.19986 – ident: ref33 doi: 10.1109/CVPR.2016.207 |
SSID | ssj0014517 |
Score | 2.439769 |
Snippet | Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Index Database Publisher |
StartPage | 1 |
SubjectTerms | Adaptation models Autonomous aerial vehicles Complement Complementary learning Computational modeling Cooperation Decoding Detectors Effectiveness Feature extraction Geoscience and remote sensing Information processing Learning Object detection Object recognition Occlusion Performance enhancement Representations Robustness Robustness (mathematics) scale challenges scale variation Semantics small objects Training Unmanned aerial vehicles |
Title | SCLNet: A Scale-Robust Complementary Learning Network for Object Detection in UAV Images |
URI | https://ieeexplore.ieee.org/document/10766668 https://www.proquest.com/docview/3143028731 |
Volume | 62 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3Nb9MwFH9ilZDGgUEpojCQDztNchc7TpxwqzbGQKxIazv1FtX2y4QQKVrTA_vr9-y4UzWExCk5xJbjZ7_3e98AR4WzWMgy46pUlpO-ofiylJrXSWnqpXbEM72943KSX8zV10W2iMnqIRcGEUPwGY78a_Dlu5XdeFMZ3XBNaDsv9mCPNLcuWevBZaAyEXOjc05ahIwuTJGUJ7PPV1NSBaUapSTwlW-LvSOEQleVv1hxkC_nBzDZrqwLK_k52rRmZO8eFW3876W_gOcRabJxdzRewhNs-vBsp_5gH56G-E-7fgWL6em3CbYf2ZhNiWrIr1Zms26ZZxcxwPz2D4vFWG_YpAseZ4R42XfjTTnsDNsQ1dWwHw2bj6_Zl1_EqtYDmJ9_mp1e8Nh0gVup8panMrVlYYzv3InSu0VtWmKRYOYkQUWXZtbVCTr_O0sUiUlcjrkVmFmhaynS19BrVg2-AaZrGmKyBIXVymphCqRnXgtHIFDndgjHWypUv7vaGlXQSZKy8iSrPMmqSLIhDPyu7nzYbegQDreEq-L1W1cpoUACTjoVb_8x7B3s-9k7Y8oh9NrbDb4neNGaD-FY3QNfxMmt |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV3NbtNAEB6VIgQc-ClFBArsAS5IDt712msjcYhaSkLTIDUJys1kd8dVhepUtaOqfRdehWdj1t5UEYhjJU72wbuWdz7PfvOzMwBvUmswFVkcyEyagOwNGcwzoYIizHQxV5Z0pvN3HI6S_lR-mcWzDfh5fRYGEZvkM-y62yaWbxdm6Vxl9IcrYttJ6nMoD_Dygiy06uNgj8T5Voj9T5PdfuCbCARGyKQOIhGZLNXadaJE4cJ8JsowDTG2gqiPjWJjixCtm3mOPNShTTAxHGPDVSF4RPPegttENGLRHg-7DlLImPvT2ElAdovwQVMeZu8nn4_GZHwK2Y2IYkjXiHtt22v6uPyl_Jsdbf8h_FqtRZvI8qO7rHXXXP1RJvK_XaxH8MBzadZrwf8YNrDcgvtrFRa34E6T4WqqJzAb7w5HWH9gPTYmXGJwtNDLqmZOIfoU-vNL5svNHrNRmx7PiNOzr9o5q9ge1k3eWslOSjbtfWODU1LG1TZMb-Qjn8JmuSjxGTBV0BAdh8iNkkZxnSJdk4JborkqMR14t5J6ftZWD8kbqyvMcgeR3EEk9xDpwLaT4tqDrQA7sLMCSu4VTJVHxHOJGqqIP__HsNdwtz85HObDwejgBdxzb2pdRzuwWZ8v8SWRqVq_aiDN4PtNw-I3XzsmZg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SCLNet%3A+A+Scale-Robust+Complementary+Learning+Network+for+Object+Detection+in+UAV+Images&rft.jtitle=IEEE+transactions+on+geoscience+and+remote+sensing&rft.au=Li%2C+Xuexue&rft.au=Diao%2C+Wenhui&rft.au=Mao%2C+Yongqiang&rft.au=Li%2C+Xinming&rft.date=2024&rft.pub=IEEE&rft.issn=0196-2892&rft.volume=62&rft.spage=1&rft.epage=19&rft_id=info:doi/10.1109%2FTGRS.2024.3505425&rft.externalDocID=10766668 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0196-2892&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0196-2892&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0196-2892&client=summon |