SCLNet: A Scale-Robust Complementary Learning Network for Object Detection in UAV Images

Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of scale challenges, which encompass scale variation and small objects, continues to hinder object detection in UAV images. Although existing work...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on geoscience and remote sensing Vol. 62; pp. 1 - 19
Main Authors Li, Xuexue, Diao, Wenhui, Mao, Yongqiang, Li, Xinming, Sun, Xian
Format Journal Article
LanguageEnglish
Published New York IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of scale challenges, which encompass scale variation and small objects, continues to hinder object detection in UAV images. Although existing works propose solutions, they are implicitly modeled and have redundant steps, so detection performance remains limited. One specific work addressing the above scale challenges can help improve the performance of UAV image detectors. Compared to natural scenes, scale challenges in UAV images happen with problems of limited perception in comprehensive scales and poor robustness to small objects. We found that complementary learning is beneficial for the detection model to address the scale challenges. Therefore, the article introduces it to form our scale-robust complementary learning network (SCLNet) in conjunction with the object detection model. The SCLNet consists of two implementations and a cooperation method. In detail, one implementation is based on our proposed scale-complementary decoder and scale-complementary loss function to explicitly extract complementary information as a complement, named comprehensive-scale complementary learning (CSCL). Another implementation is based on our proposed contrastive complement network and contrastive complement loss function to explicitly guide the learning of small objects with the rich texture detail information of the large objects, named interscale contrastive complementary learning (ICCL). In addition, an end-to-end cooperation (ECoop) between two implementations and with the detection model is proposed to exploit each potential. In short, SCLNet forms a more comprehensive representation through feature complementary and improves the representation of small objects through interscale contrast, which in turn comes to improve scale robustness and detection performance. Thorough experiments prove the effectiveness of our SCLNet on Visdrone and UAVDT datasets, including the fact that the novel components included in SCLNet are effective and competitive with many CNN-based and transformer-based methods, among other aspects. In general, our SCLNet can effectively address scale challenges and is a competitive model in UAV image object detection.
AbstractList Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of scale challenges, which encompass scale variation and small objects, continues to hinder object detection in UAV images. Although existing works propose solutions, they are implicitly modeled and have redundant steps, so detection performance remains limited. One specific work addressing the above scale challenges can help improve the performance of UAV image detectors. Compared to natural scenes, scale challenges in UAV images happen with problems of limited perception in comprehensive scales and poor robustness to small objects. We found that complementary learning is beneficial for the detection model to address the scale challenges. Therefore, the article introduces it to form our scale-robust complementary learning network (SCLNet) in conjunction with the object detection model. The SCLNet consists of two implementations and a cooperation method. In detail, one implementation is based on our proposed scale-complementary decoder and scale-complementary loss function to explicitly extract complementary information as a complement, named comprehensive-scale complementary learning (CSCL). Another implementation is based on our proposed contrastive complement network and contrastive complement loss function to explicitly guide the learning of small objects with the rich texture detail information of the large objects, named interscale contrastive complementary learning (ICCL). In addition, an end-to-end cooperation (ECoop) between two implementations and with the detection model is proposed to exploit each potential. In short, SCLNet forms a more comprehensive representation through feature complementary and improves the representation of small objects through interscale contrast, which in turn comes to improve scale robustness and detection performance. Thorough experiments prove the effectiveness of our SCLNet on Visdrone and UAVDT datasets, including the fact that the novel components included in SCLNet are effective and competitive with many CNN-based and transformer-based methods, among other aspects. In general, our SCLNet can effectively address scale challenges and is a competitive model in UAV image object detection.
Author Mao, Yongqiang
Diao, Wenhui
Li, Xuexue
Sun, Xian
Li, Xinming
Author_xml – sequence: 1
  givenname: Xuexue
  orcidid: 0000-0002-0177-7001
  surname: Li
  fullname: Li, Xuexue
  organization: Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
– sequence: 2
  givenname: Wenhui
  orcidid: 0000-0002-3931-3974
  surname: Diao
  fullname: Diao, Wenhui
  email: diaowh@aircas.ac.cn
  organization: Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
– sequence: 3
  givenname: Yongqiang
  orcidid: 0000-0001-9256-3668
  surname: Mao
  fullname: Mao, Yongqiang
  email: lixuexue20@mails.ucas.ac.cn
  organization: Department of Electronic Engineering, Tsinghua University, Beijing, China
– sequence: 4
  givenname: Xinming
  surname: Li
  fullname: Li, Xinming
  organization: Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
– sequence: 5
  givenname: Xian
  orcidid: 0000-0002-0038-9816
  surname: Sun
  fullname: Sun, Xian
  organization: Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, China
BookMark eNpNkEtLw0AYRQepYFv9AYKLAdep807irlSthWChD3E3JJMvJbWZqTMp4r83oS5c3c253-OM0MA6CwjdUjKhlKQPm_lqPWGEiQmXRAomL9CQSplERAkxQENCUxWxJGVXaBTCnhAqJI2H6GM9y96gfcRTvDb5AaKVK06hxTPXHA_QgG1z_4MzyL2t7Q536Lfzn7hyHi-LPZgWP0HbRe0sri3eTt_xosl3EK7RZZUfAtz85RhtX543s9coW84Xs2kWGSZUG3HGTZoURSJjCYwJqQxPISEgSxbHouTSlBWBMlZK5UBJQUoFylCQhsYVo3yM7s9zj959nSC0eu9O3nYrNaeCE5bEvKfomTLeheCh0kdfN91rmhLdC9S9QN0L1H8Cu87duVMDwD--P0Ul_BdrmGy4
CODEN IGRSD2
Cites_doi 10.1109/CVPR.2018.00644
10.3390/rs15061687
10.1007/978-3-030-58595-2_24
10.1109/TCSVT.2022.3168279
10.1109/TGRS.2022.3140809
10.1016/j.neucom.2021.11.105
10.1007/978-3-030-01249-6_23
10.1109/TGRS.2022.3203163
10.1109/TPAMI.2020.3024900
10.1016/j.isprsjprs.2023.04.009
10.1109/ICCV48922.2021.00986
10.1016/j.isprsjprs.2021.12.004
10.1109/CVPR.2016.91
10.1109/CVPRW50498.2020.00103
10.1109/CVPR52688.2022.01330
10.1109/TPAMI.2016.2577031
10.1037/0033-295X.102.3.419
10.1109/CVPRW59228.2023.00484
10.1109/TGRS.2022.3175213
10.1609/aaai.v36i2.20099
10.1109/TPAMI.2023.3290594
10.1109/WACV48630.2021.00330
10.1109/TIP.2013.2259840
10.1109/CVPR52729.2023.01291
10.1016/j.tics.2016.05.004
10.1109/ICCVW54120.2021.00313
10.1109/TGRS.2023.3298852
10.1007/s00138-018-0994-z
10.1609/aaai.v34i04.6126
10.1109/TGRS.2022.3201056
10.1109/CVPRW59228.2023.00198
10.1109/TPAMI.2018.2858759
10.1109/TGRS.2022.3183567
10.1109/TIP.2020.3045636
10.1007/978-3-319-10602-1_48
10.24963/ijcai.2018/607
10.1109/ICCV.2017.324
10.1111/j.1551-6709.2011.01214.x
10.1007/978-3-030-58452-8_13
10.1109/ICCV.2019.00840
10.1145/3474085.3475467
10.1109/ICCVW.2019.00030
10.1109/TGRS.2021.3062048
10.1109/TGRS.2022.3140856
10.1109/ICCVW.2019.00007
10.1037/a0033812
10.1109/tgrs.2021.3051466
10.1109/CVPR.2018.00144
10.48550/arXiv.1906.07155
10.1609/aaai.v36i1.19986
10.1109/CVPR.2016.207
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID 97E
RIA
RIE
AAYXX
CITATION
7UA
8FD
C1K
F1W
FR3
H8D
H96
KR7
L.G
L7M
DOI 10.1109/TGRS.2024.3505425
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Water Resources Abstracts
Technology Research Database
Environmental Sciences and Pollution Management
ASFA: Aquatic Sciences and Fisheries Abstracts
Engineering Research Database
Aerospace Database
Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources
Civil Engineering Abstracts
Aquatic Science & Fisheries Abstracts (ASFA) Professional
Advanced Technologies Database with Aerospace
DatabaseTitle CrossRef
Aerospace Database
Civil Engineering Abstracts
Aquatic Science & Fisheries Abstracts (ASFA) Professional
Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources
Technology Research Database
ASFA: Aquatic Sciences and Fisheries Abstracts
Engineering Research Database
Advanced Technologies Database with Aerospace
Water Resources Abstracts
Environmental Sciences and Pollution Management
DatabaseTitleList
Aerospace Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Physics
EISSN 1558-0644
EndPage 19
ExternalDocumentID 10_1109_TGRS_2024_3505425
10766668
Genre orig-research
GrantInformation_xml – fundername: National Key Research and Development Program of China
  grantid: 2022ZD0118402
  funderid: 10.13039/501100012166
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACNCT
AENEX
AETIX
AFRAH
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
H~9
IBMZZ
ICLAB
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
RXW
TAE
TN5
VH1
Y6R
AAYOK
AAYXX
CITATION
RIG
7UA
8FD
C1K
F1W
FR3
H8D
H96
KR7
L.G
L7M
ID FETCH-LOGICAL-c246t-323c98bb8575e22456c39e80e5d2774d35cdf0ed7666ae10b0d6e6c1e5c17f213
IEDL.DBID RIE
ISSN 0196-2892
IngestDate Mon Jun 30 10:17:33 EDT 2025
Tue Jul 01 02:15:32 EDT 2025
Wed Aug 27 02:33:17 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c246t-323c98bb8575e22456c39e80e5d2774d35cdf0ed7666ae10b0d6e6c1e5c17f213
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-9256-3668
0000-0002-0177-7001
0000-0002-3931-3974
0000-0002-0038-9816
PQID 3143028731
PQPubID 85465
PageCount 19
ParticipantIDs ieee_primary_10766668
proquest_journals_3143028731
crossref_primary_10_1109_TGRS_2024_3505425
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20240000
2024-00-00
20240101
PublicationDateYYYYMMDD 2024-01-01
PublicationDate_xml – year: 2024
  text: 20240000
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on geoscience and remote sensing
PublicationTitleAbbrev TGRS
PublicationYear 2024
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
ref14
ref53
ref52
ref11
ref10
Zhu (ref26) 2020
ref54
Dosovitskiy (ref36) 2020
ref17
ref16
ref19
ref18
ref51
ref46
ref45
ref48
ref47
ref42
ref41
ref44
ref43
ref49
ref8
ref7
ref9
ref4
ref3
ref6
ref5
ref40
ref35
ref34
ref37
ref31
ref30
ref33
ref32
ref2
ref1
ref39
ref38
ref24
ref23
ref25
ref20
ref22
ref21
ref28
ref27
ref29
Wei (ref50) 2020
References_xml – ident: ref5
  doi: 10.1109/CVPR.2018.00644
– ident: ref23
  doi: 10.3390/rs15061687
– ident: ref21
  doi: 10.1007/978-3-030-58595-2_24
– ident: ref54
  doi: 10.1109/TCSVT.2022.3168279
– ident: ref3
  doi: 10.1109/TGRS.2022.3140809
– ident: ref46
  doi: 10.1016/j.neucom.2021.11.105
– ident: ref41
  doi: 10.1007/978-3-030-01249-6_23
– ident: ref34
  doi: 10.1109/TGRS.2022.3203163
– ident: ref38
  doi: 10.1109/TPAMI.2020.3024900
– ident: ref9
  doi: 10.1016/j.isprsjprs.2023.04.009
– ident: ref45
  doi: 10.1109/ICCV48922.2021.00986
– ident: ref39
  doi: 10.1016/j.isprsjprs.2021.12.004
– ident: ref52
  doi: 10.1109/CVPR.2016.91
– year: 2020
  ident: ref50
  article-title: AMRNet: Chips augmentation in aerial images object detection
  publication-title: arXiv:2009.07168
– ident: ref10
  doi: 10.1109/CVPRW50498.2020.00103
– ident: ref16
  doi: 10.1109/CVPR52688.2022.01330
– ident: ref51
  doi: 10.1109/TPAMI.2016.2577031
– ident: ref27
  doi: 10.1037/0033-295X.102.3.419
– ident: ref48
  doi: 10.1109/CVPRW59228.2023.00484
– ident: ref37
  doi: 10.1109/TGRS.2022.3175213
– ident: ref47
  doi: 10.1609/aaai.v36i2.20099
– ident: ref19
  doi: 10.1109/TPAMI.2023.3290594
– ident: ref12
  doi: 10.1109/WACV48630.2021.00330
– ident: ref35
  doi: 10.1109/TIP.2013.2259840
– ident: ref17
  doi: 10.1109/CVPR52729.2023.01291
– ident: ref29
  doi: 10.1016/j.tics.2016.05.004
– ident: ref14
  doi: 10.1109/ICCVW54120.2021.00313
– ident: ref44
  doi: 10.1109/TGRS.2023.3298852
– ident: ref1
  doi: 10.1007/s00138-018-0994-z
– ident: ref31
  doi: 10.1609/aaai.v34i04.6126
– year: 2020
  ident: ref26
  article-title: Deformable DETR: Deformable transformers for end-to-end object detection
  publication-title: arXiv:2010.04159
– ident: ref18
  doi: 10.1109/TGRS.2022.3201056
– ident: ref24
  doi: 10.1109/CVPRW59228.2023.00198
– ident: ref43
  doi: 10.1109/TPAMI.2018.2858759
– ident: ref2
  doi: 10.1109/TGRS.2022.3183567
– ident: ref11
  doi: 10.1109/TIP.2020.3045636
– ident: ref4
  doi: 10.1007/978-3-319-10602-1_48
– ident: ref30
  doi: 10.24963/ijcai.2018/607
– ident: ref53
  doi: 10.1109/ICCV.2017.324
– year: 2020
  ident: ref36
  article-title: An image is worth 16×16 words: Transformers for image recognition at scale
  publication-title: arXiv:2010.11929
– ident: ref20
  doi: 10.1111/j.1551-6709.2011.01214.x
– ident: ref25
  doi: 10.1007/978-3-030-58452-8_13
– ident: ref15
  doi: 10.1109/ICCV.2019.00840
– ident: ref49
  doi: 10.1145/3474085.3475467
– ident: ref40
  doi: 10.1109/ICCVW.2019.00030
– ident: ref6
  doi: 10.1109/TGRS.2021.3062048
– ident: ref7
  doi: 10.1109/TGRS.2022.3140856
– ident: ref13
  doi: 10.1109/ICCVW.2019.00007
– ident: ref28
  doi: 10.1037/a0033812
– ident: ref8
  doi: 10.1109/tgrs.2021.3051466
– ident: ref32
  doi: 10.1109/CVPR.2018.00144
– ident: ref42
  doi: 10.48550/arXiv.1906.07155
– ident: ref22
  doi: 10.1609/aaai.v36i1.19986
– ident: ref33
  doi: 10.1109/CVPR.2016.207
SSID ssj0014517
Score 2.439769
Snippet Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Index Database
Publisher
StartPage 1
SubjectTerms Adaptation models
Autonomous aerial vehicles
Complement
Complementary learning
Computational modeling
Cooperation
Decoding
Detectors
Effectiveness
Feature extraction
Geoscience and remote sensing
Information processing
Learning
Object detection
Object recognition
Occlusion
Performance enhancement
Representations
Robustness
Robustness (mathematics)
scale challenges
scale variation
Semantics
small objects
Training
Unmanned aerial vehicles
Title SCLNet: A Scale-Robust Complementary Learning Network for Object Detection in UAV Images
URI https://ieeexplore.ieee.org/document/10766668
https://www.proquest.com/docview/3143028731
Volume 62
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3Nb9MwFH9ilZDGgUEpojCQDztNchc7TpxwqzbGQKxIazv1FtX2y4QQKVrTA_vr9-y4UzWExCk5xJbjZ7_3e98AR4WzWMgy46pUlpO-ofiylJrXSWnqpXbEM72943KSX8zV10W2iMnqIRcGEUPwGY78a_Dlu5XdeFMZ3XBNaDsv9mCPNLcuWevBZaAyEXOjc05ahIwuTJGUJ7PPV1NSBaUapSTwlW-LvSOEQleVv1hxkC_nBzDZrqwLK_k52rRmZO8eFW3876W_gOcRabJxdzRewhNs-vBsp_5gH56G-E-7fgWL6em3CbYf2ZhNiWrIr1Zms26ZZxcxwPz2D4vFWG_YpAseZ4R42XfjTTnsDNsQ1dWwHw2bj6_Zl1_EqtYDmJ9_mp1e8Nh0gVup8panMrVlYYzv3InSu0VtWmKRYOYkQUWXZtbVCTr_O0sUiUlcjrkVmFmhaynS19BrVg2-AaZrGmKyBIXVymphCqRnXgtHIFDndgjHWypUv7vaGlXQSZKy8iSrPMmqSLIhDPyu7nzYbegQDreEq-L1W1cpoUACTjoVb_8x7B3s-9k7Y8oh9NrbDb4neNGaD-FY3QNfxMmt
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV3NbtNAEB6VIgQc-ClFBArsAS5IDt712msjcYhaSkLTIDUJys1kd8dVhepUtaOqfRdehWdj1t5UEYhjJU72wbuWdz7PfvOzMwBvUmswFVkcyEyagOwNGcwzoYIizHQxV5Z0pvN3HI6S_lR-mcWzDfh5fRYGEZvkM-y62yaWbxdm6Vxl9IcrYttJ6nMoD_Dygiy06uNgj8T5Voj9T5PdfuCbCARGyKQOIhGZLNXadaJE4cJ8JsowDTG2gqiPjWJjixCtm3mOPNShTTAxHGPDVSF4RPPegttENGLRHg-7DlLImPvT2ElAdovwQVMeZu8nn4_GZHwK2Y2IYkjXiHtt22v6uPyl_Jsdbf8h_FqtRZvI8qO7rHXXXP1RJvK_XaxH8MBzadZrwf8YNrDcgvtrFRa34E6T4WqqJzAb7w5HWH9gPTYmXGJwtNDLqmZOIfoU-vNL5svNHrNRmx7PiNOzr9o5q9ge1k3eWslOSjbtfWODU1LG1TZMb-Qjn8JmuSjxGTBV0BAdh8iNkkZxnSJdk4JborkqMR14t5J6ftZWD8kbqyvMcgeR3EEk9xDpwLaT4tqDrQA7sLMCSu4VTJVHxHOJGqqIP__HsNdwtz85HObDwejgBdxzb2pdRzuwWZ8v8SWRqVq_aiDN4PtNw-I3XzsmZg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SCLNet%3A+A+Scale-Robust+Complementary+Learning+Network+for+Object+Detection+in+UAV+Images&rft.jtitle=IEEE+transactions+on+geoscience+and+remote+sensing&rft.au=Li%2C+Xuexue&rft.au=Diao%2C+Wenhui&rft.au=Mao%2C+Yongqiang&rft.au=Li%2C+Xinming&rft.date=2024&rft.pub=IEEE&rft.issn=0196-2892&rft.volume=62&rft.spage=1&rft.epage=19&rft_id=info:doi/10.1109%2FTGRS.2024.3505425&rft.externalDocID=10766668
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0196-2892&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0196-2892&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0196-2892&client=summon