SCLNet: A Scale-Robust Complementary Learning Network for Object Detection in UAV Images

Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of scale challenges, which encompass scale variation and small objects, continues to hinder object detection in UAV images. Although existing work...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on geoscience and remote sensing Vol. 62; pp. 1 - 19
Main Authors Li, Xuexue, Diao, Wenhui, Mao, Yongqiang, Li, Xinming, Sun, Xian
Format Journal Article
LanguageEnglish
Published New York IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Most recent unmanned aerial vehicle (UAV) detectors focus primarily on general challenges such as uneven distribution and occlusion. However, the neglect of scale challenges, which encompass scale variation and small objects, continues to hinder object detection in UAV images. Although existing works propose solutions, they are implicitly modeled and have redundant steps, so detection performance remains limited. One specific work addressing the above scale challenges can help improve the performance of UAV image detectors. Compared to natural scenes, scale challenges in UAV images happen with problems of limited perception in comprehensive scales and poor robustness to small objects. We found that complementary learning is beneficial for the detection model to address the scale challenges. Therefore, the article introduces it to form our scale-robust complementary learning network (SCLNet) in conjunction with the object detection model. The SCLNet consists of two implementations and a cooperation method. In detail, one implementation is based on our proposed scale-complementary decoder and scale-complementary loss function to explicitly extract complementary information as a complement, named comprehensive-scale complementary learning (CSCL). Another implementation is based on our proposed contrastive complement network and contrastive complement loss function to explicitly guide the learning of small objects with the rich texture detail information of the large objects, named interscale contrastive complementary learning (ICCL). In addition, an end-to-end cooperation (ECoop) between two implementations and with the detection model is proposed to exploit each potential. In short, SCLNet forms a more comprehensive representation through feature complementary and improves the representation of small objects through interscale contrast, which in turn comes to improve scale robustness and detection performance. Thorough experiments prove the effectiveness of our SCLNet on Visdrone and UAVDT datasets, including the fact that the novel components included in SCLNet are effective and competitive with many CNN-based and transformer-based methods, among other aspects. In general, our SCLNet can effectively address scale challenges and is a competitive model in UAV image object detection.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2024.3505425