Adaptive Knowledge Distillation With Attention-Based Multi-Modal Fusion for Robust Dim Object Detection

Automated object detection in aerial images is crucial in both civil and military applications. Existing computer vision-based object detection methods are not robust enough to precisely detect dim objects in aerial images due to the cluttered backgrounds, various observing angles, small object scal...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on multimedia Vol. 27; pp. 2083 - 2096
Main Authors Lan, Zhen, Li, Zixing, Yan, Chao, Xiang, Xiaojia, Tang, Dengqing, Zhou, Han, Lai, Jun
Format Journal Article
LanguageEnglish
Published IEEE 01.01.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Automated object detection in aerial images is crucial in both civil and military applications. Existing computer vision-based object detection methods are not robust enough to precisely detect dim objects in aerial images due to the cluttered backgrounds, various observing angles, small object scales, and severe occlusions. Recently, electroencephalography (EEG)-based object detection methods have received increasing attention owing to the advanced cognitive capabilities of human vision. However, how to combine the human intelligence with computer intelligence to achieve robust dim object detection is still an open question. In this paper, we propose a novel approach to efficiently fuse and exploit the properties of multi-modal data for dim object detection. Specifically, we first design a brain-computer interface (BCI) paradigm called eye-tracking-based slow serial visual presentation (ESSVP) to simultaneously collect the paired EEG and image data when subjects search for the dim objects in aerial images. Then, we develop an attention-based multi-modal fusion network to selectively aggregate the learned features of EEG and image modalities. Furthermore, we propose an adaptive multi-teacher knowledge distillation method to efficiently train the multi-modal dim object detector for better performance. To evaluate the effectiveness of our method, we conduct extensive experiments on the collected dataset in subject-dependent and subject-independent tasks. The experimental results demonstrate that the proposed dim object detection method exhibits superior effectiveness and robustness compared to the baselines and the state-of-the-art methods.
ISSN:1520-9210
1941-0077
DOI:10.1109/TMM.2024.3521793