Small Object Detection Algorithm Based on Improved Attention Mechanism and Feature Fusion of YOLOv8

Addressing the challenges in small object detection, particularly the issue that small objects often suffer from a lack of sufficient semantic information and are highly susceptible to background noise, this paper proposes an innovative algorithm, namely YOLOv8-FE. Firstly, to enhance the network’s...

Full description

Saved in:

Bibliographic Details
Published in	Journal of advanced computational intelligence and intelligent informatics Vol. 29; no. 4; pp. 941 - 955
Main Authors	Fang, Mingxing, Rui, Xinyu, Cheng, Hongyu, Liu, Xinke, She, Jinhua, Du, Youwu, Tan, Haoran
Format	Journal Article
Language	English
Published	Tokyo Fuji Technology Press Co. Ltd 20.07.2025
Subjects	Algorithms Background noise Datasets Effectiveness Modules Object recognition Real time
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Addressing the challenges in small object detection, particularly the issue that small objects often suffer from a lack of sufficient semantic information and are highly susceptible to background noise, this paper proposes an innovative algorithm, namely YOLOv8-FE. Firstly, to enhance the network’s sensitivity to small object detection, a P2-scale detection layer specifically designed for small objects is integrated into the model. Secondly, addressing the potential information loss during downsampling in traditional convolutional layers, an innovative downsampling module named RFAC-SPD is designed, aiming to more effectively capture and utilize features of small objects, thereby assisting the model in improving performance. Additionally, to mitigate the interference from background noise and strengthen the network’s ability to focus on object information, the study builds the C2f-CBAM module based on the convolutional block attention module (CBAM). Moreover, to fully integrate low-level feature information, minimize the loss of underlying detail information, and further enhance the network’s representational capability, an enhanced path aggregation network is proposed, significantly improving the effectiveness of network feature fusion. Experiments on the dataset VisDrone2019 show that the YOLOv8-FE algorithm exhibits superior performance and detection efficiency. Compared to the baseline algorithm YOLOv8n, its mAP50 and mAP50-95 have increased by 8.3% and 5.3%, respectively. Furthermore, with an inference speed of 77 frames per second, YOLOv8-FE meets real-time requirements, thereby validating the advancement and effectiveness of the proposed improvement algorithm. Furthermore, generalization experiments conducted on the DOTA and Caltech Pedestrian datasets demonstrate that the improved model achieves an increase of 2.7% and 6.8% in mAP50, respectively, fully validating the generality of the proposed model.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1343-0130 1883-8014
DOI:	10.20965/jaciii.2025.p0941