Lyolo: a lightweight object detection algorithm integrating label enhancement for high-quality prediction boxes
In the field of object detection, most researchers overlook the relationship between predicted bounding boxes and ground truth boxes. Moreover, the downsampling of conventional convolution reduces image resolution, often sacrificing some details and edge information, impacting the precise determinat...
Saved in:
Published in | Pattern analysis and applications : PAA Vol. 28; no. 3 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
London
Springer London
01.09.2025
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In the field of object detection, most researchers overlook the relationship between predicted bounding boxes and ground truth boxes. Moreover, the downsampling of conventional convolution reduces image resolution, often sacrificing some details and edge information, impacting the precise determination of object positions. Meanwhile, the feature extraction capability of the backbone network in enhancement algorithms is crucial for the detection performance of the entire model. To address these issues, this paper proposes a high-quality prediction box based object detection algorithm LYOLO. It suppresses low-quality prediction boxes and enhances high-quality ones, devising a Label Enhancement (LE) strategy to effectively adjust the weights of positive and negative samples. Meanwhile, a lightweight downsampling method (Down) and a lightweight Feature Enhancement (FE) mechanism are designed. The former enlarges the receptive field to improve the model’s ability to determine object positions, and the latter further allocates feature weights to generate stronger feature representations for the backbone network. Experimental results on the VOC and COCO datasets demonstrate that LYOLO, across all sizes, performs exceptionally well. It achieves the highest accuracy with the lowest number of parameters and computational complexity while maintaining low latency. For example, LYOLOn achieves an
of 82.0% on the VOC dataset with only 2.28M parameters. Compared to the baseline model YOLO11n, it reduces the number of parameters by 11.9% while improving
by 3.0%. In comparison with YOLOv8n, YOLOv9t, and YOLOv10n, LYOLOn achieves
improvements of 3.4%, 2.3%, and 3.1%, respectively. The code and datasets used in this article can be obtained from
https://github.com/lingzhiy/LYOLO
. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1433-7541 1433-755X |
DOI: | 10.1007/s10044-025-01528-4 |