An Improved YOLOv3 (E-YOLOv3) to Detect Objects and Comparative Analysis of YOLOv3, ResNet101-YOLOv3, YOLOv8 and DETR

The object detection field has received significant improvement through deep learning technology and YOLO (You Only Look Once) stands out as a leading model which delivers fast and precise real-time results. This research evaluates the performance of five object detection models including YOLOv3 and...

Full description

Saved in:

Bibliographic Details
Published in	International Journal of Innovative Research in Computer Science and Technology Vol. 13; no. 4; pp. 15 - 31
Main Authors	Singh, Jashanpreet, Kumar, Rajiv
Format	Journal Article
Language	English
Published	01.07.2025
Online Access	Get full text
ISSN	2347-5552 2347-5552
DOI	10.55524/ijircst.2025.13.4.2

Cover

Loading…

More Information
Summary:	The object detection field has received significant improvement through deep learning technology and YOLO (You Only Look Once) stands out as a leading model which delivers fast and precise real-time results. This research evaluates the performance of five object detection models including YOLOv3 and ResNet101-based YOLOv3 (R-YOLO) and EfficientNetB0-based YOLOv3 (E-YOLO, proposed model) and YOLOv8 and DETR which were trained on the COCO dataset. The evaluation process used a test dataset consisting of 19,960 images to measure Precision, Recall, F1 Score and mean Average Precision (mAP). To assess robustness, all models were further tested on challenging subsets, including 10 images each of blurred, low-light, and clean images. Rigorous testing against COCO benchmark datasets revealed that the modified E-YOLOv3 outperforms the state-of-the-art detection models, especially in environments like blurred scene, clean scene and lowlight scene. Our model achieved a mean Average Precision (mAP) of 96.85%. The proposed E-YOLO model outperformed YOLOv3, R-YOLO, and YOLOv8 in both general and adverse conditions. E-YOLO achieved competitive accuracy compared to DETR while using significantly less computational resources and faster inference which makes it more suitable for real-time applications. DETR achieved better mAP and precision results than E-YOLO in complex and overlapping scenes because of its transformer-based global attention but its high resource requirements and slow inference speed limit its performance. E-YOLO provides an excellent balance between accuracy and efficiency by delivering strong performance across various scenarios at a low computational cost. The solution provides practical and effective real-world object detection capabilities especially when hardware constraints exist.
ISSN:	2347-5552 2347-5552
DOI:	10.55524/ijircst.2025.13.4.2