State-of-the-Art Object Detection: An Overview of YOLO Variants and their Performance

A fundamental component of computer vision is object detection, and the state-of-the-art object detection algorithm YOLO (You Only Look Once), which uses regression as its foundation has become one of the most effective techniques for real-time object detection in images and videos. This research st...

Full description

Saved in:
Bibliographic Details
Published in2023 4th International Conference on Smart Electronics and Communication (ICOSEC) pp. 1018 - 1024
Main Authors L, Dhruthi, Megharaj, Praveen K, P, Pranav, Kiran, Niharika, P, Asha Rani K, S, Gowrishankar
Format Conference Proceeding
LanguageEnglish
Published IEEE 20.09.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A fundamental component of computer vision is object detection, and the state-of-the-art object detection algorithm YOLO (You Only Look Once), which uses regression as its foundation has become one of the most effective techniques for real-time object detection in images and videos. This research study compares the performance of different YOLO object detection algorithms on the widely used Microsoft Common Objects in Context (MS COCO) dataset. It also provides a comprehensive analysis of how the YOLO algorithm has been implemented in various iterations, from YOLOv3 to YOLOv7, for various object detection tasks. The work focuses on assessing these models, each of which was pretrained over multiple weights, by putting them to the test on various datasets with various classifications. The primary objectives of this research are twofold: first, to contrast the performance of YOLO models amongst various versions, and second, to assess their effectiveness on both CPU and GPU platforms. Frames-per-second (fps) and mean Average Precision (mAP), which measures the precision of identifying an object, serve as the foundation for the comparison. This study tests the capability of YOLO versions to infer classes from photos and videos by running the models on several datasets with dozens of classes. The results obtained, offer insightful information about the advantages and disadvantages of each YOLO variation. The frames-per-second(fps) metric allows us to assess the computational efficiency of the models, highlighting their real-time processing capabilities. Simultaneously, mean Average Precision(mAP) serves as a measure of the accuracy and precision in detecting objects across multiple classes. By comparing the performance of different YOLO versions, this study intends to assist users in selecting the most suitable model based on their specific requirements. The research findings contribute to a better understanding of the trade-offs between accuracy and computational efficiency, enabling informed decisions for practical applications of object detection
DOI:10.1109/ICOSEC58147.2023.10276030