YOLO-based Object Detection Models: A Review and its Applications

In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant advancement of deep learning techniques over the past decades, most researchers work on enhancing object detection, segmentation and classification. O...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 83; no. 35; pp. 83535 - 83574
Main Authors	Vijayakumar, Ajantha, Vairavasundaram, Subramaniyaswamy
Format	Journal Article
Language	English
Published	New York Springer US 01.10.2024 Springer Nature B.V
Subjects	Accuracy Computer Communication Networks Computer Science Computer vision Data Structures and Information Theory Detectors Inference Multimedia Information Systems Object recognition Performance measurement Real time Special Purpose and Application-Based Systems Telematics Time measurement Track 6: Computer Vision for Multimedia Applications YOLO Computer Vision Dataset Deep Learning Object detection
Online Access	Get full text

Cover

Loading…

Abstract	In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant advancement of deep learning techniques over the past decades, most researchers work on enhancing object detection, segmentation and classification. Object detection performance is measured in both detection accuracy and inference time. The detection accuracy in two stage detectors is better than single stage detectors. In 2015, the real-time object detection system YOLO was published, and it rapidly grew its iterations, with the newest release, YOLOv8 in January 2023. The YOLO achieves a high detection accuracy and inference time with single stage detector. Many applications easily adopt YOLO versions due to their high inference speed. This paper presents a complete survey of YOLO versions up to YOLOv8. This article begins with explained about the performance metrics used in object detection, post-processing methods, dataset availability and object detection techniques that are used mostly; then discusses the architectural design of each YOLO version. Finally, the diverse range of YOLO versions was discussed by highlighting their contributions to various applications.
AbstractList	In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant advancement of deep learning techniques over the past decades, most researchers work on enhancing object detection, segmentation and classification. Object detection performance is measured in both detection accuracy and inference time. The detection accuracy in two stage detectors is better than single stage detectors. In 2015, the real-time object detection system YOLO was published, and it rapidly grew its iterations, with the newest release, YOLOv8 in January 2023. The YOLO achieves a high detection accuracy and inference time with single stage detector. Many applications easily adopt YOLO versions due to their high inference speed. This paper presents a complete survey of YOLO versions up to YOLOv8. This article begins with explained about the performance metrics used in object detection, post-processing methods, dataset availability and object detection techniques that are used mostly; then discusses the architectural design of each YOLO version. Finally, the diverse range of YOLO versions was discussed by highlighting their contributions to various applications.
Author	Vairavasundaram, Subramaniyaswamy Vijayakumar, Ajantha
Author_xml	– sequence: 1 givenname: Ajantha surname: Vijayakumar fullname: Vijayakumar, Ajantha organization: School of Computing, SASTRA Deemed University – sequence: 2 givenname: Subramaniyaswamy surname: Vairavasundaram fullname: Vairavasundaram, Subramaniyaswamy email: vsubramaniyaswamy@gmail.com organization: School of Computing, SASTRA Deemed University
BookMark	eNp9kE1LxDAQhoOs4O7qH_AU8BzNNGnTeivrJ6wURA-eQjYfkqW2Nekq--_tbgXFg6d3Ds8zM7wzNGnaxiJ0CvQcKBUXEYDyhNCEE8hzkZDtAZpCKhgRIoHJr_kIzWJcUwpZmvApKl-qZUVWKlqDq9Xa6h5f2X4I3zb4oTW2jpe4xI_2w9tPrBqDfR9x2XW112oHxWN06FQd7cl3ztHzzfXT4o4sq9v7RbkkmkHRkxQoy5ShLimylDumdQGKOS640ZkRmaVCqR0qjLGcppxB7lYqczTLC-04m6OzcW8X2veNjb1ct5vQDCclA0jTomCiGKh8pHRoYwzWSe37_aN9UL6WQOWuMDkWJofC5L4wuR3U5I_aBf-mwvZ_iY1SHODm1Yafr_6xvgBRrX7K
CitedBy_id	crossref_primary_10_3390_buildings14061878 crossref_primary_10_1109_JSEN_2024_3505939 crossref_primary_10_3390_s24196347 crossref_primary_10_1007_s11227_024_06753_y crossref_primary_10_3390_a17110488 crossref_primary_10_1016_j_measurement_2025_117115 crossref_primary_10_3390_plants13121681 crossref_primary_10_1021_acsestwater_4c00853 crossref_primary_10_53898_josse2024428 crossref_primary_10_3390_buildings14123883 crossref_primary_10_3390_s25020531 crossref_primary_10_48175_IJARSCT_18483 crossref_primary_10_56977_jicce_2024_22_2_159 crossref_primary_10_1038_s41598_024_69701_z crossref_primary_10_3390_computers13110286 crossref_primary_10_3390_ai6020027 crossref_primary_10_1007_s11082_024_08032_9 crossref_primary_10_3390_app15063148 crossref_primary_10_3390_technologies12090155 crossref_primary_10_3390_computers13120336 crossref_primary_10_3390_s25020472 crossref_primary_10_3390_rs17061095 crossref_primary_10_1038_s40494_025_01567_4 crossref_primary_10_3390_electronics13081557 crossref_primary_10_1109_ACCESS_2024_3524618 crossref_primary_10_1002_ps_8473 crossref_primary_10_3390_drones8120741 crossref_primary_10_1088_1538_3873_adbcd6 crossref_primary_10_1109_ACCESS_2025_3539310 crossref_primary_10_1007_s11042_025_20729_x crossref_primary_10_1109_ACCESS_2024_3521403 crossref_primary_10_3390_f15091652 crossref_primary_10_1109_ACCESS_2025_3547914 crossref_primary_10_3390_app14188200 crossref_primary_10_3390_buildings14103230 crossref_primary_10_1109_ACCESS_2025_3526458 crossref_primary_10_1051_itmconf_20257003008 crossref_primary_10_3390_app14177958 crossref_primary_10_3390_s25061870 crossref_primary_10_62660_bcstu_3_2024_21 crossref_primary_10_1109_ACCESS_2024_3506563 crossref_primary_10_1007_s42979_024_03520_x crossref_primary_10_3390_diagnostics14242875 crossref_primary_10_3390_drones8110609
Cites_doi	10.1109/ICCV.2015.169 10.1016/j.patcog.2011.05.017 10.1016/j.isprsjprs.2019.11.023 10.1016/j.compag.2020.105742 10.1109/ICCV48922.2021.00803 10.1109/IWSSIP48289.2020.9145130 10.1016/j.jvcir.2015.11.002 10.1109/AIPR.2016.8010578 10.1109/ICCE.2019.8662019 10.1109/TCSVT.2022.3177320 10.1109/ICCV.2019.00852 10.1109/ICM.2012.6471380 10.1109/CCWC.2019.8666539 10.1016/j.neucom.2022.01.022 10.1023/B:VISI.0000013087.49260.fb 10.1017/ATSIP.2020.7 10.1007/s11042-019-07898-2 10.1109/ICRA.2018.8460737 10.1007/s12652-020-02845-8 10.1016/j.procs.2021.02.031 10.1155/2020/3189691 10.1109/COMPSAC54236.2022.00045 10.1109/TCSVT.2022.3202574 10.1109/CVPR.2009.5206848 10.1016/j.isprsjprs.2016.03.014 10.1016/j.dajour.2022.100041 10.1109/TITS.2022.3161960 10.1016/j.patcog.2020.107195 10.1109/TPAMI.2015.2389824 10.1109/CVPR46437.2021.01352 10.1109/CVPR.2017.685 10.1109/TPAMI.2011.155 10.1109/B-HTC50970.2020.9297902 10.1109/CVPR46437.2021.00841 10.1109/ACCESS.2020.3021508 10.1007/978-3-319-10602-1_48 10.1109/CVPR.2017.690 10.1007/s11042-022-12962-5 10.1016/j.spc.2021.02.025 10.1007/s11042-019-08523-y 10.1109/PUNECON.2018.8745428 10.1109/ICMLC.2018.8526987 10.1109/CVPR.2016.91 10.1109/ICCV48922.2021.00526 10.1007/s11263-007-0090-8 10.1109/ICCE-China.2017.7991122 10.1016/j.ijpharm.2022.121957 10.1145/2809695.2809725 10.1016/j.icte.2020.07.008 10.1109/ACCESS.2022.3230894 10.3390/ai4010013 10.1016/j.eswa.2023.121209 10.1007/s11263-020-01316-z 10.1007/s11042-022-14305-w 10.1007/978-3-319-46448-0_2 10.1109/ICDAR.2011.290 10.1109/TPAMI.2015.2437384 10.1109/CVPR52729.2023.00721 10.1109/ICCV.2009.5459257 10.1109/ICDAR.2017.157 10.1109/ICCV.2017.324 10.1007/978-3-030-20887-5_43 10.1109/CVPR46437.2021.00294 10.1109/ICCV.2017.322 10.1007/s11263-014-0733-5 10.1109/TNNLS.2018.2876865 10.1609/aaai.v34i07.6999 10.1109/ICCV48922.2021.00349 10.1016/j.patcog.2017.05.025 10.1109/SYNASC.2018.00041 10.1109/TSP52935.2021.9522653 10.1016/j.neucom.2020.01.085 10.1109/CVPR.2001.990517 10.1109/CVPR.2014.81 10.1109/CVPR.2017.474 10.1109/RAICS.2013.6745491
ContentType	Journal Article
Copyright	The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Copyright_xml	– notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
DBID	AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D
DOI	10.1007/s11042-024-18872-y
DatabaseName	CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional
DatabaseTitleList	Computer and Information Systems Abstracts
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Computer Science
EISSN	1573-7721
EndPage	83574
ExternalDocumentID	10_1007_s11042_024_18872_y
GroupedDBID	-4Z -59 -5G -BR -EM -Y2 -~C .4S .86 .DC .VR 06D 0R~ 0VY 123 1N0 1SB 2.D 203 28- 29M 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3EH 3V. 4.4 406 408 409 40D 40E 5QI 5VS 67Z 6NX 7WY 8AO 8FE 8FG 8FL 8G5 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAOBN AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFO ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACREN ACSNA ACZOJ ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADYOE ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFKRA AFLOW AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMTXH AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. BA0 BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EBLON EBS EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GUQSH GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I-F I09 IHE IJ- IKXTQ ITG ITH ITM IWAJR IXC IXE IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW LAK LLZTM M0C M0N M2O M4Y MA- N2Q N9A NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P2P P62 P9O PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 Q2X QOK QOS R4E R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TH9 TSG TSK TSV TUC TUS U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 Z7R Z7S Z7W Z7X Z7Y Z7Z Z81 Z83 Z86 Z88 Z8M Z8N Z8Q Z8R Z8S Z8T Z8U Z8W Z92 ZMTXR ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ACMFV ACSTC ADKFA AEZWR AFDZB AFHIU AFOHR AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT 7SC 8FD ABRTQ JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c319t-51036ad0f29654f3cc91a3f474dc6d76e07aac3197dde4054318fba6f0689cf43
IEDL.DBID	U2A
ISSN	1573-7721 1380-7501
IngestDate	Sat Jul 26 00:46:55 EDT 2025 Tue Jul 01 04:13:34 EDT 2025 Thu Apr 24 23:14:46 EDT 2025 Fri Feb 21 02:37:34 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	35
Keywords	YOLO Computer Vision Dataset Deep Learning Object detection
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c319t-51036ad0f29654f3cc91a3f474dc6d76e07aac3197dde4054318fba6f0689cf43
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
PQID	3115599379
PQPubID	54626
PageCount	40
ParticipantIDs	proquest_journals_3115599379 crossref_citationtrail_10_1007_s11042_024_18872_y crossref_primary_10_1007_s11042_024_18872_y springer_journals_10_1007_s11042_024_18872_y
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	20241000
PublicationDateYYYYMMDD	2024-10-01
PublicationDate_xml	– month: 10 year: 2024 text: 20241000
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York – name: Dordrecht
PublicationSubtitle	An International Journal
PublicationTitle	Multimedia tools and applications
PublicationTitleAbbrev	Multimed Tools Appl
PublicationYear	2024
Publisher	Springer US Springer Nature B.V
Publisher_xml	– name: Springer US – name: Springer Nature B.V
References	HanXChangJWangKReal-time object detection based on YOLO-v2 for tiny vehicle objectProcedia Comput Sci2021183617210.1016/j.procs.2021.02.031 TouschAMHerbinSAudibertJYSemantic hierarchies for image annotation: A surveyPatt Recog201245133334510.1016/j.patcog.2011.05.017 Manikandan NS, Ganesan K (2019). Deep learning based automatic video annotation tool for self-driving car. arXiv preprint arXiv:1904.12618 Matsuzaka Y, Yashiro R (2023). AI-Based Computer Vision Techniques and Expert Systems. AI, 4(1), 289-302. SalariADjavadifarALiuXNajjaranHObject recognition datasets and challenges: A reviewNeurocomputing202249512915210.1016/j.neucom.2022.01.022 WuDLvSJiangMSongHUsing channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environmentsComput Electron Agriculture202017810574210.1016/j.compag.2020.105742 ChengGHanJA survey on object detection in optical remote sensing imagesISPRS J Photogram Remote Sens2016117112810.1016/j.isprsjprs.2016.03.014 AzizLSalamMSBHSheikhUUAyubSExploring deep learning-based architecture, strategies, applications and current trends in generic object detection: A comprehensive reviewIEEE Access2020817046117049510.1109/ACCESS.2020.3021508 ZhangHHongXRecent progresses on object detection: a brief reviewMultimed Tools Appli201978278092784710.1007/s11042-019-07898-2 GirshickRDonahueJDarrellTMalikJRegion-based convolutional networks for accurate object detection and segmentationIEEE Trans Patt Analy Machine Intel201538114215810.1109/TPAMI.2015.2437384 Ding X, Chen H, Zhang X, Huang, K, Han J, Ding G (2022) Re-parameterizing your optimizers rather than architectures. arXiv preprint arXiv:2205.15242 Umer S, Rout RK, Pero C, Nappi M (2022). Facial expression recognition with trade-offs between data augmentation and deep learning features. J Ambient Intel Humanized Comput. 1-15 Ghiasi G, Cui Y, Srinivas A, Qian R, Lin TY, Cubuk ED, ..., Zoph B (2021). Simple copy-paste is a strong data augmentation method for instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2918-2928 CVAT (2023) https://github.com/opencv/cvat. Accessed 5 Oct 2023 Verma NK, Sharma T, Rajurkar SD, Salour A (2016). Object identification for inventory management using convolutional neural network. In 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) (pp. 1-6). IEEE Neumann L, Karg M, Zhang S, Scharfenberger C, Piegert E, Mistr S, ... ,Schiele B (2019). Nightowls: A pedestrians at night dataset. In Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part I 14 (pp. 691-705). Springer International Publishing. Vennelakanti A, Shreya S, Rajendran R, Sarkar, Muddegowda D, Hanagal P (2019) Traffic sign detection and recognition using a CNN ensemble. In 2019 IEEE international conference on consumer electronics (ICCE) (pp. 1-4). IEEE Bhambani, K., Jain, T., & Sultanpure, K. A. (2020, October). Real-time face mask and social distancing violation detection system using yolo. In 2020 IEEE Bangalore Humanitarian Technology Conference (B-HTC) (pp. 1-6). IEEE. IoU loss function: https://learnopencv.com/iou-loss-functions-object-detection/#ciou-complete-iou-loss. Accessed 14 Nov 2023 Roboflow (2020), https://roboflow.com/. Accessed 29 Sept 2023 Fregin A, Muller J, Krebel U, Dietmayer K (2018) The driveu traffic light dataset: Introduction and comparison with existing datasets. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 3376-3383). IEEE. Kuznetsova A Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, ... , Ferrari V (2020). The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. International Journal of Computer Vision 128(7), 1956-1981. Ren S, He K Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28. Furusho Y, Ikeda K (2020) Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives. APSIPA Transactions on Signal and Information Processing 9 Shu C, Liu Y, Gao J, Yan Z, Shen C (2021) Channel-wise knowledge distillation for dense prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5311-5320 ViolaPJonesMRobust real-time object detectionInt J Comput Vision2001434–474 NanniLGhidoniSBrahnamSHandcrafted vs. non-handcrafted features for computer vision classificationPattern Recog20177115817210.1016/j.patcog.2017.05.025 Padilla R, Netto SL, Da Silva EA (2020). A survey on performance metrics for object-detection algorithms. In 2020 international conference on systems, signals and image processing (IWSSIP) (pp. 237-242). IEEE. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, ..., & Zitnick CL (2014). Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 (pp. 740-755). Springer International Publishing. Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Wei X (2022). YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976 MaDFangHWangNZhangCDongJHuHAutomatic detection and counting system for pavement cracks based on PCGAN and YOLO-MFIEEE Trans Intel Transport Syst20222311221662217810.1109/TITS.2022.3161960 DewiCChenRCJiangXYuHDeep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4Multimed Tools Appli20228126378213784510.1007/s11042-022-12962-5 RussellBCTorralbaAMurphyKPFreemanWTLabelMe: a database and web-based tool for image annotationInt J Comput Vision20087715717310.1007/s11263-007-0090-8 Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 Li X, Wang W, Wu L, Chen S, Hu X, Li J, ..., Yang J (2020) Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems, 33, 21002-21012. YanLMaSWangQChenYZhangXSavakisALiuDVideo captioning using global-local representationIEEE Trans Circuits Syst for Video Technol202232106642665610.1109/TCSVT.2022.3177320 Søgaard A, Plank B, Hovy D (2014) Selection bias, label bias, and bias in ground truth. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Tutorial Abstracts. pp. 11-13 FuJZhaoCXiaYLiuWVehicle and wheel detection: a novel SSD-based approach and associated large-scale benchmark datasetMultimed Tools Appli202079126151263410.1007/s11042-019-08523-y Makesense (2021), https://github.com/peng-zhihui/Make-Sense. Accessed 29 Sept 2023 NayagamMGRamarKA survey on real time object detection and tracking algorithmsInt J Appl Eng Res201510982908297 HeKZhangXRenSSunJSpatial pyramid pooling in deep convolutional networks for visual recognitionIEEE Trans Patt Analy Machine Intel20153791904191610.1109/TPAMI.2015.2389824 Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7464-7475 Veit A, Matera T, Neumann L, Matas J, Belongie S (2016) Coco-text: Dataset and benchmark for text detection and recognition in natural images. arXiv preprint arXiv:1601.07140 Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8514-8523 IoU- https://towardsdatascience.com/map-mean-average-precision-might-confuse-you-5956f1bfa9e2. Accessed 12 Sept 2023 Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 Varma S, Sreeraj M (2013). Object detection and classification in surveillance system. In 2013 IEEE Recent Advances in Intelligent Computational Systems (RAICS) (pp. 299-303). IEEE ViolaPJonesMJRobust real-time face detectionInt J Comput Vision20045713715410.1023/B:VISI.0000013087.49260.fb Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779-788 Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 Soviany P, Ionescu RT (2018). Optimizing the trade-off between single-stage and two-stage deep object detectors using image difficulty prediction. In: 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) (pp. 209-214). IEEE RanaMBhushanMMachine learning and deep learning approach for medical image analysis: diagnosis to detectionMultimed Tools Appli20238217267312676910.1007/s11042-022-14305-w Jocher G, Chaurasia A, Qiu J (2023) YOLO by Ultralytics. https://github.com/ultralytics/ultralytics. Accessed 21 Jan 2024 Sahin O, Ozer S (2021) Yolodrone: Improved yolo architecture for object detection in drone images. In: 2021 44th International Conference on Telecommunications and Signal Processing (TSP) (pp. 361-365). IEEE Doon R, Rawat TK, Gautam S (2018) Cifar-10 classification using deep convolutional neural network. In 2018 IEEE Punecon (pp. 1-5). IEEE Zhang S, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3213-3221 NguyenHATSopheaTGheewalaSHRattanakomRAreerobTPrueksakornKIntegrating remote sensing and machine learning into environmental monitoring and assessment of land use changeSustain Prod Consumpt2021271239125410.1016/j.spc.2021.02.025 KhuranaKAwasthiRTechniques for object recognition in images and multi-object detectionInt J Adv Res Comput Eng Technol (IJARCET)20132413831388 Raab D, Fezer S Razakarivony (18872_CR44) 2016; 34 P Viola (18872_CR12) 2001; 4 18872_CR9 18872_CR79 18872_CR8 18872_CR6 18872_CR75 18872_CR76 18872_CR3 18872_CR77 18872_CR2 18872_CR78 18872_CR1 18872_CR82 18872_CR83 18872_CR84 L Yan (18872_CR106) 2022; 32 18872_CR85 Y Jamtsho (18872_CR93) 2021; 7 18872_CR80 18872_CR81 AM Tousch (18872_CR27) 2012; 45 K Khurana (18872_CR5) 2013; 2 18872_CR68 ND Nguyen (18872_CR59) 2020; 2020 18872_CR64 18872_CR65 18872_CR66 18872_CR67 18872_CR71 18872_CR73 18872_CR74 P Viola (18872_CR53) 2004; 57 18872_CR70 18872_CR58 18872_CR54 18872_CR56 18872_CR60 SKS Durai (18872_CR14) 2022; 3 G Cheng (18872_CR42) 2016; 117 M Rana (18872_CR10) 2023; 82 HAT Nguyen (18872_CR15) 2021; 27 ZQ Zhao (18872_CR4) 2019; 30 18872_CR46 18872_CR47 18872_CR48 18872_CR49 18872_CR45 C Dewi (18872_CR98) 2022; 81 18872_CR51 J Fu (18872_CR57) 2020; 79 H Zhang (18872_CR55) 2019; 78 K He (18872_CR63) 2015; 37 X Wei (18872_CR21) 2020; 103 18872_CR39 A Salari (18872_CR69) 2022; 495 18872_CR35 18872_CR36 MG Nayagam (18872_CR7) 2015; 10 18872_CR37 18872_CR38 18872_CR31 18872_CR32 18872_CR34 18872_CR40 18872_CR41 D Wu (18872_CR97) 2020; 178 K Li (18872_CR43) 2020; 159 18872_CR28 18872_CR29 L Kang (18872_CR101) 2024; 237 18872_CR24 18872_CR103 18872_CR25 X Han (18872_CR94) 2021; 183 18872_CR104 18872_CR26 L Aziz (18872_CR61) 2020; 8 18872_CR20 18872_CR22 18872_CR105 18872_CR23 18872_CR30 R Girshick (18872_CR62) 2015; 38 BC Russell (18872_CR33) 2008; 77 18872_CR17 18872_CR18 18872_CR19 18872_CR13 M Ficzere (18872_CR100) 2022; 623 18872_CR16 18872_CR11 18872_CR99 Y Wang (18872_CR102) 2022; 10 X Wu (18872_CR52) 2020; 396 18872_CR86 18872_CR87 18872_CR88 18872_CR89 18872_CR95 P Dollar (18872_CR50) 2011; 34 18872_CR90 18872_CR92 L Nanni (18872_CR91) 2017; 71 D Ma (18872_CR96) 2022; 23 L Yan (18872_CR107) 2022; 33 M Everingham (18872_CR72) 2015; 111
References_xml	– reference: Object detection- https://www.frontiersin.org/articles/10.3389/frobt.2015.00029/full. Accessed 11 Nov 2023 – reference: LabelBox (2018), https://labelbox.com/product/annotate/. Accessed 5 Oct 2023 – reference: RanaMBhushanMMachine learning and deep learning approach for medical image analysis: diagnosis to detectionMultimed Tools Appli20238217267312676910.1007/s11042-022-14305-w – reference: IoU- https://towardsdatascience.com/map-mean-average-precision-might-confuse-you-5956f1bfa9e2. Accessed 12 Sept 2023 – reference: Ghiasi G, Cui Y, Srinivas A, Qian R, Lin TY, Cubuk ED, ..., Zoph B (2021). Simple copy-paste is a strong data augmentation method for instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2918-2928 – reference: VoTT (visual object tagging tool) (2019), https://github.com/microsoft/VoTT/blob/master/README.md. Accessed 11 Oct 2023 – reference: Wang CY, Liao HYM, Yeh IH (2022) Designing network design strategies through gradient path analysis. arXiv preprint arXiv:2211.04800 – reference: MaDFangHWangNZhangCDongJHuHAutomatic detection and counting system for pavement cracks based on PCGAN and YOLO-MFIEEE Trans Intel Transport Syst20222311221662217810.1109/TITS.2022.3161960 – reference: AzizLSalamMSBHSheikhUUAyubSExploring deep learning-based architecture, strategies, applications and current trends in generic object detection: A comprehensive reviewIEEE Access2020817046117049510.1109/ACCESS.2020.3021508 – reference: Shao S, Li Z, Zhang T, Peng C, Yu G, Zhang X, ..., & Sun J (2019). Objects365: A large-scale, high-quality dataset for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision (pp. 8430-8439). – reference: LiKWanGChengGMengLHanJObject detection in optical remote sensing images: A survey and a new benchmarkISPRS Journal of Photogram Remote Sens202015929630710.1016/j.isprsjprs.2019.11.023 – reference: Ding X, Zhang X, Ma N, Han J, Ding G, Sun J (2021). Repvgg: Making vgg-style convnets great again. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 13733-13742 – reference: Ren S, He K Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28. – reference: Sahin O, Ozer S (2021) Yolodrone: Improved yolo architecture for object detection in drone images. In: 2021 44th International Conference on Telecommunications and Signal Processing (TSP) (pp. 361-365). IEEE – reference: Zhang S, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3213-3221 – reference: He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision. 2961-2969 – reference: Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580-587 – reference: Zhou J, Tian Y, Li W, Wang R, Luan Z, Qian D (2019) LADet: A light-weight and adaptive network for multi-scale object detection. In Asian Conference on Machine Learning. 912-923. PMLR – reference: F1 score- https://encord.com/blog/f1-score-in-machine-learning/#:~:text=This%20is%20because%20the%20regular,the%20majority%20class's%20strong%20influence. Accessed 20 Jan 2024 – reference: CIFAR-10 Dataset. https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 25 Oct 2023 – reference: NayagamMGRamarKA survey on real time object detection and tracking algorithmsInt J Appl Eng Res201510982908297 – reference: KangLLuZMengLGaoZYOLO-FA: Type-1 fuzzy attention based YOLO detector for vehicle detectionExpert Syst Appli202423712120910.1016/j.eswa.2023.121209 – reference: Søgaard A, Plank B, Hovy D (2014) Selection bias, label bias, and bias in ground truth. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Tutorial Abstracts. pp. 11-13 – reference: Veit A, Matera T, Neumann L, Matas J, Belongie S (2016) Coco-text: Dataset and benchmark for text detection and recognition in natural images. arXiv preprint arXiv:1601.07140 – reference: IoU loss function: https://learnopencv.com/iou-loss-functions-object-detection/#ciou-complete-iou-loss. Accessed 14 Nov 2023 – reference: Feng C, Zhong Y, Gao Y, Scott MR, Huang W (2021) Tood: Task-aligned one-stage object detection. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 3490-3499). IEEE Computer Society – reference: YanLWangQMaSWangJYuCSolve the puzzle of instance segmentation in videos: A weakly supervised framework with spatio-temporal collaborationIEEE Trans Circuits Syst Video Technol202233139340610.1109/TCSVT.2022.3202574 – reference: ZhangHHongXRecent progresses on object detection: a brief reviewMultimed Tools Appli201978278092784710.1007/s11042-019-07898-2 – reference: Varma S, Sreeraj M (2013). Object detection and classification in surveillance system. In 2013 IEEE Recent Advances in Intelligent Computational Systems (RAICS) (pp. 299-303). IEEE – reference: Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779-788 – reference: Verma NK, Sharma T, Rajurkar SD, Salour A (2016). Object identification for inventory management using convolutional neural network. In 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) (pp. 1-6). IEEE – reference: Deng J, Dong W, Socher R, Li L. J., Li K, Fei-Fei L (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). IEEE. – reference: CVAT (2023) https://github.com/opencv/cvat. Accessed 5 Oct 2023 – reference: Ding X, Chen H, Zhang X, Huang, K, Han J, Ding G (2022) Re-parameterizing your optimizers rather than architectures. arXiv preprint arXiv:2205.15242 – reference: Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7464-7475 – reference: Soviany P, Ionescu RT (2018). Optimizing the trade-off between single-stage and two-stage deep object detectors using image difficulty prediction. In: 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) (pp. 209-214). IEEE – reference: Doon R, Rawat TK, Gautam S (2018) Cifar-10 classification using deep convolutional neural network. In 2018 IEEE Punecon (pp. 1-5). IEEE – reference: Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.[85] – reference: ChengGHanJA survey on object detection in optical remote sensing imagesISPRS J Photogram Remote Sens2016117112810.1016/j.isprsjprs.2016.03.014 – reference: GirshickRDonahueJDarrellTMalikJRegion-based convolutional networks for accurate object detection and segmentationIEEE Trans Patt Analy Machine Intel201538114215810.1109/TPAMI.2015.2437384 – reference: Manikandan NS, Ganesan K (2019). Deep learning based automatic video annotation tool for self-driving car. arXiv preprint arXiv:1904.12618 – reference: ViolaPJonesMJRobust real-time face detectionInt J Comput Vision20045713715410.1023/B:VISI.0000013087.49260.fb – reference: JamtshoYRiyamongkolPWaranusastRReal-time license plate detection for non-helmeted motorcyclist using YOLOIct Express20217110410910.1016/j.icte.2020.07.008 – reference: Bhambani, K., Jain, T., & Sultanpure, K. A. (2020, October). Real-time face mask and social distancing violation detection system using yolo. In 2020 IEEE Bangalore Humanitarian Technology Conference (B-HTC) (pp. 1-6). IEEE. – reference: WuDLvSJiangMSongHUsing channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environmentsComput Electron Agriculture202017810574210.1016/j.compag.2020.105742 – reference: FiczereMMészárosLAKállai-SzabóNKovácsAAntalINagyZKGalataDLReal-time coating thickness measurement and defect recognition of film coated tablets with machine vision and deep learningInt J Pharm202262312195710.1016/j.ijpharm.2022.121957 – reference: DollarPWojekCSchieleBPeronaPPedestrian detection: An evaluation of the state of the artIEEE Trans Patt Analy Machine Intel201134474376110.1109/TPAMI.2011.155 – reference: Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 – reference: YanLMaSWangQChenYZhangXSavakisALiuDVideo captioning using global-local representationIEEE Trans Circuits Syst for Video Technol202232106642665610.1109/TCSVT.2022.3177320 – reference: Raab D, Fezer E, Breitenbach J, Baumgartl H, Sauter D, Buettner R (2022). A Deep Learning-Based Model for Automated Quality Control in the Pharmaceutical Industry. In: 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC) (pp. 266-271). IEEE – reference: Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, ..., & Zitnick CL (2014). Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 (pp. 740-755). Springer International Publishing. – reference: Hosang J, Benenson R, Schiele B (2017) Learning non-maximum suppression. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4507-4515 – reference: Fregin A, Muller J, Krebel U, Dietmayer K (2018) The driveu traffic light dataset: Introduction and comparison with existing datasets. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 3376-3383). IEEE. – reference: RazakarivonySJurieFVehicle detection in aerial imagery: A small target detection benchmarkJ Vis Commun Image Represent20163418720310.1016/j.jvcir.2015.11.002 – reference: Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Wei X (2022). YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976 – reference: HeKZhangXRenSSunJSpatial pyramid pooling in deep convolutional networks for visual recognitionIEEE Trans Patt Analy Machine Intel20153791904191610.1109/TPAMI.2015.2389824 – reference: Umer S, Rout RK, Pero C, Nappi M (2022). Facial expression recognition with trade-offs between data augmentation and deep learning features. J Ambient Intel Humanized Comput. 1-15 – reference: Makesense (2021), https://github.com/peng-zhihui/Make-Sense. Accessed 29 Sept 2023 – reference: NguyenHATSopheaTGheewalaSHRattanakomRAreerobTPrueksakornKIntegrating remote sensing and machine learning into environmental monitoring and assessment of land use changeSustain Prod Consumpt2021271239125410.1016/j.spc.2021.02.025 – reference: EveringhamMEslamiSAVan GoolLWilliamsCKWinnJZissermanAThe pascal visual object classes challenge: A retrospectiveInt J Comput Vis20151119813610.1007/s11263-014-0733-5 – reference: Cui Y, Yan L, Cao Z, Liu D. (2021). Tf-blender: Temporal feature blender for video object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 8138-8147) – reference: Jocher G (2020) YOLOv5 by Ultralytics. https://github.com/ultralytics/yolov5. Accessed 12 Jan 2024 – reference: FuJZhaoCXiaYLiuWVehicle and wheel detection: a novel SSD-based approach and associated large-scale benchmark datasetMultimed Tools Appli202079126151263410.1007/s11042-019-08523-y – reference: HanXChangJWangKReal-time object detection based on YOLO-v2 for tiny vehicle objectProcedia Comput Sci2021183617210.1016/j.procs.2021.02.031 – reference: WangYWangHXinZEfficient detection model of steel strip surface defects based on YOLO-V7IEEE Access20221013393613394410.1109/ACCESS.2022.3230894 – reference: WuXSahooDHoiSCRecent advances in deep learning for object detectionNeurocomput2020396396410.1016/j.neucom.2020.01.085 – reference: Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision. 2980-2988 – reference: Grosicki E, El-Abed H (2011) Icdar 2011-french handwriting recognition competition. In 2011 International Conference on Document Analysis and Recognition (pp. 1459-1463). IEEE. – reference: Labelimg (2022), https://github.com/HumanSignal/labelImg. Accessed 28 Sept 2023 – reference: Kachouane M, Sahki S, Lakrouf M, Ouadah N (2012) HOG based fast human detection. In: 2012 24th International Conference on Microelectronics (ICM) (pp. 1-4). IEEE. – reference: Lingani GM, Rawat DB Garuba M (2019). Smart traffic management system using deep learning for smart city applications. In:2019 IEEE 9th annual computing and communication workshop and conference (CCWC) (pp. 0101-0106). IEEE. – reference: Padilla R, Netto SL, Da Silva EA (2020). A survey on performance metrics for object-detection algorithms. In 2020 international conference on systems, signals and image processing (IWSSIP) (pp. 237-242). IEEE. – reference: Cucliciu T, Lin CY, Muchtar K (2017). A DPM based object detector using HOG-LBP features. In: 2017 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW) (pp. 315-316). IEEE – reference: Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8514-8523 – reference: Ch'ng CK, Chan CS (2017) Total-text: A comprehensive dataset for scene text detection and recognition. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR) (Vol. 1, pp. 935-942). IEEE. – reference: Jiang Y, Qiu H, McCartney M, Sukhatme G, Gruteser M, Bai F, ..., Govindan R (2015). Carloc: Precise positioning of automobiles. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems (pp. 253-265) – reference: Furusho Y, Ikeda K (2020) Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives. APSIPA Transactions on Signal and Information Processing 9 – reference: Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7263-7271 – reference: Shu C, Liu Y, Gao J, Yan Z, Shen C (2021) Channel-wise knowledge distillation for dense prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5311-5320 – reference: TouschAMHerbinSAudibertJYSemantic hierarchies for image annotation: A surveyPatt Recog201245133334510.1016/j.patcog.2011.05.017 – reference: DuraiSKSShamiliMDSmart farming using machine learning and deep learning techniquesDecision Analy J2022310004110.1016/j.dajour.2022.100041 – reference: Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 – reference: Kuznetsova A Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, ... , Ferrari V (2020). The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. International Journal of Computer Vision 128(7), 1956-1981. – reference: ViolaPJonesMRobust real-time object detectionInt J Comput Vision2001434–474 – reference: Girshick R (2015). Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision. 1440-1448 – reference: Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2014). Synthetic data and artificial neural networks for natural scene text recognition. arXiv preprint arXiv:1406.2227 – reference: Li X, Wang W, Wu L, Chen S, Hu X, Li J, ..., Yang J (2020) Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems, 33, 21002-21012. – reference: RussellBCTorralbaAMurphyKPFreemanWTLabelMe: a database and web-based tool for image annotationInt J Comput Vision20087715717310.1007/s11263-007-0090-8 – reference: Yuan L, Lu F (2018). Real-time ear detection based on embedded systems. In: 2018 International Conference on Machine Learning and Cybernetics (ICMLC) (Vol. 1, pp. 115-120). IEEE – reference: KhuranaKAwasthiRTechniques for object recognition in images and multi-object detectionInt J Adv Res Comput Eng Technol (IJARCET)20132413831388 – reference: Matsuzaka Y, Yashiro R (2023). AI-Based Computer Vision Techniques and Expert Systems. AI, 4(1), 289-302. – reference: Vennelakanti A, Shreya S, Rajendran R, Sarkar, Muddegowda D, Hanagal P (2019) Traffic sign detection and recognition using a CNN ensemble. In 2019 IEEE international conference on consumer electronics (ICCE) (pp. 1-4). IEEE – reference: Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI conference on artificial intelligence 34(07): 12993-13000 – reference: ZhaoZQZhengPXuSTWuXObject detection with deep learning: A reviewIEEE Trans Neural Netw Learn Syst201930113212323210.1109/TNNLS.2018.2876865 – reference: NanniLGhidoniSBrahnamSHandcrafted vs. non-handcrafted features for computer vision classificationPattern Recog20177115817210.1016/j.patcog.2017.05.025 – reference: Viola P, Jones M (2001). Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001 (Vol. 1, pp. I-I).Ieee. – reference: Yolov8- https://sandar-ali.medium.com/ultralytics-unveiled-yolov8-on-january-10-2023-which-has-garnered-over-one-million-downloads-338d8f11ec5. Accessed 20 Jan 2024 – reference: WeiXZhangHLiuSLuYPedestrian detection in underground mines via parallel feature transfer networkPattern Recog202010310719510.1016/j.patcog.2020.107195 – reference: NguyenNDDoTNgoTDLeDDAn evaluation of deep learning methods for small object detectionJ Electric Comput Eng2020202011810.1155/2020/3189691 – reference: Imagenet Dataset, https://www.image-net.org/download.php. Accessed 28 Oct 2023 – reference: Roboflow (2020), https://roboflow.com/. Accessed 29 Sept 2023 – reference: SalariADjavadifarALiuXNajjaranHObject recognition datasets and challenges: A reviewNeurocomputing202249512915210.1016/j.neucom.2022.01.022 – reference: Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 – reference: Neumann L, Karg M, Zhang S, Scharfenberger C, Piegert E, Mistr S, ... ,Schiele B (2019). Nightowls: A pedestrians at night dataset. In Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part I 14 (pp. 691-705). Springer International Publishing. – reference: Jocher G, Chaurasia A, Qiu J (2023) YOLO by Ultralytics. https://github.com/ultralytics/ultralytics. Accessed 21 Jan 2024 – reference: Harzallah H, Jurie F, Schmid C (2009). Combining efficient object localization and image classification. In 2009 IEEE 12th international conference on computer vision (pp. 237-244). IEEE. – reference: DewiCChenRCJiangXYuHDeep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4Multimed Tools Appli20228126378213784510.1007/s11042-022-12962-5 – ident: 18872_CR64 doi: 10.1109/ICCV.2015.169 – volume: 45 start-page: 333 issue: 1 year: 2012 ident: 18872_CR27 publication-title: Patt Recog doi: 10.1016/j.patcog.2011.05.017 – volume: 159 start-page: 296 year: 2020 ident: 18872_CR43 publication-title: ISPRS Journal of Photogram Remote Sens doi: 10.1016/j.isprsjprs.2019.11.023 – ident: 18872_CR38 – volume: 178 start-page: 105742 year: 2020 ident: 18872_CR97 publication-title: Comput Electron Agriculture doi: 10.1016/j.compag.2020.105742 – ident: 18872_CR105 doi: 10.1109/ICCV48922.2021.00803 – ident: 18872_CR19 doi: 10.1109/IWSSIP48289.2020.9145130 – volume: 34 start-page: 187 year: 2016 ident: 18872_CR44 publication-title: J Vis Commun Image Represent doi: 10.1016/j.jvcir.2015.11.002 – ident: 18872_CR9 doi: 10.1109/AIPR.2016.8010578 – ident: 18872_CR22 doi: 10.1109/ICCE.2019.8662019 – ident: 18872_CR82 – ident: 18872_CR29 – volume: 32 start-page: 6642 issue: 10 year: 2022 ident: 18872_CR106 publication-title: IEEE Trans Circuits Syst for Video Technol doi: 10.1109/TCSVT.2022.3177320 – ident: 18872_CR24 doi: 10.1109/ICCV.2019.00852 – ident: 18872_CR67 doi: 10.1109/ICM.2012.6471380 – ident: 18872_CR13 doi: 10.1109/CCWC.2019.8666539 – ident: 18872_CR35 – volume: 495 start-page: 129 year: 2022 ident: 18872_CR69 publication-title: Neurocomputing doi: 10.1016/j.neucom.2022.01.022 – ident: 18872_CR87 – volume: 57 start-page: 137 year: 2004 ident: 18872_CR53 publication-title: Int J Comput Vision doi: 10.1023/B:VISI.0000013087.49260.fb – ident: 18872_CR75 doi: 10.1017/ATSIP.2020.7 – volume: 78 start-page: 27809 year: 2019 ident: 18872_CR55 publication-title: Multimed Tools Appli doi: 10.1007/s11042-019-07898-2 – ident: 18872_CR30 – ident: 18872_CR76 – ident: 18872_CR25 doi: 10.1109/ICRA.2018.8460737 – ident: 18872_CR23 doi: 10.1007/s12652-020-02845-8 – volume: 183 start-page: 61 year: 2021 ident: 18872_CR94 publication-title: Procedia Comput Sci doi: 10.1016/j.procs.2021.02.031 – volume: 2020 start-page: 1 year: 2020 ident: 18872_CR59 publication-title: J Electric Comput Eng doi: 10.1155/2020/3189691 – ident: 18872_CR17 – ident: 18872_CR11 doi: 10.1109/COMPSAC54236.2022.00045 – volume: 4 start-page: 4 issue: 34–47 year: 2001 ident: 18872_CR12 publication-title: Int J Comput Vision – volume: 33 start-page: 393 issue: 1 year: 2022 ident: 18872_CR107 publication-title: IEEE Trans Circuits Syst Video Technol doi: 10.1109/TCSVT.2022.3202574 – ident: 18872_CR26 doi: 10.1109/CVPR.2009.5206848 – ident: 18872_CR65 – volume: 117 start-page: 11 year: 2016 ident: 18872_CR42 publication-title: ISPRS J Photogram Remote Sens doi: 10.1016/j.isprsjprs.2016.03.014 – volume: 3 start-page: 100041 year: 2022 ident: 18872_CR14 publication-title: Decision Analy J doi: 10.1016/j.dajour.2022.100041 – ident: 18872_CR79 – volume: 23 start-page: 22166 issue: 11 year: 2022 ident: 18872_CR96 publication-title: IEEE Trans Intel Transport Syst doi: 10.1109/TITS.2022.3161960 – volume: 10 start-page: 8290 issue: 9 year: 2015 ident: 18872_CR7 publication-title: Int J Appl Eng Res – volume: 103 start-page: 107195 year: 2020 ident: 18872_CR21 publication-title: Pattern Recog doi: 10.1016/j.patcog.2020.107195 – volume: 37 start-page: 1904 issue: 9 year: 2015 ident: 18872_CR63 publication-title: IEEE Trans Patt Analy Machine Intel doi: 10.1109/TPAMI.2015.2389824 – ident: 18872_CR89 doi: 10.1109/CVPR46437.2021.01352 – ident: 18872_CR20 doi: 10.1109/CVPR.2017.685 – volume: 34 start-page: 743 issue: 4 year: 2011 ident: 18872_CR50 publication-title: IEEE Trans Patt Analy Machine Intel doi: 10.1109/TPAMI.2011.155 – ident: 18872_CR99 doi: 10.1109/B-HTC50970.2020.9297902 – ident: 18872_CR83 doi: 10.1109/CVPR46437.2021.00841 – volume: 8 start-page: 170461 year: 2020 ident: 18872_CR61 publication-title: IEEE Access doi: 10.1109/ACCESS.2020.3021508 – ident: 18872_CR39 doi: 10.1007/978-3-319-10602-1_48 – ident: 18872_CR73 doi: 10.1109/CVPR.2017.690 – volume: 81 start-page: 37821 issue: 26 year: 2022 ident: 18872_CR98 publication-title: Multimed Tools Appli doi: 10.1007/s11042-022-12962-5 – volume: 27 start-page: 1239 year: 2021 ident: 18872_CR15 publication-title: Sustain Prod Consumpt doi: 10.1016/j.spc.2021.02.025 – volume: 79 start-page: 12615 year: 2020 ident: 18872_CR57 publication-title: Multimed Tools Appli doi: 10.1007/s11042-019-08523-y – ident: 18872_CR37 doi: 10.1109/PUNECON.2018.8745428 – ident: 18872_CR6 doi: 10.1109/ICMLC.2018.8526987 – ident: 18872_CR71 doi: 10.1109/CVPR.2016.91 – ident: 18872_CR32 – ident: 18872_CR74 – ident: 18872_CR86 doi: 10.1109/ICCV48922.2021.00526 – ident: 18872_CR60 – volume: 77 start-page: 157 year: 2008 ident: 18872_CR33 publication-title: Int J Comput Vision doi: 10.1007/s11263-007-0090-8 – ident: 18872_CR68 doi: 10.1109/ICCE-China.2017.7991122 – volume: 623 start-page: 121957 year: 2022 ident: 18872_CR100 publication-title: Int J Pharm doi: 10.1016/j.ijpharm.2022.121957 – ident: 18872_CR18 doi: 10.1145/2809695.2809725 – ident: 18872_CR40 – volume: 7 start-page: 104 issue: 1 year: 2021 ident: 18872_CR93 publication-title: Ict Express doi: 10.1016/j.icte.2020.07.008 – volume: 10 start-page: 133936 year: 2022 ident: 18872_CR102 publication-title: IEEE Access doi: 10.1109/ACCESS.2022.3230894 – ident: 18872_CR1 doi: 10.3390/ai4010013 – volume: 237 start-page: 121209 year: 2024 ident: 18872_CR101 publication-title: Expert Syst Appli doi: 10.1016/j.eswa.2023.121209 – ident: 18872_CR31 – ident: 18872_CR104 – ident: 18872_CR41 doi: 10.1007/s11263-020-01316-z – volume: 82 start-page: 26731 issue: 17 year: 2023 ident: 18872_CR10 publication-title: Multimed Tools Appli doi: 10.1007/s11042-022-14305-w – ident: 18872_CR56 doi: 10.1007/978-3-319-46448-0_2 – ident: 18872_CR46 doi: 10.1109/ICDAR.2011.290 – volume: 38 start-page: 142 issue: 1 year: 2015 ident: 18872_CR62 publication-title: IEEE Trans Patt Analy Machine Intel doi: 10.1109/TPAMI.2015.2437384 – ident: 18872_CR34 – ident: 18872_CR51 – ident: 18872_CR28 – ident: 18872_CR88 doi: 10.1109/CVPR52729.2023.00721 – ident: 18872_CR3 doi: 10.1109/ICCV.2009.5459257 – ident: 18872_CR45 doi: 10.1109/ICDAR.2017.157 – ident: 18872_CR58 doi: 10.1109/ICCV.2017.324 – ident: 18872_CR90 – ident: 18872_CR36 – ident: 18872_CR84 – ident: 18872_CR49 doi: 10.1007/978-3-030-20887-5_43 – ident: 18872_CR80 doi: 10.1109/CVPR46437.2021.00294 – ident: 18872_CR66 doi: 10.1109/ICCV.2017.322 – volume: 111 start-page: 98 year: 2015 ident: 18872_CR72 publication-title: Int J Comput Vis doi: 10.1007/s11263-014-0733-5 – volume: 30 start-page: 3212 issue: 11 year: 2019 ident: 18872_CR4 publication-title: IEEE Trans Neural Netw Learn Syst doi: 10.1109/TNNLS.2018.2876865 – ident: 18872_CR77 doi: 10.1609/aaai.v34i07.6999 – ident: 18872_CR85 doi: 10.1109/ICCV48922.2021.00349 – volume: 71 start-page: 158 year: 2017 ident: 18872_CR91 publication-title: Pattern Recog doi: 10.1016/j.patcog.2017.05.025 – ident: 18872_CR2 doi: 10.1109/SYNASC.2018.00041 – ident: 18872_CR95 doi: 10.1109/TSP52935.2021.9522653 – volume: 2 start-page: 1383 issue: 4 year: 2013 ident: 18872_CR5 publication-title: Int J Adv Res Comput Eng Technol (IJARCET) – ident: 18872_CR16 – volume: 396 start-page: 39 year: 2020 ident: 18872_CR52 publication-title: Neurocomput doi: 10.1016/j.neucom.2020.01.085 – ident: 18872_CR70 – ident: 18872_CR47 – ident: 18872_CR54 doi: 10.1109/CVPR.2001.990517 – ident: 18872_CR92 doi: 10.1109/CVPR.2014.81 – ident: 18872_CR78 – ident: 18872_CR48 doi: 10.1109/CVPR.2017.474 – ident: 18872_CR81 – ident: 18872_CR103 – ident: 18872_CR8 doi: 10.1109/RAICS.2013.6745491
SSID	ssj0016524
Score	2.6695828
Snippet	In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant...
SourceID	proquest crossref springer
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	83535
SubjectTerms	Accuracy Computer Communication Networks Computer Science Computer vision Data Structures and Information Theory Detectors Inference Multimedia Information Systems Object recognition Performance measurement Real time Special Purpose and Application-Based Systems Telematics Time measurement Track 6: Computer Vision for Multimedia Applications
Title	YOLO-based Object Detection Models: A Review and its Applications
URI	https://link.springer.com/article/10.1007/s11042-024-18872-y https://www.proquest.com/docview/3115599379
Volume	83
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED5Bu8DAo4AolMoDG1jK00nYUmipeLQLldopcvyQKlUBkTD032O7SVMQIDFliG1Fdz7f59zddwCXwnVpGgqOfVOSEzoMK69oY0p4aEXCkqGpcn0ekeHEe5j607IoLK-y3auQpDmp62I3W5eSKJ-CbWUZDl5uQ9NXd3edyDVx4nXsgPiOV5bH_DzvqwuqceW3UKjxMIMD2CuhIYpXujyELZG1YL9qu4BKK2zB7gaH4BHEs_HTGGtnxNE41X9V0J0oTIJVhnSns0V-g2K0igEgmnE0L3IUb8Stj2Ey6L_cDnHZFwEzZTAF1iR4hHJLOhHxPekyFtnUlV7gcUZ4QIQVUKqHBursUoBMYYRQppRIi4QRk557Ao3sNROngAjR3EaMqkud46WcUkkJibiwUoUr7DBqg12JKmElabjuXbFIarpjLd5EiTcx4k2Wbbhaz3lbUWb8ObpTaSApzSdPNAWQr5GT-oDrSiv1699XO_vf8HPYcczG0Ml5HWgU7x_iQoGMIu1CMx70eiP9vJ899rtmj30C0djK1w
linkProvider	Springer Nature
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT8MwDLZgHIADjwFiMCAHOEGkNm3TFolDxZg29rps0jiVtEklpKkgVoT2f_ihJFm7DQRIHHauG0WOXX-p7c8A58KyWOQJjh3dkuORGMuoaGJGuWf4wkg83eXa6dLGwL4fOsMV-Ch6YXS1e5GS1F_qebObqVpJZEzBpvQMgid5KWVLTN7lRW1806zJU70gpH7Xv23gfJYAjqWRZVgRx1HGjYT41LETK459k1mJ7do8ptylwnAZU6Ku9HcJYmRc9ZKI0cSgnh8ntiXXXYU1CT485TsDEsxyFdQhdt6O8_M-v4a8OY79lnrVEa2-A1s5FEXB1HZ2YUWkZdguxjyg3OvLsLnAWbgHwUOv3cMq-HHUi9RfHFQTmS7oSpGarDYaX6MATXMOiKUcPWVjFCzkyfdhsBTdHUApfU7FISBKFZdSzOQlktgRZyxhlPpcGJHEMabnV8AsVBXGOUm5mpUxCuf0ykq9oVRvqNUbTipwOXvnZUrR8ad0tTiBMHfXcagohxyF1OQGropTmT_-fbWj_4mfwXqj32mH7Wa3dQwbRBuJKgysQil7fRMnEuBk0am2LwSPyzboTxfkBKk
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fS8MwED7mBNEHf0zF6dQ86JMG27TNWsGH4hybm5sPDuZTTZsEhFGHq8j-K_9Ek67dpqjgw557DeFy17v07vsO4ERYFgtdwbGTQnJcEmEVFU3MKHcNTxjSTVGudx3a6Nm3fadfgI8cC5N2u-clyQmmQbM0xcnFkMuLGfDN1LASFV-wqbyE4HHWVtkS43d1aRtdNWvqhE8Jqd88XDdwNlcAR8rgEqxJ5CjjhiQedWxpRZFnMkvaVZtHlFepMKqMadGq8n2V0KgY68qQUWlQ14ukbal1l2DZ1uhj5UE94k_rFtQhdgbN-XmfX8PfLKf9VoZNo1t9E9aztBT5EzvagoKIS7CRj3xA2RegBGtz_IXb4D92212sAyFH3VD_0UE1kaTNXTHSU9YGo0vko0n9AbGYo-dkhPy5mvkO9Baiu10oxi-x2ANEqeZVipi6UBI75IxJRqnHhRGqnMZ0vTKYuaqCKCMs13MzBsGMalmrN1DqDVL1BuMynE3fGU7oOv6UruQnEGSuOwo0_ZCjsza1gfP8VGaPf19t_3_ix7ByX6sH7WandQCrJLUR3SNYgWLy-iYOVa6ThEepeSF4WrQ9fwKD7wjc
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=YOLO-based+Object+Detection+Models%3A+A+Review+and+its+Applications&rft.jtitle=Multimedia+tools+and+applications&rft.au=Vijayakumar%2C+Ajantha&rft.au=Vairavasundaram%2C+Subramaniyaswamy&rft.date=2024-10-01&rft.pub=Springer+Nature+B.V&rft.issn=1380-7501&rft.eissn=1573-7721&rft.volume=83&rft.issue=35&rft.spage=83535&rft.epage=83574&rft_id=info:doi/10.1007%2Fs11042-024-18872-y&rft.externalDBID=HAS_PDF_LINK
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1573-7721&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1573-7721&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1573-7721&client=summon