YOLO-based Object Detection Models: A Review and its Applications

In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant advancement of deep learning techniques over the past decades, most researchers work on enhancing object detection, segmentation and classification. O...

Full description

Saved in:
Bibliographic Details
Published inMultimedia tools and applications Vol. 83; no. 35; pp. 83535 - 83574
Main Authors Vijayakumar, Ajantha, Vairavasundaram, Subramaniyaswamy
Format Journal Article
LanguageEnglish
Published New York Springer US 01.10.2024
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
Abstract In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant advancement of deep learning techniques over the past decades, most researchers work on enhancing object detection, segmentation and classification. Object detection performance is measured in both detection accuracy and inference time. The detection accuracy in two stage detectors is better than single stage detectors. In 2015, the real-time object detection system YOLO was published, and it rapidly grew its iterations, with the newest release, YOLOv8 in January 2023. The YOLO achieves a high detection accuracy and inference time with single stage detector. Many applications easily adopt YOLO versions due to their high inference speed. This paper presents a complete survey of YOLO versions up to YOLOv8. This article begins with explained about the performance metrics used in object detection, post-processing methods, dataset availability and object detection techniques that are used mostly; then discusses the architectural design of each YOLO version. Finally, the diverse range of YOLO versions was discussed by highlighting their contributions to various applications.
AbstractList In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant advancement of deep learning techniques over the past decades, most researchers work on enhancing object detection, segmentation and classification. Object detection performance is measured in both detection accuracy and inference time. The detection accuracy in two stage detectors is better than single stage detectors. In 2015, the real-time object detection system YOLO was published, and it rapidly grew its iterations, with the newest release, YOLOv8 in January 2023. The YOLO achieves a high detection accuracy and inference time with single stage detector. Many applications easily adopt YOLO versions due to their high inference speed. This paper presents a complete survey of YOLO versions up to YOLOv8. This article begins with explained about the performance metrics used in object detection, post-processing methods, dataset availability and object detection techniques that are used mostly; then discusses the architectural design of each YOLO version. Finally, the diverse range of YOLO versions was discussed by highlighting their contributions to various applications.
Author Vairavasundaram, Subramaniyaswamy
Vijayakumar, Ajantha
Author_xml – sequence: 1
  givenname: Ajantha
  surname: Vijayakumar
  fullname: Vijayakumar, Ajantha
  organization: School of Computing, SASTRA Deemed University
– sequence: 2
  givenname: Subramaniyaswamy
  surname: Vairavasundaram
  fullname: Vairavasundaram, Subramaniyaswamy
  email: vsubramaniyaswamy@gmail.com
  organization: School of Computing, SASTRA Deemed University
BookMark eNp9kE1LxDAQhoOs4O7qH_AU8BzNNGnTeivrJ6wURA-eQjYfkqW2Nekq--_tbgXFg6d3Ds8zM7wzNGnaxiJ0CvQcKBUXEYDyhNCEE8hzkZDtAZpCKhgRIoHJr_kIzWJcUwpZmvApKl-qZUVWKlqDq9Xa6h5f2X4I3zb4oTW2jpe4xI_2w9tPrBqDfR9x2XW112oHxWN06FQd7cl3ztHzzfXT4o4sq9v7RbkkmkHRkxQoy5ShLimylDumdQGKOS640ZkRmaVCqR0qjLGcppxB7lYqczTLC-04m6OzcW8X2veNjb1ct5vQDCclA0jTomCiGKh8pHRoYwzWSe37_aN9UL6WQOWuMDkWJofC5L4wuR3U5I_aBf-mwvZ_iY1SHODm1Yafr_6xvgBRrX7K
CitedBy_id crossref_primary_10_3390_buildings14061878
crossref_primary_10_1109_JSEN_2024_3505939
crossref_primary_10_3390_s24196347
crossref_primary_10_1007_s11227_024_06753_y
crossref_primary_10_3390_a17110488
crossref_primary_10_1016_j_measurement_2025_117115
crossref_primary_10_3390_plants13121681
crossref_primary_10_1021_acsestwater_4c00853
crossref_primary_10_53898_josse2024428
crossref_primary_10_3390_buildings14123883
crossref_primary_10_3390_s25020531
crossref_primary_10_48175_IJARSCT_18483
crossref_primary_10_56977_jicce_2024_22_2_159
crossref_primary_10_1038_s41598_024_69701_z
crossref_primary_10_3390_computers13110286
crossref_primary_10_3390_ai6020027
crossref_primary_10_1007_s11082_024_08032_9
crossref_primary_10_3390_app15063148
crossref_primary_10_3390_technologies12090155
crossref_primary_10_3390_computers13120336
crossref_primary_10_3390_s25020472
crossref_primary_10_3390_rs17061095
crossref_primary_10_1038_s40494_025_01567_4
crossref_primary_10_3390_electronics13081557
crossref_primary_10_1109_ACCESS_2024_3524618
crossref_primary_10_1002_ps_8473
crossref_primary_10_3390_drones8120741
crossref_primary_10_1088_1538_3873_adbcd6
crossref_primary_10_1109_ACCESS_2025_3539310
crossref_primary_10_1007_s11042_025_20729_x
crossref_primary_10_1109_ACCESS_2024_3521403
crossref_primary_10_3390_f15091652
crossref_primary_10_1109_ACCESS_2025_3547914
crossref_primary_10_3390_app14188200
crossref_primary_10_3390_buildings14103230
crossref_primary_10_1109_ACCESS_2025_3526458
crossref_primary_10_1051_itmconf_20257003008
crossref_primary_10_3390_app14177958
crossref_primary_10_3390_s25061870
crossref_primary_10_62660_bcstu_3_2024_21
crossref_primary_10_1109_ACCESS_2024_3506563
crossref_primary_10_1007_s42979_024_03520_x
crossref_primary_10_3390_diagnostics14242875
crossref_primary_10_3390_drones8110609
Cites_doi 10.1109/ICCV.2015.169
10.1016/j.patcog.2011.05.017
10.1016/j.isprsjprs.2019.11.023
10.1016/j.compag.2020.105742
10.1109/ICCV48922.2021.00803
10.1109/IWSSIP48289.2020.9145130
10.1016/j.jvcir.2015.11.002
10.1109/AIPR.2016.8010578
10.1109/ICCE.2019.8662019
10.1109/TCSVT.2022.3177320
10.1109/ICCV.2019.00852
10.1109/ICM.2012.6471380
10.1109/CCWC.2019.8666539
10.1016/j.neucom.2022.01.022
10.1023/B:VISI.0000013087.49260.fb
10.1017/ATSIP.2020.7
10.1007/s11042-019-07898-2
10.1109/ICRA.2018.8460737
10.1007/s12652-020-02845-8
10.1016/j.procs.2021.02.031
10.1155/2020/3189691
10.1109/COMPSAC54236.2022.00045
10.1109/TCSVT.2022.3202574
10.1109/CVPR.2009.5206848
10.1016/j.isprsjprs.2016.03.014
10.1016/j.dajour.2022.100041
10.1109/TITS.2022.3161960
10.1016/j.patcog.2020.107195
10.1109/TPAMI.2015.2389824
10.1109/CVPR46437.2021.01352
10.1109/CVPR.2017.685
10.1109/TPAMI.2011.155
10.1109/B-HTC50970.2020.9297902
10.1109/CVPR46437.2021.00841
10.1109/ACCESS.2020.3021508
10.1007/978-3-319-10602-1_48
10.1109/CVPR.2017.690
10.1007/s11042-022-12962-5
10.1016/j.spc.2021.02.025
10.1007/s11042-019-08523-y
10.1109/PUNECON.2018.8745428
10.1109/ICMLC.2018.8526987
10.1109/CVPR.2016.91
10.1109/ICCV48922.2021.00526
10.1007/s11263-007-0090-8
10.1109/ICCE-China.2017.7991122
10.1016/j.ijpharm.2022.121957
10.1145/2809695.2809725
10.1016/j.icte.2020.07.008
10.1109/ACCESS.2022.3230894
10.3390/ai4010013
10.1016/j.eswa.2023.121209
10.1007/s11263-020-01316-z
10.1007/s11042-022-14305-w
10.1007/978-3-319-46448-0_2
10.1109/ICDAR.2011.290
10.1109/TPAMI.2015.2437384
10.1109/CVPR52729.2023.00721
10.1109/ICCV.2009.5459257
10.1109/ICDAR.2017.157
10.1109/ICCV.2017.324
10.1007/978-3-030-20887-5_43
10.1109/CVPR46437.2021.00294
10.1109/ICCV.2017.322
10.1007/s11263-014-0733-5
10.1109/TNNLS.2018.2876865
10.1609/aaai.v34i07.6999
10.1109/ICCV48922.2021.00349
10.1016/j.patcog.2017.05.025
10.1109/SYNASC.2018.00041
10.1109/TSP52935.2021.9522653
10.1016/j.neucom.2020.01.085
10.1109/CVPR.2001.990517
10.1109/CVPR.2014.81
10.1109/CVPR.2017.474
10.1109/RAICS.2013.6745491
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1007/s11042-024-18872-y
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Computer and Information Systems Abstracts
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1573-7721
EndPage 83574
ExternalDocumentID 10_1007_s11042_024_18872_y
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
.4S
.86
.DC
.VR
06D
0R~
0VY
123
1N0
1SB
2.D
203
28-
29M
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
3EH
3V.
4.4
406
408
409
40D
40E
5QI
5VS
67Z
6NX
7WY
8AO
8FE
8FG
8FL
8G5
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFO
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACREN
ACSNA
ACZOJ
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADURQ
ADYFF
ADYOE
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFYQB
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMTXH
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BBWZM
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I-F
I09
IHE
IJ-
IKXTQ
ITG
ITH
ITM
IWAJR
IXC
IXE
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
KOW
LAK
LLZTM
M0C
M0N
M2O
M4Y
MA-
N2Q
N9A
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P2P
P62
P9O
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
Q2X
QOK
QOS
R4E
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TEORI
TH9
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7S
Z7W
Z7X
Z7Y
Z7Z
Z81
Z83
Z86
Z88
Z8M
Z8N
Z8Q
Z8R
Z8S
Z8T
Z8U
Z8W
Z92
ZMTXR
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ACMFV
ACSTC
ADKFA
AEZWR
AFDZB
AFHIU
AFOHR
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
7SC
8FD
ABRTQ
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c319t-51036ad0f29654f3cc91a3f474dc6d76e07aac3197dde4054318fba6f0689cf43
IEDL.DBID U2A
ISSN 1573-7721
1380-7501
IngestDate Sat Jul 26 00:46:55 EDT 2025
Tue Jul 01 04:13:34 EDT 2025
Thu Apr 24 23:14:46 EDT 2025
Fri Feb 21 02:37:34 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 35
Keywords YOLO
Computer Vision
Dataset
Deep Learning
Object detection
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c319t-51036ad0f29654f3cc91a3f474dc6d76e07aac3197dde4054318fba6f0689cf43
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 3115599379
PQPubID 54626
PageCount 40
ParticipantIDs proquest_journals_3115599379
crossref_citationtrail_10_1007_s11042_024_18872_y
crossref_primary_10_1007_s11042_024_18872_y
springer_journals_10_1007_s11042_024_18872_y
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20241000
PublicationDateYYYYMMDD 2024-10-01
PublicationDate_xml – month: 10
  year: 2024
  text: 20241000
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: Dordrecht
PublicationSubtitle An International Journal
PublicationTitle Multimedia tools and applications
PublicationTitleAbbrev Multimed Tools Appl
PublicationYear 2024
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References HanXChangJWangKReal-time object detection based on YOLO-v2 for tiny vehicle objectProcedia Comput Sci2021183617210.1016/j.procs.2021.02.031
TouschAMHerbinSAudibertJYSemantic hierarchies for image annotation: A surveyPatt Recog201245133334510.1016/j.patcog.2011.05.017
Manikandan NS, Ganesan K (2019). Deep learning based automatic video annotation tool for self-driving car. arXiv preprint arXiv:1904.12618
Matsuzaka Y, Yashiro R (2023). AI-Based Computer Vision Techniques and Expert Systems. AI, 4(1), 289-302.
SalariADjavadifarALiuXNajjaranHObject recognition datasets and challenges: A reviewNeurocomputing202249512915210.1016/j.neucom.2022.01.022
WuDLvSJiangMSongHUsing channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environmentsComput Electron Agriculture202017810574210.1016/j.compag.2020.105742
ChengGHanJA survey on object detection in optical remote sensing imagesISPRS J Photogram Remote Sens2016117112810.1016/j.isprsjprs.2016.03.014
AzizLSalamMSBHSheikhUUAyubSExploring deep learning-based architecture, strategies, applications and current trends in generic object detection: A comprehensive reviewIEEE Access2020817046117049510.1109/ACCESS.2020.3021508
ZhangHHongXRecent progresses on object detection: a brief reviewMultimed Tools Appli201978278092784710.1007/s11042-019-07898-2
GirshickRDonahueJDarrellTMalikJRegion-based convolutional networks for accurate object detection and segmentationIEEE Trans Patt Analy Machine Intel201538114215810.1109/TPAMI.2015.2437384
Ding X, Chen H, Zhang X, Huang, K, Han J, Ding G (2022) Re-parameterizing your optimizers rather than architectures. arXiv preprint arXiv:2205.15242
Umer S, Rout RK, Pero C, Nappi M (2022). Facial expression recognition with trade-offs between data augmentation and deep learning features. J Ambient Intel Humanized Comput. 1-15
Ghiasi G, Cui Y, Srinivas A, Qian R, Lin TY, Cubuk ED, ..., Zoph B (2021). Simple copy-paste is a strong data augmentation method for instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2918-2928
CVAT (2023) https://github.com/opencv/cvat. Accessed 5 Oct 2023
Verma NK, Sharma T, Rajurkar SD, Salour A (2016). Object identification for inventory management using convolutional neural network. In 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) (pp. 1-6). IEEE
Neumann L, Karg M, Zhang S, Scharfenberger C, Piegert E, Mistr S, ... ,Schiele B (2019). Nightowls: A pedestrians at night dataset. In Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part I 14 (pp. 691-705). Springer International Publishing.
Vennelakanti A, Shreya S, Rajendran R, Sarkar, Muddegowda D, Hanagal P (2019) Traffic sign detection and recognition using a CNN ensemble. In 2019 IEEE international conference on consumer electronics (ICCE) (pp. 1-4). IEEE
Bhambani, K., Jain, T., & Sultanpure, K. A. (2020, October). Real-time face mask and social distancing violation detection system using yolo. In 2020 IEEE Bangalore Humanitarian Technology Conference (B-HTC) (pp. 1-6). IEEE.
IoU loss function: https://learnopencv.com/iou-loss-functions-object-detection/#ciou-complete-iou-loss. Accessed 14 Nov 2023
Roboflow (2020), https://roboflow.com/. Accessed 29 Sept 2023
Fregin A, Muller J, Krebel U, Dietmayer K (2018) The driveu traffic light dataset: Introduction and comparison with existing datasets. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 3376-3383). IEEE.
Kuznetsova A Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, ... , Ferrari V (2020). The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. International Journal of Computer Vision 128(7), 1956-1981.
Ren S, He K Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
Furusho Y, Ikeda K (2020) Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives. APSIPA Transactions on Signal and Information Processing 9
Shu C, Liu Y, Gao J, Yan Z, Shen C (2021) Channel-wise knowledge distillation for dense prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5311-5320
ViolaPJonesMRobust real-time object detectionInt J Comput Vision2001434–474
NanniLGhidoniSBrahnamSHandcrafted vs. non-handcrafted features for computer vision classificationPattern Recog20177115817210.1016/j.patcog.2017.05.025
Padilla R, Netto SL, Da Silva EA (2020). A survey on performance metrics for object-detection algorithms. In 2020 international conference on systems, signals and image processing (IWSSIP) (pp. 237-242). IEEE.
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, ..., & Zitnick CL (2014). Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 (pp. 740-755). Springer International Publishing.
Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Wei X (2022). YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976
MaDFangHWangNZhangCDongJHuHAutomatic detection and counting system for pavement cracks based on PCGAN and YOLO-MFIEEE Trans Intel Transport Syst20222311221662217810.1109/TITS.2022.3161960
DewiCChenRCJiangXYuHDeep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4Multimed Tools Appli20228126378213784510.1007/s11042-022-12962-5
RussellBCTorralbaAMurphyKPFreemanWTLabelMe: a database and web-based tool for image annotationInt J Comput Vision20087715717310.1007/s11263-007-0090-8
Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Li X, Wang W, Wu L, Chen S, Hu X, Li J, ..., Yang J (2020) Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems, 33, 21002-21012.
YanLMaSWangQChenYZhangXSavakisALiuDVideo captioning using global-local representationIEEE Trans Circuits Syst for Video Technol202232106642665610.1109/TCSVT.2022.3177320
Søgaard A, Plank B, Hovy D (2014) Selection bias, label bias, and bias in ground truth. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Tutorial Abstracts. pp. 11-13
FuJZhaoCXiaYLiuWVehicle and wheel detection: a novel SSD-based approach and associated large-scale benchmark datasetMultimed Tools Appli202079126151263410.1007/s11042-019-08523-y
Makesense (2021), https://github.com/peng-zhihui/Make-Sense. Accessed 29 Sept 2023
NayagamMGRamarKA survey on real time object detection and tracking algorithmsInt J Appl Eng Res201510982908297
HeKZhangXRenSSunJSpatial pyramid pooling in deep convolutional networks for visual recognitionIEEE Trans Patt Analy Machine Intel20153791904191610.1109/TPAMI.2015.2389824
Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7464-7475
Veit A, Matera T, Neumann L, Matas J, Belongie S (2016) Coco-text: Dataset and benchmark for text detection and recognition in natural images. arXiv preprint arXiv:1601.07140
Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8514-8523
IoU- https://towardsdatascience.com/map-mean-average-precision-might-confuse-you-5956f1bfa9e2. Accessed 12 Sept 2023
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Varma S, Sreeraj M (2013). Object detection and classification in surveillance system. In 2013 IEEE Recent Advances in Intelligent Computational Systems (RAICS) (pp. 299-303). IEEE
ViolaPJonesMJRobust real-time face detectionInt J Comput Vision20045713715410.1023/B:VISI.0000013087.49260.fb
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779-788
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412
Soviany P, Ionescu RT (2018). Optimizing the trade-off between single-stage and two-stage deep object detectors using image difficulty prediction. In: 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) (pp. 209-214). IEEE
RanaMBhushanMMachine learning and deep learning approach for medical image analysis: diagnosis to detectionMultimed Tools Appli20238217267312676910.1007/s11042-022-14305-w
Jocher G, Chaurasia A, Qiu J (2023) YOLO by Ultralytics. https://github.com/ultralytics/ultralytics. Accessed 21 Jan 2024
Sahin O, Ozer S (2021) Yolodrone: Improved yolo architecture for object detection in drone images. In: 2021 44th International Conference on Telecommunications and Signal Processing (TSP) (pp. 361-365). IEEE
Doon R, Rawat TK, Gautam S (2018) Cifar-10 classification using deep convolutional neural network. In 2018 IEEE Punecon (pp. 1-5). IEEE
Zhang S, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3213-3221
NguyenHATSopheaTGheewalaSHRattanakomRAreerobTPrueksakornKIntegrating remote sensing and machine learning into environmental monitoring and assessment of land use changeSustain Prod Consumpt2021271239125410.1016/j.spc.2021.02.025
KhuranaKAwasthiRTechniques for object recognition in images and multi-object detectionInt J Adv Res Comput Eng Technol (IJARCET)20132413831388
Raab D, Fezer
S Razakarivony (18872_CR44) 2016; 34
P Viola (18872_CR12) 2001; 4
18872_CR9
18872_CR79
18872_CR8
18872_CR6
18872_CR75
18872_CR76
18872_CR3
18872_CR77
18872_CR2
18872_CR78
18872_CR1
18872_CR82
18872_CR83
18872_CR84
L Yan (18872_CR106) 2022; 32
18872_CR85
Y Jamtsho (18872_CR93) 2021; 7
18872_CR80
18872_CR81
AM Tousch (18872_CR27) 2012; 45
K Khurana (18872_CR5) 2013; 2
18872_CR68
ND Nguyen (18872_CR59) 2020; 2020
18872_CR64
18872_CR65
18872_CR66
18872_CR67
18872_CR71
18872_CR73
18872_CR74
P Viola (18872_CR53) 2004; 57
18872_CR70
18872_CR58
18872_CR54
18872_CR56
18872_CR60
SKS Durai (18872_CR14) 2022; 3
G Cheng (18872_CR42) 2016; 117
M Rana (18872_CR10) 2023; 82
HAT Nguyen (18872_CR15) 2021; 27
ZQ Zhao (18872_CR4) 2019; 30
18872_CR46
18872_CR47
18872_CR48
18872_CR49
18872_CR45
C Dewi (18872_CR98) 2022; 81
18872_CR51
J Fu (18872_CR57) 2020; 79
H Zhang (18872_CR55) 2019; 78
K He (18872_CR63) 2015; 37
X Wei (18872_CR21) 2020; 103
18872_CR39
A Salari (18872_CR69) 2022; 495
18872_CR35
18872_CR36
MG Nayagam (18872_CR7) 2015; 10
18872_CR37
18872_CR38
18872_CR31
18872_CR32
18872_CR34
18872_CR40
18872_CR41
D Wu (18872_CR97) 2020; 178
K Li (18872_CR43) 2020; 159
18872_CR28
18872_CR29
L Kang (18872_CR101) 2024; 237
18872_CR24
18872_CR103
18872_CR25
X Han (18872_CR94) 2021; 183
18872_CR104
18872_CR26
L Aziz (18872_CR61) 2020; 8
18872_CR20
18872_CR22
18872_CR105
18872_CR23
18872_CR30
R Girshick (18872_CR62) 2015; 38
BC Russell (18872_CR33) 2008; 77
18872_CR17
18872_CR18
18872_CR19
18872_CR13
M Ficzere (18872_CR100) 2022; 623
18872_CR16
18872_CR11
18872_CR99
Y Wang (18872_CR102) 2022; 10
X Wu (18872_CR52) 2020; 396
18872_CR86
18872_CR87
18872_CR88
18872_CR89
18872_CR95
P Dollar (18872_CR50) 2011; 34
18872_CR90
18872_CR92
L Nanni (18872_CR91) 2017; 71
D Ma (18872_CR96) 2022; 23
L Yan (18872_CR107) 2022; 33
M Everingham (18872_CR72) 2015; 111
References_xml – reference: Object detection- https://www.frontiersin.org/articles/10.3389/frobt.2015.00029/full. Accessed 11 Nov 2023
– reference: LabelBox (2018), https://labelbox.com/product/annotate/. Accessed 5 Oct 2023
– reference: RanaMBhushanMMachine learning and deep learning approach for medical image analysis: diagnosis to detectionMultimed Tools Appli20238217267312676910.1007/s11042-022-14305-w
– reference: IoU- https://towardsdatascience.com/map-mean-average-precision-might-confuse-you-5956f1bfa9e2. Accessed 12 Sept 2023
– reference: Ghiasi G, Cui Y, Srinivas A, Qian R, Lin TY, Cubuk ED, ..., Zoph B (2021). Simple copy-paste is a strong data augmentation method for instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2918-2928
– reference: VoTT (visual object tagging tool) (2019), https://github.com/microsoft/VoTT/blob/master/README.md. Accessed 11 Oct 2023
– reference: Wang CY, Liao HYM, Yeh IH (2022) Designing network design strategies through gradient path analysis. arXiv preprint arXiv:2211.04800
– reference: MaDFangHWangNZhangCDongJHuHAutomatic detection and counting system for pavement cracks based on PCGAN and YOLO-MFIEEE Trans Intel Transport Syst20222311221662217810.1109/TITS.2022.3161960
– reference: AzizLSalamMSBHSheikhUUAyubSExploring deep learning-based architecture, strategies, applications and current trends in generic object detection: A comprehensive reviewIEEE Access2020817046117049510.1109/ACCESS.2020.3021508
– reference: Shao S, Li Z, Zhang T, Peng C, Yu G, Zhang X, ..., & Sun J (2019). Objects365: A large-scale, high-quality dataset for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision (pp. 8430-8439).
– reference: LiKWanGChengGMengLHanJObject detection in optical remote sensing images: A survey and a new benchmarkISPRS Journal of Photogram Remote Sens202015929630710.1016/j.isprsjprs.2019.11.023
– reference: Ding X, Zhang X, Ma N, Han J, Ding G, Sun J (2021). Repvgg: Making vgg-style convnets great again. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 13733-13742
– reference: Ren S, He K Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
– reference: Sahin O, Ozer S (2021) Yolodrone: Improved yolo architecture for object detection in drone images. In: 2021 44th International Conference on Telecommunications and Signal Processing (TSP) (pp. 361-365). IEEE
– reference: Zhang S, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3213-3221
– reference: He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision. 2961-2969
– reference: Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580-587
– reference: Zhou J, Tian Y, Li W, Wang R, Luan Z, Qian D (2019) LADet: A light-weight and adaptive network for multi-scale object detection. In Asian Conference on Machine Learning. 912-923. PMLR
– reference: F1 score- https://encord.com/blog/f1-score-in-machine-learning/#:~:text=This%20is%20because%20the%20regular,the%20majority%20class's%20strong%20influence. Accessed 20 Jan 2024
– reference: CIFAR-10 Dataset. https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 25 Oct 2023
– reference: NayagamMGRamarKA survey on real time object detection and tracking algorithmsInt J Appl Eng Res201510982908297
– reference: KangLLuZMengLGaoZYOLO-FA: Type-1 fuzzy attention based YOLO detector for vehicle detectionExpert Syst Appli202423712120910.1016/j.eswa.2023.121209
– reference: Søgaard A, Plank B, Hovy D (2014) Selection bias, label bias, and bias in ground truth. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Tutorial Abstracts. pp. 11-13
– reference: Veit A, Matera T, Neumann L, Matas J, Belongie S (2016) Coco-text: Dataset and benchmark for text detection and recognition in natural images. arXiv preprint arXiv:1601.07140
– reference: IoU loss function: https://learnopencv.com/iou-loss-functions-object-detection/#ciou-complete-iou-loss. Accessed 14 Nov 2023
– reference: Feng C, Zhong Y, Gao Y, Scott MR, Huang W (2021) Tood: Task-aligned one-stage object detection. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 3490-3499). IEEE Computer Society
– reference: YanLWangQMaSWangJYuCSolve the puzzle of instance segmentation in videos: A weakly supervised framework with spatio-temporal collaborationIEEE Trans Circuits Syst Video Technol202233139340610.1109/TCSVT.2022.3202574
– reference: ZhangHHongXRecent progresses on object detection: a brief reviewMultimed Tools Appli201978278092784710.1007/s11042-019-07898-2
– reference: Varma S, Sreeraj M (2013). Object detection and classification in surveillance system. In 2013 IEEE Recent Advances in Intelligent Computational Systems (RAICS) (pp. 299-303). IEEE
– reference: Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779-788
– reference: Verma NK, Sharma T, Rajurkar SD, Salour A (2016). Object identification for inventory management using convolutional neural network. In 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) (pp. 1-6). IEEE
– reference: Deng J, Dong W, Socher R, Li L. J., Li K, Fei-Fei L (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). IEEE.
– reference: CVAT (2023) https://github.com/opencv/cvat. Accessed 5 Oct 2023
– reference: Ding X, Chen H, Zhang X, Huang, K, Han J, Ding G (2022) Re-parameterizing your optimizers rather than architectures. arXiv preprint arXiv:2205.15242
– reference: Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7464-7475
– reference: Soviany P, Ionescu RT (2018). Optimizing the trade-off between single-stage and two-stage deep object detectors using image difficulty prediction. In: 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) (pp. 209-214). IEEE
– reference: Doon R, Rawat TK, Gautam S (2018) Cifar-10 classification using deep convolutional neural network. In 2018 IEEE Punecon (pp. 1-5). IEEE
– reference: Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.[85]
– reference: ChengGHanJA survey on object detection in optical remote sensing imagesISPRS J Photogram Remote Sens2016117112810.1016/j.isprsjprs.2016.03.014
– reference: GirshickRDonahueJDarrellTMalikJRegion-based convolutional networks for accurate object detection and segmentationIEEE Trans Patt Analy Machine Intel201538114215810.1109/TPAMI.2015.2437384
– reference: Manikandan NS, Ganesan K (2019). Deep learning based automatic video annotation tool for self-driving car. arXiv preprint arXiv:1904.12618
– reference: ViolaPJonesMJRobust real-time face detectionInt J Comput Vision20045713715410.1023/B:VISI.0000013087.49260.fb
– reference: JamtshoYRiyamongkolPWaranusastRReal-time license plate detection for non-helmeted motorcyclist using YOLOIct Express20217110410910.1016/j.icte.2020.07.008
– reference: Bhambani, K., Jain, T., & Sultanpure, K. A. (2020, October). Real-time face mask and social distancing violation detection system using yolo. In 2020 IEEE Bangalore Humanitarian Technology Conference (B-HTC) (pp. 1-6). IEEE.
– reference: WuDLvSJiangMSongHUsing channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environmentsComput Electron Agriculture202017810574210.1016/j.compag.2020.105742
– reference: FiczereMMészárosLAKállai-SzabóNKovácsAAntalINagyZKGalataDLReal-time coating thickness measurement and defect recognition of film coated tablets with machine vision and deep learningInt J Pharm202262312195710.1016/j.ijpharm.2022.121957
– reference: DollarPWojekCSchieleBPeronaPPedestrian detection: An evaluation of the state of the artIEEE Trans Patt Analy Machine Intel201134474376110.1109/TPAMI.2011.155
– reference: Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
– reference: YanLMaSWangQChenYZhangXSavakisALiuDVideo captioning using global-local representationIEEE Trans Circuits Syst for Video Technol202232106642665610.1109/TCSVT.2022.3177320
– reference: Raab D, Fezer E, Breitenbach J, Baumgartl H, Sauter D, Buettner R (2022). A Deep Learning-Based Model for Automated Quality Control in the Pharmaceutical Industry. In: 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC) (pp. 266-271). IEEE
– reference: Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, ..., & Zitnick CL (2014). Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 (pp. 740-755). Springer International Publishing.
– reference: Hosang J, Benenson R, Schiele B (2017) Learning non-maximum suppression. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4507-4515
– reference: Fregin A, Muller J, Krebel U, Dietmayer K (2018) The driveu traffic light dataset: Introduction and comparison with existing datasets. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 3376-3383). IEEE.
– reference: RazakarivonySJurieFVehicle detection in aerial imagery: A small target detection benchmarkJ Vis Commun Image Represent20163418720310.1016/j.jvcir.2015.11.002
– reference: Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Wei X (2022). YOLOv6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976
– reference: HeKZhangXRenSSunJSpatial pyramid pooling in deep convolutional networks for visual recognitionIEEE Trans Patt Analy Machine Intel20153791904191610.1109/TPAMI.2015.2389824
– reference: Umer S, Rout RK, Pero C, Nappi M (2022). Facial expression recognition with trade-offs between data augmentation and deep learning features. J Ambient Intel Humanized Comput. 1-15
– reference: Makesense (2021), https://github.com/peng-zhihui/Make-Sense. Accessed 29 Sept 2023
– reference: NguyenHATSopheaTGheewalaSHRattanakomRAreerobTPrueksakornKIntegrating remote sensing and machine learning into environmental monitoring and assessment of land use changeSustain Prod Consumpt2021271239125410.1016/j.spc.2021.02.025
– reference: EveringhamMEslamiSAVan GoolLWilliamsCKWinnJZissermanAThe pascal visual object classes challenge: A retrospectiveInt J Comput Vis20151119813610.1007/s11263-014-0733-5
– reference: Cui Y, Yan L, Cao Z, Liu D. (2021). Tf-blender: Temporal feature blender for video object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 8138-8147)
– reference: Jocher G (2020) YOLOv5 by Ultralytics. https://github.com/ultralytics/yolov5. Accessed 12 Jan 2024
– reference: FuJZhaoCXiaYLiuWVehicle and wheel detection: a novel SSD-based approach and associated large-scale benchmark datasetMultimed Tools Appli202079126151263410.1007/s11042-019-08523-y
– reference: HanXChangJWangKReal-time object detection based on YOLO-v2 for tiny vehicle objectProcedia Comput Sci2021183617210.1016/j.procs.2021.02.031
– reference: WangYWangHXinZEfficient detection model of steel strip surface defects based on YOLO-V7IEEE Access20221013393613394410.1109/ACCESS.2022.3230894
– reference: WuXSahooDHoiSCRecent advances in deep learning for object detectionNeurocomput2020396396410.1016/j.neucom.2020.01.085
– reference: Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision. 2980-2988
– reference: Grosicki E, El-Abed H (2011) Icdar 2011-french handwriting recognition competition. In 2011 International Conference on Document Analysis and Recognition (pp. 1459-1463). IEEE.
– reference: Labelimg (2022), https://github.com/HumanSignal/labelImg. Accessed 28 Sept 2023
– reference: Kachouane M, Sahki S, Lakrouf M, Ouadah N (2012) HOG based fast human detection. In: 2012 24th International Conference on Microelectronics (ICM) (pp. 1-4). IEEE.
– reference: Lingani GM, Rawat DB Garuba M (2019). Smart traffic management system using deep learning for smart city applications. In:2019 IEEE 9th annual computing and communication workshop and conference (CCWC) (pp. 0101-0106). IEEE.
– reference: Padilla R, Netto SL, Da Silva EA (2020). A survey on performance metrics for object-detection algorithms. In 2020 international conference on systems, signals and image processing (IWSSIP) (pp. 237-242). IEEE.
– reference: Cucliciu T, Lin CY, Muchtar K (2017). A DPM based object detector using HOG-LBP features. In: 2017 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW) (pp. 315-316). IEEE
– reference: Zhang H, Wang Y, Dayoub F, Sunderhauf N (2021) Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8514-8523
– reference: Ch'ng CK, Chan CS (2017) Total-text: A comprehensive dataset for scene text detection and recognition. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR) (Vol. 1, pp. 935-942). IEEE.
– reference: Jiang Y, Qiu H, McCartney M, Sukhatme G, Gruteser M, Bai F, ..., Govindan R (2015). Carloc: Precise positioning of automobiles. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems (pp. 253-265)
– reference: Furusho Y, Ikeda K (2020) Theoretical analysis of skip connections and batch normalization from generalization and optimization perspectives. APSIPA Transactions on Signal and Information Processing 9
– reference: Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7263-7271
– reference: Shu C, Liu Y, Gao J, Yan Z, Shen C (2021) Channel-wise knowledge distillation for dense prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5311-5320
– reference: TouschAMHerbinSAudibertJYSemantic hierarchies for image annotation: A surveyPatt Recog201245133334510.1016/j.patcog.2011.05.017
– reference: DuraiSKSShamiliMDSmart farming using machine learning and deep learning techniquesDecision Analy J2022310004110.1016/j.dajour.2022.100041
– reference: Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412
– reference: Kuznetsova A Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, ... , Ferrari V (2020). The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. International Journal of Computer Vision 128(7), 1956-1981.
– reference: ViolaPJonesMRobust real-time object detectionInt J Comput Vision2001434–474
– reference: Girshick R (2015). Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision. 1440-1448
– reference: Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2014). Synthetic data and artificial neural networks for natural scene text recognition. arXiv preprint arXiv:1406.2227
– reference: Li X, Wang W, Wu L, Chen S, Hu X, Li J, ..., Yang J (2020) Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems, 33, 21002-21012.
– reference: RussellBCTorralbaAMurphyKPFreemanWTLabelMe: a database and web-based tool for image annotationInt J Comput Vision20087715717310.1007/s11263-007-0090-8
– reference: Yuan L, Lu F (2018). Real-time ear detection based on embedded systems. In: 2018 International Conference on Machine Learning and Cybernetics (ICMLC) (Vol. 1, pp. 115-120). IEEE
– reference: KhuranaKAwasthiRTechniques for object recognition in images and multi-object detectionInt J Adv Res Comput Eng Technol (IJARCET)20132413831388
– reference: Matsuzaka Y, Yashiro R (2023). AI-Based Computer Vision Techniques and Expert Systems. AI, 4(1), 289-302.
– reference: Vennelakanti A, Shreya S, Rajendran R, Sarkar, Muddegowda D, Hanagal P (2019) Traffic sign detection and recognition using a CNN ensemble. In 2019 IEEE international conference on consumer electronics (ICCE) (pp. 1-4). IEEE
– reference: Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI conference on artificial intelligence 34(07): 12993-13000
– reference: ZhaoZQZhengPXuSTWuXObject detection with deep learning: A reviewIEEE Trans Neural Netw Learn Syst201930113212323210.1109/TNNLS.2018.2876865
– reference: NanniLGhidoniSBrahnamSHandcrafted vs. non-handcrafted features for computer vision classificationPattern Recog20177115817210.1016/j.patcog.2017.05.025
– reference: Viola P, Jones M (2001). Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001 (Vol. 1, pp. I-I).Ieee.
– reference: Yolov8- https://sandar-ali.medium.com/ultralytics-unveiled-yolov8-on-january-10-2023-which-has-garnered-over-one-million-downloads-338d8f11ec5. Accessed 20 Jan 2024
– reference: WeiXZhangHLiuSLuYPedestrian detection in underground mines via parallel feature transfer networkPattern Recog202010310719510.1016/j.patcog.2020.107195
– reference: NguyenNDDoTNgoTDLeDDAn evaluation of deep learning methods for small object detectionJ Electric Comput Eng2020202011810.1155/2020/3189691
– reference: Imagenet Dataset, https://www.image-net.org/download.php. Accessed 28 Oct 2023
– reference: Roboflow (2020), https://roboflow.com/. Accessed 29 Sept 2023
– reference: SalariADjavadifarALiuXNajjaranHObject recognition datasets and challenges: A reviewNeurocomputing202249512915210.1016/j.neucom.2022.01.022
– reference: Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
– reference: Neumann L, Karg M, Zhang S, Scharfenberger C, Piegert E, Mistr S, ... ,Schiele B (2019). Nightowls: A pedestrians at night dataset. In Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2–6, 2018, Revised Selected Papers, Part I 14 (pp. 691-705). Springer International Publishing.
– reference: Jocher G, Chaurasia A, Qiu J (2023) YOLO by Ultralytics. https://github.com/ultralytics/ultralytics. Accessed 21 Jan 2024
– reference: Harzallah H, Jurie F, Schmid C (2009). Combining efficient object localization and image classification. In 2009 IEEE 12th international conference on computer vision (pp. 237-244). IEEE.
– reference: DewiCChenRCJiangXYuHDeep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4Multimed Tools Appli20228126378213784510.1007/s11042-022-12962-5
– ident: 18872_CR64
  doi: 10.1109/ICCV.2015.169
– volume: 45
  start-page: 333
  issue: 1
  year: 2012
  ident: 18872_CR27
  publication-title: Patt Recog
  doi: 10.1016/j.patcog.2011.05.017
– volume: 159
  start-page: 296
  year: 2020
  ident: 18872_CR43
  publication-title: ISPRS Journal of Photogram Remote Sens
  doi: 10.1016/j.isprsjprs.2019.11.023
– ident: 18872_CR38
– volume: 178
  start-page: 105742
  year: 2020
  ident: 18872_CR97
  publication-title: Comput Electron Agriculture
  doi: 10.1016/j.compag.2020.105742
– ident: 18872_CR105
  doi: 10.1109/ICCV48922.2021.00803
– ident: 18872_CR19
  doi: 10.1109/IWSSIP48289.2020.9145130
– volume: 34
  start-page: 187
  year: 2016
  ident: 18872_CR44
  publication-title: J Vis Commun Image Represent
  doi: 10.1016/j.jvcir.2015.11.002
– ident: 18872_CR9
  doi: 10.1109/AIPR.2016.8010578
– ident: 18872_CR22
  doi: 10.1109/ICCE.2019.8662019
– ident: 18872_CR82
– ident: 18872_CR29
– volume: 32
  start-page: 6642
  issue: 10
  year: 2022
  ident: 18872_CR106
  publication-title: IEEE Trans Circuits Syst for Video Technol
  doi: 10.1109/TCSVT.2022.3177320
– ident: 18872_CR24
  doi: 10.1109/ICCV.2019.00852
– ident: 18872_CR67
  doi: 10.1109/ICM.2012.6471380
– ident: 18872_CR13
  doi: 10.1109/CCWC.2019.8666539
– ident: 18872_CR35
– volume: 495
  start-page: 129
  year: 2022
  ident: 18872_CR69
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2022.01.022
– ident: 18872_CR87
– volume: 57
  start-page: 137
  year: 2004
  ident: 18872_CR53
  publication-title: Int J Comput Vision
  doi: 10.1023/B:VISI.0000013087.49260.fb
– ident: 18872_CR75
  doi: 10.1017/ATSIP.2020.7
– volume: 78
  start-page: 27809
  year: 2019
  ident: 18872_CR55
  publication-title: Multimed Tools Appli
  doi: 10.1007/s11042-019-07898-2
– ident: 18872_CR30
– ident: 18872_CR76
– ident: 18872_CR25
  doi: 10.1109/ICRA.2018.8460737
– ident: 18872_CR23
  doi: 10.1007/s12652-020-02845-8
– volume: 183
  start-page: 61
  year: 2021
  ident: 18872_CR94
  publication-title: Procedia Comput Sci
  doi: 10.1016/j.procs.2021.02.031
– volume: 2020
  start-page: 1
  year: 2020
  ident: 18872_CR59
  publication-title: J Electric Comput Eng
  doi: 10.1155/2020/3189691
– ident: 18872_CR17
– ident: 18872_CR11
  doi: 10.1109/COMPSAC54236.2022.00045
– volume: 4
  start-page: 4
  issue: 34–47
  year: 2001
  ident: 18872_CR12
  publication-title: Int J Comput Vision
– volume: 33
  start-page: 393
  issue: 1
  year: 2022
  ident: 18872_CR107
  publication-title: IEEE Trans Circuits Syst Video Technol
  doi: 10.1109/TCSVT.2022.3202574
– ident: 18872_CR26
  doi: 10.1109/CVPR.2009.5206848
– ident: 18872_CR65
– volume: 117
  start-page: 11
  year: 2016
  ident: 18872_CR42
  publication-title: ISPRS J Photogram Remote Sens
  doi: 10.1016/j.isprsjprs.2016.03.014
– volume: 3
  start-page: 100041
  year: 2022
  ident: 18872_CR14
  publication-title: Decision Analy J
  doi: 10.1016/j.dajour.2022.100041
– ident: 18872_CR79
– volume: 23
  start-page: 22166
  issue: 11
  year: 2022
  ident: 18872_CR96
  publication-title: IEEE Trans Intel Transport Syst
  doi: 10.1109/TITS.2022.3161960
– volume: 10
  start-page: 8290
  issue: 9
  year: 2015
  ident: 18872_CR7
  publication-title: Int J Appl Eng Res
– volume: 103
  start-page: 107195
  year: 2020
  ident: 18872_CR21
  publication-title: Pattern Recog
  doi: 10.1016/j.patcog.2020.107195
– volume: 37
  start-page: 1904
  issue: 9
  year: 2015
  ident: 18872_CR63
  publication-title: IEEE Trans Patt Analy Machine Intel
  doi: 10.1109/TPAMI.2015.2389824
– ident: 18872_CR89
  doi: 10.1109/CVPR46437.2021.01352
– ident: 18872_CR20
  doi: 10.1109/CVPR.2017.685
– volume: 34
  start-page: 743
  issue: 4
  year: 2011
  ident: 18872_CR50
  publication-title: IEEE Trans Patt Analy Machine Intel
  doi: 10.1109/TPAMI.2011.155
– ident: 18872_CR99
  doi: 10.1109/B-HTC50970.2020.9297902
– ident: 18872_CR83
  doi: 10.1109/CVPR46437.2021.00841
– volume: 8
  start-page: 170461
  year: 2020
  ident: 18872_CR61
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2020.3021508
– ident: 18872_CR39
  doi: 10.1007/978-3-319-10602-1_48
– ident: 18872_CR73
  doi: 10.1109/CVPR.2017.690
– volume: 81
  start-page: 37821
  issue: 26
  year: 2022
  ident: 18872_CR98
  publication-title: Multimed Tools Appli
  doi: 10.1007/s11042-022-12962-5
– volume: 27
  start-page: 1239
  year: 2021
  ident: 18872_CR15
  publication-title: Sustain Prod Consumpt
  doi: 10.1016/j.spc.2021.02.025
– volume: 79
  start-page: 12615
  year: 2020
  ident: 18872_CR57
  publication-title: Multimed Tools Appli
  doi: 10.1007/s11042-019-08523-y
– ident: 18872_CR37
  doi: 10.1109/PUNECON.2018.8745428
– ident: 18872_CR6
  doi: 10.1109/ICMLC.2018.8526987
– ident: 18872_CR71
  doi: 10.1109/CVPR.2016.91
– ident: 18872_CR32
– ident: 18872_CR74
– ident: 18872_CR86
  doi: 10.1109/ICCV48922.2021.00526
– ident: 18872_CR60
– volume: 77
  start-page: 157
  year: 2008
  ident: 18872_CR33
  publication-title: Int J Comput Vision
  doi: 10.1007/s11263-007-0090-8
– ident: 18872_CR68
  doi: 10.1109/ICCE-China.2017.7991122
– volume: 623
  start-page: 121957
  year: 2022
  ident: 18872_CR100
  publication-title: Int J Pharm
  doi: 10.1016/j.ijpharm.2022.121957
– ident: 18872_CR18
  doi: 10.1145/2809695.2809725
– ident: 18872_CR40
– volume: 7
  start-page: 104
  issue: 1
  year: 2021
  ident: 18872_CR93
  publication-title: Ict Express
  doi: 10.1016/j.icte.2020.07.008
– volume: 10
  start-page: 133936
  year: 2022
  ident: 18872_CR102
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2022.3230894
– ident: 18872_CR1
  doi: 10.3390/ai4010013
– volume: 237
  start-page: 121209
  year: 2024
  ident: 18872_CR101
  publication-title: Expert Syst Appli
  doi: 10.1016/j.eswa.2023.121209
– ident: 18872_CR31
– ident: 18872_CR104
– ident: 18872_CR41
  doi: 10.1007/s11263-020-01316-z
– volume: 82
  start-page: 26731
  issue: 17
  year: 2023
  ident: 18872_CR10
  publication-title: Multimed Tools Appli
  doi: 10.1007/s11042-022-14305-w
– ident: 18872_CR56
  doi: 10.1007/978-3-319-46448-0_2
– ident: 18872_CR46
  doi: 10.1109/ICDAR.2011.290
– volume: 38
  start-page: 142
  issue: 1
  year: 2015
  ident: 18872_CR62
  publication-title: IEEE Trans Patt Analy Machine Intel
  doi: 10.1109/TPAMI.2015.2437384
– ident: 18872_CR34
– ident: 18872_CR51
– ident: 18872_CR28
– ident: 18872_CR88
  doi: 10.1109/CVPR52729.2023.00721
– ident: 18872_CR3
  doi: 10.1109/ICCV.2009.5459257
– ident: 18872_CR45
  doi: 10.1109/ICDAR.2017.157
– ident: 18872_CR58
  doi: 10.1109/ICCV.2017.324
– ident: 18872_CR90
– ident: 18872_CR36
– ident: 18872_CR84
– ident: 18872_CR49
  doi: 10.1007/978-3-030-20887-5_43
– ident: 18872_CR80
  doi: 10.1109/CVPR46437.2021.00294
– ident: 18872_CR66
  doi: 10.1109/ICCV.2017.322
– volume: 111
  start-page: 98
  year: 2015
  ident: 18872_CR72
  publication-title: Int J Comput Vis
  doi: 10.1007/s11263-014-0733-5
– volume: 30
  start-page: 3212
  issue: 11
  year: 2019
  ident: 18872_CR4
  publication-title: IEEE Trans Neural Netw Learn Syst
  doi: 10.1109/TNNLS.2018.2876865
– ident: 18872_CR77
  doi: 10.1609/aaai.v34i07.6999
– ident: 18872_CR85
  doi: 10.1109/ICCV48922.2021.00349
– volume: 71
  start-page: 158
  year: 2017
  ident: 18872_CR91
  publication-title: Pattern Recog
  doi: 10.1016/j.patcog.2017.05.025
– ident: 18872_CR2
  doi: 10.1109/SYNASC.2018.00041
– ident: 18872_CR95
  doi: 10.1109/TSP52935.2021.9522653
– volume: 2
  start-page: 1383
  issue: 4
  year: 2013
  ident: 18872_CR5
  publication-title: Int J Adv Res Comput Eng Technol (IJARCET)
– ident: 18872_CR16
– volume: 396
  start-page: 39
  year: 2020
  ident: 18872_CR52
  publication-title: Neurocomput
  doi: 10.1016/j.neucom.2020.01.085
– ident: 18872_CR70
– ident: 18872_CR47
– ident: 18872_CR54
  doi: 10.1109/CVPR.2001.990517
– ident: 18872_CR92
  doi: 10.1109/CVPR.2014.81
– ident: 18872_CR78
– ident: 18872_CR48
  doi: 10.1109/CVPR.2017.474
– ident: 18872_CR81
– ident: 18872_CR103
– ident: 18872_CR8
  doi: 10.1109/RAICS.2013.6745491
SSID ssj0016524
Score 2.6695828
Snippet In computer vision, object detection is the classical and most challenging problem to get accurate results in detecting objects. With the significant...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 83535
SubjectTerms Accuracy
Computer Communication Networks
Computer Science
Computer vision
Data Structures and Information Theory
Detectors
Inference
Multimedia Information Systems
Object recognition
Performance measurement
Real time
Special Purpose and Application-Based Systems
Telematics
Time measurement
Track 6: Computer Vision for Multimedia Applications
Title YOLO-based Object Detection Models: A Review and its Applications
URI https://link.springer.com/article/10.1007/s11042-024-18872-y
https://www.proquest.com/docview/3115599379
Volume 83
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED5Bu8DAo4AolMoDG1jK00nYUmipeLQLldopcvyQKlUBkTD032O7SVMQIDFliG1Fdz7f59zddwCXwnVpGgqOfVOSEzoMK69oY0p4aEXCkqGpcn0ekeHEe5j607IoLK-y3auQpDmp62I3W5eSKJ-CbWUZDl5uQ9NXd3edyDVx4nXsgPiOV5bH_DzvqwuqceW3UKjxMIMD2CuhIYpXujyELZG1YL9qu4BKK2zB7gaH4BHEs_HTGGtnxNE41X9V0J0oTIJVhnSns0V-g2K0igEgmnE0L3IUb8Stj2Ey6L_cDnHZFwEzZTAF1iR4hHJLOhHxPekyFtnUlV7gcUZ4QIQVUKqHBursUoBMYYRQppRIi4QRk557Ao3sNROngAjR3EaMqkud46WcUkkJibiwUoUr7DBqg12JKmElabjuXbFIarpjLd5EiTcx4k2Wbbhaz3lbUWb8ObpTaSApzSdPNAWQr5GT-oDrSiv1699XO_vf8HPYcczG0Ml5HWgU7x_iQoGMIu1CMx70eiP9vJ899rtmj30C0djK1w
linkProvider Springer Nature
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT8MwDLZgHIADjwFiMCAHOEGkNm3TFolDxZg29rps0jiVtEklpKkgVoT2f_ihJFm7DQRIHHauG0WOXX-p7c8A58KyWOQJjh3dkuORGMuoaGJGuWf4wkg83eXa6dLGwL4fOsMV-Ch6YXS1e5GS1F_qebObqVpJZEzBpvQMgid5KWVLTN7lRW1806zJU70gpH7Xv23gfJYAjqWRZVgRx1HGjYT41LETK459k1mJ7do8ptylwnAZU6Ku9HcJYmRc9ZKI0cSgnh8ntiXXXYU1CT485TsDEsxyFdQhdt6O8_M-v4a8OY79lnrVEa2-A1s5FEXB1HZ2YUWkZdguxjyg3OvLsLnAWbgHwUOv3cMq-HHUi9RfHFQTmS7oSpGarDYaX6MATXMOiKUcPWVjFCzkyfdhsBTdHUApfU7FISBKFZdSzOQlktgRZyxhlPpcGJHEMabnV8AsVBXGOUm5mpUxCuf0ykq9oVRvqNUbTipwOXvnZUrR8ad0tTiBMHfXcagohxyF1OQGropTmT_-fbWj_4mfwXqj32mH7Wa3dQwbRBuJKgysQil7fRMnEuBk0am2LwSPyzboTxfkBKk
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fS8MwED7mBNEHf0zF6dQ86JMG27TNWsGH4hybm5sPDuZTTZsEhFGHq8j-K_9Ek67dpqjgw557DeFy17v07vsO4ERYFgtdwbGTQnJcEmEVFU3MKHcNTxjSTVGudx3a6Nm3fadfgI8cC5N2u-clyQmmQbM0xcnFkMuLGfDN1LASFV-wqbyE4HHWVtkS43d1aRtdNWvqhE8Jqd88XDdwNlcAR8rgEqxJ5CjjhiQedWxpRZFnMkvaVZtHlFepMKqMadGq8n2V0KgY68qQUWlQ14ukbal1l2DZ1uhj5UE94k_rFtQhdgbN-XmfX8PfLKf9VoZNo1t9E9aztBT5EzvagoKIS7CRj3xA2RegBGtz_IXb4D92212sAyFH3VD_0UE1kaTNXTHSU9YGo0vko0n9AbGYo-dkhPy5mvkO9Baiu10oxi-x2ANEqeZVipi6UBI75IxJRqnHhRGqnMZ0vTKYuaqCKCMs13MzBsGMalmrN1DqDVL1BuMynE3fGU7oOv6UruQnEGSuOwo0_ZCjsza1gfP8VGaPf19t_3_ix7ByX6sH7WandQCrJLUR3SNYgWLy-iYOVa6ThEepeSF4WrQ9fwKD7wjc
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=YOLO-based+Object+Detection+Models%3A+A+Review+and+its+Applications&rft.jtitle=Multimedia+tools+and+applications&rft.au=Vijayakumar%2C+Ajantha&rft.au=Vairavasundaram%2C+Subramaniyaswamy&rft.date=2024-10-01&rft.pub=Springer+Nature+B.V&rft.issn=1380-7501&rft.eissn=1573-7721&rft.volume=83&rft.issue=35&rft.spage=83535&rft.epage=83574&rft_id=info:doi/10.1007%2Fs11042-024-18872-y&rft.externalDBID=HAS_PDF_LINK
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1573-7721&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1573-7721&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1573-7721&client=summon