DW-YOLO: An Efficient Object Detector for Drones and Self-driving Vehicles

Object detection is frequently a challenging task due to poor visual cues of objects in an image. In this paper, a new efficient deep learning-based detection method, named as deeper and wider YOLO (DW-YOLO), has been proposed for various-sized objects from various perspectives. DW-YOLO is based on...

Full description

Saved in:

Bibliographic Details
Published in	Arabian journal for science and engineering (2011) Vol. 48; no. 2; pp. 1427 - 1436
Main Authors	Chen, Yunfan, Zheng, Wenqi, Zhao, Yangyi, Song, Tae Hun, Shin, Hyunchul
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.02.2023 Springer Nature B.V
Subjects	Annotations Autonomous cars Complexity Datasets Deep learning Drone vehicles Engineering Feature extraction Humanities and Social Sciences Image resolution Machine learning multidisciplinary Object recognition Research Article-Computer Engineering and Computer Science Science Visual tasks Deep learning Self-driving Drone vision Optimization Object detection
Online Access	Get full text

Cover

Loading…

Abstract	Object detection is frequently a challenging task due to poor visual cues of objects in an image. In this paper, a new efficient deep learning-based detection method, named as deeper and wider YOLO (DW-YOLO), has been proposed for various-sized objects from various perspectives. DW-YOLO is based on YOLOv5 and the following two enhancements have been developed to make the entire network deeper and wider. First, residual blocks in each cross stage partial structure are optimized to strengthen the ability of feature extraction in high-resolution drone images. Second, the entire network becomes wider by increasing the number of convolution kernels, aiming to obtain more discriminative features to fit complex data. The learning ability of a CNN model is related to its complexity. Making the network deeper can increase its complexity so that the ability of feature extraction is improved and the relationship between high-dimensional features can be easily learned. Increasing the network width can make each layer learn richer features in different directions and frequencies. Furthermore, a new large and diverse drone dataset named HDrone for object detection in real drone-view scenarios is introduced. This dataset involves six types of annotations in a wide range of scenarios, which is not limited to the traffic scenario. The experimental results on three datasets among which HDrone and VisDrone are the datasets for drone vision, and KITTI is the dataset for self-driving showing that the proposed DW-YOLO achieves the state-of-the-art results and can detect small-scaled objects well along with large-scaled objects.
AbstractList	Object detection is frequently a challenging task due to poor visual cues of objects in an image. In this paper, a new efficient deep learning-based detection method, named as deeper and wider YOLO (DW-YOLO), has been proposed for various-sized objects from various perspectives. DW-YOLO is based on YOLOv5 and the following two enhancements have been developed to make the entire network deeper and wider. First, residual blocks in each cross stage partial structure are optimized to strengthen the ability of feature extraction in high-resolution drone images. Second, the entire network becomes wider by increasing the number of convolution kernels, aiming to obtain more discriminative features to fit complex data. The learning ability of a CNN model is related to its complexity. Making the network deeper can increase its complexity so that the ability of feature extraction is improved and the relationship between high-dimensional features can be easily learned. Increasing the network width can make each layer learn richer features in different directions and frequencies. Furthermore, a new large and diverse drone dataset named HDrone for object detection in real drone-view scenarios is introduced. This dataset involves six types of annotations in a wide range of scenarios, which is not limited to the traffic scenario. The experimental results on three datasets among which HDrone and VisDrone are the datasets for drone vision, and KITTI is the dataset for self-driving showing that the proposed DW-YOLO achieves the state-of-the-art results and can detect small-scaled objects well along with large-scaled objects.
Author	Zhao, Yangyi Shin, Hyunchul Zheng, Wenqi Song, Tae Hun Chen, Yunfan
Author_xml	– sequence: 1 givenname: Yunfan surname: Chen fullname: Chen, Yunfan organization: School of Electrical and Electronic Engineering, Hubei University of Technology, Department of Electrical Engineering, Hanyang University – sequence: 2 givenname: Wenqi surname: Zheng fullname: Zheng, Wenqi organization: Department of Electrical Engineering, Hanyang University – sequence: 3 givenname: Yangyi surname: Zhao fullname: Zhao, Yangyi organization: Department of Electrical Engineering, Hanyang University – sequence: 4 givenname: Tae Hun surname: Song fullname: Song, Tae Hun organization: Huins Co – sequence: 5 givenname: Hyunchul orcidid: 0000-0003-3020-5130 surname: Shin fullname: Shin, Hyunchul email: shin@hanyang.ac.kr organization: Department of Electrical Engineering, Hanyang University
BookMark	eNp9kEtLAzEUhYNUsNb-AVcDrqN5TR7uSltfFGbhexXmkdSUmqnJVPDfGzuC4KKLy7mL891zOcdg4FtvADjF6BwjJC4ippQriAiBiEvBoDgAQ4IVhoxIPNjtFOZcvByBcYyuQkxSlWNMh-Bu9gxfi0VxmU18NrfW1c74Liuqlam7bGa6JG3IbJpZSLExK32T3Zu1hU1wn84vsyfz5uq1iSfg0JbraMa_OgKPV_OH6Q1cFNe308kC1oSpDlbWEoNFIzHDQtGK8LyhkltjRWk4p3XDlCRlbnKE0hgmVU0FboQqKyUZoSNw1t_dhPZja2KnV-02-BSpiRAs54gRnlykd9WhjTEYqzfBvZfhS2Okf2rTfW061aZ3tWmRIPkPql1Xdq71XSjdej9KezSmHL804e-rPdQ30baBdA
CitedBy_id	crossref_primary_10_3390_agronomy14051034 crossref_primary_10_1145_3665649 crossref_primary_10_3390_agriculture14101739 crossref_primary_10_1049_ipr2_13185 crossref_primary_10_1109_JIOT_2023_3334742 crossref_primary_10_1109_ACCESS_2024_3518208 crossref_primary_10_1038_s41598_024_68934_2 crossref_primary_10_11834_jig_220836 crossref_primary_10_1016_j_jrras_2023_100705 crossref_primary_10_3390_app131911030 crossref_primary_10_1007_s13369_023_08700_0 crossref_primary_10_1093_comjnl_bxaf002 crossref_primary_10_1007_s00521_024_09510_7 crossref_primary_10_1109_ACCESS_2025_3550895 crossref_primary_10_1109_ACCESS_2023_3252021 crossref_primary_10_1016_j_eij_2024_100523 crossref_primary_10_1007_s13369_024_09188_y crossref_primary_10_1109_ACCESS_2023_3327735 crossref_primary_10_1109_ACCESS_2024_3354076 crossref_primary_10_1007_s10489_022_04108_9 crossref_primary_10_1016_j_sasc_2024_200119 crossref_primary_10_3390_electronics13173564 crossref_primary_10_1016_j_imavis_2025_105485
Cites_doi	10.1109/TPAMI.2015.2389824 10.1109/LGRS.2016.2542358 10.1007/s11263-009-0275-4 10.1016/j.cviu.2021.103186 10.1364/JOSAA.386410 10.1109/JSTARS.2017.2694890 10.3390/rs9040368 10.1109/CVPR.2016.91 10.1109/CVPRW50498.2020.00203 10.1109/CVPR.2018.00442 10.1109/ICCV.2015.169 10.1109/ICCV.2017.322 10.1109/CVPR.2017.690 10.1007/978-3-319-46448-0_2 10.1007/978-3-319-10602-1_48 10.1109/CVPR.2016.90 10.1109/WACV.2017.41 10.1109/ICCV.2017.324
ContentType	Journal Article
Copyright	King Fahd University of Petroleum & Minerals 2022 King Fahd University of Petroleum & Minerals 2022.
Copyright_xml	– notice: King Fahd University of Petroleum & Minerals 2022 – notice: King Fahd University of Petroleum & Minerals 2022.
DBID	AAYXX CITATION
DOI	10.1007/s13369-022-06874-7
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	2191-4281
EndPage	1436
ExternalDocumentID	10_1007_s13369_022_06874_7
GroupedDBID	-EM 0R~ 203 2KG 406 AAAVM AACDK AAHNG AAIAL AAJBT AANZL AARHV AASML AATNV AATVU AAUYE AAYTO AAYZH ABAKF ABDBF ABDZT ABECU ABFTD ABFTV ABJNI ABJOX ABKCH ABMQK ABQBU ABSXP ABTEG ABTKH ABTMW ABXPI ACAOD ACBXY ACDTI ACHSB ACMDZ ACMLO ACOKC ACPIV ACUHS ACZOJ ADINQ ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFQL AEJRE AEMSY AEOHA AESKC AEVLU AEXYK AFBBN AFLOW AFQWF AGAYW AGJBK AGMZJ AGQEE AGQMX AGRTI AHAVH AHBYD AHSBF AIAKS AIGIU AILAN AITGF AJBLW AJRNO AJZVZ ALFXC ALMA_UNASSIGNED_HOLDINGS AMXSW AMYLF AOCGG AXYYD BGNMA CSCUP DDRTE DNIVK DPUIP EBLON EBS EIOEI EJD ESX FERAY FIGPU FINBP FNLPD FSGXE GGCAI GQ6 GQ7 H13 HG6 I-F IKXTQ IWAJR J-C JBSCW JZLTJ L8X LLZTM M4Y MK~ NPVJJ NQJWS NU0 O9J PT4 ROL RSV SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE TSG TUS UOJIU UTJUX UZXMN VFIZW Z5O Z7R Z7V Z7X Z7Y Z7Z Z81 Z83 Z85 Z88 ZMTXR ~8M AAPKM AAYXX ABBRH ABDBE ABFSG ACSTC AEZWR AFDZB AFHIU AFOHR AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION 06D 0VY 23M 29~ 2KM 30V 408 5GY 96X AAJKR AARTL AAYIU AAYQN AAZMS ABTHY ACGFS ACKNC ADHHG ADHIR AEGNC AEJHL AENEX AEPYU AETCA AFWTZ AFZKB AGDGC AGWZB AGYKE AHYZX AIIXL AMKLP AMYQR ANMIH AYJHY ESBYG FFXSO FRRFC FYJPI GGRSB GJIRD GX1 HMJXF HRMNR HZ~ I0C IXD J9A KOV O93 OVT P9P R9I RLLFE S27 S3B SEG SHX T13 U2A UG4 VC2 W48 WK8 ~A9
ID	FETCH-LOGICAL-c249t-bff2e17d8141793b265d386fef7ae663cd4982a5e500e50e489c371d79ab98423
ISSN	2193-567X 1319-8025
IngestDate	Mon Jun 30 08:55:56 EDT 2025 Tue Jul 01 01:34:25 EDT 2025 Thu Apr 24 23:06:41 EDT 2025 Fri Feb 21 02:45:05 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	2
Keywords	Deep learning Self-driving Drone vision Optimization Object detection
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c249t-bff2e17d8141793b265d386fef7ae663cd4982a5e500e50e489c371d79ab98423
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0003-3020-5130
PQID	2774560426
PQPubID	2044268
PageCount	10
ParticipantIDs	proquest_journals_2774560426 crossref_primary_10_1007_s13369_022_06874_7 crossref_citationtrail_10_1007_s13369_022_06874_7 springer_journals_10_1007_s13369_022_06874_7
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2023-02-01
PublicationDateYYYYMMDD	2023-02-01
PublicationDate_xml	– month: 02 year: 2023 text: 2023-02-01 day: 01
PublicationDecade	2020
PublicationPlace	Berlin/Heidelberg
PublicationPlace_xml	– name: Berlin/Heidelberg – name: Heidelberg
PublicationTitle	Arabian journal for science and engineering (2011)
PublicationTitleAbbrev	Arab J Sci Eng
PublicationYear	2023
Publisher	Springer Berlin Heidelberg Springer Nature B.V
Publisher_xml	– name: Springer Berlin Heidelberg – name: Springer Nature B.V
References	He, Zhang, Ren, Sun (CR23) 2015; 37 Audebert, Bertrand, Sébastien (CR3) 2017; 9 CR19 CR18 CR17 CR16 CR15 CR14 CR13 CR12 CR11 CR10 Ševo, Avramović (CR2) 2016; 13 CR4 Chen, Shin (CR8) 2020; 37 CR6 Sultani, Mubarak (CR1) 2021; 22 CR7 CR9 CR27 CR26 CR25 CR24 CR22 Everingham, Luc, Christopher, John, Andrew (CR21) 2010; 88 CR20 Deng, Sun, Zhou, Zhao, Zou (CR5) 2017; 10 Y Chen (6874_CR8) 2020; 37 6874_CR18 6874_CR19 6874_CR16 6874_CR17 6874_CR14 6874_CR15 6874_CR12 6874_CR13 6874_CR10 6874_CR11 I Ševo (6874_CR2) 2016; 13 M Everingham (6874_CR21) 2010; 88 N Audebert (6874_CR3) 2017; 9 Z Deng (6874_CR5) 2017; 10 K He (6874_CR23) 2015; 37 6874_CR27 6874_CR25 6874_CR26 W Sultani (6874_CR1) 2021; 22 6874_CR24 6874_CR22 6874_CR20 6874_CR9 6874_CR7 6874_CR6 6874_CR4
References_xml	– ident: CR22 – ident: CR18 – ident: CR4 – ident: CR14 – ident: CR16 – ident: CR12 – ident: CR10 – ident: CR6 – ident: CR25 – volume: 37 start-page: 1904 issue: 9 year: 2015 end-page: 1916 ident: CR23 article-title: Spatial pyramid pooling in deep convolutional networks for visual recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2015.2389824 – ident: CR27 – volume: 13 start-page: 740 issue: 5 year: 2016 end-page: 744 ident: CR2 article-title: Convolutional neural network based automatic object detection on aerial images publication-title: IEEE Geosci. Remote Sens. Lett. doi: 10.1109/LGRS.2016.2542358 – ident: CR19 – volume: 88 start-page: 303 issue: 2 year: 2010 end-page: 338 ident: CR21 article-title: The pascal visual object classes (voc) challenge publication-title: Int. J. Comput. Vis. doi: 10.1007/s11263-009-0275-4 – volume: 22 start-page: 103186 year: 2021 ident: CR1 article-title: Human action recognition in drone videos using a few aerial training examples publication-title: Comput. Vis. Image Underst. doi: 10.1016/j.cviu.2021.103186 – ident: CR15 – ident: CR17 – ident: CR13 – ident: CR11 – ident: CR9 – volume: 37 start-page: 768 issue: 5 year: 2020 end-page: 779 ident: CR8 article-title: Multispectral image fusion based pedestrian detection using a multilayer fused deconvolutional single-shot detector publication-title: JOSA A doi: 10.1364/JOSAA.386410 – ident: CR7 – volume: 10 start-page: 3652 issue: 8 year: 2017 end-page: 3664 ident: CR5 article-title: Toward fast and accurate vehicle detection in aerial images using coupled region-based convolutional neural networks publication-title: IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. doi: 10.1109/JSTARS.2017.2694890 – ident: CR26 – ident: CR24 – ident: CR20 – volume: 9 start-page: 368 issue: 4 year: 2017 ident: CR3 article-title: Segment-before-detect: vehicle detection and classification through semantic segmentation of aerial images publication-title: Remote Sens. doi: 10.3390/rs9040368 – ident: 6874_CR20 – volume: 22 start-page: 103186 year: 2021 ident: 6874_CR1 publication-title: Comput. Vis. Image Underst. doi: 10.1016/j.cviu.2021.103186 – ident: 6874_CR12 doi: 10.1109/CVPR.2016.91 – ident: 6874_CR7 – ident: 6874_CR24 doi: 10.1109/CVPRW50498.2020.00203 – volume: 13 start-page: 740 issue: 5 year: 2016 ident: 6874_CR2 publication-title: IEEE Geosci. Remote Sens. Lett. doi: 10.1109/LGRS.2016.2542358 – volume: 37 start-page: 768 issue: 5 year: 2020 ident: 6874_CR8 publication-title: JOSA A doi: 10.1364/JOSAA.386410 – volume: 88 start-page: 303 issue: 2 year: 2010 ident: 6874_CR21 publication-title: Int. J. Comput. Vis. doi: 10.1007/s11263-009-0275-4 – ident: 6874_CR17 doi: 10.1109/CVPR.2018.00442 – ident: 6874_CR9 doi: 10.1109/ICCV.2015.169 – ident: 6874_CR11 doi: 10.1109/ICCV.2017.322 – ident: 6874_CR13 doi: 10.1109/CVPR.2017.690 – ident: 6874_CR15 – ident: 6874_CR16 doi: 10.1007/978-3-319-46448-0_2 – ident: 6874_CR19 – volume: 37 start-page: 1904 issue: 9 year: 2015 ident: 6874_CR23 publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2015.2389824 – ident: 6874_CR25 – ident: 6874_CR27 – ident: 6874_CR22 doi: 10.1007/978-3-319-10602-1_48 – ident: 6874_CR6 – volume: 9 start-page: 368 issue: 4 year: 2017 ident: 6874_CR3 publication-title: Remote Sens. doi: 10.3390/rs9040368 – volume: 10 start-page: 3652 issue: 8 year: 2017 ident: 6874_CR5 publication-title: IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. doi: 10.1109/JSTARS.2017.2694890 – ident: 6874_CR26 doi: 10.1109/CVPR.2016.90 – ident: 6874_CR4 doi: 10.1109/WACV.2017.41 – ident: 6874_CR18 doi: 10.1109/ICCV.2017.324 – ident: 6874_CR10 – ident: 6874_CR14
SSID	ssib048395113 ssj0001916267 ssj0061873
Score	2.4030993
Snippet	Object detection is frequently a challenging task due to poor visual cues of objects in an image. In this paper, a new efficient deep learning-based detection...
SourceID	proquest crossref springer
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	1427
SubjectTerms	Annotations Autonomous cars Complexity Datasets Deep learning Drone vehicles Engineering Feature extraction Humanities and Social Sciences Image resolution Machine learning multidisciplinary Object recognition Research Article-Computer Engineering and Computer Science Science Visual tasks
Title	DW-YOLO: An Efficient Object Detector for Drones and Self-driving Vehicles
URI	https://link.springer.com/article/10.1007/s13369-022-06874-7 https://www.proquest.com/docview/2774560426
Volume	48
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELbK7gUOiKcoLCsfuJVUieM4DreWbVWtSnsgZVsuUdw4y0qrACU9wH_hvzJ-JaFaVsChUeW0E8XzeTwefzNG6BXM0WFJWOjRvIQFiqAwpEhSeEkswT8Al65kKt_53YLNVvR8Ha17vZ8d1tK-FsPtjxvzSv5Hq9AGelVZsv-g2UYoNMB30C9cQcNw_Ssdn114m-V8aYN7E10NQu3tL4WKroAtqXVMXlMJz3aqKr9hasrr0it2VzqW8EF-0sy4rpeqT2re5aocuduv1jKcIVBCOoUMhypqr3E01qa1vdOSB4x12-yrsoXjR2i9HFzI6utVJ379ebDJq8vvTdN7SxpOczmY7atumIKEjtl8EKZUHGy1M9Kk0WirqzKpuG8yoIdSt4ElBewQc56LM9WUdyBJOnY3oKbCgJ3DwQlkN84Pvs2XDkOWeCqRwWdcEVLb2dAxABbLbLqaz7N0sk7voGMCqxCdS74O2hAeuNb6_C4z8bOAa0JD8zY2R8tkah4-8Xc_qF3cHOzHazcnfYDuW23jkQHbQ9ST1SN0r6PSx-jcwu4NHlW4AR02oMMOdBgAgw3oMOAFd0GHHeieoNV0kr6defZIDm8L6_TaE2VJZBAXPFAn14WCsKgIOStlGecSnNdtQRNO8khGvg8fSXmyDeOgiJNcJBxc96foqIInP0NYRCxhgioqQEiJjAVlLCkEAdll7Be8jwLXQdnW1qtXx6ZcZ22lbdWpGXRqpjs1i_to0Pzni6nWcuuvT1y_Z3ZUf8tA0bCmUJGFPnrtdNHe_rO057dLe4HutsPiBB3Vu718Cf5sLU7R8Wg6Hi9ONbp-ARdPl6U
linkProvider	Geneva Foundation for Medical Education and Research
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=DW-YOLO%3A+An+Efficient+Object+Detector+for+Drones+and+Self-driving+Vehicles&rft.jtitle=The+Arabian+Journal+for+Science+and+Engineering.+Section+B%2C+Engineering&rft.au=Chen%2C+Yunfan&rft.au=Zheng+Wenqi&rft.au=Zhao+Yangyi&rft.au=Song%2C+Tae+Hun&rft.date=2023-02-01&rft.pub=Springer+Nature+B.V&rft.issn=1319-8025&rft.eissn=2191-4281&rft.volume=48&rft.issue=2&rft.spage=1427&rft.epage=1436&rft_id=info:doi/10.1007%2Fs13369-022-06874-7&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2193-567X&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2193-567X&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2193-567X&client=summon