DW-YOLO: An Efficient Object Detector for Drones and Self-driving Vehicles

Object detection is frequently a challenging task due to poor visual cues of objects in an image. In this paper, a new efficient deep learning-based detection method, named as deeper and wider YOLO (DW-YOLO), has been proposed for various-sized objects from various perspectives. DW-YOLO is based on...

Full description

Saved in:
Bibliographic Details
Published inArabian journal for science and engineering (2011) Vol. 48; no. 2; pp. 1427 - 1436
Main Authors Chen, Yunfan, Zheng, Wenqi, Zhao, Yangyi, Song, Tae Hun, Shin, Hyunchul
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.02.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Object detection is frequently a challenging task due to poor visual cues of objects in an image. In this paper, a new efficient deep learning-based detection method, named as deeper and wider YOLO (DW-YOLO), has been proposed for various-sized objects from various perspectives. DW-YOLO is based on YOLOv5 and the following two enhancements have been developed to make the entire network deeper and wider. First, residual blocks in each cross stage partial structure are optimized to strengthen the ability of feature extraction in high-resolution drone images. Second, the entire network becomes wider by increasing the number of convolution kernels, aiming to obtain more discriminative features to fit complex data. The learning ability of a CNN model is related to its complexity. Making the network deeper can increase its complexity so that the ability of feature extraction is improved and the relationship between high-dimensional features can be easily learned. Increasing the network width can make each layer learn richer features in different directions and frequencies. Furthermore, a new large and diverse drone dataset named HDrone for object detection in real drone-view scenarios is introduced. This dataset involves six types of annotations in a wide range of scenarios, which is not limited to the traffic scenario. The experimental results on three datasets among which HDrone and VisDrone are the datasets for drone vision, and KITTI is the dataset for self-driving showing that the proposed DW-YOLO achieves the state-of-the-art results and can detect small-scaled objects well along with large-scaled objects.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2193-567X
1319-8025
2191-4281
DOI:10.1007/s13369-022-06874-7