HoughLaneNet: Lane Detection with Deep Hough Transform and Dynamic Convolution

The task of lane detection has garnered considerable attention in the field of autonomous driving due to its complexity. Lanes can present difficulties for detection, as they can be narrow, fragmented, and often obscured by heavy traffic. However, it has been observed that the lanes have a geometric...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Jia-Qi, Zhang, Hao-Bin Duan, Jun-Long, Chen, Shamir, Ariel, Wang, Miao
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 07.07.2023
Subjects	Computer architecture Computer networks Convolution Hough transformation Modules Parameters Straight lines
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The task of lane detection has garnered considerable attention in the field of autonomous driving due to its complexity. Lanes can present difficulties for detection, as they can be narrow, fragmented, and often obscured by heavy traffic. However, it has been observed that the lanes have a geometrical structure that resembles a straight line, leading to improved lane detection results when utilizing this characteristic. To address this challenge, we propose a hierarchical Deep Hough Transform (DHT) approach that combines all lane features in an image into the Hough parameter space. Additionally, we refine the point selection method and incorporate a Dynamic Convolution Module to effectively differentiate between lanes in the original image. Our network architecture comprises a backbone network, either a ResNet or Pyramid Vision Transformer, a Feature Pyramid Network as the neck to extract multi-scale features, and a hierarchical DHT-based feature aggregation head to accurately segment each lane. By utilizing the lane features in the Hough parameter space, the network learns dynamic convolution kernel parameters corresponding to each lane, allowing the Dynamic Convolution Module to effectively differentiate between lane features. Subsequently, the lane features are fed into the feature decoder, which predicts the final position of the lane. Our proposed network structure demonstrates improved performance in detecting heavily occluded or worn lane images, as evidenced by our extensive experimental results, which show that our method outperforms or is on par with state-of-the-art techniques.
ISSN:	2331-8422