Region-aligned single-stage point cloud object detector with direct feature compression and cross-semantic attention mechanism
Lidar has become increasingly crucial for perception in autonomous driving due to its indispensable advantages. Current voxel-based single-stage detector (SSD) employs a method that utilizes 3D sparse backbone and 2D Bird’s Eye View (BEV) backbone for predicting targets in the point cloud. However,...
Saved in:
Published in | Pattern analysis and applications : PAA Vol. 28; no. 2 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
London
Springer London
01.06.2025
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Lidar has become increasingly crucial for perception in autonomous driving due to its indispensable advantages. Current voxel-based single-stage detector (SSD) employs a method that utilizes 3D sparse backbone and 2D Bird’s Eye View (BEV) backbone for predicting targets in the point cloud. However, the sparse-to-dense layer between these two processes not only brings inconvenience in model design, but also deforms the height structure in feature representation, thus limiting construction of downstream backbone and ability of object perception. Therefore, we propose a Directly Sparse Feature Compression(DSFC) Block to better utilize 3D features and transform them into 2D features. Additionally, to address the weak correlation between regression and semantic features and the lack of abilities to extract global features which limits the detection performance of voxel-based SSD, we propose a Cross-semantic Cross-dimension Multi-head Attention(CDMHA) Block to better utilize regression features to enhance the ability of semantic branches. Experiments on the KITTI dataset demonstrate that our DSFC Block is more effective compared to the vanilla approach. The Cross-semantic CDMHA Block, designed using the CDMHA mechanism, enhances the object detection capability of various mainstream voxel-based SSD. We designed a network named RA-SSD to demonstrate the compatibility of our proposed methods. Experiments show that RA-SSD achieves excellent improvement in all categories compared with the baseline model. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1433-7541 1433-755X |
DOI: | 10.1007/s10044-025-01467-0 |