PETR: Position Embedding Transformation for Multi-view 3D Object Detection
In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-en...
Saved in:
Published in | Computer Vision - ECCV 2022 Vol. 13687; pp. 531 - 548 |
---|---|
Main Authors | , , , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer
2022
Springer Nature Switzerland |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. PETR achieves state-of-the-art performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset and ranks 1st place on the benchmark. It can serve as a simple yet strong baseline for future research. Code is available at https://github.com/megvii-research/PETR. |
---|---|
Bibliography: | Y. Liu and T. Wang—Equal contribution. |
ISBN: | 9783031198113 3031198115 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-031-19812-0_31 |