BEVFormer: Learning Bird's-Eye-View Representation from Multi-camera Images via Spatiotemporal Transformers

3D visual perception tasks, including 3D detection and map segmentation based on multi-camera images, are essential for autonomous driving systems. In this work, we present a new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple...

Full description

Saved in:
Bibliographic Details
Published inComputer Vision - ECCV 2022 Vol. 13669; pp. 1 - 18
Main Authors Li, Zhiqi, Wang, Wenhai, Li, Hongyang, Xie, Enze, Sima, Chonghao, Lu, Tong, Qiao, Yu, Dai, Jifeng
Format Book Chapter
LanguageEnglish
Published Switzerland Springer 2022
Springer Nature Switzerland
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:3D visual perception tasks, including 3D detection and map segmentation based on multi-camera images, are essential for autonomous driving systems. In this work, we present a new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple autonomous driving perception tasks. In a nutshell, BEVFormer exploits both spatial and temporal information by interacting with spatial and temporal space through predefined grid-shaped BEV queries. To aggregate spatial information, we design spatial cross-attention that each BEV query extracts the spatial features from the regions of interest across camera views. For temporal information, we propose temporal self-attention to recurrently fuse the history BEV information. Our approach achieves the new state-of-the-art 56.9% in terms of NDS metric on the nuScenes test set, which is 9.0 points higher than previous best arts and on par with the performance of LiDAR-based baselines. The code is available at https://github.com/zhiqi-li/BEVFormer.
Bibliography:Supplementary InformationThe online version contains supplementary material available at https://doi.org/10.1007/978-3-031-20077-9_1.
Z. Li, W. Wang and H. Li—Equal contribution.
ISBN:9783031200762
3031200764
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-031-20077-9_1