AdaOcc: Adaptive Forward View Transformation and Flow Modeling for 3D Occupancy and Flow Prediction
In this technical report, we present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ Dataset Challenge at CVPR 2024. Our innovative approach involves a dual-stage framework that enhances 3D occupancy and flow predictions by incorporating adaptive f...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
01.07.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this technical report, we present our solution for the Vision-Centric 3D
Occupancy and Flow Prediction track in the nuScenes Open-Occ Dataset Challenge
at CVPR 2024. Our innovative approach involves a dual-stage framework that
enhances 3D occupancy and flow predictions by incorporating adaptive forward
view transformation and flow modeling. Initially, we independently train the
occupancy model, followed by flow prediction using sequential frame
integration. Our method combines regression with classification to address
scale variations in different scenes, and leverages predicted flow to warp
current voxel features to future frames, guided by future frame ground truth.
Experimental results on the nuScenes dataset demonstrate significant
improvements in accuracy and robustness, showcasing the effectiveness of our
approach in real-world scenarios. Our single model based on Swin-Base ranks
second on the public leaderboard, validating the potential of our method in
advancing autonomous car perception systems. |
---|---|
DOI: | 10.48550/arxiv.2407.01436 |