Optimization-Based Monocular 3D Object Tracking via Combined Ellipsoid-Cuboid Representation

Monocular 3D object tracking is a challenging task because monocular image lacks depth information necessary for 3D scene understanding. Modern methods typically rely on deep learning to reconstruct 3D information from learned prior, which demands strenuous effort on acquiring ground-truth annotated...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 12; pp. 109281 - 109292
Main Authors Kim, Gyeong Chan, Jang, Youngseok, Kim, H. Jin
Format Journal Article
LanguageEnglish
Published IEEE 2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Monocular 3D object tracking is a challenging task because monocular image lacks depth information necessary for 3D scene understanding. Modern methods typically rely on deep learning to reconstruct 3D information from learned prior, which demands strenuous effort on acquiring ground-truth annotated data and does not generalize for various camera settings. We present a method to continuously track 3D location and orientation of the target object from a monocular image sequence from 2D instance segmentation methods. We reconstruct the structure and trajectory of the objects using factor graph optimization incorporating reprojection error of keypoint tracks, kinematic motion model and bounding box constraints. We propose a combined ellipsoid-cuboid object representation and bounding box constraint to model the object dimension. We evaluate our algorithm in simulation dataset generated using CARLA, and the result indicates that the method is robust to 2D bounding box error and the proposed object representation yields more accurate pose and size estimation compared to solely using either representation.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3438162