Optimization-Based Monocular 3D Object Tracking via Combined Ellipsoid-Cuboid Representation

Monocular 3D object tracking is a challenging task because monocular image lacks depth information necessary for 3D scene understanding. Modern methods typically rely on deep learning to reconstruct 3D information from learned prior, which demands strenuous effort on acquiring ground-truth annotated...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 12; pp. 109281 - 109292
Main Authors	Kim, Gyeong Chan, Jang, Youngseok, Kim, H. Jin
Format	Journal Article
Language	English
Published	IEEE 2024
Subjects	3D object tracking Accuracy Ellipsoids Estimation Graph optimization monocular vision Object tracking Shape Solid modeling Three-dimensional displays
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Monocular 3D object tracking is a challenging task because monocular image lacks depth information necessary for 3D scene understanding. Modern methods typically rely on deep learning to reconstruct 3D information from learned prior, which demands strenuous effort on acquiring ground-truth annotated data and does not generalize for various camera settings. We present a method to continuously track 3D location and orientation of the target object from a monocular image sequence from 2D instance segmentation methods. We reconstruct the structure and trajectory of the objects using factor graph optimization incorporating reprojection error of keypoint tracks, kinematic motion model and bounding box constraints. We propose a combined ellipsoid-cuboid object representation and bounding box constraint to model the object dimension. We evaluate our algorithm in simulation dataset generated using CARLA, and the result indicates that the method is robust to 2D bounding box error and the proposed object representation yields more accurate pose and size estimation compared to solely using either representation.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2024.3438162