Tracking by segmentation with future motion estimation applied to person-following robots

Person-following is a crucial capability for service robots, and the employment of vision technology is a leading trend in building environmental understanding. While most existing methodologies rely on a tracking-by-detection strategy, which necessitates extensive datasets for training and yet rema...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in neurorobotics Vol. 17; p. 1255085
Main Authors Jiang, Shenlu, Cui, Runze, Wei, Runze, Fu, Zhiyang, Hong, Zhonghua, Feng, Guofu
Format Journal Article
LanguageEnglish
Published Lausanne Frontiers Research Foundation 28.08.2023
Frontiers Media S.A
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Person-following is a crucial capability for service robots, and the employment of vision technology is a leading trend in building environmental understanding. While most existing methodologies rely on a tracking-by-detection strategy, which necessitates extensive datasets for training and yet remains susceptible to environmental noise, we propose a novel approach: real-time tracking-by-segmentation with a future motion estimation framework. This framework facilitates pixel-level tracking of a target individual and predicts their future motion. Our strategy leverages a single-shot segmentation tracking neural network for precise foreground segmentation to track the target, overcoming the limitations of using a rectangular region of interest (ROI). Here we clarify that, while the ROI provides a broad context, the segmentation within this bounding box offers a detailed and more accurate position of the human subject. To further improve our approach, a classification-lock pre-trained layer is utilized to form a constraint that curbs feature outliers originating from the person being tracked. A discriminative correlation filter estimates the potential target region in the scene to prevent foreground misrecognition, while a motion estimation neural network anticipates the target's future motion for use in the control module. We validated our proposed methodology using the VOT, LaSot, YouTube-VOS, and Davis tracking datasets, demonstrating its effectiveness. Notably, our framework supports long-term person-following tasks in indoor environments, showing promise for practical implementation in service robots.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by: Hang Su, Fondazione Politecnico di Milano, Italy
Reviewed by: Francesco Calimeri, Calhoun Community College, United States; Xianghuan Luo, Shenzhen University, China
ISSN:1662-5218
1662-5218
DOI:10.3389/fnbot.2023.1255085