Vision-Based Gesture Recognition in Human-Robot Teams Using Synthetic Data

Building successful collaboration between humans and robots requires efficient, effective, and natural communication. Here we study a RGB-based deep learning approach for controlling robots through gestures (e.g., "follow me"). To address the challenge of collecting high-quality annotated...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the ... IEEE/RSJ International Conference on Intelligent Robots and Systems pp. 10278 - 10284
Main Authors de Melo, Celso M., Rothrock, Brandon, Gurram, Prudhvi, Ulutan, Oytun, Manjunath, B.S.
Format Conference Proceeding
LanguageEnglish
Published IEEE 24.10.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Building successful collaboration between humans and robots requires efficient, effective, and natural communication. Here we study a RGB-based deep learning approach for controlling robots through gestures (e.g., "follow me"). To address the challenge of collecting high-quality annotated data from human subjects, synthetic data is considered for this domain. We contribute a dataset of gestures that includes real videos with human subjects and synthetic videos from our custom simulator. A solution is presented for gesture recognition based on the state-of-the-art I3D model. Comprehensive testing was conducted to optimize the parameters for this model. Finally, to gather insight on the value of synthetic data, several experiments are described that systematically study the properties of synthetic data (e.g., gesture variations, character variety, generalization to new gestures). We discuss practical implications for the design of effective human-robot collaboration and the usefulness of synthetic data for deep learning.
ISSN:2153-0866
DOI:10.1109/IROS45743.2020.9340728