Expanding human visual field: online learning of assistive camera views by an aerial co-robot

We present a novel method by which an aerial robot can learn sequences of task-relevant camera views within a multitasking environment. The robot learns these views by tracking the visual gaze of a human collaborator wearing an augmented reality headset. The spatial footprint of the human’s visual f...

Full description

Saved in:

Bibliographic Details
Published in	Autonomous robots Vol. 46; no. 8; pp. 949 - 970
Main Authors	Bentz, William, Qian, Long, Panagou, Dimitra
Format	Journal Article
Language	English
Published	New York Springer US 01.12.2022 Springer Nature B.V
Subjects	Algorithms Angular velocity Artificial Intelligence Augmented reality Cameras Completion time Computer Imaging Control Engineering Extravehicular activity Field of view Headsets Human performance Machine learning Markov processes Mechatronics Monitoring Multitasking Pattern Recognition and Graphics Probabilistic models Robotics Robotics and Automation Robots Taskload Vision Visual fields Space robotics Cognitive human–robot interaction Field and service robotics Machine learning Human performance augmentation Aerial robotics Augmented reality
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We present a novel method by which an aerial robot can learn sequences of task-relevant camera views within a multitasking environment. The robot learns these views by tracking the visual gaze of a human collaborator wearing an augmented reality headset. The spatial footprint of the human’s visual field is integrated in time and then fit to a Gaussian mixture model via expectation maximization. The modes of this model represent the visual-interest regions of the environment with each visual-interest region containing one human task. Using Q-learning, the robot is trained as to which visual-interest region it should photograph given the human’s most recent sequence of K tasks. This sequence of K tasks forms one state of a Markov Decision Process whose entry triggers an action—the robot’s selection of visual-interest region. The robot’s camera view is continuously streamed to the human’s augmented reality headset in order to artificially expand the human’s visual field-of-view. The intent is to increase the human’s multitasking performance and decrease their physical and mental effort. An experimental study is presented in which 24 humans were asked to complete toy construction tasks in parallel with spatially separated persistent monitoring tasks (e.g., buttons which would flash at random times to request input). Subjects participated in four 2-h sessions over multiple days. The efficacy of the autonomous view selection system is compared against control trials containing no assistance as well as supervised trials in which the subjects could directly command the robot to switch between views. The merits of this system were evaluated through both subjective measures, e.g., System Usability Scale and NASA Task Load Index, as well as objective measures, e.g., task completion time, reflex time, and head angular velocity. This algorithm is applicable to multitasking environments that require persistent monitoring of regions outside of a human’s (possibly restricted) field of view, e.g., spacecraft extravehicular activity.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0929-5593 1573-7527
DOI:	10.1007/s10514-022-10059-4