Expanding human visual field: online learning of assistive camera views by an aerial co-robot

We present a novel method by which an aerial robot can learn sequences of task-relevant camera views within a multitasking environment. The robot learns these views by tracking the visual gaze of a human collaborator wearing an augmented reality headset. The spatial footprint of the human’s visual f...

Full description

Saved in:
Bibliographic Details
Published inAutonomous robots Vol. 46; no. 8; pp. 949 - 970
Main Authors Bentz, William, Qian, Long, Panagou, Dimitra
Format Journal Article
LanguageEnglish
Published New York Springer US 01.12.2022
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We present a novel method by which an aerial robot can learn sequences of task-relevant camera views within a multitasking environment. The robot learns these views by tracking the visual gaze of a human collaborator wearing an augmented reality headset. The spatial footprint of the human’s visual field is integrated in time and then fit to a Gaussian mixture model via expectation maximization. The modes of this model represent the visual-interest regions of the environment with each visual-interest region containing one human task. Using Q-learning, the robot is trained as to which visual-interest region it should photograph given the human’s most recent sequence of K tasks. This sequence of K tasks forms one state of a Markov Decision Process whose entry triggers an action—the robot’s selection of visual-interest region. The robot’s camera view is continuously streamed to the human’s augmented reality headset in order to artificially expand the human’s visual field-of-view. The intent is to increase the human’s multitasking performance and decrease their physical and mental effort. An experimental study is presented in which 24 humans were asked to complete toy construction tasks in parallel with spatially separated persistent monitoring tasks (e.g., buttons which would flash at random times to request input). Subjects participated in four 2-h sessions over multiple days. The efficacy of the autonomous view selection system is compared against control trials containing no assistance as well as supervised trials in which the subjects could directly command the robot to switch between views. The merits of this system were evaluated through both subjective measures, e.g., System Usability Scale and NASA Task Load Index, as well as objective measures, e.g., task completion time, reflex time, and head angular velocity. This algorithm is applicable to multitasking environments that require persistent monitoring of regions outside of a human’s (possibly restricted) field of view, e.g., spacecraft extravehicular activity.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0929-5593
1573-7527
DOI:10.1007/s10514-022-10059-4