Warning: Full texts from electronic resources are only available from the university network. You are currently outside this network. Please log in to access full texts.

Using Features at Multiple Temporal and Spatial Resolutions to Predict Human Behavior in Real Time

When performing complex tasks, humans naturally reason at multiple temporal and spatial resolutions simultaneously. We contend that for an artificially intelligent agent to effectively model human teammates, i.e., demonstrate computational theory of mind (ToM), it should do the same. In this paper,...

Full description

Saved in:

Bibliographic Details
Published in	Computational Theory of Mind for Human-Machine Teams Vol. 13775; pp. 205 - 219
Main Authors	Zhang, Liang, Lieffers, Justin, Pyarelal, Adarsh
Format	Book Chapter
Language	English
Published	Switzerland Springer 2023 Springer Nature Switzerland
Series	Lecture Notes in Computer Science
Subjects	Neural networks Theory of mind Urban search and rescue
Online Access	Get full text
ISBN	3031216709 9783031216701
ISSN	0302-9743 1611-3349
DOI	10.1007/978-3-031-21671-8_13

Cover

More Information
Summary:	When performing complex tasks, humans naturally reason at multiple temporal and spatial resolutions simultaneously. We contend that for an artificially intelligent agent to effectively model human teammates, i.e., demonstrate computational theory of mind (ToM), it should do the same. In this paper, we present an approach for integrating high and low-resolution spatial and temporal information to predict human behavior in real time and evaluate it on data collected from human subjects performing simulated urban search and rescue (USAR) missions in a Minecraft-based environment. Our model composes neural networks for high and low-resolution feature extraction with a neural network for behavior prediction, with all three networks trained simultaneously. The high-resolution extractor encodes dynamically changing goals robustly by taking as input the Manhattan distance difference between the humans’ Minecraft avatars and candidate goals in the environment for the latest few actions, computed from a high-resolution gridworld representation. In contrast, the low-resolution extractor encodes participants’ historical behavior using a historical state matrix computed from a low-resolution graph representation. Through supervised learning, our model acquires a robust prior for human behavior prediction, and can effectively deal with long-term observations. Our experimental results demonstrate that our method significantly improves prediction accuracy compared to approaches that only use high-resolution information.
Bibliography:	This research was conducted as part of DARPA’s Artificial Social Intelligence for Successful Teams (ASIST) program, and was sponsored by the Army Research Office and was accomplished under Grant Number W911NF-20-1-0002. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.
ISBN:	3031216709 9783031216701
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-031-21671-8_13