A Dynamic Game Framework for Rational and Persistent Robot Deception With an Application to Deceptive Pursuit-Evasion

This article studies rational and persistent deception among intelligent robots to enhance security and operational efficiency. We present an <inline-formula> <tex-math notation="LaTeX">N </tex-math></inline-formula>-player <inline-formula> <tex-math notati...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on automation science and engineering Vol. 19; no. 4; pp. 2918 - 2932
Main Authors	Huang, Linan, Zhu, Quanyan
Format	Journal Article
Language	English
Published	New York IEEE 01.10.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Adaptive control Algorithms Bayes methods Bayesian analysis Deception Discrete-time Riccati equations Game theory Games Intelligent robots linear-quadratic (LQ) games Machine learning Maneuverability Minimum cost Multiagent systems Nonlinear control Optimal control perfect Bayesian equilibrium Players Probability distribution pursuit-evasion Pursuit-evasion games Random variables Riccati equation Riccati equations robot deception Robotics Robots Robustness (mathematics) Security Stochastic processes
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This article studies rational and persistent deception among intelligent robots to enhance security and operational efficiency. We present an <inline-formula> <tex-math notation="LaTeX">N </tex-math></inline-formula>-player <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula>-stage game with an asymmetric information structure where each robot's private information is modeled as a random variable or its type. The deception is persistent as each robot's private type remains unknown to other robots for all stages. The deception is rational as robots aim to achieve their deception goals at minimum cost. Each robot forms a dynamic belief of others' types based on intrinsic or extrinsic information. Perfect Bayesian Nash equilibrium (PBNE) is a natural solution concept for dynamic games of incomplete information. Due to its requirements of sequential rationality and belief consistency, PBNE provides a reliable prediction of players' actions, beliefs, and expected cumulative costs over the entire <inline-formula> <tex-math notation="LaTeX">K </tex-math></inline-formula> stages. The contribution of this work is fourfold. First, we identify the PBNE computation as a nonlinear stochastic control problem and characterize the structures of players' actions and costs under PBNE. We further derive a set of extended Riccati equations with cognitive coupling under the linear-quadratic (LQ) setting and extrinsic belief dynamics. Second, we develop a receding-horizon algorithm with low temporal and spatial complexity to compute PBNE under intrinsic belief dynamics. Third, we investigate a deceptive pursuit-evasion game as a case study and use numerical experiments to corroborate the results. Finally, we propose metrics, such as deceivability, reachability, and the price of deception (PoD), to evaluate the strategy design and the system performance under deception. Note to Practitioners-Recent advances in automation and adaptive control in multi-agent systems enable robots to use deception to accomplish their objectives. Deception involves intentional information hiding to compromise the security and operational efficiency of the robotic systems. This work proposes a dynamic game framework to quantify the impact of deception, understand the robots' behaviors and intentions, and design cost-efficient strategies under the deception that persists over stages. Existing research studies on robot deception have relied on experiments while this work aims to lay a theoretical foundation of deception with quantitative metrics, such as deceivability and the PoD. The proposed model has wide applications, including cooperative robots, pursuit and evasion, and human-robot teaming. The pursuit-evasion games are used as case studies to show how the deceiver can amplify the deception by belief manipulation and how the deceived robots can reduce the negative impact of deception by enhanced maneuverability and Bayesian learning. The future work would focus on designing cooperative deception among swarm robotics and robotic systems that are robust to or further benefit from the deception.
ISSN:	1545-5955 1558-3783
DOI:	10.1109/TASE.2021.3097286