A Monte Carlo Tree Search approach to finding efficient patrolling schemes on graphs
•Stackelberg Equilibrium in multi-step patrolling games is approximated.•Monte-Carlo Tree Search sampling is applied.•Better scalability than in the case of state-of-the-art exact methods.•Optimal or close-to-optimal solutions in vast majority of test cases. In this paper, we propose an evader-defen...
Saved in:
Published in | European journal of operational research Vol. 277; no. 1; pp. 255 - 268 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
16.08.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Stackelberg Equilibrium in multi-step patrolling games is approximated.•Monte-Carlo Tree Search sampling is applied.•Better scalability than in the case of state-of-the-art exact methods.•Optimal or close-to-optimal solutions in vast majority of test cases.
In this paper, we propose an evader-defender type of game for modeling multi-step patrolling scenarios on a graph. The game utilizes a specifically designed graph-based setting which captures spatial arrangements of the protected area, for instance industrial premises or warehouses, wherein certain valuable assets are stored. The game is played by two sides: the evader who attempts to steal or destroy the assets and the defender whose aim is to intercept the evader and prevent him/her from accomplishing his/her goal.
Real-life specificity of the proposed game assumes information asymmetry between the two sides as the evader can usually observe defender’s patrolling schedules prior to making decision of an attack. For this reason, we employ the Stackelberg Game principles to model our game and consequently focus on approximation of Stackelberg Equilibrium during the solution process. To this end we propose a novel approach, called Mixed-UCT, which relies on Upper Confidence Bound applied to Trees algorithm – a variant of Monte Carlo Tree Search.
The efficacy of the proposed solution method is experimentally evaluated on randomly generated games played in warehouse-like, industrial environment. The results show that Mixed-UCT is efficient and scales very well for multi-step games with reasonable number of steps, leading to optimal or close-to-optimal strategies. |
---|---|
ISSN: | 0377-2217 1872-6860 |
DOI: | 10.1016/j.ejor.2019.02.017 |