Monte Carlo tree search with temporal-difference learning for general video game playing

General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully with no prior knowledge about them. Being an important sub-field in General Artificial Intelligence, GVGP has drawn a considerable amount of in...

Full description

Saved in:

Bibliographic Details
Published in	IEEE Conference on Computational Intelligence and Games (Print) pp. 317 - 324
Main Authors	Ilhan, Ercument, Etaner-Uyar, A. Sima
Format	Conference Proceeding
Language	English
Published	IEEE 01.08.2017
Subjects	Computational intelligence Conferences Games Learning systems Mathematical model Monte Carlo methods
Online Access	Get full text
ISSN	2325-4289
DOI	10.1109/CIG.2017.8080453

Cover

More Information
Summary:	General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully with no prior knowledge about them. Being an important sub-field in General Artificial Intelligence, GVGP has drawn a considerable amount of interest, and the research in this field got intensified with the release of General Video Game AI framework and competition. As of today, even though this problem has been approached with many different techniques, it is still far from being solved. Monte Carlo Tree Search (MCTS) is one of the most promising baseline approaches in literature. In this study, MCTS algorithm is enhanced with a recently developed temporal- difference learning method, namely True Online Sarsa(lambda) to make it able to exploit domain knowledge by using past experience. Experiments show that the proposed modifications improve the performance of MCTS significantly in GVGP, and applications of reinforcement learning techniques in this domain is a promising subject that needs to be further researched.
ISSN:	2325-4289
DOI:	10.1109/CIG.2017.8080453