Monte Carlo tree search with temporal-difference learning for general video game playing

General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully with no prior knowledge about them. Being an important sub-field in General Artificial Intelligence, GVGP has drawn a considerable amount of in...

Full description

Saved in:
Bibliographic Details
Published inIEEE Conference on Computational Intelligence and Games (Print) pp. 317 - 324
Main Authors Ilhan, Ercument, Etaner-Uyar, A. Sima
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.08.2017
Subjects
Online AccessGet full text
ISSN2325-4289
DOI10.1109/CIG.2017.8080453

Cover

More Information
Summary:General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully with no prior knowledge about them. Being an important sub-field in General Artificial Intelligence, GVGP has drawn a considerable amount of interest, and the research in this field got intensified with the release of General Video Game AI framework and competition. As of today, even though this problem has been approached with many different techniques, it is still far from being solved. Monte Carlo Tree Search (MCTS) is one of the most promising baseline approaches in literature. In this study, MCTS algorithm is enhanced with a recently developed temporal- difference learning method, namely True Online Sarsa(lambda) to make it able to exploit domain knowledge by using past experience. Experiments show that the proposed modifications improve the performance of MCTS significantly in GVGP, and applications of reinforcement learning techniques in this domain is a promising subject that needs to be further researched.
ISSN:2325-4289
DOI:10.1109/CIG.2017.8080453