Monte Carlo tree search with temporal-difference learning for general video game playing

General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully with no prior knowledge about them. Being an important sub-field in General Artificial Intelligence, GVGP has drawn a considerable amount of in...

Full description

Saved in:

Bibliographic Details
Published in	IEEE Conference on Computational Intelligence and Games (Print) pp. 317 - 324
Main Authors	Ilhan, Ercument, Etaner-Uyar, A. Sima
Format	Conference Proceeding
Language	English
Published	IEEE 01.08.2017
Subjects	Computational intelligence Conferences Games Learning systems Mathematical model Monte Carlo methods
Online Access	Get full text
ISSN	2325-4289
DOI	10.1109/CIG.2017.8080453

Cover

Abstract	General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully with no prior knowledge about them. Being an important sub-field in General Artificial Intelligence, GVGP has drawn a considerable amount of interest, and the research in this field got intensified with the release of General Video Game AI framework and competition. As of today, even though this problem has been approached with many different techniques, it is still far from being solved. Monte Carlo Tree Search (MCTS) is one of the most promising baseline approaches in literature. In this study, MCTS algorithm is enhanced with a recently developed temporal- difference learning method, namely True Online Sarsa(lambda) to make it able to exploit domain knowledge by using past experience. Experiments show that the proposed modifications improve the performance of MCTS significantly in GVGP, and applications of reinforcement learning techniques in this domain is a promising subject that needs to be further researched.
AbstractList	General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully with no prior knowledge about them. Being an important sub-field in General Artificial Intelligence, GVGP has drawn a considerable amount of interest, and the research in this field got intensified with the release of General Video Game AI framework and competition. As of today, even though this problem has been approached with many different techniques, it is still far from being solved. Monte Carlo Tree Search (MCTS) is one of the most promising baseline approaches in literature. In this study, MCTS algorithm is enhanced with a recently developed temporal- difference learning method, namely True Online Sarsa(lambda) to make it able to exploit domain knowledge by using past experience. Experiments show that the proposed modifications improve the performance of MCTS significantly in GVGP, and applications of reinforcement learning techniques in this domain is a promising subject that needs to be further researched.
Author	Etaner-Uyar, A. Sima Ilhan, Ercument
Author_xml	– sequence: 1 givenname: Ercument surname: Ilhan fullname: Ilhan, Ercument email: ilhane@itu.edu.tr organization: Grad. Sch. of Sci., Istanbul Tech. Univ., Istanbul, Turkey – sequence: 2 givenname: A. Sima surname: Etaner-Uyar fullname: Etaner-Uyar, A. Sima email: etaner@itu.edu.tr organization: Dept. of Comput. Eng., Istanbul Tech. Univ., Istanbul, Turkey
BookMark	eNot0EFLwzAYxvEoCs7Zu-AlX6D1TdKkyVGKzsHEyw7eRpq-6SJtOtKi7NtbcKfn8IPn8L8nN3GMSMgjg4IxMM_1dlNwYFWhQUMpxRXJTKWZFFoJLoS-JisuuMxLrs0dyabpGwAE01orvSJfH2OckdY29SOdEyKd0CZ3pL9hPtIZh9OYbJ-3wXtMGB3SfvEYYkf9mGiHERenP6HFkXZ2QHrq7XnhB3LrbT9hdtk12b-97uv3fPe52dYvuzwYmPPWWVmyEhAlelkxV5YtN0q6xhklDFiU0DJlG19hqxQojU5KENJCA-iZWJOn_9uAiIdTCoNN58MlhfgDqqVUpg
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CIG.2017.8080453
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Xplore Digital Library (LUT) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISBN	9781538632338 1538632330
EISSN	2325-4289
EndPage	324
ExternalDocumentID	8080453
Genre	orig-research
GroupedDBID	6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL
ID	FETCH-LOGICAL-i90t-dca54140ee5ef571c44d2965cbc96390ae50d16abf7ed66068ec55035a0b0ef13
IEDL.DBID	RIE
IngestDate	Wed Aug 27 02:28:55 EDT 2025
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-i90t-dca54140ee5ef571c44d2965cbc96390ae50d16abf7ed66068ec55035a0b0ef13
PageCount	8
ParticipantIDs	ieee_primary_8080453
PublicationCentury	2000
PublicationDate	2017-Aug.
PublicationDateYYYYMMDD	2017-08-01
PublicationDate_xml	– month: 08 year: 2017 text: 2017-Aug.
PublicationDecade	2010
PublicationTitle	IEEE Conference on Computational Intelligence and Games (Print)
PublicationTitleAbbrev	CIG
PublicationYear	2017
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0003188868
Score	1.6634321
Snippet	General Video Game Playing (GVGP) is a problem where the objective is to create an agent that can play multiple games with different properties successfully...
SourceID	ieee
SourceType	Publisher
StartPage	317
SubjectTerms	Computational intelligence Conferences Games Learning systems Mathematical model Monte Carlo methods
Title	Monte Carlo tree search with temporal-difference learning for general video game playing
URI	https://ieeexplore.ieee.org/document/8080453
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV05T8MwFH5qOzEVaBG3PDDi1Dmc2HNFKUhFDEXqVvl4qRClQVW68OuxcxSBGNgiS0ks-1nP7_i-D-CGZzKPQqMpCitpgnlEpWaGOm-Vp6GNM577hP7sKZ2-JI8LvujA7R4Lg4hV8xkG_rGq5dvC7HyqbOQ5EBMed6HrzKzGau3zKc42hUhFW4lkcjR-uPetW1nQvPZDP6VyH5M-zNof110jb8Gu1IH5_MXJ-N-ZHcLwG6hHnvcu6Ag6uDmGfqvUQJqDO4DFzJNQkbHargvi69CktnDi07Ckoada01YtxX21UZNYEXepJauam5p4zF5BVuodycdaeYDUEOaTu_l4ShtNBfoqWUmtUV73myFyzHkWmiSxkUy50cadRMkUcmbDVOk8Q5u64EagcTFMzBXTDPMwPoHeptjgKRB3cxGYWZ2YWCYuaFLG8ghTkckIuYrNGQz8Oi0_ataMZbNE538PX8CB36u6te4SeuV2h1fO3Zf6utrnL7wiq_w
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JT8JAGJ0gHvSECsbdOXh0YNrOtJ0zEUEp8YAJNzLLV2JESki5-Oud6YLRePDWNOmSWfLmW957CN3xSKS-pxWB2AjCIPWJUFQTi1Zp6Jkg4qlL6CeTcPjKnmZ81kD3Oy4MABTNZ9B1l0Ut32R661JlPaeByHiwh_Yt7jNesrV2GRW7OuM4jOtaJBW9_ujRNW9F3erBHw4qBYAMWiipP132jbx3t7nq6s9fqoz__bcj1Pmm6uGXHQgdowasTlCr9mrA1dZto1niZKhwX26WGXaVaFyucewSsbgSqFqS2i_FvrXyk1hge6zFi1KdGjvWXoYX8gPweikdRaqDpoOHaX9IKlcF8iZoToyWzvmbAnBIeeRpxowvQq6VtntRUAmcGi-UKo3AhDa8iUHbKCbgkioKqRecouYqW8EZwvbsEkNkFNOBYDZsktpwH8I4Ej5wGehz1HbjNF-Xuhnzaogu_r59iw6G02Q8H48mz5fo0M1b2Wh3hZr5ZgvXFvxzdVPM-Rfqrq9J
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+Conference+on+Computational+Intelligence+and+Games+%28Print%29&rft.atitle=Monte+Carlo+tree+search+with+temporal-difference+learning+for+general+video+game+playing&rft.au=Ilhan%2C+Ercument&rft.au=Etaner-Uyar%2C+A.+Sima&rft.date=2017-08-01&rft.pub=IEEE&rft.eissn=2325-4289&rft.spage=317&rft.epage=324&rft_id=info:doi/10.1109%2FCIG.2017.8080453&rft.externalDocID=8080453