Solving NP-Hard Problems on Graphs with Extended AlphaGo Zero

There have been increasing challenges to solve combinatorial optimization problems by machine learning. Khalil et al. proposed an end-to-end reinforcement learning framework, S2V-DQN, which automatically learns graph embeddings to construct solutions to a wide range of problems. To improve the gener...

Full description

Saved in:

Bibliographic Details
Main Authors	Abe, Kenshin, Xu, Zijian, Sato, Issei, Sugiyama, Masashi
Format	Journal Article
Language	English
Published	28.05.2019
Subjects	Computer Science - Learning Statistics - Machine Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	There have been increasing challenges to solve combinatorial optimization problems by machine learning. Khalil et al. proposed an end-to-end reinforcement learning framework, S2V-DQN, which automatically learns graph embeddings to construct solutions to a wide range of problems. To improve the generalization ability of their Q-learning method, we propose a novel learning strategy based on AlphaGo Zero which is a Go engine that achieved a superhuman level without the domain knowledge of the game. Our framework is redesigned for combinatorial problems, where the final reward might take any real number instead of a binary response, win/lose. In experiments conducted for five kinds of NP-hard problems including {\sc MinimumVertexCover} and {\sc MaxCut}, our method is shown to generalize better to various graphs than S2V-DQN. Furthermore, our method can be combined with recently-developed graph neural network (GNN) models such as the \emph{Graph Isomorphism Network}, resulting in even better performance. This experiment also gives an interesting insight into a suitable choice of GNN models for each task.
DOI:	10.48550/arxiv.1905.11623