A modified reinforcement learning algorithm for solving coordinated signalized networks

•The modified RL is developed to improve the performance of RL algorithm.•The MORELTRANS model is proposed to solve signal optimization problem in CSN.•The MORELTRANS improves network’s performance especially in heavy demand condition. This study proposes Reinforcement Learning (RL) based algorithm...

Full description

Saved in:

Bibliographic Details
Published in	Transportation research. Part C, Emerging technologies Vol. 54; pp. 40 - 55
Main Authors	Ozan, Cenk, Baskan, Ozgur, Haldenbilen, Soner, Ceylan, Halim
Format	Journal Article
Language	English
Published	Elsevier India Pvt Ltd 01.05.2015
Subjects	Algorithms Coordinated signalized network Learning Mathematical models Networks Optimization Performance indices Reinforcement Reinforcement learning Signal timing optimization Time measurements TRANSYT-7F Coordinated signalized network Signal timing optimization Reinforcement learning TRANSYT-7F
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•The modified RL is developed to improve the performance of RL algorithm.•The MORELTRANS model is proposed to solve signal optimization problem in CSN.•The MORELTRANS improves network’s performance especially in heavy demand condition. This study proposes Reinforcement Learning (RL) based algorithm for finding optimum signal timings in Coordinated Signalized Networks (CSN) for fixed set of link flows. For this purpose, MOdified REinforcement Learning algorithm with TRANSYT-7F (MORELTRANS) model is proposed by way of combining RL algorithm and TRANSYT-7F. The modified RL differs from other RL algorithms since it takes advantage of the best solution obtained from the previous learning episode by generating a sub-environment at each learning episode as the same size of original environment. On the other hand, TRANSYT-7F traffic model is used in order to determine network performance index, namely disutility index. Numerical application is conducted on medium sized coordinated signalized road network. Results indicated that the MORELTRANS produced slightly better results than the GA in signal timing optimization in terms of objective function value while it outperformed than the HC. In order to show the capability of the proposed model for heavy demand condition, two cases in which link flows are increased by 20% and 50% with respect to the base case are considered. It is found that the MORELTRANS is able to reach good solutions for signal timing optimization even if demand became increased.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0968-090X 1879-2359
DOI:	10.1016/j.trc.2015.03.010