Improving Traffic Efficiency in a Road Network by Adopting Decentralised Multi-Agent Reinforcement Learning and Smart Navigation

In the future, mixed traffic flow will consist of human-driven vehicles (HDVs) and connected autonomous vehicles (CAVs). Effective traffic management is a global challenge, especially in urban areas with many intersections. Much research has focused on solving this problem to increase intersection n...

Full description

Saved in:

Bibliographic Details
Published in	Promet Vol. 35; no. 5; pp. 755 - 771
Main Authors	Trinh, Hung Tuan, Bae, Sang-Hoon, Tran, Quang Duy
Format	Journal Article Paper
Language	English
Published	Fakultet prometnih znanosti Sveučilišta u Zagrebu 01.01.2023 University of Zagreb, Faculty of Transport and Traffic Sciences
Subjects	connected and autonomous vehicles (CAVs) deep neural network (DNN) deep reinforcement learning (DRL) multi-agent advantage actor-critic (MA-A2C) multi-agent reinforcement learning (MARL) traffic signal control
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In the future, mixed traffic flow will consist of human-driven vehicles (HDVs) and connected autonomous vehicles (CAVs). Effective traffic management is a global challenge, especially in urban areas with many intersections. Much research has focused on solving this problem to increase intersection network performance. Reinforcement learning (RL) is a new approach to optimising traffic signal lights that overcomes the disadvantages of traditional methods. In this paper, we propose an integrated approach that combines the multi-agent advantage actor-critic (MA-A2C) and smart navigation (SN) to solve the congestion problem in a road network under mixed traffic conditions. The A2C algorithm combines the advantages of value-based and policy-based methods to stabilise the training by reducing the variance. It also overcomes the limitations of centralised and independent MARL. In addition, the SN technique reroutes traffic load to alternate paths to avoid congestion at intersections. To evaluate the robustness of our approach, we compare our model against independent-A2C (I-A2C) and max pressure (MP). These results show that our proposed approach performs more efficiently than others regarding average waiting time, speed and queue length. In addition, the simulation results also suggest that the model is effective as the CAV penetration rate is greater than 20%.
Bibliography:	309457
ISSN:	0353-5320 1848-4069
DOI:	10.7307/ptt.v35i5.246