Modeling the Effects of Autonomous Vehicles on Human Driver Car-Following Behaviors Using Inverse Reinforcement Learning

The development of autonomous driving technology will lead to a transition period during which human-driven vehicles (HVs) will share the road with autonomous vehicles (AVs). Understanding the interactions between AVs and HVs is critical for traffic safety and efficiency. Previous studies have used...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on intelligent transportation systems Vol. 24; no. 12; pp. 13903 - 13915
Main Authors Wen, Xiao, Jian, Sisi, He, Dengbo
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The development of autonomous driving technology will lead to a transition period during which human-driven vehicles (HVs) will share the road with autonomous vehicles (AVs). Understanding the interactions between AVs and HVs is critical for traffic safety and efficiency. Previous studies have used traffic/numerical simulations and field experiments to investigate HVs' behavioral changes when following AVs. However, such approaches simplify the actual scenarios and may result in biased results. Therefore, the objective of this study is to realistically model HV-following-AV dynamics and their microscopic interactions, which are important for intelligent transportation applications. HV-following-AV and HV-following-HV events are extracted from the high-resolution (10Hz) Waymo Open Dataset. Statistical test results reveal significant differences in calibrated intelligent driver model (IDM) parameters between HV-following-AV and HV-following-HV. An inverse reinforcement learning model (Inverse soft-Q Learning) is proposed to retrieve HVs' reward functions in HV-following-AV events. A deep reinforcement learning (DRL) approach - soft actor-critic (SAC) - is adopted to estimate the optimal policy for HVs when following AVs. The results show that, compared with other conventional and data-driven car-following models, the proposed model leads to significantly more accurate trajectory predictions. In addition, the recovered reward functions indicate that drivers' preferences when following AVs are different from those when following HVs.
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2023.3298150