Real-Time Energy Management for a Hybrid Electric Vehicle Based on Heuristic Search

To solve air pollution problems and reduce energy costs, hybrid electric vehicles (HEV) are applied for larger scale as a superior solution and their energy management strategies (EMS) are essential for achieving higher fuel efficiency. This paper proposes a reinforcement learning (RL) based real-ti...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on vehicular technology Vol. 71; no. 12; pp. 12635 - 12647
Main Authors Yang, Ningkang, Han, Lijin, Xiang, Changle, Liu, Hui, Ma, Tian, Ruan, Shumin
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To solve air pollution problems and reduce energy costs, hybrid electric vehicles (HEV) are applied for larger scale as a superior solution and their energy management strategies (EMS) are essential for achieving higher fuel efficiency. This paper proposes a reinforcement learning (RL) based real-time EMS for a HEV, which requires no prior knowledge of the future driving conditions. First, considering that the practical driving conditions are mutative, an online recursive Markov Chain model is constructed, which can depict the stochastic environment more accurately. Then, a novel RL framework, heuristic search, is introduced, which means using extra knowledge in the search process. It aims to find an optimal action for a specific state rather than learn a policy for an entire driving cycle like traditional RL algorithms. Consequently, the learning process is significantly efficient and real-time optimization can be realized. Moreover, a Q-table is learned offline, and is used online as a heuristic function to direct the search process, which can accelerate the convergence rate and improve the real-time performance further. In the simulation, the effectiveness of the recursive Markov Chain model and the superiority of the heuristic function are verified. Then, a comparative simulation of heuristic search, Q-learning, dynamic programming (DP) and equivalent consumption minimization strategy is conducted and evaluated. The proposed method consumes 1.63% more fuel compared with DP in a totally unknown cycle, which demonstrates its control performance and real-time capability.
ISSN:0018-9545
1939-9359
DOI:10.1109/TVT.2022.3195769