Real-Time Energy Management for a Hybrid Electric Vehicle Based on Heuristic Search
To solve air pollution problems and reduce energy costs, hybrid electric vehicles (HEV) are applied for larger scale as a superior solution and their energy management strategies (EMS) are essential for achieving higher fuel efficiency. This paper proposes a reinforcement learning (RL) based real-ti...
Saved in:
Published in | IEEE transactions on vehicular technology Vol. 71; no. 12; pp. 12635 - 12647 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.12.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | To solve air pollution problems and reduce energy costs, hybrid electric vehicles (HEV) are applied for larger scale as a superior solution and their energy management strategies (EMS) are essential for achieving higher fuel efficiency. This paper proposes a reinforcement learning (RL) based real-time EMS for a HEV, which requires no prior knowledge of the future driving conditions. First, considering that the practical driving conditions are mutative, an online recursive Markov Chain model is constructed, which can depict the stochastic environment more accurately. Then, a novel RL framework, heuristic search, is introduced, which means using extra knowledge in the search process. It aims to find an optimal action for a specific state rather than learn a policy for an entire driving cycle like traditional RL algorithms. Consequently, the learning process is significantly efficient and real-time optimization can be realized. Moreover, a Q-table is learned offline, and is used online as a heuristic function to direct the search process, which can accelerate the convergence rate and improve the real-time performance further. In the simulation, the effectiveness of the recursive Markov Chain model and the superiority of the heuristic function are verified. Then, a comparative simulation of heuristic search, Q-learning, dynamic programming (DP) and equivalent consumption minimization strategy is conducted and evaluated. The proposed method consumes 1.63% more fuel compared with DP in a totally unknown cycle, which demonstrates its control performance and real-time capability. |
---|---|
ISSN: | 0018-9545 1939-9359 |
DOI: | 10.1109/TVT.2022.3195769 |