Real-Time Energy Management for a Hybrid Electric Vehicle Based on Heuristic Search

To solve air pollution problems and reduce energy costs, hybrid electric vehicles (HEV) are applied for larger scale as a superior solution and their energy management strategies (EMS) are essential for achieving higher fuel efficiency. This paper proposes a reinforcement learning (RL) based real-ti...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on vehicular technology Vol. 71; no. 12; pp. 12635 - 12647
Main Authors	Yang, Ningkang, Han, Lijin, Xiang, Changle, Liu, Hui, Ma, Tian, Ruan, Shumin
Format	Journal Article
Language	English
Published	New York IEEE 01.12.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Batteries Driving conditions Dynamic programming Electric vehicles Energy costs Energy management Engines Fuels Generators Heuristic Heuristic search hybrid electric vehicle Hybrid electric vehicles Machine learning Markov analysis Markov chains Optimization Real time real-time energy management Real-time systems reinforcement learning Resistance Search process
Online Access	Get full text

Cover

Loading…

More Information
Summary:	To solve air pollution problems and reduce energy costs, hybrid electric vehicles (HEV) are applied for larger scale as a superior solution and their energy management strategies (EMS) are essential for achieving higher fuel efficiency. This paper proposes a reinforcement learning (RL) based real-time EMS for a HEV, which requires no prior knowledge of the future driving conditions. First, considering that the practical driving conditions are mutative, an online recursive Markov Chain model is constructed, which can depict the stochastic environment more accurately. Then, a novel RL framework, heuristic search, is introduced, which means using extra knowledge in the search process. It aims to find an optimal action for a specific state rather than learn a policy for an entire driving cycle like traditional RL algorithms. Consequently, the learning process is significantly efficient and real-time optimization can be realized. Moreover, a Q-table is learned offline, and is used online as a heuristic function to direct the search process, which can accelerate the convergence rate and improve the real-time performance further. In the simulation, the effectiveness of the recursive Markov Chain model and the superiority of the heuristic function are verified. Then, a comparative simulation of heuristic search, Q-learning, dynamic programming (DP) and equivalent consumption minimization strategy is conducted and evaluated. The proposed method consumes 1.63% more fuel compared with DP in a totally unknown cycle, which demonstrates its control performance and real-time capability.
ISSN:	0018-9545 1939-9359
DOI:	10.1109/TVT.2022.3195769