Hierarchical Q-learning network for online simultaneous optimization of energy efficiency and battery life of the battery/ultracapacitor electric vehicle

•Hierarchical Q-learning is proposed to control battery/ ultracapacitor split ratio and engagement.•Results from hierarchical Q-learning are extensively analyzed.•Hierarchical Q-learning reduces battery degradation by 8–12%.•The proposed method is validated in different driving cycles considering me...

Full description

Saved in:
Bibliographic Details
Published inJournal of energy storage Vol. 46; p. 103925
Main Authors Xu, Bin, Zhou, Quan, Shi, Junzhe, Li, Sixu
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.02.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•Hierarchical Q-learning is proposed to control battery/ ultracapacitor split ratio and engagement.•Results from hierarchical Q-learning are extensively analyzed.•Hierarchical Q-learning reduces battery degradation by 8–12%.•The proposed method is validated in different driving cycles considering measurement noise. Reinforcement learning has been gaining attention in energy management of hybrid power systems for its low computation cost and great energy saving performance. However, the potential of reinforcement learning (RL) has not been fully explored in electric vehicle (EV) applications because most studies on RL only focused on single design targets. This paper studied on online optimization of the supervisory control system of an EV (powered by battery and ultracapacitor) with two design targets, maximizing energy efficiency and battery life. Based on a widely used reinforcement learning method, Q-learning, a hierarchical learning network is proposed. Within the hierarchical Q-learning network, two independent Q tables, Q1 and Q2, are allocated in two control layers. In addition to the baseline power-split layer, which determines the power split ratio between battery and ultracapacitor based on the knowledge stored in Q1, an upper layer is developed to trigger the engagement of the ultracapacitor based on Q2. In the learning process, Q1 and Q2 are updated during the real driving using the measured signals of states, actions, and rewards. The hierarchical Q-learning network is developed and evaluated following a full propulsion system model. By introducing the single-layer Q-learning based method and the rule-based method as two baselines, performance of the EV with the three control methods (i.e., two baseline and one proposed) are simulated under different driving cycles. The results show that the addition of an ultracapacitor in the electric vehicle reduces the battery capacity loss by 12%. The proposed hierarchical Q-learning network is shown superior to the two baseline methods by reducing 8% battery capacity loss. The vehicle range is slightly extended along with the battery life extension. Moreover, the proposed strategy is validated by considering different driving cycle and measurement noise. The proposed hierarchical strategy can be adapted and applied to reinforcement learning based energy management in different hybrid power systems.
ISSN:2352-152X
2352-1538
DOI:10.1016/j.est.2021.103925