Energy management strategy for fuel cell vehicles via soft actor-critic-based deep reinforcement learning considering powertrain thermal and durability characteristics

•Learning-based method continuously controls energy management system.•Soft actor-critic retrenches hyperparameter adjustment of energy management.•On-line health and thermal models of fuel cell and battery are incorporated into reward.•Fuel economy and health increments of fuel cell and battery.•Te...

Full description

Saved in:
Bibliographic Details
Published inEnergy conversion and management Vol. 283; p. 116921
Main Authors Zhang, Yuanzhi, Zhang, Caizhi, Fan, Ruijia, Deng, Chenghao, Wan, Song, Chaoui, Hicham
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.05.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•Learning-based method continuously controls energy management system.•Soft actor-critic retrenches hyperparameter adjustment of energy management.•On-line health and thermal models of fuel cell and battery are incorporated into reward.•Fuel economy and health increments of fuel cell and battery.•Temperatures of fuel cell and battery can be effectively regulated. Temperature can significantly affect the water equilibrium, electrochemical kinetics and mass transmission in a proton exchange membrane fuel cell (PEMFC) stack, meanwhile it also impacts the lifespan and safety of a lithium-ion battery (LIB). Yet, energy management strategy (EMS) is rarely to synchronously study the durability performances of the LIB and PEMFC stack with their thermal effects in fuel cell vehicles (FCVs) under real-world driving scenarios. Thus, this study proposes a deep reinforcement learning (DRL)-based EMS to minimize transient costs of the LIB and PEMFC stack, which include their state-of-health (SOH) descents and overtemperature penalties. Meanwhile, the transient costs are incorporated into the overall cost, which comprises the hydrogen consumption rate of the PEMFC stack, and penalty of maintaining the LIB state-of-charge (SOC). Moreover, the soft actor-critic (SAC) is applied to the DRL-based EMS due to its advantage of stability across different random environments and no meticulous hyperparameter calibration. Specifically, the proposed EMS intelligently allocates the direct current (DC) bus power of FCVs in real time to maximize a multi-objective reward in accordance with FCV states, in which the reward is the negative overall cost. Then, long-term real-world driving scenarios in Chongqing city, China, are used for off-line training and real-time control to advance the adaptability of the proposed EMS. The results show that in comparison with the deep Q-network (DQN)-based EMS considering the powertrain temperature and durability, and the SAC-based EMS neglecting the powertrain temperature and durability, the proposed strategy can actualize overall SOH increments of the powertrain up to 14.01 % and 3.45 %, respectively, and restrict the maximum temperatures of the PEMFC stack and LIB. In addition, the generalization of the proposed EMS is verified, in which the trained model of the proposed EMS is tested in other FCV and driving cycles, and it can acquire similar effectiveness. Thus, the proposed strategy can enforce the lifespan durability and thermal stability of the powertrain system.
ISSN:0196-8904
1879-2227
DOI:10.1016/j.enconman.2023.116921