Research on temperature control of proton exchange membrane electrolysis cell based on MO‐TD3

To solve the problem of temperature control in proton exchange membrane electrolytic cell (PEMEC), this paper presents a temperature control method based on multi‐experience pool probability playback and Ornstein‐Uhlenbeck noise‐twin delay depth deterministic strategy gradient. Firstly, considering...

Full description

Saved in:
Bibliographic Details
Published inIET renewable power generation Vol. 18; no. 9-10; pp. 1597 - 1610
Main Authors Ma, Libo, Zhao, Hongshan, Pan, Sichao
Format Journal Article
LanguageEnglish
Published John Wiley & Sons, Inc 01.07.2024
Wiley
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To solve the problem of temperature control in proton exchange membrane electrolytic cell (PEMEC), this paper presents a temperature control method based on multi‐experience pool probability playback and Ornstein‐Uhlenbeck noise‐twin delay depth deterministic strategy gradient. Firstly, considering the influence of water supply, anode and cathode pressure, and natural heat dissipation on temperature, a refined thermal model of PEMEC is established and transformed into a Markov model under the framework of deep reinforcement learning (DRL). Then, to solve the training instability and poor control effect of DRL caused by inertia delay of the PEMEC temperature control system, multi‐empirical pool probability playback and Ornstein‐Uhlenbeck random process noise techniques are introduced on the basis of the traditional DRL method. Finally, the simulation and hardware‐in‐the‐loop experience results show that the proposed method outperforms other advanced methods. 1. Based on the traditional thermal models, a detailed thermal model of proton exchange membrane electrolytic cell (PEMEC) is established, taking into account the temperature rise of the water supply, the variation of anode and cathode pressure, and natural heat dissipation. 2. This paper proposes a PEMEC temperature control framework based on reinforcement learning. 3. The authors propose a multi‐experiential pool probabilistic replay and Ornstein‐Uhlenbeck noise‐twin delay depth deterministic strategy gradient (MO‐TD3) method. Based on popular TD3, MO‐TD3 introduces multi‐experience pool probability playback and Ornstein‐Uhlenbeck random process noise to solve the problem of instability in deep reinforcement learning training caused by the inertial delay of the PEMEC temperature control system.
ISSN:1752-1416
1752-1424
DOI:10.1049/rpg2.12997