Stochastic dispatch of energy storage in microgrids: An augmented reinforcement learning approach

•A stochastic model for dispatching the BESS in microgrids is formulated.•An augmented reinforcement learning method is proposed to incorporate uncertainty.•Dispatching rules are expressed as soft logic to reduce infeasible explorations.•Results show the proposed algorithm outperforms the baseline R...

Full description

Saved in:

Bibliographic Details
Published in	Applied energy Vol. 261; p. 114423
Main Authors	Shang, Yuwei, Wu, Wenchuan, Guo, Jianbo, Ma, Zhao, Sheng, Wanxing, Lv, Zhe, Fu, Chenran
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.03.2020
Subjects	Dynamic dispatch Energy storage Microgrid Reinforcement learning Volatile energy resource Volatile energy resource Microgrid Reinforcement learning Energy storage Dynamic dispatch
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•A stochastic model for dispatching the BESS in microgrids is formulated.•An augmented reinforcement learning method is proposed to incorporate uncertainty.•Dispatching rules are expressed as soft logic to reduce infeasible explorations.•Results show the proposed algorithm outperforms the baseline RL algorithms. The dynamic dispatch (DD) of battery energy storage systems (BESSs) in microgrids integrated with volatile energy resources is essentially a multiperiod stochastic optimization problem (MSOP). Because the life span of a BESS is significantly affected by its charging and discharging behaviors, its lifecycle degradation costs should be incorporated into the DD model of BESSs, which makes it non-convex. In general, this MSOP is intractable. To solve this problem, we propose a reinforcement learning (RL) solution augmented with Monte-Carlo tree search (MCTS) and domain knowledge expressed as dispatching rules. In this solution, the Q-learning with function approximation is employed as the basic learning architecture that allows multistep bootstrapping and continuous policy learning. To improve the computation efficiency of randomized multistep simulations, we employed the MCTS to estimate the expected maximum action values. Moreover, we embedded a few dispatching rules in RL as probabilistic logics to reduce infeasible action explorations, which can improve the quality of the data-driven solution. Numerical test results show the proposed algorithm outperforms other baseline RL algorithms in all cases tested.
ISSN:	0306-2619 1872-9118
DOI:	10.1016/j.apenergy.2019.114423