REINFORCEMENT LEARNING PROGRAM, REINFORCEMENT LEARNING METHOD, AND REINFORCEMENT LEARNING DEVICE

To properly control a wind power generation system.SOLUTION: A reinforcement learning device 100 performs reinforcement learning. The reinforcement learning device 100 observes output power from a generator 120 using a wind turbine 110 and wind speed in reinforcement learning, for example. Here, whe...

Full description

Saved in:
Bibliographic Details
Main Authors SHIGEZUMI JUNICHI, IWANE HIDENAO, OCHITANI AKIRA, ITO TOSHIO, YANAMI HITOSHI
Format Patent
LanguageEnglish
Japanese
Published 11.06.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To properly control a wind power generation system.SOLUTION: A reinforcement learning device 100 performs reinforcement learning. The reinforcement learning device 100 observes output power from a generator 120 using a wind turbine 110 and wind speed in reinforcement learning, for example. Here, when observed output power exceeds rated output power, for example, the reinforcement learning device 100 performs first learning using reward corresponding to a difference obtained by subtracting the observed output power from the rated output power. On the other hand, if the observed output power does not exceed the rated output power, for example, the reinforcement learning device 100 performs second learning using reward corresponding to a difference obtained by subtracting scheduled output power specified by the observed wind speed from the observed output power based on a characteristic function.SELECTED DRAWING: Figure 1 【課題】風力発電システムに対して適切な制御を行うこと。【解決手段】強化学習装置100は、強化学習を実施する。強化学習装置100は、例えば、強化学習において、風車110を用いた発電機120からの出力電力、および、風速を観測する。ここで、強化学習装置100は、例えば、観測した出力電力が定格出力電力を超える場合は、定格出力電力から観測した出力電力を減算した差分に対応する報酬を用いて第1の学習を行う。一方で、強化学習装置100は、例えば、観測した出力電力が定格出力電力を超えない場合は、観測した出力電力から、特性関数に基づいて、観測した風速から特定される予定出力電力を減算した差分に対応する報酬を用いて第2の学習を行う。【選択図】図1
Bibliography:Application Number: JP20180226913