REINFORCEMENT LEARNING METHODS, REINFORCEMENT LEARNING PROGRAMS, AND REINFORCEMENT LEARNING SYSTEMS

To improve the probability of satisfying a constraint condition.SOLUTION: When determining the control input to control target 110, an information processor 100 calculates the risk relevant to the state of the control target 110 at present with respect to the constraint condition for the state of co...

Full description

Saved in:

Bibliographic Details
Main Authors	IWANE HIDENAO, OKAWA YOSHIHIRO, SASAKI TOMOTAKE, YANAMI HITOSHI
Format	Patent
Language	English Japanese
Published	10.09.2020
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING CONTROL OR REGULATING SYSTEMS IN GENERAL CONTROLLING COUNTING DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FORADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORYOR FORECASTING PURPOSES FUNCTIONAL ELEMENTS OF SUCH SYSTEMS MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS ORELEMENTS PHYSICS REGULATING SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE,COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTINGPURPOSES, NOT OTHERWISE PROVIDED FOR
Online Access	Get full text

Cover

Loading…

More Information
Summary:	To improve the probability of satisfying a constraint condition.SOLUTION: When determining the control input to control target 110, an information processor 100 calculates the risk relevant to the state of the control target 110 at present with respect to the constraint condition for the state of control target 110 based on a prediction value of the state of control target 110 at a future point of time. The information processor 100 determines the control input to the control target 110 at the present time from the range determined depending on the calculated risk.SELECTED DRAWING: Figure 2 【課題】制約条件を充足する確率の向上を図ること。【解決手段】情報処理装置１００は、制御対象１１０への制御入力を決定するにあたり、将来の時点における制御対象１１０の状態の予測値に基づいて、制御対象１１０の状態に関する制約条件に対する、現在の時点における制御対象１１０の状態についての危険度を算出する。情報処理装置１００は、算出した危険度に応じて定まる範囲の中から、現在の時点における制御対象１１０への制御入力を決定する。【選択図】図２
Bibliography:	Application Number: JP20190039032