Estimating Objective Weights of Pareto-Optimal Policies for Multi-Objective Sequential Decision-Making
Sequential decision-making under multiple objective functions includes the problem of exhaustively searching for a Pareto-optimal policy and the problem of selecting a policy from the resulting set of Pareto-optimal policies based on the decision maker’s preferences. This paper focuses on the latter...
Saved in:
Published in | Journal of advanced computational intelligence and intelligent informatics Vol. 28; no. 2; pp. 393 - 402 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Tokyo
Fuji Technology Press Co. Ltd
01.03.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Sequential decision-making under multiple objective functions includes the problem of exhaustively searching for a Pareto-optimal policy and the problem of selecting a policy from the resulting set of Pareto-optimal policies based on the decision maker’s preferences. This paper focuses on the latter problem. In order to select a policy that reflects the decision maker’s preferences, it is necessary to order these policies, which is problematic because the decision-maker’s preferences are generally tacit knowledge. Furthermore, it is difficult to order them quantitatively. For this reason, conventional methods have mainly been used to elicit preferences through dialogue with decision-makers and through one-to-one comparisons. In contrast, this paper proposes a method based on inverse reinforcement learning to estimate the weight of each objective from the decision-making sequence. The estimated weights can be used to quantitatively evaluate the Pareto-optimal policies from the viewpoints of the decision-makers preferences. We applied the proposed method to the multi-objective reinforcement learning benchmark problem and verified its effectiveness as an elicitation method of weights for each objective function. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1343-0130 1883-8014 |
DOI: | 10.20965/jaciii.2024.p0393 |