Multi-objective fuzzy Q-learning to solve continuous state-action problems

Many real world problems are multi-objective. Thus, the need for multi-objective learning and optimization algorithms is inevitable. Although the multi-objective optimization algorithms are well-studied, the multi-objective learning algorithms have attracted less attention. In this paper, a fuzzy mu...

Full description

Saved in:

Bibliographic Details
Published in	Neurocomputing (Amsterdam) Vol. 516; pp. 115 - 132
Main Authors	Asgharnia, Amirhossein, Schwartz, Howard, Atia, Mohamed
Format	Journal Article
Language	English
Published	Elsevier B.V 07.01.2023
Subjects	Differential games Multi-objective reinforcement learning Q-learning Reinforcement learning Q-learning Multi-objective reinforcement learning Reinforcement learning Differential games
Online Access	Get full text
ISSN	0925-2312 1872-8286
DOI	10.1016/j.neucom.2022.10.035

Cover

More Information
Summary:	Many real world problems are multi-objective. Thus, the need for multi-objective learning and optimization algorithms is inevitable. Although the multi-objective optimization algorithms are well-studied, the multi-objective learning algorithms have attracted less attention. In this paper, a fuzzy multi-objective reinforcement learning algorithm is proposed, and we refer to it as the multi-objective fuzzy Q-learning (MOFQL) algorithm. The algorithm is implemented to solve a bi-objective reach-avoid game. The majority of the multi-objective reinforcement algorithms proposed address solving problems in the discrete state-action domain. However, the MOFQL algorithm can also handle problems in a continuous state-action domain. A fuzzy inference system (FIS) is implemented to estimate the value function for the bi-objective problem. We used a temporal difference (TD) approach to update the fuzzy rules. The proposed method is a multi-policy multi-objective algorithm and can find the non-convex regions of the Pareto front.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2022.10.035