Multi-objective fuzzy Q-learning to solve continuous state-action problems

Many real world problems are multi-objective. Thus, the need for multi-objective learning and optimization algorithms is inevitable. Although the multi-objective optimization algorithms are well-studied, the multi-objective learning algorithms have attracted less attention. In this paper, a fuzzy mu...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 516; pp. 115 - 132
Main Authors Asgharnia, Amirhossein, Schwartz, Howard, Atia, Mohamed
Format Journal Article
LanguageEnglish
Published Elsevier B.V 07.01.2023
Subjects
Online AccessGet full text
ISSN0925-2312
1872-8286
DOI10.1016/j.neucom.2022.10.035

Cover

More Information
Summary:Many real world problems are multi-objective. Thus, the need for multi-objective learning and optimization algorithms is inevitable. Although the multi-objective optimization algorithms are well-studied, the multi-objective learning algorithms have attracted less attention. In this paper, a fuzzy multi-objective reinforcement learning algorithm is proposed, and we refer to it as the multi-objective fuzzy Q-learning (MOFQL) algorithm. The algorithm is implemented to solve a bi-objective reach-avoid game. The majority of the multi-objective reinforcement algorithms proposed address solving problems in the discrete state-action domain. However, the MOFQL algorithm can also handle problems in a continuous state-action domain. A fuzzy inference system (FIS) is implemented to estimate the value function for the bi-objective problem. We used a temporal difference (TD) approach to update the fuzzy rules. The proposed method is a multi-policy multi-objective algorithm and can find the non-convex regions of the Pareto front.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2022.10.035