Estimating Objective Weights of Pareto-Optimal Policies for Multi-Objective Sequential Decision-Making
Sequential decision-making under multiple objective functions includes the problem of exhaustively searching for a Pareto-optimal policy and the problem of selecting a policy from the resulting set of Pareto-optimal policies based on the decision maker’s preferences. This paper focuses on the latter...
Saved in:
Published in | Journal of advanced computational intelligence and intelligent informatics Vol. 28; no. 2; pp. 393 - 402 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Tokyo
Fuji Technology Press Co. Ltd
01.03.2024
|
Subjects | |
Online Access | Get full text |
ISSN | 1343-0130 1883-8014 |
DOI | 10.20965/jaciii.2024.p0393 |
Cover
Loading…
Abstract | Sequential decision-making under multiple objective functions includes the problem of exhaustively searching for a Pareto-optimal policy and the problem of selecting a policy from the resulting set of Pareto-optimal policies based on the decision maker’s preferences. This paper focuses on the latter problem. In order to select a policy that reflects the decision maker’s preferences, it is necessary to order these policies, which is problematic because the decision-maker’s preferences are generally tacit knowledge. Furthermore, it is difficult to order them quantitatively. For this reason, conventional methods have mainly been used to elicit preferences through dialogue with decision-makers and through one-to-one comparisons. In contrast, this paper proposes a method based on inverse reinforcement learning to estimate the weight of each objective from the decision-making sequence. The estimated weights can be used to quantitatively evaluate the Pareto-optimal policies from the viewpoints of the decision-makers preferences. We applied the proposed method to the multi-objective reinforcement learning benchmark problem and verified its effectiveness as an elicitation method of weights for each objective function. |
---|---|
AbstractList | Sequential decision-making under multiple objective functions includes the problem of exhaustively searching for a Pareto-optimal policy and the problem of selecting a policy from the resulting set of Pareto-optimal policies based on the decision maker’s preferences. This paper focuses on the latter problem. In order to select a policy that reflects the decision maker’s preferences, it is necessary to order these policies, which is problematic because the decision-maker’s preferences are generally tacit knowledge. Furthermore, it is difficult to order them quantitatively. For this reason, conventional methods have mainly been used to elicit preferences through dialogue with decision-makers and through one-to-one comparisons. In contrast, this paper proposes a method based on inverse reinforcement learning to estimate the weight of each objective from the decision-making sequence. The estimated weights can be used to quantitatively evaluate the Pareto-optimal policies from the viewpoints of the decision-makers preferences. We applied the proposed method to the multi-objective reinforcement learning benchmark problem and verified its effectiveness as an elicitation method of weights for each objective function. |
Author | Ikenaga, Akiko Arai, Sachiyo |
Author_xml | – sequence: 1 givenname: Akiko surname: Ikenaga fullname: Ikenaga, Akiko organization: Graduate School of Science and Engineering, Chiba University, 1-33 Yayoi-cho, Inage-ku, Chiba 263-8522, Japan – sequence: 2 givenname: Sachiyo orcidid: 0000-0002-8899-645X surname: Arai fullname: Arai, Sachiyo organization: Graduate School of Science and Engineering, Chiba University, 1-33 Yayoi-cho, Inage-ku, Chiba 263-8522, Japan |
BookMark | eNpFkEtLAzEUhYNUsNb-AVcDrlPz6kxmKbU-oKUFFZdhEm9qxnFSk1Tw35u2gqt7Lpz7ON85GvS-B4QuKZkwUpfT67YxzrncMDHZEl7zEzSkUnIsCRWDrLngmFBOztA4xpaQrFlJBB0iO4_JfTbJ9ZtipVswyX1D8Qpu855i4W2xbgIkj1fbva0r1r5zxkEsrA_Fctclh__HnuBrB31y2XcLxkXne7xsPvLuC3Rqmy7C-K-O0Mvd_Hn2gBer-8fZzQIbUbKERcU0h7qxEgwHYo1h0HCtDasFSKLp21RSTitNdEVYbSupS1aCBK5rnSPxEbo67t0Gn3-JSbV-F_p8UrG6rEgphJDZxY4uE3yMAazahpwu_ChK1AGpOiJVe6TqgJT_AlIubiU |
Cites_doi | 10.1609/aaai.v25i1.7954 10.1007/978-3-540-89378-3_37 10.1109/ADPRL.2013.6615007 10.1145/1015330.1015430 10.1609/aaai.v32i1.11804 10.1109/TSMC.2014.2358639 10.1007/s10994-010-5232-5 10.1613/jair.3987 10.1016/0022-2496(77)90033-5 10.1145/1390156.1390162 10.1016/j.neucom.2017.05.090 |
ContentType | Journal Article |
Copyright | Copyright © 2024 Fuji Technology Press Ltd. |
Copyright_xml | – notice: Copyright © 2024 Fuji Technology Press Ltd. |
DBID | AAYXX CITATION 7SC 7SP 8FD 8FE 8FG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- L7M L~C L~D P5Z P62 PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI PRINS |
DOI | 10.20965/jaciii.2024.p0393 |
DatabaseName | CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One ProQuest Central Korea ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China |
DatabaseTitle | CrossRef Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts SciTech Premium Collection ProQuest One Community College ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Central Korea ProQuest Central (New) Advanced Technologies Database with Aerospace Advanced Technologies & Aerospace Collection ProQuest One Academic Eastern Edition Electronics & Communications Abstracts ProQuest Technology Collection ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | CrossRef Computer Science Database |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1883-8014 |
EndPage | 402 |
ExternalDocumentID | 10_20965_jaciii_2024_p0393 |
GroupedDBID | AAYXX AFKRA ALMA_UNASSIGNED_HOLDINGS ARAPS ARCSS BENPR BGLVJ CCPQU CITATION GROUPED_DOAJ HCIFZ JSI JSP K7- P2P PHGZM PHGZT RJT RZJ TUS 7SC 7SP 8FD 8FE 8FG AZQEC DWQXO GNUQQ JQ2 L7M L~C L~D P62 PKEHL PQEST PQGLB PQQKQ PQUKI PRINS |
ID | FETCH-LOGICAL-c462t-472b3e9af8ec3e0fcc2ea3bbc294e80b1d581317b0b7029f78b626e8e3b9b3263 |
IEDL.DBID | BENPR |
ISSN | 1343-0130 |
IngestDate | Mon Jul 14 10:33:18 EDT 2025 Tue Jul 01 04:30:45 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c462t-472b3e9af8ec3e0fcc2ea3bbc294e80b1d581317b0b7029f78b626e8e3b9b3263 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-8899-645X |
OpenAccessLink | https://doi.org/10.20965/jaciii.2024.p0393 |
PQID | 2967064448 |
PQPubID | 4911628 |
PageCount | 10 |
ParticipantIDs | proquest_journals_2967064448 crossref_primary_10_20965_jaciii_2024_p0393 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2024-03-01 |
PublicationDateYYYYMMDD | 2024-03-01 |
PublicationDate_xml | – month: 03 year: 2024 text: 2024-03-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Tokyo |
PublicationPlace_xml | – name: Tokyo |
PublicationTitle | Journal of advanced computational intelligence and intelligent informatics |
PublicationYear | 2024 |
Publisher | Fuji Technology Press Co. Ltd |
Publisher_xml | – name: Fuji Technology Press Co. Ltd |
References | key-10.20965/jaciii.2024.p0393-2 key-10.20965/jaciii.2024.p0393-1 key-10.20965/jaciii.2024.p0393-8 key-10.20965/jaciii.2024.p0393-13 key-10.20965/jaciii.2024.p0393-7 key-10.20965/jaciii.2024.p0393-12 key-10.20965/jaciii.2024.p0393-9 key-10.20965/jaciii.2024.p0393-14 key-10.20965/jaciii.2024.p0393-4 key-10.20965/jaciii.2024.p0393-3 key-10.20965/jaciii.2024.p0393-6 key-10.20965/jaciii.2024.p0393-11 key-10.20965/jaciii.2024.p0393-5 key-10.20965/jaciii.2024.p0393-10 |
References_xml | – ident: key-10.20965/jaciii.2024.p0393-11 doi: 10.1609/aaai.v25i1.7954 – ident: key-10.20965/jaciii.2024.p0393-13 doi: 10.1007/978-3-540-89378-3_37 – ident: key-10.20965/jaciii.2024.p0393-9 doi: 10.1109/ADPRL.2013.6615007 – ident: key-10.20965/jaciii.2024.p0393-12 – ident: key-10.20965/jaciii.2024.p0393-7 doi: 10.1145/1015330.1015430 – ident: key-10.20965/jaciii.2024.p0393-10 doi: 10.1609/aaai.v32i1.11804 – ident: key-10.20965/jaciii.2024.p0393-3 doi: 10.1109/TSMC.2014.2358639 – ident: key-10.20965/jaciii.2024.p0393-8 doi: 10.1007/s10994-010-5232-5 – ident: key-10.20965/jaciii.2024.p0393-1 doi: 10.1613/jair.3987 – ident: key-10.20965/jaciii.2024.p0393-5 – ident: key-10.20965/jaciii.2024.p0393-6 doi: 10.1016/0022-2496(77)90033-5 – ident: key-10.20965/jaciii.2024.p0393-2 doi: 10.1145/1390156.1390162 – ident: key-10.20965/jaciii.2024.p0393-4 – ident: key-10.20965/jaciii.2024.p0393-14 doi: 10.1016/j.neucom.2017.05.090 |
SSID | ssj0001326041 ssib051641541 |
Score | 2.2814596 |
Snippet | Sequential decision-making under multiple objective functions includes the problem of exhaustively searching for a Pareto-optimal policy and the problem of... |
SourceID | proquest crossref |
SourceType | Aggregation Database Index Database |
StartPage | 393 |
SubjectTerms | Decision making Multiple objective analysis Pareto optimum Policies |
Title | Estimating Objective Weights of Pareto-Optimal Policies for Multi-Objective Sequential Decision-Making |
URI | https://www.proquest.com/docview/2967064448 |
Volume | 28 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8JAEN4IXLz4NqJI9uDNVJbttmxPxgdITHhEJXJrututCSGlCv5_Z7bbEC6e-zh825n5vuk8CLkRRqkswUq1TDJPhGCKCicjomPkUoeBsjnd0TgczsTrPJi7hNvalVVWPtE66nSlMUfe4VHYg_AJauK--PZwaxT-XXUrNGqkAS5YgvhqPPbH07fqiwpADABH6G6zLsBWmChVmMBCIp-VnTQcp6B0FonGkQ4cAtddgV2ru9Fq11nbCDQ4IgeOOtKH8qyPyZ7JT8hhtZaBOis9JVkfzBaJaP5FJ2pRejT6aXOga7rK6BQ32668SYG3LakdDQyCmQJ_pbYh19s-9m5LrcENLOmzW8fjjewGqzMyG_Q_noaeW6fgaRHyjSd6XPkmSjJptG9YpjU3ia-U5pEwkqluGsgu0AnFVI_xKOtJBWrHSOOrSAFu_jmp56vcXBDKU4YtsYEOjRYmDRScb5qyVOMwnTRhTXJbwRYX5dSMGNSGBTkuQY4R5NiC3CStCtnYWdA63p735f-Xr8g-vqqsC2uR-ubn11wDUdioNqnJwUvbfRNtK7f_AOF6vjg |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT8JAEJ4gHvTi2_hA3YOeTHXZbkt7MMYICCJiIkZvtbvdmhgCKBjjn_I3OrNtQ7x489ztHr5-8-w8AA6lUSqNqVItDbgjfRRFRZMRSTGKQPuesjnd7q3fepDXT95TCb6LXhgqqyx0olXUyUhTjvxUhH4NzSdGE-fjN4e2RtHf1WKFRkaLjvn6xJBtctau4_c9EqLZ6F-2nHyrgKOlL6aOrAnlmjBOA6Ndw1OthYldpbQIpQm4qiZeUEWrqriqcRGmtUCh028C46pQobPj4r1zMC9dNySJCppXBX89DD3QI6nOcjx4nMss5pNUtuTyrG9H0MyV09dY0wAJgWbyZEw9sr9t42_TYO1dcwWWckeVXWTMWoWSGa7BcrEEguU6YR3SBioJcnuHL6ynXjP9yR5txnXCRim7oz26I6c3pmMDZgcRY3jO0Ftmtv3Xmb12bwu7UekMWD1f_uN07b6sDXj4F5g3oTwcDc0WMJFwasD1tG-0NImnkE1JwhNNo3uSmG_DcQFbNM5mdEQY21iQowzkiECOLMjbUCmQjXJ5nUQzdu38_fgAFlr97k10077t7MIiXZtVpFWgPH3_MHvookzVvuUFg-f_JuIP9qn4Ww |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Estimating+Objective+Weights+of+Pareto-Optimal+Policies+for+Multi-Objective+Sequential+Decision-Making&rft.jtitle=Journal+of+advanced+computational+intelligence+and+intelligent+informatics&rft.au=Ikenaga%2C+Akiko&rft.au=Arai%2C+Sachiyo&rft.date=2024-03-01&rft.issn=1343-0130&rft.eissn=1883-8014&rft.volume=28&rft.issue=2&rft.spage=393&rft.epage=402&rft_id=info:doi/10.20965%2Fjaciii.2024.p0393&rft.externalDBID=n%2Fa&rft.externalDocID=10_20965_jaciii_2024_p0393 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1343-0130&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1343-0130&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1343-0130&client=summon |