Deep Reinforcement Learning for Multiobjective Optimization

This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimi...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on cybernetics Vol. 51; no. 6; pp. 3103 - 3114
Main Authors Li, Kaiwen, Zhang, Tao, Wang, Rui
Format Journal Article
LanguageEnglish
Published United States IEEE 01.06.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.
AbstractList This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.
This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.
Author Li, Kaiwen
Zhang, Tao
Wang, Rui
Author_xml – sequence: 1
  givenname: Kaiwen
  orcidid: 0000-0003-1550-5987
  surname: Li
  fullname: Li, Kaiwen
  email: kaiwenli_nudt@foxmail.com
  organization: College of Systems Engineering, National University of Defense Technology, Changsha, China
– sequence: 2
  givenname: Tao
  surname: Zhang
  fullname: Zhang, Tao
  email: zhangtao@nudt.edu.cn
  organization: College of Systems Engineering, National University of Defense Technology, Changsha, China
– sequence: 3
  givenname: Rui
  orcidid: 0000-0001-9048-2979
  surname: Wang
  fullname: Wang, Rui
  email: ruiwangnudt@gmail.com
  organization: College of Systems Engineering, National University of Defense Technology, Changsha, China
BackLink https://www.ncbi.nlm.nih.gov/pubmed/32191907$$D View this record in MEDLINE/PubMed
BookMark eNp9kU9r3DAQxUVIadIkHyAUgqGXXnY7GsmWRU7t9i9sCJT00JOQpXHRYssb2S60n77a7iaHHKLLiMfvDTPzXrHjOERi7JLDknPQ7-5WPz8sERCWqJWqKn7ETpFX9QJRlceP_0qdsItx3EB-dZZ0_ZKdCOSaa1Cn7Poj0bb4TiG2Q3LUU5yKNdkUQ_xVZKm4mbspDM2G3BR-U3G7nUIf_tqsxXP2orXdSBeHesZ-fP50t_q6WN9--bZ6v144IfW0aHyrSskRvZLoQTgiVZaN87VH8E1J4L2VrvIAjZa29RXZqhVKElRYlkKcsbf7vts03M80TqYPo6Ous5GGeTQo8mIoa7lD3zxBN8OcYp7OYIm1lhIFZOrqQM1NT95sU-ht-mMezpIBtQdcGsYxUWtcmP7vPCUbOsPB7DIwuwzMLgNzyCA7-RPnQ_PnPK_3nkBEj7wGCViD-Ae7tJA9
CODEN ITCEB8
CitedBy_id crossref_primary_10_1109_TCDS_2023_3307722
crossref_primary_10_1016_j_rico_2022_100195
crossref_primary_10_1016_j_cie_2023_109605
crossref_primary_10_1109_JIOT_2021_3078462
crossref_primary_10_1109_MCI_2023_3277768
crossref_primary_10_1109_JIOT_2024_3440017
crossref_primary_10_3390_math12193025
crossref_primary_10_1016_j_asr_2024_06_003
crossref_primary_10_1016_j_asoc_2024_112575
crossref_primary_10_1016_j_apenergy_2023_121332
crossref_primary_10_1016_j_aei_2023_102197
crossref_primary_10_1109_TNNLS_2021_3105937
crossref_primary_10_1109_JAS_2024_124548
crossref_primary_10_1109_TNNLS_2024_3371706
crossref_primary_10_1145_3470971
crossref_primary_10_1016_j_knosys_2023_110801
crossref_primary_10_1016_j_swevo_2025_101892
crossref_primary_10_1016_j_matdes_2023_112556
crossref_primary_10_1109_JSAC_2024_3365902
crossref_primary_10_1109_TEVC_2022_3199045
crossref_primary_10_4018_IJCINI_361012
crossref_primary_10_1007_s11590_023_02009_5
crossref_primary_10_1016_j_asoc_2024_111751
crossref_primary_10_1109_OJPEL_2020_3012777
crossref_primary_10_3390_app13179667
crossref_primary_10_3390_jlpea12040053
crossref_primary_10_1016_j_eswa_2022_118658
crossref_primary_10_1109_TNNLS_2022_3148435
crossref_primary_10_1016_j_ins_2023_119253
crossref_primary_10_1109_TCYB_2022_3150802
crossref_primary_10_1016_j_compenvurbsys_2023_101966
crossref_primary_10_1016_j_swevo_2022_101225
crossref_primary_10_1007_s10489_023_05013_5
crossref_primary_10_1109_TCYB_2021_3107202
crossref_primary_10_1109_TETCI_2023_3293696
crossref_primary_10_3390_drones7010010
crossref_primary_10_1109_TETCI_2021_3115666
crossref_primary_10_3390_electronics11162513
crossref_primary_10_1007_s12293_022_00366_9
crossref_primary_10_1016_j_isatra_2023_04_004
crossref_primary_10_1109_TITS_2024_3448627
crossref_primary_10_1109_TNSM_2021_3086721
crossref_primary_10_1016_j_jmsy_2024_11_013
crossref_primary_10_1109_TCYB_2023_3283771
crossref_primary_10_1109_MCI_2023_3245719
crossref_primary_10_3390_electronics12194167
crossref_primary_10_1016_j_conbuildmat_2024_135206
crossref_primary_10_3390_s25061707
crossref_primary_10_1109_TEVC_2023_3250350
crossref_primary_10_1016_j_apenergy_2023_122287
crossref_primary_10_1016_j_asoc_2023_110330
crossref_primary_10_1109_ACCESS_2022_3233474
crossref_primary_10_1016_j_neunet_2024_106359
crossref_primary_10_3390_jmse12101765
crossref_primary_10_1109_TCYB_2021_3103811
crossref_primary_10_1109_TWC_2023_3240425
crossref_primary_10_1016_j_ejor_2023_11_038
crossref_primary_10_1016_j_swevo_2023_101253
crossref_primary_10_1016_j_swevo_2024_101694
crossref_primary_10_1016_j_eswa_2024_123592
crossref_primary_10_1109_LRA_2025_3534070
crossref_primary_10_3390_electronics13193842
crossref_primary_10_1007_s40747_021_00514_7
crossref_primary_10_1016_j_energy_2024_133412
crossref_primary_10_1016_j_ejor_2025_01_012
crossref_primary_10_1109_TMC_2024_3357796
crossref_primary_10_2166_hydro_2024_191
crossref_primary_10_3390_app15052679
crossref_primary_10_1002_aic_18012
crossref_primary_10_1109_TIV_2023_3236104
crossref_primary_10_1016_j_engappai_2023_107564
crossref_primary_10_1080_03088839_2024_2326635
crossref_primary_10_1016_j_buildenv_2025_112864
crossref_primary_10_1016_j_compeleceng_2024_109603
crossref_primary_10_3390_biomimetics9120718
crossref_primary_10_1016_j_jii_2024_100727
crossref_primary_10_1109_TCYB_2023_3312476
crossref_primary_10_1109_JAS_2023_123609
crossref_primary_10_1016_j_swevo_2024_101605
crossref_primary_10_1109_ACCESS_2024_3505436
crossref_primary_10_1016_j_swevo_2024_101606
crossref_primary_10_1155_2021_9032206
crossref_primary_10_3390_sym16081030
crossref_primary_10_1016_j_swevo_2025_101866
crossref_primary_10_1016_j_ins_2023_119472
crossref_primary_10_1016_j_swevo_2023_101387
crossref_primary_10_1364_JOCN_460629
crossref_primary_10_1109_JAS_2023_123687
crossref_primary_10_1109_JAS_2023_124113
crossref_primary_10_1109_TITS_2023_3334976
crossref_primary_10_1155_2021_6694695
crossref_primary_10_1016_j_jmsy_2024_04_003
crossref_primary_10_1109_TSTE_2023_3341632
crossref_primary_10_1007_s12204_023_2679_7
crossref_primary_10_1364_JOCN_483733
crossref_primary_10_26599_TST_2023_9010076
crossref_primary_10_1016_j_jmsy_2024_04_007
crossref_primary_10_1109_ACCESS_2021_3060323
crossref_primary_10_1162_dint_a_00246
crossref_primary_10_3390_rs14061304
crossref_primary_10_1016_j_engappai_2023_107381
crossref_primary_10_1016_j_ins_2023_04_003
crossref_primary_10_1109_TAI_2024_3409520
crossref_primary_10_1109_TII_2024_3495788
crossref_primary_10_3390_math12142283
crossref_primary_10_1109_TETCI_2022_3146882
crossref_primary_10_1109_TEVC_2022_3179256
crossref_primary_10_1016_j_swevo_2023_101398
crossref_primary_10_1007_s40747_022_00799_2
crossref_primary_10_1109_TCYB_2021_3121542
crossref_primary_10_1109_TSTE_2024_3393764
crossref_primary_10_1016_j_cie_2023_109115
crossref_primary_10_3390_app131910689
crossref_primary_10_1109_TETCI_2022_3145706
crossref_primary_10_1109_TCYB_2022_3226744
crossref_primary_10_1109_JAS_2024_124341
crossref_primary_10_1007_s40747_021_00635_z
crossref_primary_10_1007_s40747_024_01469_1
crossref_primary_10_3390_rs15163932
crossref_primary_10_1016_j_autcon_2024_105598
crossref_primary_10_1016_j_oceaneng_2024_118398
crossref_primary_10_3233_JCM_226740
crossref_primary_10_1016_j_swevo_2024_101788
crossref_primary_10_1109_TCYB_2022_3164285
crossref_primary_10_1016_j_eswa_2024_124296
crossref_primary_10_1016_j_disopt_2025_100879
crossref_primary_10_1109_ACCESS_2022_3181164
crossref_primary_10_1109_TCYB_2021_3089179
crossref_primary_10_1109_JAS_2022_105677
crossref_primary_10_1371_journal_pcbi_1012229
crossref_primary_10_1109_TITS_2024_3438788
crossref_primary_10_1016_j_jhydrol_2024_130904
crossref_primary_10_1109_JLT_2022_3176473
crossref_primary_10_1109_TNSM_2024_3427403
crossref_primary_10_1109_TCYB_2022_3163816
crossref_primary_10_1109_TITS_2024_3515997
crossref_primary_10_1016_j_aei_2023_101965
crossref_primary_10_1016_j_energy_2024_130999
crossref_primary_10_1109_TNSM_2021_3127685
crossref_primary_10_3390_a17120579
crossref_primary_10_1145_3715700
crossref_primary_10_1007_s40747_023_01308_9
crossref_primary_10_1007_s11227_024_06439_5
crossref_primary_10_3233_IDA_230480
crossref_primary_10_1007_s43674_023_00055_1
crossref_primary_10_1109_TITS_2023_3313688
crossref_primary_10_3390_f15122181
crossref_primary_10_1109_TCYB_2023_3234077
crossref_primary_10_1109_JIOT_2024_3406531
crossref_primary_10_1109_JIOT_2021_3133278
crossref_primary_10_1016_j_ins_2024_121081
crossref_primary_10_3390_math11020437
crossref_primary_10_1088_1742_6596_2450_1_012081
crossref_primary_10_1145_3501803
crossref_primary_10_1016_j_eij_2023_100388
crossref_primary_10_1109_TNSM_2024_3446248
crossref_primary_10_3390_ai5040085
crossref_primary_10_1016_j_engappai_2025_110337
crossref_primary_10_1021_acs_jcim_2c00671
crossref_primary_10_1109_TNSM_2024_3387987
Cites_doi 10.1287/ijoc.3.4.376
10.1287/opre.21.2.498
10.1109/TEVC.2014.2350995
10.3115/v1/D14-1179
10.1109/MCI.2017.2742868
10.1109/4235.585893
10.1109/TEVC.2016.2600642
10.1007/978-3-319-93031-2_12
10.1007/978-3-540-88051-6_14
10.1007/978-3-030-12598-1_14
10.1109/TSMCB.2012.2231860
10.1109/TEVC.2007.892759
10.1109/TCYB.2013.2295886
10.1109/TEVC.2014.2373386
10.1007/3-540-44719-9_6
10.1109/TEVC.2017.2671462
10.1109/5326.704576
10.1007/978-3-642-17144-4_6
10.1016/S0377-2217(01)00104-7
10.1109/TEVC.2013.2281535
10.1109/CEC.2016.7743866
10.1109/TEVC.2002.802873
10.1109/TEVC.2016.2611642
10.1007/978-3-642-11218-8_6
10.1109/4235.996017
10.1109/TEVC.2016.2521175
10.1109/TCYB.2018.2849403
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
DBID 97E
RIA
RIE
AAYXX
CITATION
NPM
7SC
7SP
7TB
8FD
F28
FR3
H8D
JQ2
L7M
L~C
L~D
7X8
DOI 10.1109/TCYB.2020.2977661
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore
CrossRef
PubMed
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Mechanical & Transportation Engineering Abstracts
Technology Research Database
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
Aerospace Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitle CrossRef
PubMed
Aerospace Database
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Engineering Research Database
Advanced Technologies Database with Aerospace
ANTE: Abstracts in New Technology & Engineering
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
Aerospace Database

PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
EISSN 2168-2275
EndPage 3114
ExternalDocumentID 32191907
10_1109_TCYB_2020_2977661
9040280
Genre orig-research
Journal Article
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 61773390; 71571187
  funderid: 10.13039/501100001809
GroupedDBID 0R~
4.4
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACIWK
AENEX
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
HZ~
IFIPE
IPLJI
JAVBF
M43
O9-
OCL
PQQKQ
RIA
RIE
RNS
AAYXX
CITATION
RIG
NPM
7SC
7SP
7TB
8FD
F28
FR3
H8D
JQ2
L7M
L~C
L~D
7X8
ID FETCH-LOGICAL-c349t-bdf754122d742d03cee755bcd8d20db5e0dda4c6d00b94afd6ea6f374e0625533
IEDL.DBID RIE
ISSN 2168-2267
2168-2275
IngestDate Thu Jul 10 19:15:11 EDT 2025
Mon Jun 30 04:41:26 EDT 2025
Thu Apr 03 07:00:18 EDT 2025
Tue Jul 01 00:53:55 EDT 2025
Thu Apr 24 22:52:01 EDT 2025
Wed Aug 27 02:30:24 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c349t-bdf754122d742d03cee755bcd8d20db5e0dda4c6d00b94afd6ea6f374e0625533
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0003-1550-5987
0000-0001-9048-2979
PMID 32191907
PQID 2528944230
PQPubID 85422
PageCount 12
ParticipantIDs ieee_primary_9040280
crossref_citationtrail_10_1109_TCYB_2020_2977661
proquest_miscellaneous_2381624843
proquest_journals_2528944230
crossref_primary_10_1109_TCYB_2020_2977661
pubmed_primary_32191907
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-06-01
PublicationDateYYYYMMDD 2021-06-01
PublicationDate_xml – month: 06
  year: 2021
  text: 2021-06-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: Piscataway
PublicationTitle IEEE transactions on cybernetics
PublicationTitleAbbrev TCYB
PublicationTitleAlternate IEEE Trans Cybern
PublicationYear 2021
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
nazari (ref17) 2018
ref34
ref12
ref37
ref15
ref36
ref14
ref31
ref32
konda (ref35) 2000
ref10
bahdanau (ref19) 2014
ref2
ref1
ref16
mnih (ref25) 2016
miettinen (ref30) 2012; 12
mossalam (ref24) 2016
johnson (ref7) 1990
ref26
ref41
bello (ref20) 2016
ref22
glorot (ref39) 2010
cai (ref11) 2015; 19
vinyals (ref18) 2015
chen (ref28) 2017; 21
ref27
kool (ref21) 2018
kingma (ref38) 2014
ref29
hsu (ref23) 2018
ref8
ref9
ref4
ref3
ref6
sutskever (ref33) 2014
ref5
ref40
References_xml – ident: ref37
  doi: 10.1287/ijoc.3.4.376
– ident: ref6
  doi: 10.1287/opre.21.2.498
– year: 2018
  ident: ref21
  publication-title: Attention learn to solve routing problems!
– volume: 12
  year: 2012
  ident: ref30
  publication-title: Nonlinear Multiobjective Optimization
– volume: 19
  start-page: 508
  year: 2015
  ident: ref11
  article-title: An external archive guided multiobjective evolutionary algorithm based on decomposition for combinatorial optimization
  publication-title: IEEE Trans Evol Comput
  doi: 10.1109/TEVC.2014.2350995
– start-page: 3104
  year: 2014
  ident: ref33
  article-title: Sequence to sequence learning with neural networks
  publication-title: Proc Adv Neural Inf Process Syst
– year: 2016
  ident: ref24
  article-title: Multi-objective deep reinforcement learning
– start-page: 9839
  year: 2018
  ident: ref17
  article-title: Reinforcement learning for solving the vehicle routing problem
  publication-title: Proc Adv Neural Inf Process Syst
– start-page: 443
  year: 1990
  ident: ref7
  article-title: Local search and the traveling salesman problem
  publication-title: Proc 17th Int Colloquium Automata Lang Program Lecture Notes Comput Sci
– ident: ref34
  doi: 10.3115/v1/D14-1179
– ident: ref36
  doi: 10.1109/MCI.2017.2742868
– ident: ref16
  doi: 10.1109/4235.585893
– ident: ref14
  doi: 10.1109/TEVC.2016.2600642
– ident: ref22
  doi: 10.1007/978-3-319-93031-2_12
– ident: ref5
  doi: 10.1007/978-3-540-88051-6_14
– ident: ref15
  doi: 10.1007/978-3-030-12598-1_14
– ident: ref3
  doi: 10.1109/TSMCB.2012.2231860
– ident: ref2
  doi: 10.1109/TEVC.2007.892759
– ident: ref10
  doi: 10.1109/TCYB.2013.2295886
– ident: ref27
  doi: 10.1109/TEVC.2014.2373386
– start-page: 1928
  year: 2016
  ident: ref25
  article-title: Asynchronous methods for deep reinforcement learning
  publication-title: Proc Int Conf Mach Learn
– start-page: 2692
  year: 2015
  ident: ref18
  article-title: Pointer networks
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref26
  doi: 10.1007/3-540-44719-9_6
– volume: 21
  start-page: 714
  year: 2017
  ident: ref28
  article-title: Dmoea-?C: Decomposition-based multiobjective evolutionary algorithm with the $\varepsilon$ -constraint framework
  publication-title: IEEE Trans Evol Comput
  doi: 10.1109/TEVC.2017.2671462
– ident: ref40
  doi: 10.1109/5326.704576
– start-page: 249
  year: 2010
  ident: ref39
  article-title: Understanding the difficulty of training deep feedforward neural networks
  publication-title: Proc 13th Int Conf Artif Intell Stat
– year: 2016
  ident: ref20
  article-title: Neural combinatorial optimization with reinforcement learning
– ident: ref8
  doi: 10.1007/978-3-642-17144-4_6
– ident: ref41
  doi: 10.1016/S0377-2217(01)00104-7
– year: 2018
  ident: ref23
  article-title: MONAS: Multi-objective neural architecture search using reinforcement learning
– ident: ref29
  doi: 10.1109/TEVC.2013.2281535
– ident: ref4
  doi: 10.1109/CEC.2016.7743866
– year: 2014
  ident: ref38
  publication-title: Adam A method for stochastic optimization
– ident: ref9
  doi: 10.1109/TEVC.2002.802873
– ident: ref31
  doi: 10.1109/TEVC.2016.2611642
– start-page: 1008
  year: 2000
  ident: ref35
  article-title: Actor-Critic algorithms
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref13
  doi: 10.1007/978-3-642-11218-8_6
– ident: ref1
  doi: 10.1109/4235.996017
– year: 2014
  ident: ref19
  article-title: Neural machine translation by jointly learning to align and translate
– ident: ref32
  doi: 10.1109/TEVC.2016.2521175
– ident: ref12
  doi: 10.1109/TCYB.2018.2849403
SSID ssj0000816898
Score 2.6608922
Snippet This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call...
SourceID proquest
pubmed
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 3103
SubjectTerms Algorithms
Decomposition
Deep learning
Deep reinforcement learning (DRL)
Iterative methods
Machine learning
Mathematical models
Modeling
Mopping
multiobjective optimization
Multiple objective analysis
Neural networks
Optimization
Parameters
Pareto optimization
Pointer Network
Reinforcement learning
Training
Traveling salesman problem
Traveling salesman problems
Urban areas
Title Deep Reinforcement Learning for Multiobjective Optimization
URI https://ieeexplore.ieee.org/document/9040280
https://www.ncbi.nlm.nih.gov/pubmed/32191907
https://www.proquest.com/docview/2528944230
https://www.proquest.com/docview/2381624843
Volume 51
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dSxwxEB_UB7kX67dXbVnBBy3umU2yuxf61KrHIaggJ9inZZNMBG3vpN699K_vJJtbsFTxLWyyX5mZzEwm8xuAAyUx4xma1GlaAqVTLq2dkWnfKa5rIrsJ2J2XV8XwVl7c5XcLcNzmwiBiOHyGPd8MsXw7MTO_VXaiiON4nxz0RXLcmlytdj8lFJAIpW85NVKyKsoYxMyYOhmd_vhOziBnPU4GD-mkDiwLElZSh-ULjRRKrLxubQatM_gAl_PvbQ6bPPZmU90zf_6BcnzvD63CSjQ_k28Nv6zBAo7XYS0K-HNyGFGojzbg6xniU3KDAVnVhE3EJIKx3id0KQmpuxP90KyYyTWtPb9iUucm3A7OR6fDNFZaSI2Qappq68pcZpxb8pQtE6Q5yzzXxvYtZ1bnyKytpSksY1rJ2tkC68KJUiIj_4ksxi1YGk_GuAOJ5qIURtZCYS5dbjWRvLBKeKB4JnXWBTaf7cpEGHJfDeNnFdwRpipPq8rTqoq06sKX9panBoPjrcEbfp7bgXGKu7A3J2kVpfS54jm5m5IMSureb7tJvnzQpB7jZEZjfGSVy74UXdhuWKF99pyDPv7_nbvQ4f4ETNiz2YOl6e8ZfiITZqo_B979Cxw_6Qw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dTxQxEJ8QSJQXBfHjBHRNfFDjHt1-7F7DE6LkVA4TcyTwtNm2UxM-7gjcvfDXM-32NpGo8a3Zdr86M52ZTuc3AG-1xIIXaHNvaAmUXvu88VbmA6-5aYjsNmJ3jo7K4bH8dqJOluBjlwuDiPHwGfZDM8by3dTOw1bZjiaO4wNy0FdI7yveZmt1OyqxhEQsfsupkZNdUaUwZsH0znj_9BO5g5z1OZk8pJVW4YEgcSWFWP2mk2KRlb_bm1HvHDyG0eKL2-Mm5_35zPTt7T0wx__9pTV4lAzQbK_lmHVYwskTWE8ifpO9SzjU7zdg9zPiVfYTI7aqjduIWYJj_ZXRpSwm707NWbtmZj9o9blMaZ1P4fjgy3h_mKdaC7kVUs9y43ylZMG5I1_ZMUG6s1LKWDdwnDmjkDnXSFs6xoyWjXclNqUXlURGHhTZjM9geTKd4AvIDBeVsLIRGpX0yhkieum0CFDxTJqiB2wx27VNQOShHsZFHR0SputAqzrQqk606sGH7parFoXjX4M3wjx3A9MU92BrQdI6yelNzRU5nJJMSup-03WThIWwSTPB6ZzGhNgqlwMpevC8ZYXu2QsOevnnd76Gh8Px6LA-_Hr0fRNWeTgPE3dwtmB5dj3HbTJoZuZV5OM7cizsVg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+Reinforcement+Learning+for+Multiobjective+Optimization&rft.jtitle=IEEE+transactions+on+cybernetics&rft.au=Li%2C+Kaiwen&rft.au=Zhang%2C+Tao&rft.au=Wang%2C+Rui&rft.date=2021-06-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=2168-2267&rft.eissn=2168-2275&rft.volume=51&rft.issue=6&rft.spage=3103&rft_id=info:doi/10.1109%2FTCYB.2020.2977661&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2168-2267&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2168-2267&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2168-2267&client=summon