Deep Reinforcement Learning for Multiobjective Optimization

This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimi...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on cybernetics Vol. 51; no. 6; pp. 3103 - 3114
Main Authors	Li, Kaiwen, Zhang, Tao, Wang, Rui
Format	Journal Article
Language	English
Published	United States IEEE 01.06.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Decomposition Deep learning Deep reinforcement learning (DRL) Iterative methods Machine learning Mathematical models Modeling Mopping multiobjective optimization Multiple objective analysis Neural networks Optimization Parameters Pareto optimization Pointer Network Reinforcement learning Training Traveling salesman problem Traveling salesman problems Urban areas
Online Access	Get full text

Cover

Loading…

Abstract	This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.
AbstractList	This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time. This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.
Author	Li, Kaiwen Zhang, Tao Wang, Rui
Author_xml	– sequence: 1 givenname: Kaiwen orcidid: 0000-0003-1550-5987 surname: Li fullname: Li, Kaiwen email: kaiwenli_nudt@foxmail.com organization: College of Systems Engineering, National University of Defense Technology, Changsha, China – sequence: 2 givenname: Tao surname: Zhang fullname: Zhang, Tao email: zhangtao@nudt.edu.cn organization: College of Systems Engineering, National University of Defense Technology, Changsha, China – sequence: 3 givenname: Rui orcidid: 0000-0001-9048-2979 surname: Wang fullname: Wang, Rui email: ruiwangnudt@gmail.com organization: College of Systems Engineering, National University of Defense Technology, Changsha, China
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/32191907$$D View this record in MEDLINE/PubMed
BookMark	eNp9kU9r3DAQxUVIadIkHyAUgqGXXnY7GsmWRU7t9i9sCJT00JOQpXHRYssb2S60n77a7iaHHKLLiMfvDTPzXrHjOERi7JLDknPQ7-5WPz8sERCWqJWqKn7ETpFX9QJRlceP_0qdsItx3EB-dZZ0_ZKdCOSaa1Cn7Poj0bb4TiG2Q3LUU5yKNdkUQ_xVZKm4mbspDM2G3BR-U3G7nUIf_tqsxXP2orXdSBeHesZ-fP50t_q6WN9--bZ6v144IfW0aHyrSskRvZLoQTgiVZaN87VH8E1J4L2VrvIAjZa29RXZqhVKElRYlkKcsbf7vts03M80TqYPo6Ous5GGeTQo8mIoa7lD3zxBN8OcYp7OYIm1lhIFZOrqQM1NT95sU-ht-mMezpIBtQdcGsYxUWtcmP7vPCUbOsPB7DIwuwzMLgNzyCA7-RPnQ_PnPK_3nkBEj7wGCViD-Ae7tJA9
CODEN	ITCEB8
CitedBy_id	crossref_primary_10_1109_TCDS_2023_3307722 crossref_primary_10_1016_j_rico_2022_100195 crossref_primary_10_1016_j_cie_2023_109605 crossref_primary_10_1109_JIOT_2021_3078462 crossref_primary_10_1109_MCI_2023_3277768 crossref_primary_10_1109_JIOT_2024_3440017 crossref_primary_10_3390_math12193025 crossref_primary_10_1016_j_asr_2024_06_003 crossref_primary_10_1016_j_asoc_2024_112575 crossref_primary_10_1016_j_apenergy_2023_121332 crossref_primary_10_1016_j_aei_2023_102197 crossref_primary_10_1109_TNNLS_2021_3105937 crossref_primary_10_1109_JAS_2024_124548 crossref_primary_10_1109_TNNLS_2024_3371706 crossref_primary_10_1145_3470971 crossref_primary_10_1016_j_knosys_2023_110801 crossref_primary_10_1016_j_swevo_2025_101892 crossref_primary_10_1016_j_matdes_2023_112556 crossref_primary_10_1109_JSAC_2024_3365902 crossref_primary_10_1109_TEVC_2022_3199045 crossref_primary_10_4018_IJCINI_361012 crossref_primary_10_1007_s11590_023_02009_5 crossref_primary_10_1016_j_asoc_2024_111751 crossref_primary_10_1109_OJPEL_2020_3012777 crossref_primary_10_3390_app13179667 crossref_primary_10_3390_jlpea12040053 crossref_primary_10_1016_j_eswa_2022_118658 crossref_primary_10_1109_TNNLS_2022_3148435 crossref_primary_10_1016_j_ins_2023_119253 crossref_primary_10_1109_TCYB_2022_3150802 crossref_primary_10_1016_j_compenvurbsys_2023_101966 crossref_primary_10_1016_j_swevo_2022_101225 crossref_primary_10_1007_s10489_023_05013_5 crossref_primary_10_1109_TCYB_2021_3107202 crossref_primary_10_1109_TETCI_2023_3293696 crossref_primary_10_3390_drones7010010 crossref_primary_10_1109_TETCI_2021_3115666 crossref_primary_10_3390_electronics11162513 crossref_primary_10_1007_s12293_022_00366_9 crossref_primary_10_1016_j_isatra_2023_04_004 crossref_primary_10_1109_TITS_2024_3448627 crossref_primary_10_1109_TNSM_2021_3086721 crossref_primary_10_1016_j_jmsy_2024_11_013 crossref_primary_10_1109_TCYB_2023_3283771 crossref_primary_10_1109_MCI_2023_3245719 crossref_primary_10_3390_electronics12194167 crossref_primary_10_1016_j_conbuildmat_2024_135206 crossref_primary_10_3390_s25061707 crossref_primary_10_1109_TEVC_2023_3250350 crossref_primary_10_1016_j_apenergy_2023_122287 crossref_primary_10_1016_j_asoc_2023_110330 crossref_primary_10_1109_ACCESS_2022_3233474 crossref_primary_10_1016_j_neunet_2024_106359 crossref_primary_10_3390_jmse12101765 crossref_primary_10_1109_TCYB_2021_3103811 crossref_primary_10_1109_TWC_2023_3240425 crossref_primary_10_1016_j_ejor_2023_11_038 crossref_primary_10_1016_j_swevo_2023_101253 crossref_primary_10_1016_j_swevo_2024_101694 crossref_primary_10_1016_j_eswa_2024_123592 crossref_primary_10_1109_LRA_2025_3534070 crossref_primary_10_3390_electronics13193842 crossref_primary_10_1007_s40747_021_00514_7 crossref_primary_10_1016_j_energy_2024_133412 crossref_primary_10_1016_j_ejor_2025_01_012 crossref_primary_10_1109_TMC_2024_3357796 crossref_primary_10_2166_hydro_2024_191 crossref_primary_10_3390_app15052679 crossref_primary_10_1002_aic_18012 crossref_primary_10_1109_TIV_2023_3236104 crossref_primary_10_1016_j_engappai_2023_107564 crossref_primary_10_1080_03088839_2024_2326635 crossref_primary_10_1016_j_buildenv_2025_112864 crossref_primary_10_1016_j_compeleceng_2024_109603 crossref_primary_10_3390_biomimetics9120718 crossref_primary_10_1016_j_jii_2024_100727 crossref_primary_10_1109_TCYB_2023_3312476 crossref_primary_10_1109_JAS_2023_123609 crossref_primary_10_1016_j_swevo_2024_101605 crossref_primary_10_1109_ACCESS_2024_3505436 crossref_primary_10_1016_j_swevo_2024_101606 crossref_primary_10_1155_2021_9032206 crossref_primary_10_3390_sym16081030 crossref_primary_10_1016_j_swevo_2025_101866 crossref_primary_10_1016_j_ins_2023_119472 crossref_primary_10_1016_j_swevo_2023_101387 crossref_primary_10_1364_JOCN_460629 crossref_primary_10_1109_JAS_2023_123687 crossref_primary_10_1109_JAS_2023_124113 crossref_primary_10_1109_TITS_2023_3334976 crossref_primary_10_1155_2021_6694695 crossref_primary_10_1016_j_jmsy_2024_04_003 crossref_primary_10_1109_TSTE_2023_3341632 crossref_primary_10_1007_s12204_023_2679_7 crossref_primary_10_1364_JOCN_483733 crossref_primary_10_26599_TST_2023_9010076 crossref_primary_10_1016_j_jmsy_2024_04_007 crossref_primary_10_1109_ACCESS_2021_3060323 crossref_primary_10_1162_dint_a_00246 crossref_primary_10_3390_rs14061304 crossref_primary_10_1016_j_engappai_2023_107381 crossref_primary_10_1016_j_ins_2023_04_003 crossref_primary_10_1109_TAI_2024_3409520 crossref_primary_10_1109_TII_2024_3495788 crossref_primary_10_3390_math12142283 crossref_primary_10_1109_TETCI_2022_3146882 crossref_primary_10_1109_TEVC_2022_3179256 crossref_primary_10_1016_j_swevo_2023_101398 crossref_primary_10_1007_s40747_022_00799_2 crossref_primary_10_1109_TCYB_2021_3121542 crossref_primary_10_1109_TSTE_2024_3393764 crossref_primary_10_1016_j_cie_2023_109115 crossref_primary_10_3390_app131910689 crossref_primary_10_1109_TETCI_2022_3145706 crossref_primary_10_1109_TCYB_2022_3226744 crossref_primary_10_1109_JAS_2024_124341 crossref_primary_10_1007_s40747_021_00635_z crossref_primary_10_1007_s40747_024_01469_1 crossref_primary_10_3390_rs15163932 crossref_primary_10_1016_j_autcon_2024_105598 crossref_primary_10_1016_j_oceaneng_2024_118398 crossref_primary_10_3233_JCM_226740 crossref_primary_10_1016_j_swevo_2024_101788 crossref_primary_10_1109_TCYB_2022_3164285 crossref_primary_10_1016_j_eswa_2024_124296 crossref_primary_10_1016_j_disopt_2025_100879 crossref_primary_10_1109_ACCESS_2022_3181164 crossref_primary_10_1109_TCYB_2021_3089179 crossref_primary_10_1109_JAS_2022_105677 crossref_primary_10_1371_journal_pcbi_1012229 crossref_primary_10_1109_TITS_2024_3438788 crossref_primary_10_1016_j_jhydrol_2024_130904 crossref_primary_10_1109_JLT_2022_3176473 crossref_primary_10_1109_TNSM_2024_3427403 crossref_primary_10_1109_TCYB_2022_3163816 crossref_primary_10_1109_TITS_2024_3515997 crossref_primary_10_1016_j_aei_2023_101965 crossref_primary_10_1016_j_energy_2024_130999 crossref_primary_10_1109_TNSM_2021_3127685 crossref_primary_10_3390_a17120579 crossref_primary_10_1145_3715700 crossref_primary_10_1007_s40747_023_01308_9 crossref_primary_10_1007_s11227_024_06439_5 crossref_primary_10_3233_IDA_230480 crossref_primary_10_1007_s43674_023_00055_1 crossref_primary_10_1109_TITS_2023_3313688 crossref_primary_10_3390_f15122181 crossref_primary_10_1109_TCYB_2023_3234077 crossref_primary_10_1109_JIOT_2024_3406531 crossref_primary_10_1109_JIOT_2021_3133278 crossref_primary_10_1016_j_ins_2024_121081 crossref_primary_10_3390_math11020437 crossref_primary_10_1088_1742_6596_2450_1_012081 crossref_primary_10_1145_3501803 crossref_primary_10_1016_j_eij_2023_100388 crossref_primary_10_1109_TNSM_2024_3446248 crossref_primary_10_3390_ai5040085 crossref_primary_10_1016_j_engappai_2025_110337 crossref_primary_10_1021_acs_jcim_2c00671 crossref_primary_10_1109_TNSM_2024_3387987
Cites_doi	10.1287/ijoc.3.4.376 10.1287/opre.21.2.498 10.1109/TEVC.2014.2350995 10.3115/v1/D14-1179 10.1109/MCI.2017.2742868 10.1109/4235.585893 10.1109/TEVC.2016.2600642 10.1007/978-3-319-93031-2_12 10.1007/978-3-540-88051-6_14 10.1007/978-3-030-12598-1_14 10.1109/TSMCB.2012.2231860 10.1109/TEVC.2007.892759 10.1109/TCYB.2013.2295886 10.1109/TEVC.2014.2373386 10.1007/3-540-44719-9_6 10.1109/TEVC.2017.2671462 10.1109/5326.704576 10.1007/978-3-642-17144-4_6 10.1016/S0377-2217(01)00104-7 10.1109/TEVC.2013.2281535 10.1109/CEC.2016.7743866 10.1109/TEVC.2002.802873 10.1109/TEVC.2016.2611642 10.1007/978-3-642-11218-8_6 10.1109/4235.996017 10.1109/TEVC.2016.2521175 10.1109/TCYB.2018.2849403
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
DBID	97E RIA RIE AAYXX CITATION NPM 7SC 7SP 7TB 8FD F28 FR3 H8D JQ2 L7M L~C L~D 7X8
DOI	10.1109/TCYB.2020.2977661
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Xplore CrossRef PubMed Computer and Information Systems Abstracts Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database ANTE: Abstracts in New Technology & Engineering Engineering Research Database Aerospace Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic
DatabaseTitle	CrossRef PubMed Aerospace Database Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace ANTE: Abstracts in New Technology & Engineering Computer and Information Systems Abstracts Professional MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic Aerospace Database PubMed
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE Xplore url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Sciences (General)
EISSN	2168-2275
EndPage	3114
ExternalDocumentID	32191907 10_1109_TCYB_2020_2977661 9040280
Genre	orig-research Journal Article
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: 61773390; 71571187 funderid: 10.13039/501100001809
GroupedDBID	0R~ 4.4 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACIWK AENEX AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD HZ~ IFIPE IPLJI JAVBF M43 O9- OCL PQQKQ RIA RIE RNS AAYXX CITATION RIG NPM 7SC 7SP 7TB 8FD F28 FR3 H8D JQ2 L7M L~C L~D 7X8
ID	FETCH-LOGICAL-c349t-bdf754122d742d03cee755bcd8d20db5e0dda4c6d00b94afd6ea6f374e0625533
IEDL.DBID	RIE
ISSN	2168-2267 2168-2275
IngestDate	Thu Jul 10 19:15:11 EDT 2025 Mon Jun 30 04:41:26 EDT 2025 Thu Apr 03 07:00:18 EDT 2025 Tue Jul 01 00:53:55 EDT 2025 Thu Apr 24 22:52:01 EDT 2025 Wed Aug 27 02:30:24 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	6
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c349t-bdf754122d742d03cee755bcd8d20db5e0dda4c6d00b94afd6ea6f374e0625533
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ORCID	0000-0003-1550-5987 0000-0001-9048-2979
PMID	32191907
PQID	2528944230
PQPubID	85422
PageCount	12
ParticipantIDs	ieee_primary_9040280 crossref_citationtrail_10_1109_TCYB_2020_2977661 proquest_miscellaneous_2381624843 proquest_journals_2528944230 crossref_primary_10_1109_TCYB_2020_2977661 pubmed_primary_32191907
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2021-06-01
PublicationDateYYYYMMDD	2021-06-01
PublicationDate_xml	– month: 06 year: 2021 text: 2021-06-01 day: 01
PublicationDecade	2020
PublicationPlace	United States
PublicationPlace_xml	– name: United States – name: Piscataway
PublicationTitle	IEEE transactions on cybernetics
PublicationTitleAbbrev	TCYB
PublicationTitleAlternate	IEEE Trans Cybern
PublicationYear	2021
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 nazari (ref17) 2018 ref34 ref12 ref37 ref15 ref36 ref14 ref31 ref32 konda (ref35) 2000 ref10 bahdanau (ref19) 2014 ref2 ref1 ref16 mnih (ref25) 2016 miettinen (ref30) 2012; 12 mossalam (ref24) 2016 johnson (ref7) 1990 ref26 ref41 bello (ref20) 2016 ref22 glorot (ref39) 2010 cai (ref11) 2015; 19 vinyals (ref18) 2015 chen (ref28) 2017; 21 ref27 kool (ref21) 2018 kingma (ref38) 2014 ref29 hsu (ref23) 2018 ref8 ref9 ref4 ref3 ref6 sutskever (ref33) 2014 ref5 ref40
References_xml	– ident: ref37 doi: 10.1287/ijoc.3.4.376 – ident: ref6 doi: 10.1287/opre.21.2.498 – year: 2018 ident: ref21 publication-title: Attention learn to solve routing problems! – volume: 12 year: 2012 ident: ref30 publication-title: Nonlinear Multiobjective Optimization – volume: 19 start-page: 508 year: 2015 ident: ref11 article-title: An external archive guided multiobjective evolutionary algorithm based on decomposition for combinatorial optimization publication-title: IEEE Trans Evol Comput doi: 10.1109/TEVC.2014.2350995 – start-page: 3104 year: 2014 ident: ref33 article-title: Sequence to sequence learning with neural networks publication-title: Proc Adv Neural Inf Process Syst – year: 2016 ident: ref24 article-title: Multi-objective deep reinforcement learning – start-page: 9839 year: 2018 ident: ref17 article-title: Reinforcement learning for solving the vehicle routing problem publication-title: Proc Adv Neural Inf Process Syst – start-page: 443 year: 1990 ident: ref7 article-title: Local search and the traveling salesman problem publication-title: Proc 17th Int Colloquium Automata Lang Program Lecture Notes Comput Sci – ident: ref34 doi: 10.3115/v1/D14-1179 – ident: ref36 doi: 10.1109/MCI.2017.2742868 – ident: ref16 doi: 10.1109/4235.585893 – ident: ref14 doi: 10.1109/TEVC.2016.2600642 – ident: ref22 doi: 10.1007/978-3-319-93031-2_12 – ident: ref5 doi: 10.1007/978-3-540-88051-6_14 – ident: ref15 doi: 10.1007/978-3-030-12598-1_14 – ident: ref3 doi: 10.1109/TSMCB.2012.2231860 – ident: ref2 doi: 10.1109/TEVC.2007.892759 – ident: ref10 doi: 10.1109/TCYB.2013.2295886 – ident: ref27 doi: 10.1109/TEVC.2014.2373386 – start-page: 1928 year: 2016 ident: ref25 article-title: Asynchronous methods for deep reinforcement learning publication-title: Proc Int Conf Mach Learn – start-page: 2692 year: 2015 ident: ref18 article-title: Pointer networks publication-title: Proc Adv Neural Inf Process Syst – ident: ref26 doi: 10.1007/3-540-44719-9_6 – volume: 21 start-page: 714 year: 2017 ident: ref28 article-title: Dmoea-?C: Decomposition-based multiobjective evolutionary algorithm with the $\varepsilon$ -constraint framework publication-title: IEEE Trans Evol Comput doi: 10.1109/TEVC.2017.2671462 – ident: ref40 doi: 10.1109/5326.704576 – start-page: 249 year: 2010 ident: ref39 article-title: Understanding the difficulty of training deep feedforward neural networks publication-title: Proc 13th Int Conf Artif Intell Stat – year: 2016 ident: ref20 article-title: Neural combinatorial optimization with reinforcement learning – ident: ref8 doi: 10.1007/978-3-642-17144-4_6 – ident: ref41 doi: 10.1016/S0377-2217(01)00104-7 – year: 2018 ident: ref23 article-title: MONAS: Multi-objective neural architecture search using reinforcement learning – ident: ref29 doi: 10.1109/TEVC.2013.2281535 – ident: ref4 doi: 10.1109/CEC.2016.7743866 – year: 2014 ident: ref38 publication-title: Adam A method for stochastic optimization – ident: ref9 doi: 10.1109/TEVC.2002.802873 – ident: ref31 doi: 10.1109/TEVC.2016.2611642 – start-page: 1008 year: 2000 ident: ref35 article-title: Actor-Critic algorithms publication-title: Proc Adv Neural Inf Process Syst – ident: ref13 doi: 10.1007/978-3-642-11218-8_6 – ident: ref1 doi: 10.1109/4235.996017 – year: 2014 ident: ref19 article-title: Neural machine translation by jointly learning to align and translate – ident: ref32 doi: 10.1109/TEVC.2016.2521175 – ident: ref12 doi: 10.1109/TCYB.2018.2849403
SSID	ssj0000816898
Score	2.6608922
Snippet	This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call...
SourceID	proquest pubmed crossref ieee
SourceType	Aggregation Database Index Database Enrichment Source Publisher
StartPage	3103
SubjectTerms	Algorithms Decomposition Deep learning Deep reinforcement learning (DRL) Iterative methods Machine learning Mathematical models Modeling Mopping multiobjective optimization Multiple objective analysis Neural networks Optimization Parameters Pareto optimization Pointer Network Reinforcement learning Training Traveling salesman problem Traveling salesman problems Urban areas
Title	Deep Reinforcement Learning for Multiobjective Optimization
URI	https://ieeexplore.ieee.org/document/9040280 https://www.ncbi.nlm.nih.gov/pubmed/32191907 https://www.proquest.com/docview/2528944230 https://www.proquest.com/docview/2381624843
Volume	51
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dSxwxEB_UB7kX67dXbVnBBy3umU2yuxf61KrHIaggJ9inZZNMBG3vpN699K_vJJtbsFTxLWyyX5mZzEwm8xuAAyUx4xma1GlaAqVTLq2dkWnfKa5rIrsJ2J2XV8XwVl7c5XcLcNzmwiBiOHyGPd8MsXw7MTO_VXaiiON4nxz0RXLcmlytdj8lFJAIpW85NVKyKsoYxMyYOhmd_vhOziBnPU4GD-mkDiwLElZSh-ULjRRKrLxubQatM_gAl_PvbQ6bPPZmU90zf_6BcnzvD63CSjQ_k28Nv6zBAo7XYS0K-HNyGFGojzbg6xniU3KDAVnVhE3EJIKx3id0KQmpuxP90KyYyTWtPb9iUucm3A7OR6fDNFZaSI2Qappq68pcZpxb8pQtE6Q5yzzXxvYtZ1bnyKytpSksY1rJ2tkC68KJUiIj_4ksxi1YGk_GuAOJ5qIURtZCYS5dbjWRvLBKeKB4JnXWBTaf7cpEGHJfDeNnFdwRpipPq8rTqoq06sKX9panBoPjrcEbfp7bgXGKu7A3J2kVpfS54jm5m5IMSureb7tJvnzQpB7jZEZjfGSVy74UXdhuWKF99pyDPv7_nbvQ4f4ETNiz2YOl6e8ZfiITZqo_B979Cxw_6Qw
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dTxQxEJ8QSJQXBfHjBHRNfFDjHt1-7F7DE6LkVA4TcyTwtNm2UxM-7gjcvfDXM-32NpGo8a3Zdr86M52ZTuc3AG-1xIIXaHNvaAmUXvu88VbmA6-5aYjsNmJ3jo7K4bH8dqJOluBjlwuDiPHwGfZDM8by3dTOw1bZjiaO4wNy0FdI7yveZmt1OyqxhEQsfsupkZNdUaUwZsH0znj_9BO5g5z1OZk8pJVW4YEgcSWFWP2mk2KRlb_bm1HvHDyG0eKL2-Mm5_35zPTt7T0wx__9pTV4lAzQbK_lmHVYwskTWE8ifpO9SzjU7zdg9zPiVfYTI7aqjduIWYJj_ZXRpSwm707NWbtmZj9o9blMaZ1P4fjgy3h_mKdaC7kVUs9y43ylZMG5I1_ZMUG6s1LKWDdwnDmjkDnXSFs6xoyWjXclNqUXlURGHhTZjM9geTKd4AvIDBeVsLIRGpX0yhkieum0CFDxTJqiB2wx27VNQOShHsZFHR0SputAqzrQqk606sGH7parFoXjX4M3wjx3A9MU92BrQdI6yelNzRU5nJJMSup-03WThIWwSTPB6ZzGhNgqlwMpevC8ZYXu2QsOevnnd76Gh8Px6LA-_Hr0fRNWeTgPE3dwtmB5dj3HbTJoZuZV5OM7cizsVg
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+Reinforcement+Learning+for+Multiobjective+Optimization&rft.jtitle=IEEE+transactions+on+cybernetics&rft.au=Li%2C+Kaiwen&rft.au=Zhang%2C+Tao&rft.au=Wang%2C+Rui&rft.date=2021-06-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=2168-2267&rft.eissn=2168-2275&rft.volume=51&rft.issue=6&rft.spage=3103&rft_id=info:doi/10.1109%2FTCYB.2020.2977661&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2168-2267&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2168-2267&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2168-2267&client=summon