Adaptive Learning: A New Decentralized Reinforcement Learning Approach for Cooperative Multiagent Systems
Multiagent systems (MASs) have received extensive attention in a variety of domains, such as robotics and distributed control. This paper focuses on how independent learners (ILs, structures used in decentralized reinforcement learning) decide on their individual behaviors to achieve coherent joint...
Saved in:
Published in | IEEE access Vol. 8; pp. 99404 - 99421 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Piscataway
IEEE
2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
ISSN | 2169-3536 2169-3536 |
DOI | 10.1109/ACCESS.2020.2997899 |
Cover
Loading…
Abstract | Multiagent systems (MASs) have received extensive attention in a variety of domains, such as robotics and distributed control. This paper focuses on how independent learners (ILs, structures used in decentralized reinforcement learning) decide on their individual behaviors to achieve coherent joint behavior. To date, Reinforcement learning(RL) approaches for ILs have not guaranteed convergence to the optimal joint policy in scenarios in which communication is difficult. Especially in a decentralized algorithm, the proportion of credit for a single agent's action in a multiagent system is not distinguished, which can lead to miscoordination of joint actions. Therefore, it is highly significant to study the mechanisms of coordination between agents in MASs. Most previous coordination mechanisms have been carried out by modeling the communication mechanism and other agent policies. These methods are applicable only to a particular system, so such algorithms do not offer generalizability, especially when there are dozens or more agents. Therefore, this paper mainly focuses on the MAS contains more than a dozen agents. By combining the method of parallel computation, the experimental environment is closer to the application scene. By studying the paradigm of centralized training and decentralized execution(CTDE), a multi-agent reinforcement learning algorithm for implicit coordination based on TD error is proposed. The new algorithm can dynamically adjust the learning rate by deeply analyzing the dissonance problem in the matrix game and combining it with a multiagent environment. By adjusting the dynamic learning rate between agents, coordination of the agents' strategies can be achieved. Experimental results show that the proposed algorithm can effectively improve the coordination ability of a MAS. Moreover, the variance of the training results is more stable than that of the hysteretic Q learning(HQL) algorithm. Hence, the problem of miscoordination in a MAS can be avoided to some extent without additional communication. Our work provides a new way to solve the miscoordination problem for reinforcement learning algorithms in the scale of dozens or more number of agents. As a new IL structure algorithm, our results should be extended and further studied. |
---|---|
AbstractList | Multiagent systems (MASs) have received extensive attention in a variety of domains, such as robotics and distributed control. This paper focuses on how independent learners (ILs, structures used in decentralized reinforcement learning) decide on their individual behaviors to achieve coherent joint behavior. To date, Reinforcement learning(RL) approaches for ILs have not guaranteed convergence to the optimal joint policy in scenarios in which communication is difficult. Especially in a decentralized algorithm, the proportion of credit for a single agent's action in a multiagent system is not distinguished, which can lead to miscoordination of joint actions. Therefore, it is highly significant to study the mechanisms of coordination between agents in MASs. Most previous coordination mechanisms have been carried out by modeling the communication mechanism and other agent policies. These methods are applicable only to a particular system, so such algorithms do not offer generalizability, especially when there are dozens or more agents. Therefore, this paper mainly focuses on the MAS contains more than a dozen agents. By combining the method of parallel computation, the experimental environment is closer to the application scene. By studying the paradigm of centralized training and decentralized execution(CTDE), a multi-agent reinforcement learning algorithm for implicit coordination based on TD error is proposed. The new algorithm can dynamically adjust the learning rate by deeply analyzing the dissonance problem in the matrix game and combining it with a multiagent environment. By adjusting the dynamic learning rate between agents, coordination of the agents' strategies can be achieved. Experimental results show that the proposed algorithm can effectively improve the coordination ability of a MAS. Moreover, the variance of the training results is more stable than that of the hysteretic Q learning(HQL) algorithm. Hence, the problem of miscoordination in a MAS can be avoided to some extent without additional communication. Our work provides a new way to solve the miscoordination problem for reinforcement learning algorithms in the scale of dozens or more number of agents. As a new IL structure algorithm, our results should be extended and further studied. |
Author | Chen, Shaofei Li, Meng-Lin Chen, Jing |
Author_xml | – sequence: 1 givenname: Meng-Lin orcidid: 0000-0003-3307-5490 surname: Li fullname: Li, Meng-Lin organization: College of Intelligence Science and Technology, National University of Defence Technology, Changsha, China – sequence: 2 givenname: Shaofei surname: Chen fullname: Chen, Shaofei email: chensf005@163.com organization: College of Intelligence Science and Technology, National University of Defence Technology, Changsha, China – sequence: 3 givenname: Jing surname: Chen fullname: Chen, Jing email: chenjing001@vip.sina.com organization: College of Intelligence Science and Technology, National University of Defence Technology, Changsha, China |
BookMark | eNp9Uctu2zAQJIIUyKP5glwE9GyXL4lib4KStgGcFKjbM7Gili4NWVQouUX69aWjJChyCC8kZmdmlztn5LgPPRJyyeiSMao_VnV9vV4vOeV0ybVWpdZH5JSzQi9ELorj_94n5GIctzSdMkG5OiW-amGY_G_MVgix9_3mU1Zld_gnu0KL_RSh83-xzb6j712IFncJfOFm1TDEAPZXlmpZHcKAER7dbvfd5GFzIK8fxgl343vyzkE34sXTfU5-fr7-UX9drL59uamr1cJKWk6LHGxDLbOshIY20tJWSlEw5NqhsiiwAZfqrlDIbJO3wJTmgjpQ0glIhHNyM_u2AbZmiH4H8cEE8OYRCHFjIE7edmgaWzgOLKdCqdQJITkBx5Llecmd1Mnrw-yVfnm_x3Ey27CPfRrfcJlLyXRRFomlZ5aNYRwjOmP9lNYQDuvznWHUHIIyc1DmEJR5CippxSvt88Rvqy5nlUfEF4VmlHOlxD_YwaJq |
CODEN | IAECCG |
CitedBy_id | crossref_primary_10_3390_s24144741 crossref_primary_10_25046_aj070307 crossref_primary_10_1016_j_jobe_2025_112021 crossref_primary_10_1016_j_aej_2024_10_115 crossref_primary_10_1109_ACCESS_2021_3086002 crossref_primary_10_1109_ACCESS_2022_3180754 crossref_primary_10_3390_app13159016 crossref_primary_10_3390_s22145143 crossref_primary_10_1109_TKDE_2023_3336185 crossref_primary_10_1364_OE_438439 crossref_primary_10_25299_itjrd_2023_13474 |
Cites_doi | 10.1038/nature14236 10.1109/SSCI.2016.7849837 10.1023/A:1008942012299 10.1016/j.asoc.2014.01.004 10.1007/3-540-58484-6_269 10.1109/TCYB.2015.2421338 10.1007/BF00992698 10.1109/IROS.2007.4399095 10.1016/B978-1-55860-307-3.50049-6 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D DOA |
DOI | 10.1109/ACCESS.2020.2997899 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Materials Research Database |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 2169-3536 |
EndPage | 99421 |
ExternalDocumentID | oai_doaj_org_article_bc6f2a1503774c0ea923a2e815582f49 10_1109_ACCESS_2020_2997899 9102277 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61702528 funderid: 10.13039/501100001809 |
GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR ABAZT ABVLG ACGFS ADBBV AGSQL ALMA_UNASSIGNED_HOLDINGS BCNDV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD ESBDL GROUPED_DOAJ IPLJI JAVBF KQ8 M43 M~E O9- OCL OK1 RIA RIE RNS AAYXX CITATION RIG 7SC 7SP 7SR 8BQ 8FD JG9 JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c408t-5acb0c1c18ab0b4c0d44361e29fe7ce3ebaf0c1f67e1cb5da179230fa74f3ace3 |
IEDL.DBID | RIE |
ISSN | 2169-3536 |
IngestDate | Wed Aug 27 01:29:23 EDT 2025 Sun Jun 29 16:38:49 EDT 2025 Tue Jul 01 02:55:27 EDT 2025 Thu Apr 24 23:12:30 EDT 2025 Wed Aug 27 02:37:30 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://creativecommons.org/licenses/by/4.0/legalcode |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c408t-5acb0c1c18ab0b4c0d44361e29fe7ce3ebaf0c1f67e1cb5da179230fa74f3ace3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0003-3307-5490 |
OpenAccessLink | https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/document/9102277 |
PQID | 2454419686 |
PQPubID | 4845423 |
PageCount | 18 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_bc6f2a1503774c0ea923a2e815582f49 proquest_journals_2454419686 crossref_citationtrail_10_1109_ACCESS_2020_2997899 ieee_primary_9102277 crossref_primary_10_1109_ACCESS_2020_2997899 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 20200000 2020-00-00 20200101 2020-01-01 |
PublicationDateYYYYMMDD | 2020-01-01 |
PublicationDate_xml | – year: 2020 text: 20200000 |
PublicationDecade | 2020 |
PublicationPlace | Piscataway |
PublicationPlace_xml | – name: Piscataway |
PublicationTitle | IEEE access |
PublicationTitleAbbrev | Access |
PublicationYear | 2020 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref14 foerster (ref30) 2017 liu (ref32) 2018; 41 zheng (ref47) 2017 hausknecht (ref50) 2015 yang (ref17) 2020 ref18 samvelyan (ref9) 2019 kaitai (ref45) 1994 foerster (ref34) 2017 hausknecht (ref53) 2015 ref51 peng (ref21) 2017 van hasselt (ref40) 2015 ref48 jun (ref22) 2011; 26 sukhbaatar (ref11) 2016 sunehag (ref8) 2017 son (ref7) 2019 al gizi (ref42) 2013 sukhbaatar (ref20) 2016 brockman (ref16) 2016 vinyals (ref10) 2017 peng (ref12) 2017 ref36 foerster (ref19) 2017 rashid (ref6) 2018 li (ref13) 2018 guo (ref3) 2018 mnih (ref46) 2015; 518 ref2 hampel (ref39) 2000 hasselt (ref41) 2010 schulman (ref43) 2015 matignon (ref28) 2009 wei (ref25) 2016; 17 claus (ref4) 1998 foerster (ref29) 2016 mnih (ref52) 2016 song (ref31) 2005; 20 oroojlooyjadid (ref33) 2019 lowe (ref23) 2017 lauer (ref35) 2000 ref26 yang (ref24) 2020 panait (ref37) 2008; 9 yishu (ref44) 1988; 4 ref27 zhong (ref5) 2003; 20 ardi (ref15) 2017; 12 wiegand (ref38) 2004 yang (ref1) 2004 babaeizadeh (ref49) 2016 |
References_xml | – year: 2017 ident: ref10 article-title: StarCraft II: A new challenge for reinforcement learning publication-title: arXiv 1708 04782 – volume: 518 start-page: 529 year: 2015 ident: ref46 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – year: 2017 ident: ref8 article-title: Value-decomposition networks for cooperative multi-agent learning publication-title: arXiv 1706 05296 – volume: 17 start-page: 2914 year: 2016 ident: ref25 article-title: Lenient learning in independent-learner stochastic cooperative games publication-title: J Mach Learn Res – ident: ref51 doi: 10.1109/SSCI.2016.7849837 – year: 2017 ident: ref34 article-title: Stabilising experience replay for deep multi-agent reinforcement learning publication-title: arXiv 1702 08887 – ident: ref2 doi: 10.1023/A:1008942012299 – year: 2016 ident: ref11 article-title: Learning multiagent communication with backpropagation publication-title: arXiv 1605 07736 – year: 2017 ident: ref47 article-title: MAgent: A many-agent reinforcement learning platform for artificial collective intelligence publication-title: arXiv 1712 00600 – volume: 41 start-page: 1 year: 2018 ident: ref32 article-title: A survey on deep reinforcement learning publication-title: Chin J Comput – ident: ref26 doi: 10.1016/j.asoc.2014.01.004 – year: 2020 ident: ref24 article-title: Q-value path decomposition for deep multiagent reinforcement learning publication-title: arXiv 2002 03950 – year: 1994 ident: ref45 publication-title: Uniform Design and Uniform Designs Table – year: 2016 ident: ref20 article-title: Learning multiagent communication with backpropagation publication-title: arXiv 1605 07736 – year: 2019 ident: ref33 article-title: A review of cooperative multi-agent deep reinforcement learning publication-title: arXiv 1908 03963 – volume: 20 start-page: 1081 year: 2005 ident: ref31 article-title: Survey of multi-agent reinforcement learning in Markov games publication-title: Control Decis – start-page: 259 year: 2018 ident: ref13 article-title: Path planning based on ant colony tsp algorithm publication-title: Proc 6th China Command Control Conf – ident: ref36 doi: 10.1007/3-540-58484-6_269 – volume: 26 start-page: 1601 year: 2011 ident: ref22 article-title: A review of the research progress of reinforcement learning for multi-robot systems publication-title: Control Decis – year: 2018 ident: ref6 article-title: QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning publication-title: arXiv 1803 11485 – start-page: 6379 year: 2017 ident: ref23 article-title: Multi-agent actor-critic for mixed cooperative-competitive environments publication-title: Proc Adv Neural Inf Process Syst – start-page: 2613 year: 2010 ident: ref41 article-title: Double Q-learning publication-title: Advances in Neural Information Processing Systems 23 – year: 2015 ident: ref40 article-title: Deep reinforcement learning with double Q-learning publication-title: arXiv 1509 06461 [cs] – start-page: 746 year: 1998 ident: ref4 article-title: The dynamics of reinforcement learning in cooperative multiagent systems publication-title: Proc Nat Conf Artif Intell – year: 2013 ident: ref42 article-title: Fuzzy control system review publication-title: Int J Sci Eng Res – volume: 20 start-page: 317 year: 2003 ident: ref5 article-title: Survey of distributed reinforcement learning algorithms in multi-agent systems publication-title: Control Theory Appl – ident: ref48 doi: 10.1109/TCYB.2015.2421338 – year: 2016 ident: ref49 article-title: GA3C: GPU-based A3C for deep reinforcement learning – year: 2017 ident: ref12 article-title: Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play StarCraft combat games publication-title: arXiv 1703 10069 – ident: ref14 doi: 10.1007/BF00992698 – year: 2009 ident: ref28 article-title: Coordination of independent learners in cooperative Markov games – start-page: 1928 year: 2016 ident: ref52 article-title: Asynchronous methods for deep reinforcement learning publication-title: Proc Int Conf Mach Learn – ident: ref27 doi: 10.1109/IROS.2007.4399095 – year: 2016 ident: ref16 article-title: OpenAI GYM publication-title: arXiv 1606 01540 [cs] – year: 2020 ident: ref17 article-title: Qatten: A general framework for cooperative multiagent reinforcement learning publication-title: arXiv 2002 03939 – volume: 12 start-page: 172 year: 2017 ident: ref15 article-title: Multiagent cooperation and competition with deep reinforcement learning publication-title: PLoS ONE – year: 2015 ident: ref53 article-title: Deep recurrent Q-Learning for partially observable MDPs publication-title: arXiv 1507 06527 – ident: ref18 doi: 10.1016/B978-1-55860-307-3.50049-6 – year: 2016 ident: ref29 article-title: Learning to communicate with deep multi-agent reinforcement learning publication-title: arXiv 1605 06676 – year: 2017 ident: ref19 article-title: Counterfactual multi-agent policy gradients publication-title: arXiv 1705 08926 – year: 2017 ident: ref21 article-title: Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play StarCraft combat games publication-title: arXiv 1703 10069 – year: 2018 ident: ref3 article-title: Generative adversarial self-imitation learning publication-title: arXiv 1812 00950 – start-page: 1 year: 2000 ident: ref35 article-title: An algorithm for distributed reinforcement learning in cooperative multi-agent systems publication-title: Proc 17th Int Conf Mach Learn – year: 2019 ident: ref7 article-title: QTRAN: Learning to factorize with transformation for cooperative multi-agent reinforcement learning publication-title: arXiv 1905 05408 – year: 2017 ident: ref30 article-title: Stabilising experience replay for deep multi-agent reinforcement learning publication-title: arXiv 1702 08887 – volume: 9 start-page: 423 year: 2008 ident: ref37 article-title: Theoretical advantages of lenient learners: An evolutionary game theoretic perspective publication-title: J Mach Learn Res – start-page: 1 year: 2004 ident: ref1 article-title: Multiagent reinforcement learning for multi-robot systems: A survey – year: 2015 ident: ref50 article-title: Deep recurrent Q-Learning for partially observable MDPs publication-title: arXiv 1507 06527 – start-page: 178 year: 2004 ident: ref38 article-title: An analysis of cooperative coevolutionary algorithms – year: 2019 ident: ref9 article-title: The starcraft multi-agent challenge publication-title: arXiv 1902 04043 – year: 2000 ident: ref39 article-title: Fuzzy control: Theory and practice publication-title: Adv Intell Soft Comput – volume: 4 start-page: 55 year: 1988 ident: ref44 article-title: Uniformly design the construction of the table and its use table publication-title: Tactical Missile Technology – year: 2015 ident: ref43 article-title: High-dimensional continuous control using generalized advantage estimation publication-title: arXiv 1506 02438 [cs] |
SSID | ssj0000816957 |
Score | 2.2454686 |
Snippet | Multiagent systems (MASs) have received extensive attention in a variety of domains, such as robotics and distributed control. This paper focuses on how... |
SourceID | doaj proquest crossref ieee |
SourceType | Open Website Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 99404 |
SubjectTerms | Adaptive systems Algorithms Communication Coordination Games Heuristic algorithms intelligent control Learning (artificial intelligence) Machine learning Multi-agent systems multiagent system Multiagent systems Parallel processing Reinforcement learning Roads Robotics Training Urban areas |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8QwDI4QEwyIpzg4UAZGCmmSJilbOUCIgQGBxBaliYuQ0N0JjoVfj5OG4yQkWFgb9xHH_WxHzmdCjmoRfMUhFK6qXCHRARa1iAU2UGkdEDQdpCrfW3X9IG8eq8eFVl-xJqynB-4Vd9p61XGHYYvAQMUzcBiROA4G_aDhnUxH99DnLSRTCYNNqepKZ5qhktWnzWiEM8KEkLMThGBtEtvrtytKjP25xcoPXE7O5mqdrOUokTb9122QJRhvktUF7sAt8twEN41YRTNF6tMZbShiFr2AXHH5_AGB3kHiRvVpG3AuS5vMJU5xjI4mkyn0FOA0nch18cAVzWzm2-Th6vJ-dF3kvgmFl8zMisr5lvnSl8a1rEWlBSmFKoHXHWgPAlrX4XinNJS-rYIrI4kg65yWnXAosEOWx5Mx7BLqtNcoqbXyQgbBTCeDFhgiIERC682A8C8VWp9JxWNvixebkgtW217vNurdZr0PyPH8pmnPqfG7-Hlcm7loJMROF9BMbDYT-5eZDMhWXNn5Q-qY6Wo9IMOvlbb5532zXMbGbLUyau8_Xr1PVuJ0YuEaV0OyPHt9hwOMZGbtYTLaT4Re7Yw priority: 102 providerName: Directory of Open Access Journals |
Title | Adaptive Learning: A New Decentralized Reinforcement Learning Approach for Cooperative Multiagent Systems |
URI | https://ieeexplore.ieee.org/document/9102277 https://www.proquest.com/docview/2454419686 https://doaj.org/article/bc6f2a1503774c0ea923a2e815582f49 |
Volume | 8 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3BbtQwEB21PcEBKAWxUCofODZbJ3bihFtYWlWV4ICo1JtljyeoAu2uYPfSr2fseAMChHqLkrFl69nj8WTmDcCbTgWsKwqFq2tXaD4Ai07FABuqjQmsNB2lKN-PzeW1vrqpb_bgdMqFIaIUfEbz-Jj-5YcVbqOr7KyL1xNj9mGfL25jrtbkT4kFJLraZGKhUnZn_WLBc-ArYCXnrHRNm_hdfx0-iaM_F1X5SxOn4-XiMXzYDWyMKvk63278HO_-4Gy878ifwKNsZ4p-XBiHsEfLp_DwN_bBI7jtg1tHbScyyeqXt6IXrPXEe8oxm7d3FMQnSuyqmByJk6zoMxu54G9isVqtaSQRFymn18WULZH50J_B9cX558VlkSsvFKhluylqh15iiWXrvPQaZdBaNSVV3UAGSZF3A38fGkMl-jq4MtIQysEZPSjHAs_hYLla0gsQzqBhSWMaVDoo2Q46GMVGBitZ8tjOoNpBYjHTksfqGN9sup7Izo442oijzTjO4HRqtB5ZOf4v_i5iPYlGSu30gjGyeYdaj81QObaPFVvEKMnxjFxFLRtcbTVo7uQo4jp1kiGdwfFu5di8_X_YSsfSbl3TNi__3eoVPIgDHH05x3Cw-b6l12zdbPxJ8gqcpMX9Ex2t-Js |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB6VcoAeeBXUhQI-cGy2TmzHCbewUC3Q9oBaqTfLjwmqQLsr2L301zN2vAEBQtyieGzZ-pzxeDLzDcCrVgSvKgyFVcoWkg7AohUxwAaV1oGUpsUU5Xtezy_lhyt1tQNHYy4MIqbgM5zGx_QvPyz9JrrKjtt4PdH6Ftymc1-VQ7bW6FGJJSRapTO1UMnb4242o1XQJbDiU1K7ukkMrz-Pn8TSn8uq_KGL0wFzch_OtlMb4kq-TDdrN_U3v7E2_u_cH8C9bGmybtgaD2EHF49g7xf-wX247oJdRX3HMs3q59esY6T32FvMUZvXNxjYJ0z8qj65EkdZ1mU-ckZtbLZcrnCgEWcpq9fGpC2WGdEfw-XJu4vZvMi1FwovebMulPWO-9KXjXXcSc-DlKIusWp71B4FOttTe19rLL1TwZaRiJD3VsteWBJ4AruL5QIPgFntNUlqXXshg-BNL4MWZGaQmkXnmwlUW0iMz8TksT7GV5MuKLw1A44m4mgyjhM4GjutBl6Of4u_iViPopFUO70gjEz-Ro3zdV9ZspAF2cSeo6UV2QobMrmaqpc0yH7EdRwkQzqBw-3OMVkBfDeVjMXd2rqpn_6910u4M784OzWn788_PoO7cbKDZ-cQdtffNvicbJ21e5G2-A9Ynvrv |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Adaptive+Learning%3A+A+New+Decentralized+Reinforcement+Learning+Approach+for+Cooperative+Multiagent+Systems&rft.jtitle=IEEE+access&rft.au=Li%2C+Meng-Lin&rft.au=Chen%2C+Shaofei&rft.au=Chen%2C+Jing&rft.date=2020&rft.issn=2169-3536&rft.eissn=2169-3536&rft.volume=8&rft.spage=99404&rft.epage=99421&rft_id=info:doi/10.1109%2FACCESS.2020.2997899&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_ACCESS_2020_2997899 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2169-3536&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2169-3536&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2169-3536&client=summon |