Learning in multi-agent systems with asymmetric information structure
In this paper, we study multi-agent systems with asymmetric information structure. Due to limited channel capacity in communication network, the information in routing path suffers a transmission delay. Instead of the game theoretic setting, we formulate the problem as an online quadratic optimizati...
Saved in:
Published in | Neurocomputing (Amsterdam) Vol. 412; pp. 351 - 359 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
28.10.2020
|
Subjects | |
Online Access | Get full text |
ISSN | 0925-2312 1872-8286 |
DOI | 10.1016/j.neucom.2019.08.112 |
Cover
Abstract | In this paper, we study multi-agent systems with asymmetric information structure. Due to limited channel capacity in communication network, the information in routing path suffers a transmission delay. Instead of the game theoretic setting, we formulate the problem as an online quadratic optimization problem subject to stochastic systems involving input delay. Since the probability statistics of system noise is unknown, the decision-maker can not utilize the traditional optimal control strategies. Motivated by online convex optimization theory, we introduce the notion of regret, which measures the cumulative performance difference between the optimal statistics known (offline) index value and the statistics unknown (online) index value. The contributions of this paper are twofold. First, utilizing the linear minimum mean square biased estimate, we derive a learning based control policy and then characterize its behavior. Second, under some basic assumptions, we further prove that the regret grows at a sub-linear rate and it is explicitly bounded by O(lnT). |
---|---|
AbstractList | In this paper, we study multi-agent systems with asymmetric information structure. Due to limited channel capacity in communication network, the information in routing path suffers a transmission delay. Instead of the game theoretic setting, we formulate the problem as an online quadratic optimization problem subject to stochastic systems involving input delay. Since the probability statistics of system noise is unknown, the decision-maker can not utilize the traditional optimal control strategies. Motivated by online convex optimization theory, we introduce the notion of regret, which measures the cumulative performance difference between the optimal statistics known (offline) index value and the statistics unknown (online) index value. The contributions of this paper are twofold. First, utilizing the linear minimum mean square biased estimate, we derive a learning based control policy and then characterize its behavior. Second, under some basic assumptions, we further prove that the regret grows at a sub-linear rate and it is explicitly bounded by O(lnT). |
Author | Qi, Qingyuan Wong, Wing Shing Tan, Cheng |
Author_xml | – sequence: 1 givenname: Cheng surname: Tan fullname: Tan, Cheng email: tancheng1987love@163.com organization: Department of Information Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong – sequence: 2 givenname: Qingyuan surname: Qi fullname: Qi, Qingyuan organization: School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore – sequence: 3 givenname: Wing Shing surname: Wong fullname: Wong, Wing Shing organization: Department of Information Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong |
BookMark | eNqFkMtKAzEYhYMo2FbfwMW8wIz5k7nFhSClXqDgRtchk_lTU5qMJBmlb-_UunKhq7M534HzzcmpHzwScgW0AAr19bbwOOrBFYyCKGhbALATMoO2YXnL2vqUzKhgVc44sHMyj3FLKTTAxIys1qiCt36TWZ-5cZdsrjboUxb3MaGL2adNb5mKe-cwBaunmhmCU8kOPospjDqNAS_ImVG7iJc_uSCv96uX5WO-fn54Wt6tc81pnXLWVWWPlcG6VEI0XBswolG0pwrKmmvdAaKuOo5MCGH6nnZMc6ZUbbhhHfIFKY-7OgwxBjTyPVinwl4ClQcVciuPKuRBhaStnFRM2M0vTNv0fSEFZXf_wbdHGKdjHxaDjNqi19jbgDrJfrB_D3wBozuB1w |
CitedBy_id | crossref_primary_10_1016_j_neucom_2021_10_033 |
Cites_doi | 10.1016/j.automatica.2014.10.022 10.1086/262121 10.1016/0022-0531(85)90059-6 10.1137/0402013 10.1561/2200000024 10.1109/TCYB.2017.2750221 10.1080/00207543.2014.998789 10.1016/j.neucom.2017.10.008 10.1016/j.physrep.2014.09.006 10.1137/0306011 10.1049/iet-cta.2017.0398 10.1109/TAC.2016.2614887 10.1002/asjc.61 10.1109/TAC.2013.2275670 10.1109/TAC.1981.1102802 10.1007/s11424-015-4201-2 10.1109/TAC.2016.2627401 10.1137/060654396 10.1561/2400000013 10.1109/TWC.2014.040714.131695 10.1007/s12555-015-0479-z 10.1109/TCST.2009.2017934 |
ContentType | Journal Article |
Copyright | 2020 Elsevier B.V. |
Copyright_xml | – notice: 2020 Elsevier B.V. |
DBID | AAYXX CITATION |
DOI | 10.1016/j.neucom.2019.08.112 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1872-8286 |
EndPage | 359 |
ExternalDocumentID | 10_1016_j_neucom_2019_08_112 S0925231220310377 |
GroupedDBID | --- --K --M .DC .~1 0R~ 123 1B1 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JM 9JN AABNK AACTN AADPK AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXLA AAXUO AAYFN ABBOA ABCQJ ABFNM ABJNI ABMAC ABYKQ ACDAQ ACGFS ACRLP ACZNC ADBBV ADEZE AEBSH AEKER AENEX AFKWA AFTJW AFXIZ AGHFR AGUBO AGWIK AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ IHE J1W KOM LG9 M41 MO0 MOBAO N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 ROL RPZ SDF SDG SDP SES SPC SPCBC SSN SSV SSZ T5K ZMT ~G- 29N AAQXK AATTM AAXKI AAYWO AAYXX ABWVN ABXDB ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGCQF AGQPQ AGRNS AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN BNPGV CITATION EJD FEDTE FGOYB HLZ HVGLF HZ~ R2- RIG SBC SEW SSH WUQ XPP |
ID | FETCH-LOGICAL-c306t-2b54de5fe64a9973cf1f97a0d0a1463ccb1eec5b3e2999fdd0b2c32aa6f3f2be3 |
IEDL.DBID | AIKHN |
ISSN | 0925-2312 |
IngestDate | Thu Apr 24 23:10:35 EDT 2025 Tue Jul 01 01:46:51 EDT 2025 Fri Feb 23 02:46:16 EST 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Keywords | Asymmetric information Regret Learning based control policy Linear minimum mean square unbiased estimation Online qudratic optimization |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c306t-2b54de5fe64a9973cf1f97a0d0a1463ccb1eec5b3e2999fdd0b2c32aa6f3f2be3 |
PageCount | 9 |
ParticipantIDs | crossref_primary_10_1016_j_neucom_2019_08_112 crossref_citationtrail_10_1016_j_neucom_2019_08_112 elsevier_sciencedirect_doi_10_1016_j_neucom_2019_08_112 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2020-10-28 |
PublicationDateYYYYMMDD | 2020-10-28 |
PublicationDate_xml | – month: 10 year: 2020 text: 2020-10-28 day: 28 |
PublicationDecade | 2020 |
PublicationTitle | Neurocomputing (Amsterdam) |
PublicationYear | 2020 |
Publisher | Elsevier B.V |
Publisher_xml | – name: Elsevier B.V |
References | Bolton, Freixas (b0075) 2000; 108 Tan, Xu, Yang, Wong (b0060) 2018; 12 Bertsekas (b0085) 1995 Oh, Park, Ahn (b0005) 2015; 53 Ge, Han, Ding, Zhang, Ning (b0010) 2018; 275 Tan, Zhang, Wong (b0035) 2018; 48 Hazan (b0095) 2016; 2 Yang, Xiao, Choi, Cheng (b0045) 2018; 56 Zhang, Zhang, Tan, Wang, Gao (b0020) 2017; 15 Bryson (b0080) 1975 Zhang, Wang, Zhang (b0015) 2016; 29 Chakraborti, Challet, Chatterjee, Marsili, Zhang, Chakrabarti (b0065) 2015; 552 Movric, Lewis (b0025) 2013; 59 Paternain, Ribeiro (b0100) 2017; 62 Duchi, Hazan, Singer (b0110) 2011; 12 Levine, Finn, Darrell, Abbeel (b0090) 2016; 17 Hazan, Kale (b0115) 2014; 15 Judd (b0130) 1985; 35 Huang, Zhang, Zhang (b0135) 2008; 10 Tan, Zhang (b0030) 2017; 62 Watanabe, Ito (b0125) 1981; 26 Bubeck, Cesa-Bianchi (b0105) 2012; 5 Wang, Boyd (b0120) 2010; 18 Chang, Ristaniemi, Niu (b0070) 2014; 13 Witsenhausen (b0040) 1968; 6 Cardaliaguet (b0050) 2007; 46 Sugihara, Suzuki (b0055) 1989; 2 Chakraborti (10.1016/j.neucom.2019.08.112_b0065) 2015; 552 Judd (10.1016/j.neucom.2019.08.112_b0130) 1985; 35 Huang (10.1016/j.neucom.2019.08.112_b0135) 2008; 10 Tan (10.1016/j.neucom.2019.08.112_b0060) 2018; 12 Movric (10.1016/j.neucom.2019.08.112_b0025) 2013; 59 Wang (10.1016/j.neucom.2019.08.112_b0120) 2010; 18 Levine (10.1016/j.neucom.2019.08.112_b0090) 2016; 17 Witsenhausen (10.1016/j.neucom.2019.08.112_b0040) 1968; 6 Bubeck (10.1016/j.neucom.2019.08.112_b0105) 2012; 5 Zhang (10.1016/j.neucom.2019.08.112_b0015) 2016; 29 Bertsekas (10.1016/j.neucom.2019.08.112_b0085) 1995 Paternain (10.1016/j.neucom.2019.08.112_b0100) 2017; 62 Sugihara (10.1016/j.neucom.2019.08.112_b0055) 1989; 2 Bryson (10.1016/j.neucom.2019.08.112_b0080) 1975 Hazan (10.1016/j.neucom.2019.08.112_b0095) 2016; 2 Hazan (10.1016/j.neucom.2019.08.112_b0115) 2014; 15 Watanabe (10.1016/j.neucom.2019.08.112_b0125) 1981; 26 Zhang (10.1016/j.neucom.2019.08.112_b0020) 2017; 15 Ge (10.1016/j.neucom.2019.08.112_b0010) 2018; 275 Tan (10.1016/j.neucom.2019.08.112_b0035) 2018; 48 Bolton (10.1016/j.neucom.2019.08.112_b0075) 2000; 108 Cardaliaguet (10.1016/j.neucom.2019.08.112_b0050) 2007; 46 Yang (10.1016/j.neucom.2019.08.112_b0045) 2018; 56 Chang (10.1016/j.neucom.2019.08.112_b0070) 2014; 13 Oh (10.1016/j.neucom.2019.08.112_b0005) 2015; 53 Duchi (10.1016/j.neucom.2019.08.112_b0110) 2011; 12 Tan (10.1016/j.neucom.2019.08.112_b0030) 2017; 62 |
References_xml | – volume: 48 start-page: 2783 year: 2018 end-page: 2794 ident: b0035 article-title: Delay-dependent algebraic Riccati equation to stabilization of networked control systems: continuous-time case publication-title: IEEE Trans. Cyber. – volume: 17 start-page: 1334 year: 2016 end-page: 1373 ident: b0090 article-title: End-to-end training of deep visuomotor policies publication-title: J. Mach. Learn. Res. – volume: 15 start-page: 2489 year: 2014 end-page: 2512 ident: b0115 article-title: Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization publication-title: J. Mach. Learn. Res. – volume: 62 start-page: 2807 year: 2017 end-page: 2822 ident: b0100 article-title: Online learning of feasible strategies in unknown environments publication-title: IEEE Trans. Autom. Control – volume: 26 start-page: 1261 year: 1981 end-page: 1269 ident: b0125 article-title: A process-model control for linear systems with delay publication-title: IEEE Trans. Autom. Control – volume: 5 start-page: 1 year: 2012 end-page: 122 ident: b0105 article-title: Regret analysis of stochastic and nonstochastic multi-armed bandit problems publication-title: Found. Trends Mach. Learn. – volume: 56 start-page: 1960 year: 2018 end-page: 1981 ident: b0045 article-title: Optimal reservation pricing strategy for a fashion supply chain with forecast update and asymmetric cost information publication-title: Int. J. Prod. Res. – volume: 10 start-page: 608 year: 2008 end-page: 615 ident: b0135 article-title: Infinite horizon linear quadratic optimal control for discrete-time stochastic systems publication-title: Asian J. Control – volume: 13 start-page: 2824 year: 2014 end-page: 2835 ident: b0070 article-title: Radio resource allocation for collaborative OFDMA relay networks with imperfect channel state information publication-title: IEEE Trans. Wire. Commun. – year: 1975 ident: b0080 article-title: Applied Optimal Control: Optimization, Estimation and Control – volume: 12 start-page: 110 year: 2018 end-page: 118 ident: b0060 article-title: Gittins index based control policy for a class of pursuit-evasion problems publication-title: IET Control Theory Appl. – volume: 29 start-page: 629 year: 2016 end-page: 641 ident: b0015 article-title: Distributed design of approximately optimal controller for identical discrete-time multi-agent systems publication-title: J. Syst. Sci. Compl. – volume: 6 start-page: 131 year: 1968 end-page: 147 ident: b0040 article-title: A counterexample in stochastic optimum control publication-title: SIAM J. Control – volume: 12 start-page: 2121 year: 2011 end-page: 2159 ident: b0110 article-title: Adaptive subgradient methods for online learning and stochastic optimization publication-title: J. Mach. Learn. Res. – volume: 108 start-page: 324 year: 2000 end-page: 351 ident: b0075 article-title: Equity, bonds, and bank debt: capital structure and financial market equilibrium under asymmetric information publication-title: J. Polit. Econ. – volume: 35 start-page: 19 year: 1985 end-page: 25 ident: b0130 article-title: The law of large numbers with a continuum of iid random variables publication-title: J. Econ. Theory – volume: 62 start-page: 4011 year: 2017 end-page: 4016 ident: b0030 article-title: Necessary and sufficient stabilizing conditions for networked control systems with simultaneous transmission delay and packet dropout publication-title: IEEE Trans. Autom. Control – volume: 18 start-page: 267 year: 2010 end-page: 278 ident: b0120 article-title: Fast model predictive control using online optimization publication-title: IEEE Trans. Control Syst. Technol. – volume: 59 start-page: 769 year: 2013 end-page: 774 ident: b0025 article-title: Cooperative optimal control for multi-agent systems on directed graph topologies publication-title: IEEE Trans. Autom. Control – volume: 15 start-page: 2507 year: 2017 end-page: 2515 ident: b0020 article-title: A new approach to distributed control for multi-agent systems based on approximate upper and lower bounds publication-title: Int. J. Cont. Autom. Syst. – volume: 275 start-page: 1684 year: 2018 end-page: 1701 ident: b0010 article-title: A survey on recent advances in distributed sampled-data cooperative control of multi-agent systems publication-title: Neurocomputing – year: 1995 ident: b0085 article-title: Dynamic Programming and Optimal Control, Belmont, MA – volume: 2 start-page: 157 year: 2016 end-page: 325 ident: b0095 article-title: Introduction to online convex optimization publication-title: Found. Trends Opt. – volume: 552 start-page: 1 year: 2015 end-page: 25 ident: b0065 article-title: Statistical mechanics of competitive resource allocation using agent-based models publication-title: Phys. Rep. – volume: 53 start-page: 424 year: 2015 end-page: 440 ident: b0005 article-title: A survey of multi-agent formation control publication-title: Automatica – volume: 2 start-page: 126 year: 1989 end-page: 143 ident: b0055 article-title: Optimal algorithms for a pursuit-evasion problem in grids publication-title: SIAM J. Disc. Math. – volume: 46 start-page: 816 year: 2007 end-page: 838 ident: b0050 article-title: Differential games with asymmetric information publication-title: SIAM J. Control Optim. – volume: 53 start-page: 424 year: 2015 ident: 10.1016/j.neucom.2019.08.112_b0005 article-title: A survey of multi-agent formation control publication-title: Automatica doi: 10.1016/j.automatica.2014.10.022 – volume: 108 start-page: 324 issue: 2 year: 2000 ident: 10.1016/j.neucom.2019.08.112_b0075 article-title: Equity, bonds, and bank debt: capital structure and financial market equilibrium under asymmetric information publication-title: J. Polit. Econ. doi: 10.1086/262121 – volume: 17 start-page: 1334 issue: 1 year: 2016 ident: 10.1016/j.neucom.2019.08.112_b0090 article-title: End-to-end training of deep visuomotor policies publication-title: J. Mach. Learn. Res. – year: 1975 ident: 10.1016/j.neucom.2019.08.112_b0080 – volume: 12 start-page: 2121 year: 2011 ident: 10.1016/j.neucom.2019.08.112_b0110 article-title: Adaptive subgradient methods for online learning and stochastic optimization publication-title: J. Mach. Learn. Res. – volume: 35 start-page: 19 issue: 1 year: 1985 ident: 10.1016/j.neucom.2019.08.112_b0130 article-title: The law of large numbers with a continuum of iid random variables publication-title: J. Econ. Theory doi: 10.1016/0022-0531(85)90059-6 – volume: 2 start-page: 126 issue: 1 year: 1989 ident: 10.1016/j.neucom.2019.08.112_b0055 article-title: Optimal algorithms for a pursuit-evasion problem in grids publication-title: SIAM J. Disc. Math. doi: 10.1137/0402013 – volume: 5 start-page: 1 issue: 1 year: 2012 ident: 10.1016/j.neucom.2019.08.112_b0105 article-title: Regret analysis of stochastic and nonstochastic multi-armed bandit problems publication-title: Found. Trends Mach. Learn. doi: 10.1561/2200000024 – volume: 48 start-page: 2783 issue: 10 year: 2018 ident: 10.1016/j.neucom.2019.08.112_b0035 article-title: Delay-dependent algebraic Riccati equation to stabilization of networked control systems: continuous-time case publication-title: IEEE Trans. Cyber. doi: 10.1109/TCYB.2017.2750221 – volume: 56 start-page: 1960 issue: 5 year: 2018 ident: 10.1016/j.neucom.2019.08.112_b0045 article-title: Optimal reservation pricing strategy for a fashion supply chain with forecast update and asymmetric cost information publication-title: Int. J. Prod. Res. doi: 10.1080/00207543.2014.998789 – volume: 275 start-page: 1684 year: 2018 ident: 10.1016/j.neucom.2019.08.112_b0010 article-title: A survey on recent advances in distributed sampled-data cooperative control of multi-agent systems publication-title: Neurocomputing doi: 10.1016/j.neucom.2017.10.008 – volume: 15 start-page: 2489 issue: 1 year: 2014 ident: 10.1016/j.neucom.2019.08.112_b0115 article-title: Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization publication-title: J. Mach. Learn. Res. – volume: 552 start-page: 1 year: 2015 ident: 10.1016/j.neucom.2019.08.112_b0065 article-title: Statistical mechanics of competitive resource allocation using agent-based models publication-title: Phys. Rep. doi: 10.1016/j.physrep.2014.09.006 – volume: 6 start-page: 131 issue: 1 year: 1968 ident: 10.1016/j.neucom.2019.08.112_b0040 article-title: A counterexample in stochastic optimum control publication-title: SIAM J. Control doi: 10.1137/0306011 – volume: 12 start-page: 110 issue: 1 year: 2018 ident: 10.1016/j.neucom.2019.08.112_b0060 article-title: Gittins index based control policy for a class of pursuit-evasion problems publication-title: IET Control Theory Appl. doi: 10.1049/iet-cta.2017.0398 – volume: 62 start-page: 4011 issue: 8 year: 2017 ident: 10.1016/j.neucom.2019.08.112_b0030 article-title: Necessary and sufficient stabilizing conditions for networked control systems with simultaneous transmission delay and packet dropout publication-title: IEEE Trans. Autom. Control doi: 10.1109/TAC.2016.2614887 – volume: 10 start-page: 608 issue: 5 year: 2008 ident: 10.1016/j.neucom.2019.08.112_b0135 article-title: Infinite horizon linear quadratic optimal control for discrete-time stochastic systems publication-title: Asian J. Control doi: 10.1002/asjc.61 – volume: 59 start-page: 769 issue: 3 year: 2013 ident: 10.1016/j.neucom.2019.08.112_b0025 article-title: Cooperative optimal control for multi-agent systems on directed graph topologies publication-title: IEEE Trans. Autom. Control doi: 10.1109/TAC.2013.2275670 – volume: 26 start-page: 1261 issue: 6 year: 1981 ident: 10.1016/j.neucom.2019.08.112_b0125 article-title: A process-model control for linear systems with delay publication-title: IEEE Trans. Autom. Control doi: 10.1109/TAC.1981.1102802 – volume: 29 start-page: 629 issue: 3 year: 2016 ident: 10.1016/j.neucom.2019.08.112_b0015 article-title: Distributed design of approximately optimal controller for identical discrete-time multi-agent systems publication-title: J. Syst. Sci. Compl. doi: 10.1007/s11424-015-4201-2 – volume: 62 start-page: 2807 issue: 6 year: 2017 ident: 10.1016/j.neucom.2019.08.112_b0100 article-title: Online learning of feasible strategies in unknown environments publication-title: IEEE Trans. Autom. Control doi: 10.1109/TAC.2016.2627401 – volume: 46 start-page: 816 issue: 3 year: 2007 ident: 10.1016/j.neucom.2019.08.112_b0050 article-title: Differential games with asymmetric information publication-title: SIAM J. Control Optim. doi: 10.1137/060654396 – volume: 2 start-page: 157 issue: 3–4 year: 2016 ident: 10.1016/j.neucom.2019.08.112_b0095 article-title: Introduction to online convex optimization publication-title: Found. Trends Opt. doi: 10.1561/2400000013 – volume: 13 start-page: 2824 issue: 5 year: 2014 ident: 10.1016/j.neucom.2019.08.112_b0070 article-title: Radio resource allocation for collaborative OFDMA relay networks with imperfect channel state information publication-title: IEEE Trans. Wire. Commun. doi: 10.1109/TWC.2014.040714.131695 – volume: 15 start-page: 2507 issue: 6 year: 2017 ident: 10.1016/j.neucom.2019.08.112_b0020 article-title: A new approach to distributed control for multi-agent systems based on approximate upper and lower bounds publication-title: Int. J. Cont. Autom. Syst. doi: 10.1007/s12555-015-0479-z – year: 1995 ident: 10.1016/j.neucom.2019.08.112_b0085 – volume: 18 start-page: 267 issue: 2 year: 2010 ident: 10.1016/j.neucom.2019.08.112_b0120 article-title: Fast model predictive control using online optimization publication-title: IEEE Trans. Control Syst. Technol. doi: 10.1109/TCST.2009.2017934 |
SSID | ssj0017129 |
Score | 2.3016534 |
Snippet | In this paper, we study multi-agent systems with asymmetric information structure. Due to limited channel capacity in communication network, the information in... |
SourceID | crossref elsevier |
SourceType | Enrichment Source Index Database Publisher |
StartPage | 351 |
SubjectTerms | Asymmetric information Learning based control policy Linear minimum mean square unbiased estimation Online qudratic optimization Regret |
Title | Learning in multi-agent systems with asymmetric information structure |
URI | https://dx.doi.org/10.1016/j.neucom.2019.08.112 |
Volume | 412 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV09T8MwED2VsrDwjfisPLCaOnYcx2NVFRUQLFCJLbIvDiqCgIAOLPx27MapQEIgsUY5KXq27905d-8AjjNmNTKTUZdnGU0NQ2pRIFVYulwbp5gJzcmXV9l4kp7fytsODNtemFBWGX1_49Pn3jo-6Uc0-8_Taf-aae6zqITzoG4plFqCZS50JruwPDi7GF8tfiaohDeSe1zSYNB20M3LvGo3C2Ujngd10PJMEv4zQ31hndN1WI3hIhk0X7QBHVdvwlo7ioHEk7kFo6iTekemNZkXCVITmqZIo9T8SsJ9KzGv74-PYYQWkiiYGpaFNBqysxe3DZPT0c1wTOOEBIo-1H-j3Mq0dLJyWWq0VgKrpNLKsJIZ7wEFok2cQ2mF86yjq7JklqPgxmSVqLh1Yge69VPtdoEkzscyukJtrEqNkhatNSgxz0VVCpPvgWhRKTDKh4cpFg9FWyd2XzRYFgHLguU-ueB7QBdWz418xh_vqxbw4ts2KLyH_9Vy_9-WB7DCQxLtCYnnh9D1mLsjH2m82R4snXwkvbifPgFWydbj |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB50PejFt_g2B69h06RtmuMiK-trLyrsLSTTVFa0iuse_Pcm23RREAWvpQPlSzLfTDrzDcBpzqxCZnLqijynqWFILQqkEktXKOMkM6E5-WaYD-7Ty1E2WoCzthcmlFVG39_49Jm3jk-6Ec3u63jcvWWK-ywq4TyoWwopF2EpzXy214Gl3sXVYDj_mSAT3kju8YwGg7aDblbmVbtpKBvxPKiClmeS8J8Z6gvrnK_DagwXSa_5og1YcPUmrLWjGEg8mVvQjzqpD2Rck1mRIDWhaYo0Ss0TEu5biZl8PD-HEVpIomBqWBbSaMhO39w23J_3784GNE5IoOhD_XfKbZaWLqtcnhqlpMAqqZQ0rGTGe0CBaBPnMLPCedZRVVkyy1FwY_JKVNw6sQOd-qV2u0AS52MZVaEyVqZGZhatNZhhUYiqFKbYA9GiojHKh4cpFk-6rRN71A2WOmCpWeGTC74HdG712shn_PG-bAHX37aB9h7-V8v9f1uewPLg7uZaX18Mrw5ghYeE2pMTLw6h4_F3Rz7qeLfHcVd9AoBm2NI |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Learning+in+multi-agent+systems+with+asymmetric+information+structure&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Tan%2C+Cheng&rft.au=Qi%2C+Qingyuan&rft.au=Wong%2C+Wing+Shing&rft.date=2020-10-28&rft.issn=0925-2312&rft.volume=412&rft.spage=351&rft.epage=359&rft_id=info:doi/10.1016%2Fj.neucom.2019.08.112&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2019_08_112 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon |