Learning in multi-agent systems with asymmetric information structure

In this paper, we study multi-agent systems with asymmetric information structure. Due to limited channel capacity in communication network, the information in routing path suffers a transmission delay. Instead of the game theoretic setting, we formulate the problem as an online quadratic optimizati...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 412; pp. 351 - 359
Main Authors Tan, Cheng, Qi, Qingyuan, Wong, Wing Shing
Format Journal Article
LanguageEnglish
Published Elsevier B.V 28.10.2020
Subjects
Online AccessGet full text
ISSN0925-2312
1872-8286
DOI10.1016/j.neucom.2019.08.112

Cover

Abstract In this paper, we study multi-agent systems with asymmetric information structure. Due to limited channel capacity in communication network, the information in routing path suffers a transmission delay. Instead of the game theoretic setting, we formulate the problem as an online quadratic optimization problem subject to stochastic systems involving input delay. Since the probability statistics of system noise is unknown, the decision-maker can not utilize the traditional optimal control strategies. Motivated by online convex optimization theory, we introduce the notion of regret, which measures the cumulative performance difference between the optimal statistics known (offline) index value and the statistics unknown (online) index value. The contributions of this paper are twofold. First, utilizing the linear minimum mean square biased estimate, we derive a learning based control policy and then characterize its behavior. Second, under some basic assumptions, we further prove that the regret grows at a sub-linear rate and it is explicitly bounded by O(lnT).
AbstractList In this paper, we study multi-agent systems with asymmetric information structure. Due to limited channel capacity in communication network, the information in routing path suffers a transmission delay. Instead of the game theoretic setting, we formulate the problem as an online quadratic optimization problem subject to stochastic systems involving input delay. Since the probability statistics of system noise is unknown, the decision-maker can not utilize the traditional optimal control strategies. Motivated by online convex optimization theory, we introduce the notion of regret, which measures the cumulative performance difference between the optimal statistics known (offline) index value and the statistics unknown (online) index value. The contributions of this paper are twofold. First, utilizing the linear minimum mean square biased estimate, we derive a learning based control policy and then characterize its behavior. Second, under some basic assumptions, we further prove that the regret grows at a sub-linear rate and it is explicitly bounded by O(lnT).
Author Qi, Qingyuan
Wong, Wing Shing
Tan, Cheng
Author_xml – sequence: 1
  givenname: Cheng
  surname: Tan
  fullname: Tan, Cheng
  email: tancheng1987love@163.com
  organization: Department of Information Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
– sequence: 2
  givenname: Qingyuan
  surname: Qi
  fullname: Qi, Qingyuan
  organization: School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
– sequence: 3
  givenname: Wing Shing
  surname: Wong
  fullname: Wong, Wing Shing
  organization: Department of Information Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
BookMark eNqFkMtKAzEYhYMo2FbfwMW8wIz5k7nFhSClXqDgRtchk_lTU5qMJBmlb-_UunKhq7M534HzzcmpHzwScgW0AAr19bbwOOrBFYyCKGhbALATMoO2YXnL2vqUzKhgVc44sHMyj3FLKTTAxIys1qiCt36TWZ-5cZdsrjboUxb3MaGL2adNb5mKe-cwBaunmhmCU8kOPospjDqNAS_ImVG7iJc_uSCv96uX5WO-fn54Wt6tc81pnXLWVWWPlcG6VEI0XBswolG0pwrKmmvdAaKuOo5MCGH6nnZMc6ZUbbhhHfIFKY-7OgwxBjTyPVinwl4ClQcVciuPKuRBhaStnFRM2M0vTNv0fSEFZXf_wbdHGKdjHxaDjNqi19jbgDrJfrB_D3wBozuB1w
CitedBy_id crossref_primary_10_1016_j_neucom_2021_10_033
Cites_doi 10.1016/j.automatica.2014.10.022
10.1086/262121
10.1016/0022-0531(85)90059-6
10.1137/0402013
10.1561/2200000024
10.1109/TCYB.2017.2750221
10.1080/00207543.2014.998789
10.1016/j.neucom.2017.10.008
10.1016/j.physrep.2014.09.006
10.1137/0306011
10.1049/iet-cta.2017.0398
10.1109/TAC.2016.2614887
10.1002/asjc.61
10.1109/TAC.2013.2275670
10.1109/TAC.1981.1102802
10.1007/s11424-015-4201-2
10.1109/TAC.2016.2627401
10.1137/060654396
10.1561/2400000013
10.1109/TWC.2014.040714.131695
10.1007/s12555-015-0479-z
10.1109/TCST.2009.2017934
ContentType Journal Article
Copyright 2020 Elsevier B.V.
Copyright_xml – notice: 2020 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.neucom.2019.08.112
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-8286
EndPage 359
ExternalDocumentID 10_1016_j_neucom_2019_08_112
S0925231220310377
GroupedDBID ---
--K
--M
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXLA
AAXUO
AAYFN
ABBOA
ABCQJ
ABFNM
ABJNI
ABMAC
ABYKQ
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
KOM
LG9
M41
MO0
MOBAO
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
ROL
RPZ
SDF
SDG
SDP
SES
SPC
SPCBC
SSN
SSV
SSZ
T5K
ZMT
~G-
29N
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ABXDB
ACNNM
ACRPL
ACVFH
ADCNI
ADJOM
ADMUD
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
BNPGV
CITATION
EJD
FEDTE
FGOYB
HLZ
HVGLF
HZ~
R2-
RIG
SBC
SEW
SSH
WUQ
XPP
ID FETCH-LOGICAL-c306t-2b54de5fe64a9973cf1f97a0d0a1463ccb1eec5b3e2999fdd0b2c32aa6f3f2be3
IEDL.DBID AIKHN
ISSN 0925-2312
IngestDate Thu Apr 24 23:10:35 EDT 2025
Tue Jul 01 01:46:51 EDT 2025
Fri Feb 23 02:46:16 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Asymmetric information
Regret
Learning based control policy
Linear minimum mean square unbiased estimation
Online qudratic optimization
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-2b54de5fe64a9973cf1f97a0d0a1463ccb1eec5b3e2999fdd0b2c32aa6f3f2be3
PageCount 9
ParticipantIDs crossref_primary_10_1016_j_neucom_2019_08_112
crossref_citationtrail_10_1016_j_neucom_2019_08_112
elsevier_sciencedirect_doi_10_1016_j_neucom_2019_08_112
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2020-10-28
PublicationDateYYYYMMDD 2020-10-28
PublicationDate_xml – month: 10
  year: 2020
  text: 2020-10-28
  day: 28
PublicationDecade 2020
PublicationTitle Neurocomputing (Amsterdam)
PublicationYear 2020
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Bolton, Freixas (b0075) 2000; 108
Tan, Xu, Yang, Wong (b0060) 2018; 12
Bertsekas (b0085) 1995
Oh, Park, Ahn (b0005) 2015; 53
Ge, Han, Ding, Zhang, Ning (b0010) 2018; 275
Tan, Zhang, Wong (b0035) 2018; 48
Hazan (b0095) 2016; 2
Yang, Xiao, Choi, Cheng (b0045) 2018; 56
Zhang, Zhang, Tan, Wang, Gao (b0020) 2017; 15
Bryson (b0080) 1975
Zhang, Wang, Zhang (b0015) 2016; 29
Chakraborti, Challet, Chatterjee, Marsili, Zhang, Chakrabarti (b0065) 2015; 552
Movric, Lewis (b0025) 2013; 59
Paternain, Ribeiro (b0100) 2017; 62
Duchi, Hazan, Singer (b0110) 2011; 12
Levine, Finn, Darrell, Abbeel (b0090) 2016; 17
Hazan, Kale (b0115) 2014; 15
Judd (b0130) 1985; 35
Huang, Zhang, Zhang (b0135) 2008; 10
Tan, Zhang (b0030) 2017; 62
Watanabe, Ito (b0125) 1981; 26
Bubeck, Cesa-Bianchi (b0105) 2012; 5
Wang, Boyd (b0120) 2010; 18
Chang, Ristaniemi, Niu (b0070) 2014; 13
Witsenhausen (b0040) 1968; 6
Cardaliaguet (b0050) 2007; 46
Sugihara, Suzuki (b0055) 1989; 2
Chakraborti (10.1016/j.neucom.2019.08.112_b0065) 2015; 552
Judd (10.1016/j.neucom.2019.08.112_b0130) 1985; 35
Huang (10.1016/j.neucom.2019.08.112_b0135) 2008; 10
Tan (10.1016/j.neucom.2019.08.112_b0060) 2018; 12
Movric (10.1016/j.neucom.2019.08.112_b0025) 2013; 59
Wang (10.1016/j.neucom.2019.08.112_b0120) 2010; 18
Levine (10.1016/j.neucom.2019.08.112_b0090) 2016; 17
Witsenhausen (10.1016/j.neucom.2019.08.112_b0040) 1968; 6
Bubeck (10.1016/j.neucom.2019.08.112_b0105) 2012; 5
Zhang (10.1016/j.neucom.2019.08.112_b0015) 2016; 29
Bertsekas (10.1016/j.neucom.2019.08.112_b0085) 1995
Paternain (10.1016/j.neucom.2019.08.112_b0100) 2017; 62
Sugihara (10.1016/j.neucom.2019.08.112_b0055) 1989; 2
Bryson (10.1016/j.neucom.2019.08.112_b0080) 1975
Hazan (10.1016/j.neucom.2019.08.112_b0095) 2016; 2
Hazan (10.1016/j.neucom.2019.08.112_b0115) 2014; 15
Watanabe (10.1016/j.neucom.2019.08.112_b0125) 1981; 26
Zhang (10.1016/j.neucom.2019.08.112_b0020) 2017; 15
Ge (10.1016/j.neucom.2019.08.112_b0010) 2018; 275
Tan (10.1016/j.neucom.2019.08.112_b0035) 2018; 48
Bolton (10.1016/j.neucom.2019.08.112_b0075) 2000; 108
Cardaliaguet (10.1016/j.neucom.2019.08.112_b0050) 2007; 46
Yang (10.1016/j.neucom.2019.08.112_b0045) 2018; 56
Chang (10.1016/j.neucom.2019.08.112_b0070) 2014; 13
Oh (10.1016/j.neucom.2019.08.112_b0005) 2015; 53
Duchi (10.1016/j.neucom.2019.08.112_b0110) 2011; 12
Tan (10.1016/j.neucom.2019.08.112_b0030) 2017; 62
References_xml – volume: 48
  start-page: 2783
  year: 2018
  end-page: 2794
  ident: b0035
  article-title: Delay-dependent algebraic Riccati equation to stabilization of networked control systems: continuous-time case
  publication-title: IEEE Trans. Cyber.
– volume: 17
  start-page: 1334
  year: 2016
  end-page: 1373
  ident: b0090
  article-title: End-to-end training of deep visuomotor policies
  publication-title: J. Mach. Learn. Res.
– volume: 15
  start-page: 2489
  year: 2014
  end-page: 2512
  ident: b0115
  article-title: Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization
  publication-title: J. Mach. Learn. Res.
– volume: 62
  start-page: 2807
  year: 2017
  end-page: 2822
  ident: b0100
  article-title: Online learning of feasible strategies in unknown environments
  publication-title: IEEE Trans. Autom. Control
– volume: 26
  start-page: 1261
  year: 1981
  end-page: 1269
  ident: b0125
  article-title: A process-model control for linear systems with delay
  publication-title: IEEE Trans. Autom. Control
– volume: 5
  start-page: 1
  year: 2012
  end-page: 122
  ident: b0105
  article-title: Regret analysis of stochastic and nonstochastic multi-armed bandit problems
  publication-title: Found. Trends Mach. Learn.
– volume: 56
  start-page: 1960
  year: 2018
  end-page: 1981
  ident: b0045
  article-title: Optimal reservation pricing strategy for a fashion supply chain with forecast update and asymmetric cost information
  publication-title: Int. J. Prod. Res.
– volume: 10
  start-page: 608
  year: 2008
  end-page: 615
  ident: b0135
  article-title: Infinite horizon linear quadratic optimal control for discrete-time stochastic systems
  publication-title: Asian J. Control
– volume: 13
  start-page: 2824
  year: 2014
  end-page: 2835
  ident: b0070
  article-title: Radio resource allocation for collaborative OFDMA relay networks with imperfect channel state information
  publication-title: IEEE Trans. Wire. Commun.
– year: 1975
  ident: b0080
  article-title: Applied Optimal Control: Optimization, Estimation and Control
– volume: 12
  start-page: 110
  year: 2018
  end-page: 118
  ident: b0060
  article-title: Gittins index based control policy for a class of pursuit-evasion problems
  publication-title: IET Control Theory Appl.
– volume: 29
  start-page: 629
  year: 2016
  end-page: 641
  ident: b0015
  article-title: Distributed design of approximately optimal controller for identical discrete-time multi-agent systems
  publication-title: J. Syst. Sci. Compl.
– volume: 6
  start-page: 131
  year: 1968
  end-page: 147
  ident: b0040
  article-title: A counterexample in stochastic optimum control
  publication-title: SIAM J. Control
– volume: 12
  start-page: 2121
  year: 2011
  end-page: 2159
  ident: b0110
  article-title: Adaptive subgradient methods for online learning and stochastic optimization
  publication-title: J. Mach. Learn. Res.
– volume: 108
  start-page: 324
  year: 2000
  end-page: 351
  ident: b0075
  article-title: Equity, bonds, and bank debt: capital structure and financial market equilibrium under asymmetric information
  publication-title: J. Polit. Econ.
– volume: 35
  start-page: 19
  year: 1985
  end-page: 25
  ident: b0130
  article-title: The law of large numbers with a continuum of iid random variables
  publication-title: J. Econ. Theory
– volume: 62
  start-page: 4011
  year: 2017
  end-page: 4016
  ident: b0030
  article-title: Necessary and sufficient stabilizing conditions for networked control systems with simultaneous transmission delay and packet dropout
  publication-title: IEEE Trans. Autom. Control
– volume: 18
  start-page: 267
  year: 2010
  end-page: 278
  ident: b0120
  article-title: Fast model predictive control using online optimization
  publication-title: IEEE Trans. Control Syst. Technol.
– volume: 59
  start-page: 769
  year: 2013
  end-page: 774
  ident: b0025
  article-title: Cooperative optimal control for multi-agent systems on directed graph topologies
  publication-title: IEEE Trans. Autom. Control
– volume: 15
  start-page: 2507
  year: 2017
  end-page: 2515
  ident: b0020
  article-title: A new approach to distributed control for multi-agent systems based on approximate upper and lower bounds
  publication-title: Int. J. Cont. Autom. Syst.
– volume: 275
  start-page: 1684
  year: 2018
  end-page: 1701
  ident: b0010
  article-title: A survey on recent advances in distributed sampled-data cooperative control of multi-agent systems
  publication-title: Neurocomputing
– year: 1995
  ident: b0085
  article-title: Dynamic Programming and Optimal Control, Belmont, MA
– volume: 2
  start-page: 157
  year: 2016
  end-page: 325
  ident: b0095
  article-title: Introduction to online convex optimization
  publication-title: Found. Trends Opt.
– volume: 552
  start-page: 1
  year: 2015
  end-page: 25
  ident: b0065
  article-title: Statistical mechanics of competitive resource allocation using agent-based models
  publication-title: Phys. Rep.
– volume: 53
  start-page: 424
  year: 2015
  end-page: 440
  ident: b0005
  article-title: A survey of multi-agent formation control
  publication-title: Automatica
– volume: 2
  start-page: 126
  year: 1989
  end-page: 143
  ident: b0055
  article-title: Optimal algorithms for a pursuit-evasion problem in grids
  publication-title: SIAM J. Disc. Math.
– volume: 46
  start-page: 816
  year: 2007
  end-page: 838
  ident: b0050
  article-title: Differential games with asymmetric information
  publication-title: SIAM J. Control Optim.
– volume: 53
  start-page: 424
  year: 2015
  ident: 10.1016/j.neucom.2019.08.112_b0005
  article-title: A survey of multi-agent formation control
  publication-title: Automatica
  doi: 10.1016/j.automatica.2014.10.022
– volume: 108
  start-page: 324
  issue: 2
  year: 2000
  ident: 10.1016/j.neucom.2019.08.112_b0075
  article-title: Equity, bonds, and bank debt: capital structure and financial market equilibrium under asymmetric information
  publication-title: J. Polit. Econ.
  doi: 10.1086/262121
– volume: 17
  start-page: 1334
  issue: 1
  year: 2016
  ident: 10.1016/j.neucom.2019.08.112_b0090
  article-title: End-to-end training of deep visuomotor policies
  publication-title: J. Mach. Learn. Res.
– year: 1975
  ident: 10.1016/j.neucom.2019.08.112_b0080
– volume: 12
  start-page: 2121
  year: 2011
  ident: 10.1016/j.neucom.2019.08.112_b0110
  article-title: Adaptive subgradient methods for online learning and stochastic optimization
  publication-title: J. Mach. Learn. Res.
– volume: 35
  start-page: 19
  issue: 1
  year: 1985
  ident: 10.1016/j.neucom.2019.08.112_b0130
  article-title: The law of large numbers with a continuum of iid random variables
  publication-title: J. Econ. Theory
  doi: 10.1016/0022-0531(85)90059-6
– volume: 2
  start-page: 126
  issue: 1
  year: 1989
  ident: 10.1016/j.neucom.2019.08.112_b0055
  article-title: Optimal algorithms for a pursuit-evasion problem in grids
  publication-title: SIAM J. Disc. Math.
  doi: 10.1137/0402013
– volume: 5
  start-page: 1
  issue: 1
  year: 2012
  ident: 10.1016/j.neucom.2019.08.112_b0105
  article-title: Regret analysis of stochastic and nonstochastic multi-armed bandit problems
  publication-title: Found. Trends Mach. Learn.
  doi: 10.1561/2200000024
– volume: 48
  start-page: 2783
  issue: 10
  year: 2018
  ident: 10.1016/j.neucom.2019.08.112_b0035
  article-title: Delay-dependent algebraic Riccati equation to stabilization of networked control systems: continuous-time case
  publication-title: IEEE Trans. Cyber.
  doi: 10.1109/TCYB.2017.2750221
– volume: 56
  start-page: 1960
  issue: 5
  year: 2018
  ident: 10.1016/j.neucom.2019.08.112_b0045
  article-title: Optimal reservation pricing strategy for a fashion supply chain with forecast update and asymmetric cost information
  publication-title: Int. J. Prod. Res.
  doi: 10.1080/00207543.2014.998789
– volume: 275
  start-page: 1684
  year: 2018
  ident: 10.1016/j.neucom.2019.08.112_b0010
  article-title: A survey on recent advances in distributed sampled-data cooperative control of multi-agent systems
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2017.10.008
– volume: 15
  start-page: 2489
  issue: 1
  year: 2014
  ident: 10.1016/j.neucom.2019.08.112_b0115
  article-title: Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization
  publication-title: J. Mach. Learn. Res.
– volume: 552
  start-page: 1
  year: 2015
  ident: 10.1016/j.neucom.2019.08.112_b0065
  article-title: Statistical mechanics of competitive resource allocation using agent-based models
  publication-title: Phys. Rep.
  doi: 10.1016/j.physrep.2014.09.006
– volume: 6
  start-page: 131
  issue: 1
  year: 1968
  ident: 10.1016/j.neucom.2019.08.112_b0040
  article-title: A counterexample in stochastic optimum control
  publication-title: SIAM J. Control
  doi: 10.1137/0306011
– volume: 12
  start-page: 110
  issue: 1
  year: 2018
  ident: 10.1016/j.neucom.2019.08.112_b0060
  article-title: Gittins index based control policy for a class of pursuit-evasion problems
  publication-title: IET Control Theory Appl.
  doi: 10.1049/iet-cta.2017.0398
– volume: 62
  start-page: 4011
  issue: 8
  year: 2017
  ident: 10.1016/j.neucom.2019.08.112_b0030
  article-title: Necessary and sufficient stabilizing conditions for networked control systems with simultaneous transmission delay and packet dropout
  publication-title: IEEE Trans. Autom. Control
  doi: 10.1109/TAC.2016.2614887
– volume: 10
  start-page: 608
  issue: 5
  year: 2008
  ident: 10.1016/j.neucom.2019.08.112_b0135
  article-title: Infinite horizon linear quadratic optimal control for discrete-time stochastic systems
  publication-title: Asian J. Control
  doi: 10.1002/asjc.61
– volume: 59
  start-page: 769
  issue: 3
  year: 2013
  ident: 10.1016/j.neucom.2019.08.112_b0025
  article-title: Cooperative optimal control for multi-agent systems on directed graph topologies
  publication-title: IEEE Trans. Autom. Control
  doi: 10.1109/TAC.2013.2275670
– volume: 26
  start-page: 1261
  issue: 6
  year: 1981
  ident: 10.1016/j.neucom.2019.08.112_b0125
  article-title: A process-model control for linear systems with delay
  publication-title: IEEE Trans. Autom. Control
  doi: 10.1109/TAC.1981.1102802
– volume: 29
  start-page: 629
  issue: 3
  year: 2016
  ident: 10.1016/j.neucom.2019.08.112_b0015
  article-title: Distributed design of approximately optimal controller for identical discrete-time multi-agent systems
  publication-title: J. Syst. Sci. Compl.
  doi: 10.1007/s11424-015-4201-2
– volume: 62
  start-page: 2807
  issue: 6
  year: 2017
  ident: 10.1016/j.neucom.2019.08.112_b0100
  article-title: Online learning of feasible strategies in unknown environments
  publication-title: IEEE Trans. Autom. Control
  doi: 10.1109/TAC.2016.2627401
– volume: 46
  start-page: 816
  issue: 3
  year: 2007
  ident: 10.1016/j.neucom.2019.08.112_b0050
  article-title: Differential games with asymmetric information
  publication-title: SIAM J. Control Optim.
  doi: 10.1137/060654396
– volume: 2
  start-page: 157
  issue: 3–4
  year: 2016
  ident: 10.1016/j.neucom.2019.08.112_b0095
  article-title: Introduction to online convex optimization
  publication-title: Found. Trends Opt.
  doi: 10.1561/2400000013
– volume: 13
  start-page: 2824
  issue: 5
  year: 2014
  ident: 10.1016/j.neucom.2019.08.112_b0070
  article-title: Radio resource allocation for collaborative OFDMA relay networks with imperfect channel state information
  publication-title: IEEE Trans. Wire. Commun.
  doi: 10.1109/TWC.2014.040714.131695
– volume: 15
  start-page: 2507
  issue: 6
  year: 2017
  ident: 10.1016/j.neucom.2019.08.112_b0020
  article-title: A new approach to distributed control for multi-agent systems based on approximate upper and lower bounds
  publication-title: Int. J. Cont. Autom. Syst.
  doi: 10.1007/s12555-015-0479-z
– year: 1995
  ident: 10.1016/j.neucom.2019.08.112_b0085
– volume: 18
  start-page: 267
  issue: 2
  year: 2010
  ident: 10.1016/j.neucom.2019.08.112_b0120
  article-title: Fast model predictive control using online optimization
  publication-title: IEEE Trans. Control Syst. Technol.
  doi: 10.1109/TCST.2009.2017934
SSID ssj0017129
Score 2.3016534
Snippet In this paper, we study multi-agent systems with asymmetric information structure. Due to limited channel capacity in communication network, the information in...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 351
SubjectTerms Asymmetric information
Learning based control policy
Linear minimum mean square unbiased estimation
Online qudratic optimization
Regret
Title Learning in multi-agent systems with asymmetric information structure
URI https://dx.doi.org/10.1016/j.neucom.2019.08.112
Volume 412
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV09T8MwED2VsrDwjfisPLCaOnYcx2NVFRUQLFCJLbIvDiqCgIAOLPx27MapQEIgsUY5KXq27905d-8AjjNmNTKTUZdnGU0NQ2pRIFVYulwbp5gJzcmXV9l4kp7fytsODNtemFBWGX1_49Pn3jo-6Uc0-8_Taf-aae6zqITzoG4plFqCZS50JruwPDi7GF8tfiaohDeSe1zSYNB20M3LvGo3C2Ujngd10PJMEv4zQ31hndN1WI3hIhk0X7QBHVdvwlo7ioHEk7kFo6iTekemNZkXCVITmqZIo9T8SsJ9KzGv74-PYYQWkiiYGpaFNBqysxe3DZPT0c1wTOOEBIo-1H-j3Mq0dLJyWWq0VgKrpNLKsJIZ7wEFok2cQ2mF86yjq7JklqPgxmSVqLh1Yge69VPtdoEkzscyukJtrEqNkhatNSgxz0VVCpPvgWhRKTDKh4cpFg9FWyd2XzRYFgHLguU-ueB7QBdWz418xh_vqxbw4ts2KLyH_9Vy_9-WB7DCQxLtCYnnh9D1mLsjH2m82R4snXwkvbifPgFWydbj
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB50PejFt_g2B69h06RtmuMiK-trLyrsLSTTVFa0iuse_Pcm23RREAWvpQPlSzLfTDrzDcBpzqxCZnLqijynqWFILQqkEktXKOMkM6E5-WaYD-7Ty1E2WoCzthcmlFVG39_49Jm3jk-6Ec3u63jcvWWK-ywq4TyoWwopF2EpzXy214Gl3sXVYDj_mSAT3kju8YwGg7aDblbmVbtpKBvxPKiClmeS8J8Z6gvrnK_DagwXSa_5og1YcPUmrLWjGEg8mVvQjzqpD2Rck1mRIDWhaYo0Ss0TEu5biZl8PD-HEVpIomBqWBbSaMhO39w23J_3784GNE5IoOhD_XfKbZaWLqtcnhqlpMAqqZQ0rGTGe0CBaBPnMLPCedZRVVkyy1FwY_JKVNw6sQOd-qV2u0AS52MZVaEyVqZGZhatNZhhUYiqFKbYA9GiojHKh4cpFk-6rRN71A2WOmCpWeGTC74HdG712shn_PG-bAHX37aB9h7-V8v9f1uewPLg7uZaX18Mrw5ghYeE2pMTLw6h4_F3Rz7qeLfHcVd9AoBm2NI
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Learning+in+multi-agent+systems+with+asymmetric+information+structure&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Tan%2C+Cheng&rft.au=Qi%2C+Qingyuan&rft.au=Wong%2C+Wing+Shing&rft.date=2020-10-28&rft.issn=0925-2312&rft.volume=412&rft.spage=351&rft.epage=359&rft_id=info:doi/10.1016%2Fj.neucom.2019.08.112&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_neucom_2019_08_112
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon