Learning with Delayed Payoffs in Population Games using Kullback-Leibler Divergence Regularization
We study a multi-agent decision problem in large population games. Agents from multiple populations select strategies for repeated interactions with one another. At each stage of these interactions, agents use their decision-making model to revise their strategy selections based on payoffs determine...
Saved in:
Published in | IEEE transactions on automatic control pp. 1 - 16 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
IEEE
2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | We study a multi-agent decision problem in large population games. Agents from multiple populations select strategies for repeated interactions with one another. At each stage of these interactions, agents use their decision-making model to revise their strategy selections based on payoffs determined by an underlying game. Their goal is to learn the strategies that correspond to the Nash equilibrium of the game. However, when games are subject to time delays, conventional decision-making models from the population game literature may result in oscillations in the strategy revision process or convergence to an equilibrium other than the Nash. To address this problem, we propose the Kullback-Leibler Divergence Regularized Learning (KLD-RL) model, along with an algorithm that iteratively updates the model's regularization parameter across a network of communicating agents. Using passivity-based convergence analysis techniques, we show that the KLD-RL model achieves convergence to the Nash equilibrium without oscillations, even for a class of population games that are subject to time delays. We demonstrate our main results numerically on a two-population congestion game and a two-population zero-sum game. |
---|---|
AbstractList | We study a multi-agent decision problem in large population games. Agents from multiple populations select strategies for repeated interactions with one another. At each stage of these interactions, agents use their decision-making model to revise their strategy selections based on payoffs determined by an underlying game. Their goal is to learn the strategies that correspond to the Nash equilibrium of the game. However, when games are subject to time delays, conventional decision-making models from the population game literature may result in oscillations in the strategy revision process or convergence to an equilibrium other than the Nash. To address this problem, we propose the Kullback-Leibler Divergence Regularized Learning (KLD-RL) model, along with an algorithm that iteratively updates the model's regularization parameter across a network of communicating agents. Using passivity-based convergence analysis techniques, we show that the KLD-RL model achieves convergence to the Nash equilibrium without oscillations, even for a class of population games that are subject to time delays. We demonstrate our main results numerically on a two-population congestion game and a two-population zero-sum game. |
Author | Leonard, Naomi Ehrich Park, Shinkyu |
Author_xml | – sequence: 1 givenname: Shinkyu surname: Park fullname: Park, Shinkyu email: shinkyu7275@gmail.com organization: Electrical and Computer Engineering, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia – sequence: 2 givenname: Naomi Ehrich surname: Leonard fullname: Leonard, Naomi Ehrich email: naomi@princeton.edu organization: Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, NJ, USA |
BookMark | eNpNkE1PwkAURScGExHdu3Axf6A4n-3MkoCisYnE4Lp5nb7iaGnJDGjw11uEhauXe3PPW5xLMmi7Fgm54WzMObN3y8l0LJjQY6nTvjBnZMi1NonQQg7IkDFuEitMekEuY_zoY6oUH5IyRwitb1f022_f6Qwb2GNFF7Dv6jpS39JFt9k1sPVdS-ewxkh38TB_3jVNCe4zydGXDQY6818YVtg6pK-46pHgf_6wK3JeQxPx-nRH5O3hfjl9TPKX-dN0kieOS7VNXK0EZE4akCa1KG2Z9RXPuJBccYlZpkBVzKXMaAVVCYID18KKCqxzJcgRYce_LnQxBqyLTfBrCPuCs-LgqOgdFQdHxclRj9weEY-I_-Y2Tbk28hcayGW6 |
CODEN | IETAA9 |
ContentType | Journal Article |
DBID | 97E RIA RIE AAYXX CITATION |
DOI | 10.1109/TAC.2025.3561108 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1558-2523 |
EndPage | 16 |
ExternalDocumentID | 10_1109_TAC_2025_3561108 10966158 |
Genre | orig-research |
GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS F5P HZ~ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P RIA RIE RNS TAE TN5 ~02 3EH 5VS AAYOK AAYXX AETIX AGSQL AI. AIBXA ALLEH CITATION EJD H~9 IAAWW IBMZZ ICLAB IDIHD IFJZH RIG VH1 VJK |
ID | FETCH-LOGICAL-c134t-cf42a7c38a3869e39b7cf4171231413e774a4d0c60854adba21a15292da9ccba3 |
IEDL.DBID | RIE |
ISSN | 0018-9286 |
IngestDate | Sun Jul 06 05:07:35 EDT 2025 Wed Aug 27 02:03:36 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c134t-cf42a7c38a3869e39b7cf4171231413e774a4d0c60854adba21a15292da9ccba3 |
PageCount | 16 |
ParticipantIDs | crossref_primary_10_1109_TAC_2025_3561108 ieee_primary_10966158 |
PublicationCentury | 2000 |
PublicationDate | 2025-00-00 |
PublicationDateYYYYMMDD | 2025-01-01 |
PublicationDate_xml | – year: 2025 text: 2025-00-00 |
PublicationDecade | 2020 |
PublicationTitle | IEEE transactions on automatic control |
PublicationTitleAbbrev | TAC |
PublicationYear | 2025 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0016441 |
Score | 2.4769783 |
Snippet | We study a multi-agent decision problem in large population games. Agents from multiple populations select strategies for repeated interactions with one... |
SourceID | crossref ieee |
SourceType | Index Database Publisher |
StartPage | 1 |
SubjectTerms | Convergence decision making Delay effects Demand response evolutionary dynamics game theory Games Multi-agent systems Nash equilibrium nonlinear systems Oscillators Roads Stability analysis Training Vectors |
Title | Learning with Delayed Payoffs in Population Games using Kullback-Leibler Divergence Regularization |
URI | https://ieeexplore.ieee.org/document/10966158 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3Na8IwFA-bp-2wT8fcFznsskM706amPYrOyT5EhoK3kk8RpQ5XD-6v30tapQwGu5WQQHgv6e_3Xt4HQvcqDAg3UnkG-IJHpaFeoojy4gTgmjBDpUsUfh-0-mP6MokmZbK6y4XRWrvgM-3bT_eWr5ZybV1lcMOBnJMo3kf7YLkVyVq7JwML7MVvF25wEO_eJJvJ46jdAUswiPwQ2AKxnSQrGFRpquIwpXeMBtvdFKEkc3-dC19-_yrU-O_tnqCjkl3idnEcTtGezs7QYaXm4DkSZUXVKbYuWNzVC77RCg_5ZmnMF55leLjr6YWfbQwttrHxU_wKxqrgcu696ZlY6BXu2pAOV8sTf7iO9qsyp7OOxr2nUafvlY0WPElCmnugoYAzGcY8jFuJDhPBYIgwQDUCIKeBInKqmrIF_IxyJTjoF3A_CRRPpBQ8vEC1bJnpS4RhDjNcU6YVcAXaAlYspNA84oZFkvEGetiKPv0s6mmkzg5pJimoKbVqSks1NVDdCrUyr5Dn1R_j1-jALi8cJDeolq_W-hYoQy7u3FH5AetJvwQ |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1JSwMxFA5aD-rBXaxrDl48zNjMZLZjaa3VLhRpobch25TSMpU6PdRf70tmWgZB8DaEEMJ7yXzfe3kLQo_SdQhLhLQS4AsWFQm1IkmkFUYA1yRIqDCJwr2-3x7R97E3LpLVTS6MUsoEnylbf5q3fLkQK-0qgxsO5Jx44S7ag5U8kqdrbR8NNLTnP164w064fZWsRc_DegNsQcezXeALRPeSLKFQqa2KQZXWMepv9pMHk8zsVcZt8f2rVOO_N3yCjgp-iev5gThFOyo9Q4elqoPniBc1VSdYO2FxU83ZWkk8YOtFknzhaYoH265e-FVH0WIdHT_BHTBXORMzq6umfK6WuKmDOkw1T_xhetovi6zOCzRqvQwbbatotWAJ4tLMAh05LBBuyNzQj5Qb8QCGSAC4RgDmFJBERmVN-MDQKJOcgYYB-SNHskgIztxLVEkXqbpCGOYECVM0UBLYAvWBF3PBFfNYEngiYFX0tBF9_JlX1IiNJVKLYlBTrNUUF2qqogst1NK8XJ7Xf4w_oP32sNeNu2_9zg060Evl7pJbVMmWK3UHBCLj9-bY_AAQY8JN |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Learning+with+Delayed+Payoffs+in+Population+Games+using+Kullback-Leibler+Divergence+Regularization&rft.jtitle=IEEE+transactions+on+automatic+control&rft.au=Park%2C+Shinkyu&rft.au=Leonard%2C+Naomi+Ehrich&rft.date=2025&rft.issn=0018-9286&rft.eissn=1558-2523&rft.spage=1&rft.epage=16&rft_id=info:doi/10.1109%2FTAC.2025.3561108&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TAC_2025_3561108 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9286&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9286&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9286&client=summon |