Exponential asymptotic optimality of Whittle index policy
We evaluate the performance of Whittle index policy for restless Markovian bandit. It is shown in Weber and Weiss (J Appl Probab 27(3):637–648, 1990) that if the bandit is indexable and the associated deterministic system has a global attractor fixed point, then the Whittle index policy is asymptoti...
Saved in:
Published in | Queueing systems Vol. 104; no. 1-2; pp. 107 - 150 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.06.2023
Springer Nature B.V Springer Verlag |
Subjects | |
Online Access | Get full text |
ISSN | 0257-0130 1572-9443 |
DOI | 10.1007/s11134-023-09875-x |
Cover
Loading…
Abstract | We evaluate the performance of Whittle index policy for restless Markovian bandit. It is shown in Weber and Weiss (J Appl Probab 27(3):637–648, 1990) that if the bandit is indexable and the associated deterministic system has a global attractor fixed point, then the Whittle index policy is asymptotically optimal in the regime where the arm population grows proportionally with the number of activation arms. In this paper, we show that, under the same conditions, this convergence rate is exponential in the arm population, unless the fixed point is
singular
(to be defined later), which almost never happens in practice. Our result holds for the continuous-time model of Weber and Weiss (1990) and for a discrete-time model in which all bandits make synchronous transitions. Our proof is based on the nature of the deterministic equation governing the stochastic system: We show that it is a piecewise affine continuous dynamical system inside the simplex of the empirical measure of the arms. Using simulations and numerical solvers, we also investigate the singular cases, as well as how the level of singularity influences the (exponential) convergence rate. We illustrate our theorem on a Markovian fading channel model. |
---|---|
AbstractList | We evaluate the performance of Whittle index policy for restless Markovian bandit. It is shown in Weber and Weiss (J Appl Probab 27(3):637–648, 1990) that if the bandit is indexable and the associated deterministic system has a global attractor fixed point, then the Whittle index policy is asymptotically optimal in the regime where the arm population grows proportionally with the number of activation arms. In this paper, we show that, under the same conditions, this convergence rate is exponential in the arm population, unless the fixed point is
singular
(to be defined later), which almost never happens in practice. Our result holds for the continuous-time model of Weber and Weiss (1990) and for a discrete-time model in which all bandits make synchronous transitions. Our proof is based on the nature of the deterministic equation governing the stochastic system: We show that it is a piecewise affine continuous dynamical system inside the simplex of the empirical measure of the arms. Using simulations and numerical solvers, we also investigate the singular cases, as well as how the level of singularity influences the (exponential) convergence rate. We illustrate our theorem on a Markovian fading channel model. We evaluate the performance of Whittle index policy for restless Markovian bandit. It is shown in Weber and Weiss (J Appl Probab 27(3):637–648, 1990) that if the bandit is indexable and the associated deterministic system has a global attractor fixed point, then the Whittle index policy is asymptotically optimal in the regime where the arm population grows proportionally with the number of activation arms. In this paper, we show that, under the same conditions, this convergence rate is exponential in the arm population, unless the fixed point is singular (to be defined later), which almost never happens in practice. Our result holds for the continuous-time model of Weber and Weiss (1990) and for a discrete-time model in which all bandits make synchronous transitions. Our proof is based on the nature of the deterministic equation governing the stochastic system: We show that it is a piecewise affine continuous dynamical system inside the simplex of the empirical measure of the arms. Using simulations and numerical solvers, we also investigate the singular cases, as well as how the level of singularity influences the (exponential) convergence rate. We illustrate our theorem on a Markovian fading channel model. |
Author | Gast, Nicolas Yan, Chen Gaujal, Bruno |
Author_xml | – sequence: 1 givenname: Nicolas surname: Gast fullname: Gast, Nicolas organization: Inria, CNRS, Grenoble INP, LIG, Univ. Grenoble Alpes – sequence: 2 givenname: Bruno surname: Gaujal fullname: Gaujal, Bruno organization: Inria, CNRS, Grenoble INP, LIG, Univ. Grenoble Alpes – sequence: 3 givenname: Chen orcidid: 0000-0002-0551-4786 surname: Yan fullname: Yan, Chen email: chen.yan@inria.fr organization: Inria, CNRS, Grenoble INP, LIG, Univ. Grenoble Alpes |
BackLink | https://inria.hal.science/hal-03041176$$DView record in HAL |
BookMark | eNp9kMFKAzEQhoMo2FZfwNOCJw-rk2SzSY6lVCsUvCgeQ0yzNmW7WTepbN_e1FUEDz0NDN83_POP0WnjG4vQFYZbDMDvAsaYFjkQmoMUnOX9CRphxkkui4KeohEQxnPAFM7ROIQNAJSEyRGS875Np5rodJ3psN-20UdnMt9Gt9W1i_vMV9nr2sVY28w1K9tnra-d2V-gs0rXwV7-zAl6uZ8_zxb58unhcTZd5oZKHnPOebnSpoCKQ8kpcEKIppoJ9lZpbsSKybIkwkoQUhAhCr5Kn1SioEZwYwWdoJvh7lrXqu1Sqm6vvHZqMV2qww4oFBjz8pMk9npg285_7GyIauN3XZPiKSIIY5JAwRIlBsp0PoTOVsq4qKPzTey0qxUGdShVDaWqVKr6LlX1SSX_1N9ERyU6SCHBzbvt_lIdsb4AMCqKTA |
CitedBy_id | crossref_primary_10_1287_moor_2022_0101 crossref_primary_10_1109_TNET_2024_3408673 |
Cites_doi | 10.1287/mnsc.2019.3342 10.1109/TIT.2010.2068950 10.1109/INFOCOM.2008.217 10.1007/978-1-4615-8181-9 10.1145/2745844.2745851 10.2307/3214163 10.1002/9780470316887 10.1007/s00186-023-00821-4 10.1109/TCNS.2016.2619066 10.1145/3154491 10.1111/j.2517-6161.1979.tb01068.x 10.1145/3179410 10.1017/apr.2019.29 10.2307/3214547 10.1109/INFCOM.2012.6195483 10.1214/07-PS121 10.1287/moor.2022.0101 10.1016/0304-4149(78)90020-0 10.1145/3078505.3078523 10.1006/jcss.2000.1737 10.1287/moor.24.2.293 10.1109/TAC.2018.2799521 10.1007/s001860200257 10.1002/9780470980033 10.1109/TNET.2016.2562564 10.2307/1427757 10.1214/15-AAP1137 10.1016/B978-1-55860-377-6.50034-7 10.1016/j.peva.2018.05.002 10.1017/9781108571401 10.1239/aap/1444308876 |
ContentType | Journal Article |
Copyright | The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Attribution |
Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. – notice: Attribution |
DBID | AAYXX CITATION 3V. 7SC 7WY 7WZ 7XB 87Z 88I 8AL 8AO 8FD 8FE 8FG 8FK 8FL 8G5 ABJCF ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ GUQSH HCIFZ JQ2 K60 K6~ K7- L.- L6V L7M L~C L~D M0C M0N M2O M2P M7S MBDVC P5Z P62 PADUT PHGZM PHGZT PKEHL PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI PRINS PTHSS PYYUZ Q9U 1XC VOOES |
DOI | 10.1007/s11134-023-09875-x |
DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Global (Alumni Edition) Science Database (Alumni Edition) Computing Database (Alumni Edition) ProQuest Pharma Collection Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni Edition) ProQuest Research Library ProQuest Materials Science & Engineering ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Business Premium Collection Technology collection ProQuest One Community College ProQuest Central Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student Research Library Prep SciTech Premium Collection ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced ProQuest Engineering Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global Computing Database Research Library Science Database Engineering Database Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection Research Library China ProQuest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Business (OCUL) ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection ABI/INFORM Collection China ProQuest Central Basic Hyper Article en Ligne (HAL) Hyper Article en Ligne (HAL) (Open Access) |
DatabaseTitle | CrossRef ProQuest Business Collection (Alumni Edition) Research Library Prep Computer Science Database ProQuest Central Student ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts SciTech Premium Collection ProQuest Central China ABI/INFORM Complete ProQuest One Applied & Life Sciences Research Library China ProQuest Central (New) Engineering Collection Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global Engineering Database ProQuest Science Journals (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest Business Collection ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ABI/INFORM Global (Corporate) ProQuest One Business Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Central (Alumni Edition) ProQuest One Community College Research Library (Alumni Edition) ProQuest Pharma Collection ProQuest Central ABI/INFORM Professional Advanced ProQuest Engineering Collection ProQuest Central Korea ProQuest Research Library Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) ProQuest Computing ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Science Journals ProQuest Computing (Alumni Edition) ABI/INFORM China ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database Materials Science & Engineering Collection ProQuest One Business (Alumni) ProQuest Central (Alumni) Business Premium Collection (Alumni) |
DatabaseTitleList | ProQuest Business Collection (Alumni Edition) |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Mathematics Computer Science |
EISSN | 1572-9443 |
EndPage | 150 |
ExternalDocumentID | oai_HAL_hal_03041176v2 10_1007_s11134_023_09875_x |
GrantInformation_xml | – fundername: ANR grantid: ANR-19-CE23-0015 |
GroupedDBID | -4X -57 -5G -BR -EM -Y2 -~C -~X .4S .86 .DC .VR 06D 0R~ 0VY 123 1N0 1SB 2.D 203 28- 29P 2J2 2JN 2JY 2KG 2KM 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5QI 5VS 67Z 6NX 7WY 88I 8AO 8FE 8FG 8FL 8G5 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABFTD ABFTV ABHLI ABHQN ABJCF ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTAH ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFS ACGOD ACHSB ACHXU ACIWK ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACZOJ ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AYQZM AZFZN AZQEC B-. BA0 BAPOH BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EBLON EBS EDO EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GUQSH GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I-F I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW L6V LAK LLZTM M0C M0N M2O M2P M4Y M7S MA- N2Q NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P2P P62 P9O PADUT PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 PTHSS Q2X QOK QOS R4E R89 R9I RHV RIG RNI RNS ROL RPX RSV RZC RZD RZK S16 S1Z S26 S27 S28 S3B SAP SBE SCF SCLPG SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TN5 TSG TSK TSV TUC TUS U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW VJK W23 W48 WK8 YLTOR Z45 Z83 Z88 Z92 ZMTXR ZY4 ZYFGU ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ACSTC ADHKG AEZWR AFDZB AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP AMVHM ATHPR AYFIA CITATION PHGZM PHGZT 7SC 7XB 8AL 8FD 8FK ABRTQ JQ2 L.- L7M L~C L~D MBDVC PKEHL PQEST PQGLB PQUKI PRINS PUEGO Q9U 1XC VOOES |
ID | FETCH-LOGICAL-c397t-7776dac40f7067307222a3a585bfa7c8d596628e9089828847d134f843c87ce83 |
IEDL.DBID | 8FG |
ISSN | 0257-0130 |
IngestDate | Fri May 09 12:21:45 EDT 2025 Sat Aug 23 14:32:57 EDT 2025 Tue Jul 01 02:58:45 EDT 2025 Thu Apr 24 22:58:17 EDT 2025 Fri Feb 21 02:43:46 EST 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1-2 |
Keywords | Multi-armed bandits Whittle index Asymptotic optimality 90B18 90C40 90C05 Whittle Index secondary 37H12 60F10 MSC2020 subject classifications: Primary 90C40 68M20 Multi-armed Bandits Asymptotic Optimality |
Language | English |
License | Attribution: http://creativecommons.org/licenses/by |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c397t-7776dac40f7067307222a3a585bfa7c8d596628e9089828847d134f843c87ce83 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-0551-4786 0000-0001-6884-8698 0000-0001-9081-8401 |
OpenAccessLink | https://inria.hal.science/hal-03041176 |
PQID | 2825592045 |
PQPubID | 26066 |
PageCount | 44 |
ParticipantIDs | hal_primary_oai_HAL_hal_03041176v2 proquest_journals_2825592045 crossref_citationtrail_10_1007_s11134_023_09875_x crossref_primary_10_1007_s11134_023_09875_x springer_journals_10_1007_s11134_023_09875_x |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 20230600 2023-06-00 20230601 2023-06 |
PublicationDateYYYYMMDD | 2023-06-01 |
PublicationDate_xml | – month: 6 year: 2023 text: 20230600 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationSubtitle | Theory and Applications |
PublicationTitle | Queueing systems |
PublicationTitleAbbrev | Queueing Syst |
PublicationYear | 2023 |
Publisher | Springer US Springer Nature B.V Springer Verlag |
Publisher_xml | – name: Springer US – name: Springer Nature B.V – name: Springer Verlag |
References | WeberRRWeissGAddendum to: On an index policy for restless banditsAdv. Appl. Probab.199123242943010.2307/1427757 Duran, S., Verloop, M.: Asymptotic optimal control of markov-modulated restless bandits. In: International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2018), vol 2. ACM : Association for Computing Machinery, Irvine, US, pp. 7:1–7:25 (2018) LarranagaMAyestaUVerloopIMDynamic control of birth-and-death restless bandits: application to resource-allocation problemsIEEE/ACM Trans. Netw.20162463812382510.1109/TNET.2016.2562564 BrownDBSmithJEIndex policies and performance bounds for dynamic selection problemsManag. Sci.2020663029305010.1287/mnsc.2019.3342 Aalto, S., Lassila, P., Osti, P.: Whittle index approach to size-aware scheduling with time-varying channels. In: Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pp. 57–69 (2015) Gittins, J.C.: Bandit processes and dynamic allocation indices. J. R. Stat. Soc. Ser. B 148–177 (1979) LattimoreTSzepesváriCBandit Algorithms2020CambridgeCambridge University Press10.1017/9781108571401 GittinsJGlazebrookKWeberRMulti-armed Bandit Allocation Indices2011HobokenJohn Wiley & Sons10.1002/9780470980033 WhittlePRestless bandits: activity allocation in a changing worldJ. Appl. Probab.198825A28729810.2307/3214163 WeberRRWeissGOn an index policy for restless banditsJ. Appl. Probab.199027363764810.2307/3214547 AnsellPGlazebrookKDNino-MoraJWhittle’s index policy for a multi-class queueing system with convex holding costsMath. Methods Oper. Res.2003571213910.1007/s001860200257 Gast, N., Gaujal, B., Khun, K.: Computing whittle (and gittins) index in subcubic time. arXiv preprint arXiv:2203.05207 (2022) MeshramRManjunathDGopalanAOn the whittle index for restless multiarmed hidden Markov banditsIEEE Trans. Autom. Control20186393046305310.1109/TAC.2018.2799521 Zhang, X., Frazier, P.I.: Restless bandits with many arms: beating the central limit theorem (2021) Gast, N., Gaujal, B., Yan, C.: Lp-based policies for restless bandits: necessary and sufficient conditions for (exponentially fast) asymptotic optimality (2022) HodgeDJGlazebrookKDOn the asymptotic optimality of greedy index heuristics for multi-action restless banditsAdv. Appl. Probab.201547365266710.1239/aap/1444308876 Ouyang, W., Eryilmaz, A., Shroff, N.B.: Asymptotically optimal downlink scheduling over Markovian fading channels. In: 2012 Proceedings IEEE INFOCOM, IEEE, pp. 1224–1232 (2012) Duff, M.O.: Q-learning for bandit problems. In: Proceedings of the Twelfth International Conference on International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, ICML’95, pp. 209–217 (1995) Gast, N., Bortolussi, L., Tribastone, M.: Size expansions of mean field approximation: transient and steady-state analysis. In: 2018–36th International Symposium on Computer Performance, Modeling, Measurements and Evaluation, Toulouse, France, pp. 1–2 (2018) Zayas-CabanGJasinSWangGAn asymptotically optimal heuristic for general nonstationary finite-horizon restless multi-armed, multi-action banditsAdv. Appl. Probab.201951374577210.1017/apr.2019.29 KurtzTGStrong approximation theorems for density dependent Markov chainsStoch. Process. Appl.19786322324010.1016/0304-4149(78)90020-0 DarlingRNorrisJDifferential equation approximations for Markov chainsProbab. Surv.20085377910.1214/07-PS121 Hu, W., Frazier, P.: An asymptotically optimal index policy for finite-horizon restless bandits (2017) Zhang, X., Frazier, P.I.: Near-optimality for infinite-horizon restless bandits with many arms. arXiv preprint arXiv:2203.15853 (2022) AvrachenkovKEBorkarVSWhittle index policy for crawling ephemeral contentIEEE Trans. Control Netw. Syst.20165144645510.1109/TCNS.2016.2619066 KiferYRandom Perturbations of Dynamical Systems. Progress in Probability1988BostonBirkhäuser10.1007/978-1-4615-8181-9 Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of optimal queuing network control. Math. Oper. Res. 293–305 (1999) VerloopMAsymptotically optimal priority policies for indexable and nonindexable restless banditsAnn. Appl. Probab.20162641947199510.1214/15-AAP1137 PutermanMLMarkov Decision Processes: Discrete Stochastic Dynamic Programming19941New YorkJohn Wiley & Sons Inc10.1002/9780470316887 Niño-Mora, J., Villar, S.S.: Sensor scheduling for hunting elusive hiding targets via whittle’s restless bandit index policy. In: International Conference on NETwork Games, Control and Optimization (NetGCooP 2011). IEEE, pp. 1–8 (2011) GastNLatellaDMassinkMA refined mean field approximation of synchronous discrete-time population modelsPerform. Eval.201812612110.1016/j.peva.2018.05.002 Gast, N.: Expected Values Estimated via Mean-Field Approximation are 1/N-Accurate. In: ACM SIGMETRICS/ International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’17, Urbana-Champaign, United States, p. 26 (2017) Raghunathan, V., Borkar, V., Cao, M., et al.: Index policies for real-time multicast scheduling for wireless broadcast systems. In: IEEE INFOCOM 2008-The 27th Conference on Computer Communications, IEEE, pp. 1570–1578 (2008) BlondelVDBournezOKoiranPThe stability of saturated linear dynamical systems is undecidableJ. Comput. Syst. Sci.200162344246210.1006/jcss.2000.1737 LiuKZhaoQIndexability of restless bandit problems and optimality of whittle index for dynamic multichannel accessIEEE Trans. Inf. Theory201056115547556710.1109/TIT.2010.2068950 YingLStein’s method for mean field approximations in light and heavy traffic regimesPOMACS201711127 Gast, N., Van Houdt, B.: A refined mean field approximation. In: Proceedings of the ACM on Measurement and Analysis of Computing Systems 1(28) (2017) N Gast (9875_CR12) 2018; 126 DJ Hodge (9875_CR17) 2015; 47 RR Weber (9875_CR32) 1991; 23 P Ansell (9875_CR2) 2003; 57 R Meshram (9875_CR24) 2018; 63 RR Weber (9875_CR31) 1990; 27 9875_CR10 9875_CR1 9875_CR11 L Ying (9875_CR34) 2017; 1 9875_CR13 9875_CR14 9875_CR36 9875_CR37 9875_CR8 9875_CR16 9875_CR7 Y Kifer (9875_CR19) 1988 9875_CR18 9875_CR9 TG Kurtz (9875_CR20) 1978; 6 K Liu (9875_CR23) 2010; 56 KE Avrachenkov (9875_CR3) 2016; 5 VD Blondel (9875_CR4) 2001; 62 M Larranaga (9875_CR21) 2016; 24 ML Puterman (9875_CR28) 1994 R Darling (9875_CR6) 2008; 5 G Zayas-Caban (9875_CR35) 2019; 51 9875_CR25 9875_CR26 P Whittle (9875_CR33) 1988; 25A 9875_CR27 9875_CR29 T Lattimore (9875_CR22) 2020 DB Brown (9875_CR5) 2020; 66 M Verloop (9875_CR30) 2016; 26 J Gittins (9875_CR15) 2011 |
References_xml | – reference: Ouyang, W., Eryilmaz, A., Shroff, N.B.: Asymptotically optimal downlink scheduling over Markovian fading channels. In: 2012 Proceedings IEEE INFOCOM, IEEE, pp. 1224–1232 (2012) – reference: VerloopMAsymptotically optimal priority policies for indexable and nonindexable restless banditsAnn. Appl. Probab.20162641947199510.1214/15-AAP1137 – reference: AnsellPGlazebrookKDNino-MoraJWhittle’s index policy for a multi-class queueing system with convex holding costsMath. Methods Oper. Res.2003571213910.1007/s001860200257 – reference: GittinsJGlazebrookKWeberRMulti-armed Bandit Allocation Indices2011HobokenJohn Wiley & Sons10.1002/9780470980033 – reference: AvrachenkovKEBorkarVSWhittle index policy for crawling ephemeral contentIEEE Trans. Control Netw. Syst.20165144645510.1109/TCNS.2016.2619066 – reference: BlondelVDBournezOKoiranPThe stability of saturated linear dynamical systems is undecidableJ. Comput. Syst. Sci.200162344246210.1006/jcss.2000.1737 – reference: GastNLatellaDMassinkMA refined mean field approximation of synchronous discrete-time population modelsPerform. Eval.201812612110.1016/j.peva.2018.05.002 – reference: WeberRRWeissGAddendum to: On an index policy for restless banditsAdv. Appl. Probab.199123242943010.2307/1427757 – reference: Zhang, X., Frazier, P.I.: Near-optimality for infinite-horizon restless bandits with many arms. arXiv preprint arXiv:2203.15853 (2022) – reference: Gast, N., Bortolussi, L., Tribastone, M.: Size expansions of mean field approximation: transient and steady-state analysis. In: 2018–36th International Symposium on Computer Performance, Modeling, Measurements and Evaluation, Toulouse, France, pp. 1–2 (2018) – reference: MeshramRManjunathDGopalanAOn the whittle index for restless multiarmed hidden Markov banditsIEEE Trans. Autom. Control20186393046305310.1109/TAC.2018.2799521 – reference: WeberRRWeissGOn an index policy for restless banditsJ. Appl. Probab.199027363764810.2307/3214547 – reference: Gast, N., Van Houdt, B.: A refined mean field approximation. In: Proceedings of the ACM on Measurement and Analysis of Computing Systems 1(28) (2017) – reference: Hu, W., Frazier, P.: An asymptotically optimal index policy for finite-horizon restless bandits (2017) – reference: Aalto, S., Lassila, P., Osti, P.: Whittle index approach to size-aware scheduling with time-varying channels. In: Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pp. 57–69 (2015) – reference: Gast, N., Gaujal, B., Yan, C.: Lp-based policies for restless bandits: necessary and sufficient conditions for (exponentially fast) asymptotic optimality (2022) – reference: BrownDBSmithJEIndex policies and performance bounds for dynamic selection problemsManag. Sci.2020663029305010.1287/mnsc.2019.3342 – reference: Duff, M.O.: Q-learning for bandit problems. In: Proceedings of the Twelfth International Conference on International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, ICML’95, pp. 209–217 (1995) – reference: Gast, N., Gaujal, B., Khun, K.: Computing whittle (and gittins) index in subcubic time. arXiv preprint arXiv:2203.05207 (2022) – reference: KurtzTGStrong approximation theorems for density dependent Markov chainsStoch. Process. Appl.19786322324010.1016/0304-4149(78)90020-0 – reference: Raghunathan, V., Borkar, V., Cao, M., et al.: Index policies for real-time multicast scheduling for wireless broadcast systems. In: IEEE INFOCOM 2008-The 27th Conference on Computer Communications, IEEE, pp. 1570–1578 (2008) – reference: DarlingRNorrisJDifferential equation approximations for Markov chainsProbab. Surv.20085377910.1214/07-PS121 – reference: YingLStein’s method for mean field approximations in light and heavy traffic regimesPOMACS201711127 – reference: LattimoreTSzepesváriCBandit Algorithms2020CambridgeCambridge University Press10.1017/9781108571401 – reference: KiferYRandom Perturbations of Dynamical Systems. Progress in Probability1988BostonBirkhäuser10.1007/978-1-4615-8181-9 – reference: Zayas-CabanGJasinSWangGAn asymptotically optimal heuristic for general nonstationary finite-horizon restless multi-armed, multi-action banditsAdv. Appl. Probab.201951374577210.1017/apr.2019.29 – reference: HodgeDJGlazebrookKDOn the asymptotic optimality of greedy index heuristics for multi-action restless banditsAdv. Appl. Probab.201547365266710.1239/aap/1444308876 – reference: Gittins, J.C.: Bandit processes and dynamic allocation indices. J. R. Stat. Soc. Ser. B 148–177 (1979) – reference: Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of optimal queuing network control. Math. Oper. Res. 293–305 (1999) – reference: LarranagaMAyestaUVerloopIMDynamic control of birth-and-death restless bandits: application to resource-allocation problemsIEEE/ACM Trans. Netw.20162463812382510.1109/TNET.2016.2562564 – reference: Zhang, X., Frazier, P.I.: Restless bandits with many arms: beating the central limit theorem (2021) – reference: Duran, S., Verloop, M.: Asymptotic optimal control of markov-modulated restless bandits. In: International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2018), vol 2. ACM : Association for Computing Machinery, Irvine, US, pp. 7:1–7:25 (2018) – reference: Niño-Mora, J., Villar, S.S.: Sensor scheduling for hunting elusive hiding targets via whittle’s restless bandit index policy. In: International Conference on NETwork Games, Control and Optimization (NetGCooP 2011). IEEE, pp. 1–8 (2011) – reference: WhittlePRestless bandits: activity allocation in a changing worldJ. Appl. Probab.198825A28729810.2307/3214163 – reference: PutermanMLMarkov Decision Processes: Discrete Stochastic Dynamic Programming19941New YorkJohn Wiley & Sons Inc10.1002/9780470316887 – reference: Gast, N.: Expected Values Estimated via Mean-Field Approximation are 1/N-Accurate. In: ACM SIGMETRICS/ International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’17, Urbana-Champaign, United States, p. 26 (2017) – reference: LiuKZhaoQIndexability of restless bandit problems and optimality of whittle index for dynamic multichannel accessIEEE Trans. Inf. Theory201056115547556710.1109/TIT.2010.2068950 – volume: 66 start-page: 3029 year: 2020 ident: 9875_CR5 publication-title: Manag. Sci. doi: 10.1287/mnsc.2019.3342 – volume: 56 start-page: 5547 issue: 11 year: 2010 ident: 9875_CR23 publication-title: IEEE Trans. Inf. Theory doi: 10.1109/TIT.2010.2068950 – ident: 9875_CR29 doi: 10.1109/INFOCOM.2008.217 – volume: 1 start-page: 1 issue: 1 year: 2017 ident: 9875_CR34 publication-title: POMACS – ident: 9875_CR18 – volume-title: Random Perturbations of Dynamical Systems. Progress in Probability year: 1988 ident: 9875_CR19 doi: 10.1007/978-1-4615-8181-9 – ident: 9875_CR1 doi: 10.1145/2745844.2745851 – volume: 25A start-page: 287 year: 1988 ident: 9875_CR33 publication-title: J. Appl. Probab. doi: 10.2307/3214163 – ident: 9875_CR11 – volume-title: Markov Decision Processes: Discrete Stochastic Dynamic Programming year: 1994 ident: 9875_CR28 doi: 10.1002/9780470316887 – ident: 9875_CR13 doi: 10.1007/s00186-023-00821-4 – ident: 9875_CR36 – volume: 5 start-page: 446 issue: 1 year: 2016 ident: 9875_CR3 publication-title: IEEE Trans. Control Netw. Syst. doi: 10.1109/TCNS.2016.2619066 – ident: 9875_CR10 doi: 10.1145/3154491 – ident: 9875_CR16 doi: 10.1111/j.2517-6161.1979.tb01068.x – ident: 9875_CR8 doi: 10.1145/3179410 – volume: 51 start-page: 745 issue: 3 year: 2019 ident: 9875_CR35 publication-title: Adv. Appl. Probab. doi: 10.1017/apr.2019.29 – volume: 27 start-page: 637 issue: 3 year: 1990 ident: 9875_CR31 publication-title: J. Appl. Probab. doi: 10.2307/3214547 – ident: 9875_CR26 doi: 10.1109/INFCOM.2012.6195483 – volume: 5 start-page: 37 year: 2008 ident: 9875_CR6 publication-title: Probab. Surv. doi: 10.1214/07-PS121 – ident: 9875_CR14 doi: 10.1287/moor.2022.0101 – volume: 6 start-page: 223 issue: 3 year: 1978 ident: 9875_CR20 publication-title: Stoch. Process. Appl. doi: 10.1016/0304-4149(78)90020-0 – ident: 9875_CR25 – ident: 9875_CR9 doi: 10.1145/3078505.3078523 – volume: 62 start-page: 442 issue: 3 year: 2001 ident: 9875_CR4 publication-title: J. Comput. Syst. Sci. doi: 10.1006/jcss.2000.1737 – ident: 9875_CR27 doi: 10.1287/moor.24.2.293 – volume: 63 start-page: 3046 issue: 9 year: 2018 ident: 9875_CR24 publication-title: IEEE Trans. Autom. Control doi: 10.1109/TAC.2018.2799521 – volume: 57 start-page: 21 issue: 1 year: 2003 ident: 9875_CR2 publication-title: Math. Methods Oper. Res. doi: 10.1007/s001860200257 – volume-title: Multi-armed Bandit Allocation Indices year: 2011 ident: 9875_CR15 doi: 10.1002/9780470980033 – volume: 24 start-page: 3812 issue: 6 year: 2016 ident: 9875_CR21 publication-title: IEEE/ACM Trans. Netw. doi: 10.1109/TNET.2016.2562564 – volume: 23 start-page: 429 issue: 2 year: 1991 ident: 9875_CR32 publication-title: Adv. Appl. Probab. doi: 10.2307/1427757 – volume: 26 start-page: 1947 issue: 4 year: 2016 ident: 9875_CR30 publication-title: Ann. Appl. Probab. doi: 10.1214/15-AAP1137 – ident: 9875_CR37 – ident: 9875_CR7 doi: 10.1016/B978-1-55860-377-6.50034-7 – volume: 126 start-page: 1 year: 2018 ident: 9875_CR12 publication-title: Perform. Eval. doi: 10.1016/j.peva.2018.05.002 – volume-title: Bandit Algorithms year: 2020 ident: 9875_CR22 doi: 10.1017/9781108571401 – volume: 47 start-page: 652 issue: 3 year: 2015 ident: 9875_CR17 publication-title: Adv. Appl. Probab. doi: 10.1239/aap/1444308876 |
SSID | ssj0006259 |
Score | 2.37537 |
Snippet | We evaluate the performance of Whittle index policy for restless Markovian bandit. It is shown in Weber and Weiss (J Appl Probab 27(3):637–648, 1990) that if... |
SourceID | hal proquest crossref springer |
SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 107 |
SubjectTerms | Approximation Asymptotic properties Business and Management Computer Communication Networks Computer Science Continuous time systems Control Convergence Dynamical systems Fixed points (mathematics) Mathematics Neighborhoods Operations Research/Decision Theory Optimization Optimization and Control Ordinary differential equations Performance evaluation Probability Probability Theory and Stochastic Processes Stochastic systems Supply Chain Management Systems Theory |
SummonAdditionalLinks | – databaseName: SpringerLink Journals (ICM) dbid: U2A link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dT4MwEL_ofNEH42dEpyHGN21SoEB5XMyWxahPLtlbQwvEBzeIQzP_e-862KZRE1_hSuF69H7X-wK4Mn5EGZGa-ZlOmch4waQIMpRl7qdGe0UiKDn54TEajsTdOBw3SWGzNtq9dUnanXqV7OZ5gWCoYxhHQzlkiBy3QrTdSa5Hfm-5_xKitycrIQVVBrxJlfn5GV_U0eYzBUOuIc1vzlGrcwZ7sNuARbe3WN192MinB7CzVkLwEJL-vCqnFPKDhOnsY1LVJVK7JW4FE4ux3bJwqQ0eCoVrayO6la0FfASjQf_pdsiadgjMIGioEQfHUZYawYuYusvwGFV7GqSI93WRxkZmIZouvszJk4d2FKqdDD-2QNYbGZtcBsfQmeIbnYBrwqjQSCpxoOBSJzJHmBAHOIOWxjMOeC1XlGlqhVPLihe1qnJMnFTISWU5qeYOXC_HVItKGX9SXyKzl4RU5HrYu1d0jZy1nhdH774D3XYtVPNrzRSJVphQFX0Hbtr1Wd3-fcrT_5GfwbZvBYVOXLrQqV_f8nMEILW-sPL2CZCVzuo priority: 102 providerName: Springer Nature |
Title | Exponential asymptotic optimality of Whittle index policy |
URI | https://link.springer.com/article/10.1007/s11134-023-09875-x https://www.proquest.com/docview/2825592045 https://inria.hal.science/hal-03041176 |
Volume | 104 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8MwDLbYdoED4ikKY6oQN4jou9kJbWjdxGNCiEnjFLVpKw5sLayg8e-xs-4BEjtFapyktd34y8M2wLm0PPKIjJgVRyFzYiNl3LFj1GXDCmVkpk2HnJMf-l5v4NwO3WG54TYpr1XO50Q1UceZpD3yK-rRbVLw9Ov8nVHWKDpdLVNoVKBmoqUhDedBdzETE7ZXeywuXa-0jdJpZuY6Z5q2w9BiMQOX3S6b_jJMlVe6FrmCOf8ckyrrE-zAdgkb9dZMzruwkYz3YGslmOA-NDvTPBvT5R8kDCffo7zIkFrPcFIYKbStZ6lOCfFQPXQVJVHPVVTgAxgEneebHisTIzCJ8KFAROx7cSgdI_Upz4zho5EP7RCRf5SGvuSxi4sYiyd0pocrKjRAMX5sikKQ3JcJtw-hOsY3OgJdul4aISnHho7BoyZPEDD4No4QcWlKDcw5V4Qso4ZT8oo3sYx3TJwUyEmhOCmmGlws2uSzmBlrqc-Q2QtCCnfda90LekbHtqbpe1-WBvW5LET5k03EUiU0uJzLZ1n9_5DH63s7gU1LKQbttdShWnx8JqcIPYqoofSrAbVW0G73qey-3HWwbHf6j09YO7BaP_b91wQ |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtR3LTsJAcIJ6UA_GZ0RRG6Mn3dh3twdjiIIo4EkSb2u7beNBaRVU-Cm_0ZmlBTSRm9d2dtrOTuex8wI4kqZLFZEhM6MwYHakJ4zbVoS8rJuBDI3Et6k4uX3nNjr27YPzUIKvohaG0ioLmagEdZRKOiM_I4yOT83TL7JXRlOjKLpajNAYsUUzHn6iy9Y7v7nC_T02zXrt_rLB8qkCTKLu7aM56blRIG098WhIi-6hhgysAM3mMAk8ySMHPQCTxxQQQ3cEpXdkWHaCXyC5J2NuId45WLAty6cUQl6_Hkt-8iXUmY5D6ZyWnhfpjEr1DMTCUEMyHd18hw1-KMK5J0rDnLJxf4Vllbarr8JKbqZq1RFfrUEp7q7D8lTzwg3wa4Ms7VKyEQIGveFL1k8RWktRCL0o615LE40G8CE7aqoro5apLsSb0PkXkm3BfBffaBs06bhJiKAcF9o6D30eo4HiWfiEkEtDlsEoqCJk3qWchmU8i0l_ZaKkQEoKRUkxKMPJeE026tExE_oQiT0GpPbajWpL0DUKExuG536YZagUeyHyn7onJixYhtNifya3_37kzmxsB7DYuG-3ROvmrrkLS6ZiEjrnqcB8_-093kOzpx_uK17T4PG_mfsbEWUL9A |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT8JAEJ6gJkYPxmdEURujJ93Yd7cHY4iIIGg8aOJtbbdtPCitggp_zV_nzNICmuiNazu7bWe_7szsvAAOpOlSRmTIzCgMmB3pCeO2FSGWdTOQoZH4NiUnX9-4jXv76sF5KMFXkQtDYZXFnqg26iiVdEZ-QjM6PhVPP0nysIjbWv0se2XUQYo8rUU7jSFEWvHgE8237mmzhmt9aJr1i7vzBss7DDCJcriHqqXnRoG09cSjhi26h9IysAJUocMk8CSPHLQGTB6TcwxNE9zJI8OyE_wayT0ZcwvnnYE5z-I6dU_g9cuRFCC7Qp3vOBTaael5ws4wbc_AWRhKS6ajye-w_g-hOPNEIZkT-u4vF62SfPVlWMpVVq06xNgKlOLOKixOFDJcA_-in6UdCjxCwqA7eMl6KVJrKW5IL0rT19JEo2Z8CE1NVWjUMlWReB3up8KyDZjt4BttgiYdNwmRlONAW-ehz2NUVjwLnxByacgyGAVXhMwrllPjjGcxrrVMnBTISaE4KfplOBqNyYb1Ov6l3kdmjwip1Haj2hZ0jVzGhuG5H2YZKsVaiPwH74oxHMtwXKzP-Pbfj9z6f7Y9mEdYi3bzprUNC6bCCB35VGC29_Ye76AG1At3FdQ0eJw2tr8BsXgQIQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exponential+asymptotic+optimality+of+Whittle+index+policy&rft.jtitle=Queueing+systems&rft.au=Gast%2C+Nicolas&rft.au=Gaujal%2C+Bruno&rft.au=Yan%2C+Chen&rft.date=2023-06-01&rft.pub=Springer+Nature+B.V&rft.issn=0257-0130&rft.eissn=1572-9443&rft.volume=104&rft.issue=1-2&rft.spage=107&rft.epage=150&rft_id=info:doi/10.1007%2Fs11134-023-09875-x&rft.externalDBID=HAS_PDF_LINK |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0257-0130&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0257-0130&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0257-0130&client=summon |