Stochastically constrained best arm identification with Thompson sampling
We consider the problem of the best arm identification in the presence of stochastic constraints, where there is a finite number of arms associated with multiple performance measures. The goal is to identify the arm that optimizes the objective measure subject to constraints on the remaining measure...
Saved in:
Published in | Automatica (Oxford) Vol. 176; p. 112223 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.06.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 0005-1098 |
DOI | 10.1016/j.automatica.2025.112223 |
Cover
Loading…
Abstract | We consider the problem of the best arm identification in the presence of stochastic constraints, where there is a finite number of arms associated with multiple performance measures. The goal is to identify the arm that optimizes the objective measure subject to constraints on the remaining measures. We will explore the popular idea of Thompson sampling (TS) as a means to solve it. To the best of our knowledge, it is the first attempt to extend TS to this problem. We will design a TS-based sampling algorithm, establish its asymptotic optimality in the rate of posterior convergence, and demonstrate its superior performance using numerical examples. |
---|---|
AbstractList | We consider the problem of the best arm identification in the presence of stochastic constraints, where there is a finite number of arms associated with multiple performance measures. The goal is to identify the arm that optimizes the objective measure subject to constraints on the remaining measures. We will explore the popular idea of Thompson sampling (TS) as a means to solve it. To the best of our knowledge, it is the first attempt to extend TS to this problem. We will design a TS-based sampling algorithm, establish its asymptotic optimality in the rate of posterior convergence, and demonstrate its superior performance using numerical examples. |
ArticleNumber | 112223 |
Author | Wang, Yi Li, Cheng Gao, Siyang Yang, Le |
Author_xml | – sequence: 1 givenname: Le surname: Yang fullname: Yang, Le email: lyang272-c@my.cityu.edu.hk organization: Department of Systems Engineering, City University of Hong Kong, Hong Kong, China – sequence: 2 givenname: Siyang surname: Gao fullname: Gao, Siyang email: siyangao@cityu.edu.hk organization: Department of Systems Engineering, City University of Hong Kong, Hong Kong, China – sequence: 3 givenname: Cheng surname: Li fullname: Li, Cheng email: stalic@nus.edu.sg organization: Department of Statistics and Data Science, National University of Singapore, Singapore – sequence: 4 givenname: Yi surname: Wang fullname: Wang, Yi email: yiwang@eee.hku.hk organization: Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China |
BookMark | eNqFkMtKAzEUhrOoYFt9h3mBGXPrTGapxUuh4MK6Drmc2JSZpCRR6ds7pYJLV4cfzvfz8y3QLMQACFUENwST9u7QqM8SR1W8UQ3FdNUQQillMzTHGK9qgntxjRY5H6bIiaBztHkr0exVPiPDcKpMDLkk5QPYSkMulUpj5S2E4t30UnwM1bcv-2q3j-MxTymr8Tj48HGDrpwaMtz-3iV6f3rcrV_q7evzZn2_rQ3pWakdM2yluQBhTcsEaNDKEtMJILR1nWK9aRXvrNJOOGKF6zreE91rzjouuGBLJC69JsWcEzh5TH5U6SQJlmcN8iD_NMizBnnRMKEPFxSmfV8ekszGQzBgfQJTpI3-_5If1KZxkg |
Cites_doi | 10.1145/2556195.2556252 10.2307/2527883 10.1093/biomet/25.3-4.285 10.1109/TSMCC.2007.900656 10.1287/ijoc.1120.0519 10.1023/A:1008349927281 10.1080/17477778.2022.2046520 10.1109/TAC.2012.2195931 10.1002/nav.20408 10.1287/opre.2016.1581 10.1145/2630066 10.1016/j.automatica.2023.111042 10.1002/nav.21871 10.1136/annrheumdis-2012-201601 10.1109/WSC.2016.7822146 10.1287/opre.2019.1911 10.1145/268437.268501 10.1016/j.automatica.2017.03.019 |
ContentType | Journal Article |
Copyright | 2025 |
Copyright_xml | – notice: 2025 |
DBID | AAYXX CITATION |
DOI | 10.1016/j.automatica.2025.112223 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
ExternalDocumentID | 10_1016_j_automatica_2025_112223 S0005109825001153 |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 72371214; 72331004 funderid: http://dx.doi.org/10.13039/501100001809 |
GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1~. 1~5 23N 3R3 4.4 457 4G. 5GY 5VS 6TJ 7-5 71M 8P~ 9JN 9JO AAAKF AAAKG AABNK AACTN AAEDT AAEDW AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AARIN AATTM AAXKI AAXUO ABFNM ABFRF ABJNI ABMAC ABUCO ABWVN ABXDB ACBEA ACDAQ ACGFO ACGFS ACNNM ACRLP ACRPL ADBBV ADEZE ADIYS ADMUD ADNMO ADTZH AEBSH AECPX AEFWE AEIPS AEKER AENEX AFFNX AFJKZ AFTJW AFXIZ AGCQF AGHFR AGQPQ AGUBO AGYEJ AHHHB AHJVU AHPGS AI. AIEXJ AIKHN AITUG AKRWK ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU APLSM APXCP ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC BNPGV CS3 EBS EFJIC EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA HAMUX HLZ HVGLF HZ~ H~9 IHE J1W JJJVA K-O KOM LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ RXW SBC SDF SDG SDP SES SET SEW SPC SPCBC SSB SSD SSH SST SSZ T5K T9H TAE TN5 VH1 WH7 WUQ X6Y XPP ZMT ~G- AAYWO AAYXX ACVFH ADCNI AEUPX AFPUW AGRNS AIGII AIIUN AKBMS AKYEP CITATION |
ID | FETCH-LOGICAL-c193t-f3c35b48e8dc638ebebad1c78e126f7a39c6a47dabf8f1d8f77491b9b43748483 |
IEDL.DBID | .~1 |
ISSN | 0005-1098 |
IngestDate | Sun Jul 06 05:02:35 EDT 2025 Sat Apr 26 15:42:07 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Keywords | Top-two algorithm Rate of posterior convergence Best feasible arm identification Thompson sampling |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c193t-f3c35b48e8dc638ebebad1c78e126f7a39c6a47dabf8f1d8f77491b9b43748483 |
ParticipantIDs | crossref_primary_10_1016_j_automatica_2025_112223 elsevier_sciencedirect_doi_10_1016_j_automatica_2025_112223 |
PublicationCentury | 2000 |
PublicationDate | June 2025 2025-06-00 |
PublicationDateYYYYMMDD | 2025-06-01 |
PublicationDate_xml | – month: 06 year: 2025 text: June 2025 |
PublicationDecade | 2020 |
PublicationTitle | Automatica (Oxford) |
PublicationYear | 2025 |
Publisher | Elsevier Ltd |
Publisher_xml | – name: Elsevier Ltd |
References | Chen, Lin, Yücesan, Chick (b7) 2000; 10 Pasupathy, Hunter, Pujowidianto, Lee, Chen (b26) 2014; 25 Chen, H. C., Dai, L., Chen, C. H., & Yücesan, E. (1997). New development of optimal computing budget allocation for discrete event simulation. In Goodwin, Xu, Celik, Chen (b14) 2024; 18 Gao, S., Xiao, H., Zhou, E., & Chen, W. (2016). Optimal computing budget allocation with input uncertainty. In Yang, Gao, Ho (b32) 2024; 36 Thompson (b31) 1933; 25 Gopalan, Mannor, Mansour (b15) 2014 Kawale, Bui, Kveton, Tran-Thanh, Chawla (b22) 2015; 28 Lee, Pujowidianto, Li, Chen, Yap (b24) 2012; 57 Yang, Gao, Li, Wang (b33) 2025 Andradóttir, Kim (b3) 2010; 57 Gao, Chen, Shi (b9) 2017; 65 (pp. 173–182). (pp. 839–846). Kandasamy, K., Krishnamurthy, A., Schneider, J., & Póczos, B. (2018). Parallelised Bayesian optimization via Thompson sampling. In (pp. 133–142). Even-Dar, Mannor, Mansour, Mahadevan (b8) 2006; 7 Lee, Chen, Chew, Li, Pujowidianto, Zhang (b23) 2010; 7 Hunter, Pasupathy (b18) 2013; 25 Qin, C., Klabjan, D., & Russo, D. (2017). Improving the expected improvement algorithm. In Sankararaman, Slivkins (b29) 2018 He, Chick, Chen (b17) 2007; 37 Li, Gao (b25) 2023; 153 Bechhofer (b4) 1958; 14 Bechhofer, Dunnett, Sobel (b5) 1954; 41 Katz-Samuels, Scott (b21) 2019 (pp. 5387–5397). Agrawal, S., & Goyal, N. (2013). Thompson sampling for contextual bandits with linear payoffs. In Genovese, Durez, Richards, Supronik, Dokoupilova, Mazurov (b12) 2013; 72 Agarwal, D., Long, B., Traupman, J., Xin, D., & Zhang, L. (2014). Laser: A scalable response prediction platform for online advertising. In Russo (b28) 2020; 68 Shi, Gao, Xiao, Chen (b30) 2019; 66 Katz-Samuels, Scott (b20) 2018 Gao, Xiao, Zhou, Chen (b11) 2017; 81 (pp. 127–135). Graepel, Candela, Borchert, Herbrich (b16) 2010 (pp. 334–341). Glynn, Juneja (b13) 2004; Vol. 1 Pasupathy (10.1016/j.automatica.2025.112223_b26) 2014; 25 Graepel (10.1016/j.automatica.2025.112223_b16) 2010 Hunter (10.1016/j.automatica.2025.112223_b18) 2013; 25 Thompson (10.1016/j.automatica.2025.112223_b31) 1933; 25 Andradóttir (10.1016/j.automatica.2025.112223_b3) 2010; 57 Lee (10.1016/j.automatica.2025.112223_b23) 2010; 7 10.1016/j.automatica.2025.112223_b27 Bechhofer (10.1016/j.automatica.2025.112223_b4) 1958; 14 Katz-Samuels (10.1016/j.automatica.2025.112223_b20) 2018 He (10.1016/j.automatica.2025.112223_b17) 2007; 37 Yang (10.1016/j.automatica.2025.112223_b33) 2025 Sankararaman (10.1016/j.automatica.2025.112223_b29) 2018 Goodwin (10.1016/j.automatica.2025.112223_b14) 2024; 18 10.1016/j.automatica.2025.112223_b19 Kawale (10.1016/j.automatica.2025.112223_b22) 2015; 28 Shi (10.1016/j.automatica.2025.112223_b30) 2019; 66 Chen (10.1016/j.automatica.2025.112223_b7) 2000; 10 Li (10.1016/j.automatica.2025.112223_b25) 2023; 153 10.1016/j.automatica.2025.112223_b10 Genovese (10.1016/j.automatica.2025.112223_b12) 2013; 72 Russo (10.1016/j.automatica.2025.112223_b28) 2020; 68 Gao (10.1016/j.automatica.2025.112223_b11) 2017; 81 Katz-Samuels (10.1016/j.automatica.2025.112223_b21) 2019 10.1016/j.automatica.2025.112223_b2 10.1016/j.automatica.2025.112223_b1 10.1016/j.automatica.2025.112223_b6 Yang (10.1016/j.automatica.2025.112223_b32) 2024; 36 Lee (10.1016/j.automatica.2025.112223_b24) 2012; 57 Gao (10.1016/j.automatica.2025.112223_b9) 2017; 65 Bechhofer (10.1016/j.automatica.2025.112223_b5) 1954; 41 Glynn (10.1016/j.automatica.2025.112223_b13) 2004; Vol. 1 Gopalan (10.1016/j.automatica.2025.112223_b15) 2014 Even-Dar (10.1016/j.automatica.2025.112223_b8) 2006; 7 |
References_xml | – volume: 28 year: 2015 ident: b22 article-title: Efficient thompson sampling for online matrix-factorization recommendation publication-title: Advances in Neural Information Processing Systems – volume: 25 start-page: 527 year: 2013 end-page: 542 ident: b18 article-title: Optimal sampling laws for stochastically constrained simulation optimization on finite sets publication-title: INFORMS Journal on Computing – reference: (pp. 133–142). – volume: 37 start-page: 951 year: 2007 end-page: 961 ident: b17 article-title: Opportunity cost and OCBA selection procedures in ordinal optimization for a fixed number of alternative systems publication-title: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) – volume: 25 start-page: 285 year: 1933 end-page: 294 ident: b31 article-title: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples publication-title: Biometrika – volume: 41 start-page: 170 year: 1954 end-page: 176 ident: b5 article-title: A tow-sample multiple decision procedure for ranking means of normal populations with a common unknown variance publication-title: Biometrika – volume: 81 start-page: 30 year: 2017 end-page: 36 ident: b11 article-title: Robust ranking and selection with optimal computing budget allocation publication-title: Automatica – volume: 153 year: 2023 ident: b25 article-title: Convergence rate analysis for optimal computing budget allocation algorithms publication-title: Automatica – reference: (pp. 334–341). – volume: 7 year: 2006 ident: b8 article-title: Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. publication-title: Journal of Machine Learning Research – start-page: 1760 year: 2018 end-page: 1770 ident: b29 article-title: Combinatorial semi-bandits with knapsacks publication-title: International conference on artificial intelligence and statistics – volume: 65 start-page: 787 year: 2017 end-page: 803 ident: b9 article-title: A new budget allocation framework for the expected opportunity cost publication-title: Operations Research – volume: 10 start-page: 251 year: 2000 end-page: 270 ident: b7 article-title: Simulation budget allocation for further enhancing the efficiency of ordinal optimization publication-title: Discrete Event Dynamic Systems: Theory and Applications – reference: Agarwal, D., Long, B., Traupman, J., Xin, D., & Zhang, L. (2014). Laser: A scalable response prediction platform for online advertising. In – volume: 57 start-page: 2940 year: 2012 end-page: 2945 ident: b24 article-title: Approximate simulation budget allocation for selecting the best design in the presence of stochastic constraints publication-title: IEEE Transactions on Automatic Control – volume: 66 start-page: 648 year: 2019 end-page: 662 ident: b30 article-title: A worst-case formulation for constrained ranking and selection with input uncertainty publication-title: Naval Research Logistics – reference: Chen, H. C., Dai, L., Chen, C. H., & Yücesan, E. (1997). New development of optimal computing budget allocation for discrete event simulation. In – reference: Qin, C., Klabjan, D., & Russo, D. (2017). Improving the expected improvement algorithm. In – volume: 57 start-page: 403 year: 2010 end-page: 421 ident: b3 article-title: Fully sequential procedures for comparing constrained systems via simulation publication-title: Naval Research Logistics – start-page: 100 year: 2014 end-page: 108 ident: b15 article-title: Thompson sampling for complex online problems publication-title: International conference on machine learning – volume: 36 start-page: 61747 year: 2024 end-page: 61758 ident: b32 article-title: Improving the knowledge gradient algorithm publication-title: Advances in Neural Information Processing Systems – volume: 14 start-page: 408 year: 1958 end-page: 429 ident: b4 article-title: A sequential multiple-decision procedure for selecting the best one of several normal populations with a common unknown variance, and its use with various experimental designs publication-title: Biometrics – reference: Kandasamy, K., Krishnamurthy, A., Schneider, J., & Póczos, B. (2018). Parallelised Bayesian optimization via Thompson sampling. In – reference: (pp. 127–135). – volume: 68 start-page: 1625 year: 2020 end-page: 1647 ident: b28 article-title: Simple Bayesian algorithms for best arm identification publication-title: Operations Research – year: 2025 ident: b33 article-title: Stochastically constrained best arm identification with thompson sampling – reference: Gao, S., Xiao, H., Zhou, E., & Chen, W. (2016). Optimal computing budget allocation with input uncertainty. In – volume: 7 start-page: 19 year: 2010 end-page: 31 ident: b23 article-title: A review of optimal computing budget allocation algorithms for simulation optimization problem publication-title: International Journal of Operations Research – year: 2010 ident: b16 article-title: Webscale Bayesian click-through rate prediction for sponsored search advertising in Microsoft’s Bing search engine – reference: Agrawal, S., & Goyal, N. (2013). Thompson sampling for contextual bandits with linear payoffs. In – volume: 18 start-page: 47 year: 2024 end-page: 64 ident: b14 article-title: Real-time digital twin-based optimization with predictive simulation learning publication-title: Journal of Simulation – start-page: 1593 year: 2019 end-page: 1601 ident: b21 article-title: Top feasible arm identification publication-title: The 22nd international conference on artificial intelligence and statistics – reference: (pp. 173–182). – volume: 72 start-page: 863 year: 2013 end-page: 869 ident: b12 article-title: Efficacy and safety of secukinumab in patients with rheumatoid arthritis: a phase II, dose-finding, double-blind, randomised, placebo controlled study publication-title: Annals of the Rheumatic Diseases – start-page: 2535 year: 2018 end-page: 2543 ident: b20 article-title: Feasible arm identification publication-title: International conference on machine learning – volume: Vol. 1 year: 2004 ident: b13 article-title: A large deviations perspective on ordinal optimization publication-title: Proceedings of the 2004 winter simulation conference, 2004 – reference: (pp. 5387–5397). – volume: 25 start-page: 1 year: 2014 end-page: 26 ident: b26 article-title: Stochastically constrained ranking and selection via SCORE publication-title: ACM Transactions on Modeling and Computer Simulation (TOMACS) – reference: (pp. 839–846). – volume: 28 year: 2015 ident: 10.1016/j.automatica.2025.112223_b22 article-title: Efficient thompson sampling for online matrix-factorization recommendation publication-title: Advances in Neural Information Processing Systems – volume: 36 start-page: 61747 year: 2024 ident: 10.1016/j.automatica.2025.112223_b32 article-title: Improving the knowledge gradient algorithm publication-title: Advances in Neural Information Processing Systems – year: 2010 ident: 10.1016/j.automatica.2025.112223_b16 – ident: 10.1016/j.automatica.2025.112223_b1 doi: 10.1145/2556195.2556252 – volume: 14 start-page: 408 issue: 3 year: 1958 ident: 10.1016/j.automatica.2025.112223_b4 article-title: A sequential multiple-decision procedure for selecting the best one of several normal populations with a common unknown variance, and its use with various experimental designs publication-title: Biometrics doi: 10.2307/2527883 – start-page: 2535 year: 2018 ident: 10.1016/j.automatica.2025.112223_b20 article-title: Feasible arm identification – volume: 25 start-page: 285 issue: 3–4 year: 1933 ident: 10.1016/j.automatica.2025.112223_b31 article-title: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples publication-title: Biometrika doi: 10.1093/biomet/25.3-4.285 – volume: 37 start-page: 951 issue: 5 year: 2007 ident: 10.1016/j.automatica.2025.112223_b17 article-title: Opportunity cost and OCBA selection procedures in ordinal optimization for a fixed number of alternative systems publication-title: IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) doi: 10.1109/TSMCC.2007.900656 – ident: 10.1016/j.automatica.2025.112223_b27 – start-page: 1593 year: 2019 ident: 10.1016/j.automatica.2025.112223_b21 article-title: Top feasible arm identification – volume: 25 start-page: 527 issue: 3 year: 2013 ident: 10.1016/j.automatica.2025.112223_b18 article-title: Optimal sampling laws for stochastically constrained simulation optimization on finite sets publication-title: INFORMS Journal on Computing doi: 10.1287/ijoc.1120.0519 – volume: 10 start-page: 251 year: 2000 ident: 10.1016/j.automatica.2025.112223_b7 article-title: Simulation budget allocation for further enhancing the efficiency of ordinal optimization publication-title: Discrete Event Dynamic Systems: Theory and Applications doi: 10.1023/A:1008349927281 – year: 2025 ident: 10.1016/j.automatica.2025.112223_b33 – volume: 18 start-page: 47 issue: 1 year: 2024 ident: 10.1016/j.automatica.2025.112223_b14 article-title: Real-time digital twin-based optimization with predictive simulation learning publication-title: Journal of Simulation doi: 10.1080/17477778.2022.2046520 – volume: 57 start-page: 2940 issue: 11 year: 2012 ident: 10.1016/j.automatica.2025.112223_b24 article-title: Approximate simulation budget allocation for selecting the best design in the presence of stochastic constraints publication-title: IEEE Transactions on Automatic Control doi: 10.1109/TAC.2012.2195931 – ident: 10.1016/j.automatica.2025.112223_b19 – volume: 57 start-page: 403 issue: 5 year: 2010 ident: 10.1016/j.automatica.2025.112223_b3 article-title: Fully sequential procedures for comparing constrained systems via simulation publication-title: Naval Research Logistics doi: 10.1002/nav.20408 – volume: 65 start-page: 787 issue: 3 year: 2017 ident: 10.1016/j.automatica.2025.112223_b9 article-title: A new budget allocation framework for the expected opportunity cost publication-title: Operations Research doi: 10.1287/opre.2016.1581 – volume: 25 start-page: 1 issue: 1 year: 2014 ident: 10.1016/j.automatica.2025.112223_b26 article-title: Stochastically constrained ranking and selection via SCORE publication-title: ACM Transactions on Modeling and Computer Simulation (TOMACS) doi: 10.1145/2630066 – start-page: 100 year: 2014 ident: 10.1016/j.automatica.2025.112223_b15 article-title: Thompson sampling for complex online problems – volume: 153 year: 2023 ident: 10.1016/j.automatica.2025.112223_b25 article-title: Convergence rate analysis for optimal computing budget allocation algorithms publication-title: Automatica doi: 10.1016/j.automatica.2023.111042 – volume: 66 start-page: 648 issue: 8 year: 2019 ident: 10.1016/j.automatica.2025.112223_b30 article-title: A worst-case formulation for constrained ranking and selection with input uncertainty publication-title: Naval Research Logistics doi: 10.1002/nav.21871 – volume: 41 start-page: 170 issue: 1–2 year: 1954 ident: 10.1016/j.automatica.2025.112223_b5 article-title: A tow-sample multiple decision procedure for ranking means of normal populations with a common unknown variance publication-title: Biometrika – volume: 72 start-page: 863 issue: 6 year: 2013 ident: 10.1016/j.automatica.2025.112223_b12 article-title: Efficacy and safety of secukinumab in patients with rheumatoid arthritis: a phase II, dose-finding, double-blind, randomised, placebo controlled study publication-title: Annals of the Rheumatic Diseases doi: 10.1136/annrheumdis-2012-201601 – ident: 10.1016/j.automatica.2025.112223_b10 doi: 10.1109/WSC.2016.7822146 – volume: Vol. 1 year: 2004 ident: 10.1016/j.automatica.2025.112223_b13 article-title: A large deviations perspective on ordinal optimization – start-page: 1760 year: 2018 ident: 10.1016/j.automatica.2025.112223_b29 article-title: Combinatorial semi-bandits with knapsacks – volume: 68 start-page: 1625 issue: 6 year: 2020 ident: 10.1016/j.automatica.2025.112223_b28 article-title: Simple Bayesian algorithms for best arm identification publication-title: Operations Research doi: 10.1287/opre.2019.1911 – ident: 10.1016/j.automatica.2025.112223_b6 doi: 10.1145/268437.268501 – volume: 7 start-page: 19 issue: 2 year: 2010 ident: 10.1016/j.automatica.2025.112223_b23 article-title: A review of optimal computing budget allocation algorithms for simulation optimization problem publication-title: International Journal of Operations Research – volume: 81 start-page: 30 year: 2017 ident: 10.1016/j.automatica.2025.112223_b11 article-title: Robust ranking and selection with optimal computing budget allocation publication-title: Automatica doi: 10.1016/j.automatica.2017.03.019 – ident: 10.1016/j.automatica.2025.112223_b2 – volume: 7 issue: 6 year: 2006 ident: 10.1016/j.automatica.2025.112223_b8 article-title: Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. publication-title: Journal of Machine Learning Research |
SSID | ssj0004182 |
Score | 2.4760277 |
Snippet | We consider the problem of the best arm identification in the presence of stochastic constraints, where there is a finite number of arms associated with... |
SourceID | crossref elsevier |
SourceType | Index Database Publisher |
StartPage | 112223 |
SubjectTerms | Best feasible arm identification Rate of posterior convergence Thompson sampling Top-two algorithm |
Title | Stochastically constrained best arm identification with Thompson sampling |
URI | https://dx.doi.org/10.1016/j.automatica.2025.112223 |
Volume | 176 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1NS8NAEB1KvehB_MRWLXvwGmuym2QXT6VYWsVetNBb2I-sVjQtbXrw4m93Jx-0guDBYwIL4c1m5s3y5i3AFXNtRGiV9QRFU20hjSe0DTxq3GawwU2kLE4jP46j4YTdT8NpA_r1LAzKKqvcX-b0IltXb7oVmt3FbIYzvrihhGtxCl6Djp-Mxeiff_21kXkwn5eO4YXjpuCVmqfUeMl1Pi-cUdGBKAhxniYI6O8laqvsDA5gv-KLpFd-0iE00uwI9rZcBI9h9JTP9atcFafS759EI-XDmx9SQ5RL-kQuP8jMVLKgIhIEj19JLQYhK4m68uzlBCaDu-f-0KtuSPC0I165Z6mmoWI85Ua7H8kFREnj65infhDZWFKhI8liI5Xl1jfcOrInfCUUQ9cZxukpNLN5lp4BoYHBymRjHXFGVSgdFYtlmEbWuI5Msxb4NSjJojTCSGqF2FuyATJBIJMSyBbc1uglP4KauHz95-r2v1afwy4-laquC2jmy3V66fhDrjrFBunATm_0MBx_Ay9Jx2c |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwED6VMgAD4inK0wNrKImdxBYTqqhaaLvQSt0sP2IogrRq04GF346dh1okJAbWRJaiz5e776zvPgNcE9tGhEYaj2Fnqs2E9pgygYe1DQYT3EbSuGnk_iDqjMjjOBzXoFXNwjhZZZn7i5yeZ-vySbNEszmbTNyMrwsoZlucnNfgDdgkIY5daN98rXQexKeFZXhuucloKecpRF5imU1za1RnQRSEbqAmCPDvNWqt7rT3YLckjOi--KZ9qCXpAeys2QgeQvc5m6pXsciPpd8_kXKcz139kGgkbdZHYv6BJrrUBeVbgdz5K6rUIGghnLA8fTmCUfth2Op45RUJnrLMK_MMVjiUhCZUK_sn2R2RQvsqpokfRCYWmKlIkFgLaajxNTWW7TFfMkmc7Qyh-Bjq6TRNTgDhQLvSZGIVUYJlKCwXi0WYREbblkyRBvgVKHxWOGHwSiL2xldAcgckL4BswF2FHv-xq9wm7D9Xn_5r9RVsdYb9Hu91B09nsO3eFBKvc6hn82VyYclEJi_zYPkGBjrI_Q |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Stochastically+constrained+best+arm+identification+with+Thompson+sampling&rft.jtitle=Automatica+%28Oxford%29&rft.au=Yang%2C+Le&rft.au=Gao%2C+Siyang&rft.au=Li%2C+Cheng&rft.au=Wang%2C+Yi&rft.date=2025-06-01&rft.issn=0005-1098&rft.volume=176&rft.spage=112223&rft_id=info:doi/10.1016%2Fj.automatica.2025.112223&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_automatica_2025_112223 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0005-1098&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0005-1098&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0005-1098&client=summon |