Multi-Task Off-Policy Learning from Bandit Feedback
Hong, Joey, Kveton, Branislav, Katariya, Sumeet, Zaheer, Manzil, Ghavamzadeh, Mohammad
Year of Publication 09.12.2022
Year of Publication 09.12.2022
Get full text
Journal Article
Compositional Generalization and Decomposition in Neural Program Synthesis
Shi, Kensen, Hong, Joey, Zaheer, Manzil, Yin, Pengcheng, Sutton, Charles
Year of Publication 07.04.2022
Year of Publication 07.04.2022
Get full text
Journal Article
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
Abdulhai, Marwa, White, Isadora, Snell, Charlie, Sun, Charles, Hong, Joey, Zhai, Yuexiang, Xu, Kelvin, Levine, Sergey
Year of Publication 29.11.2023
Year of Publication 29.11.2023
Get full text
Journal Article
Deep Hierarchy in Bandits
Hong, Joey, Kveton, Branislav, Katariya, Sumeet, Zaheer, Manzil, Ghavamzadeh, Mohammad
Year of Publication 03.02.2022
Year of Publication 03.02.2022
Get full text
Journal Article
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis
Shi, Kensen, Hong, Joey, Deng, Yinlin, Yin, Pengcheng, Zaheer, Manzil, Sutton, Charles
Year of Publication 25.07.2023
Year of Publication 25.07.2023
Get full text
Journal Article
Thompson Sampling with a Mixture Prior
Hong, Joey, Kveton, Branislav, Zaheer, Manzil, Ghavamzadeh, Mohammad, Boutilier, Craig
Year of Publication 10.06.2021
Year of Publication 10.06.2021
Get full text
Journal Article