BATCH POLICY LEARNING IN AVERAGE REWARD MARKOV DECISION PROCESSES
Liao, Peng, Qi, Zhengling, Wan, Runzhe, Klasnja, Predrag, Murphy, Susan A
Published in The Annals of statistics (01.12.2022)
Published in The Annals of statistics (01.12.2022)
Get more information
Journal Article
Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches
Liu, Yu, Wan, Runzhe, McQueen, James, Hains, Doug, Gu, Jinxiang, Song, Rui
Year of Publication 20.12.2023
Year of Publication 20.12.2023
Get full text
Journal Article
Batch Policy Learning in Average Reward Markov Decision Processes
Liao, Peng, Qi, Zhengling, Wan, Runzhe, Klasnja, Predrag, Murphy, Susan
Year of Publication 22.07.2020
Year of Publication 22.07.2020
Get full text
Journal Article
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
Shi, Chengchun, Wan, Runzhe, Song, Ge, Luo, Shikai, Song, Rui, Zhu, Hongtu
Year of Publication 21.02.2022
Year of Publication 21.02.2022
Get full text
Journal Article