Challenges of real-world reinforcement learning: definitions, benchmarks and analysis
Dulac-Arnold, Gabriel, Levine, Nir, Mankowitz, Daniel J., Li, Jerry, Paduraru, Cosmin, Gowal, Sven, Hester, Todd
Published in Machine learning (01.09.2021)
Published in Machine learning (01.09.2021)
Get full text
Journal Article
Competition-level code generation with AlphaCode
Li, Yujia, Choi, David, Chung, Junyoung, Kushman, Nate, Schrittwieser, Julian, Leblond, Rémi, Eccles, Tom, Keeling, James, Gimeno, Felix, Dal Lago, Agustin, Hubert, Thomas, Choy, Peter, de Masson d'Autume, Cyprien, Babuschkin, Igor, Chen, Xinyun, Huang, Po-Sen, Welbl, Johannes, Gowal, Sven, Cherepanov, Alexey, Molloy, James, Mankowitz, Daniel J, Sutherland Robson, Esme, Kohli, Pushmeet, de Freitas, Nando, Kavukcuoglu, Koray, Vinyals, Oriol
Published in Science (American Association for the Advancement of Science) (09.12.2022)
Published in Science (American Association for the Advancement of Science) (09.12.2022)
Get full text
Journal Article
Faster sorting algorithms discovered using deep reinforcement learning
Mankowitz, Daniel J, Michi, Andrea, Zhernov, Anton, Gelmi, Marco, Selvi, Marco, Paduraru, Cosmin, Leurent, Edouard, Iqbal, Shariq, Lespiau, Jean-Baptiste, Ahern, Alex, Köppe, Thomas, Millikin, Kevin, Gaffney, Stephen, Elster, Sophie, Broshear, Jackson, Gamble, Chris, Milan, Kieran, Tung, Robert, Hwang, Minjae, Cemgil, Taylan, Barekatain, Mohammadamin, Li, Yujia, Mandhane, Amol, Hubert, Thomas, Schrittwieser, Julian, Hassabis, Demis, Kohli, Pushmeet, Riedmiller, Martin, Vinyals, Oriol, Silver, David
Published in Nature (London) (08.06.2023)
Published in Nature (London) (08.06.2023)
Get full text
Journal Article
강화 학습을 통한 전송률 제어 신경망 트레이닝
GU CHENJIE, HUBERT THOMAS KEISUKE, WANG MIAOSEN, ZHERNOV ANTON, MANKOWITZ DANIEL J, SCHRITTWIESER JULIAN, RAUH MARY ELIZABETH, MANDHANE AMOL BALKISHAN
Year of Publication 24.10.2023
Get full text
Year of Publication 24.10.2023
Patent
Towards practical reinforcement learning for tokamak magnetic control
Tracey, Brendan D., Michi, Andrea, Chervonyi, Yuri, Davies, Ian, Paduraru, Cosmin, Lazic, Nevena, Felici, Federico, Ewalds, Timo, Donner, Craig, Galperti, Cristian, Buchli, Jonas, Neunert, Michael, Huber, Andrea, Evens, Jonathan, Kurylowicz, Paula, Mankowitz, Daniel J., Riedmiller, Martin
Published in Fusion engineering and design (01.03.2024)
Published in Fusion engineering and design (01.03.2024)
Get full text
Journal Article
COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation
Lee, Jongmin, Paduraru, Cosmin, Mankowitz, Daniel J, Heess, Nicolas, Precup, Doina, Kim, Kee-Eung, Guez, Arthur
Published in arXiv.org (19.04.2022)
Published in arXiv.org (19.04.2022)
Get full text
Paper
Journal Article
Discovering a set of policies for the worst case reward
Zahavy, Tom, Barreto, Andre, Mankowitz, Daniel J, Hou, Shaobo, O'Donoghue, Brendan, Kemaev, Iurii, Singh, Satinder
Published in arXiv.org (10.12.2021)
Published in arXiv.org (10.12.2021)
Get full text
Paper
Journal Article
Nash Learning from Human Feedback
Munos, Rémi, Valko, Michal, Calandriello, Daniele, Azar, Mohammad Gheshlaghi, Rowland, Mark, Guo, Zhaohan Daniel, Tang, Yunhao, Geist, Matthieu, Mesnard, Thomas, Michi, Andrea, Selvi, Marco, Girgin, Sertan, Momchev, Nikola, Bachem, Olivier, Mankowitz, Daniel J, Precup, Doina, Piot, Bilal
Published in arXiv.org (11.06.2024)
Published in arXiv.org (11.06.2024)
Get full text
Paper
Journal Article
An empirical investigation of the challenges of real-world reinforcement learning
Dulac-Arnold, Gabriel, Levine, Nir, Mankowitz, Daniel J, Li, Jerry, Paduraru, Cosmin, Gowal, Sven, Hester, Todd
Published in arXiv.org (04.03.2021)
Published in arXiv.org (04.03.2021)
Get full text
Paper
Journal Article
Balancing Constraints and Rewards with Meta-Gradient D4PG
Calian, Dan A, Mankowitz, Daniel J, Zahavy, Tom, Xu, Zhongwen, Oh, Junhyuk, Levine, Nir, Mann, Timothy
Published in arXiv.org (27.11.2020)
Published in arXiv.org (27.11.2020)
Get full text
Paper
Journal Article
Active Offline Policy Selection
Konyushkova, Ksenia, Chen, Yutian, Tom Le Paine, Gulcehre, Caglar, Paduraru, Cosmin, Mankowitz, Daniel J, Denil, Misha, Nando de Freitas
Published in arXiv.org (06.05.2022)
Published in arXiv.org (06.05.2022)
Get full text
Paper
Journal Article
Soft-Robust Actor-Critic Policy-Gradient
Derman, Esther, Mankowitz, Daniel J, Mann, Timothy A, Mannor, Shie
Published in arXiv.org (24.10.2018)
Published in arXiv.org (24.10.2018)
Get full text
Paper
Journal Article
GENERATING POSITIONAL ENCODINGS OF DIRECTED GRAPHS
CEMGIL, Ali Taylan, PADURARU, Cosmin, LI, Yujia, GEISLER, Simon Markus, MANKOWITZ, Daniel J
Year of Publication 10.05.2024
Get full text
Year of Publication 10.05.2024
Patent
Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification
Mankowitz, Daniel J, Calian, Dan A, Jeong, Rae, Paduraru, Cosmin, Heess, Nicolas, Dathathri, Sumanth, Riedmiller, Martin, Mann, Timothy
Published in arXiv.org (03.03.2021)
Published in arXiv.org (03.03.2021)
Get full text
Paper
Journal Article
Optimizing Memory Mapping Using Deep Reinforcement Learning
Wang, Pengming, Mikita Sazanovich, Ilbeyi, Berkin, Phothilimthana, Phitchaya Mangpo, Purohit, Manish, Han Yang Tay, Ngân Vũ, Wang, Miaosen, Paduraru, Cosmin, Leurent, Edouard, Zhernov, Anton, Po-Sen, Huang, Schrittwieser, Julian, Thomas, Hubert, Tung, Robert, Kurylowicz, Paula, Kieran Milan, Vinyals, Oriol, Mankowitz, Daniel J
Published in arXiv.org (17.10.2023)
Published in arXiv.org (17.10.2023)
Get full text
Paper
Journal Article
Towards practical reinforcement learning for tokamak magnetic control
Tracey, Brendan D, Michi, Andrea, Chervonyi, Yuri, Davies, Ian, Paduraru, Cosmin, Lazic, Nevena, Felici, Federico, Ewalds, Timo, Donner, Craig, Galperti, Cristian, Buchli, Jonas, Neunert, Michael, Huber, Andrea, Evens, Jonathan, Kurylowicz, Paula, Mankowitz, Daniel J, Riedmiller, Martin, The TCV Team
Published in arXiv.org (05.10.2023)
Published in arXiv.org (05.10.2023)
Get full text
Paper
Journal Article
Action Assembly: Sparse Imitation Learning for Text Based Games with Combinatorial Action Spaces
Tessler, Chen, Zahavy, Tom, Cohen, Deborah, Mankowitz, Daniel J, Mannor, Shie
Published in arXiv.org (09.02.2020)
Published in arXiv.org (09.02.2020)
Get full text
Paper
Journal Article