Joint semantic utterance classification and slot filling with recursive neural networks
Guo, Daniel, Tur, Gokhan, Wen-tau Yih, Zweig, Geoffrey
Published in 2014 IEEE Spoken Language Technology Workshop (SLT) (01.12.2014)
Published in 2014 IEEE Spoken Language Technology Workshop (SLT) (01.12.2014)
Get full text
Conference Proceeding
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
Chandak, Yash, Thakoor, Shantanu, Guo, Zhaohan Daniel, Tang, Yunhao, Munos, Remi, Dabney, Will, Borsa, Diana L
Published in arXiv.org (02.05.2023)
Published in arXiv.org (02.05.2023)
Get full text
Paper
Journal Article
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning
Khetarpal, Khimya, Guo, Zhaohan Daniel, Bernardo Avila Pires, Tang, Yunhao, Lyle, Clare, Rowland, Mark, Heess, Nicolas, Borsa, Diana, Guez, Arthur, Dabney, Will
Published in arXiv.org (04.06.2024)
Published in arXiv.org (04.06.2024)
Get full text
Paper
Journal Article
Understanding the performance gap between online and offline alignment algorithms
Tang, Yunhao, Guo, Daniel Zhaohan, Zheng, Zeyu, Calandriello, Daniele, Cao, Yuan, Tarassov, Eugene, Munos, Rémi, Bernardo Ávila Pires, Valko, Michal, Cheng, Yong, Dabney, Will
Published in arXiv.org (14.05.2024)
Published in arXiv.org (14.05.2024)
Get full text
Paper
Journal Article
Nash Learning from Human Feedback
Munos, Rémi, Valko, Michal, Calandriello, Daniele, Azar, Mohammad Gheshlaghi, Rowland, Mark, Guo, Zhaohan Daniel, Tang, Yunhao, Geist, Matthieu, Mesnard, Thomas, Michi, Andrea, Selvi, Marco, Girgin, Sertan, Momchev, Nikola, Bachem, Olivier, Mankowitz, Daniel J, Precup, Doina, Piot, Bilal
Published in arXiv.org (11.06.2024)
Published in arXiv.org (11.06.2024)
Get full text
Paper
Journal Article
Generalized Preference Optimization: A Unified Approach to Offline Alignment
Tang, Yunhao, Guo, Zhaohan Daniel, Zheng, Zeyu, Calandriello, Daniele, Munos, Rémi, Rowland, Mark, Pierre Harvey Richemond, Valko, Michal, Bernardo Ávila Pires, Piot, Bilal
Published in arXiv.org (08.02.2024)
Published in arXiv.org (08.02.2024)
Get full text
Paper
Journal Article
BYOL-Explore: Exploration by Bootstrapped Prediction
Guo, Zhaohan Daniel, Thakoor, Shantanu, Pîslar, Miruna, Bernardo Avila Pires, Altché, Florent, Tallec, Corentin, Saade, Alaa, Calandriello, Daniele, Jean-Bastien Grill, Tang, Yunhao, Valko, Michal, Munos, Rémi, Azar, Mohammad Gheshlaghi, Piot, Bilal
Published in arXiv.org (16.06.2022)
Published in arXiv.org (16.06.2022)
Get full text
Paper
Journal Article
Neural Predictive Belief Representations
Guo, Zhaohan Daniel, Azar, Mohammad Gheshlaghi, Piot, Bilal, Pires, Bernardo A, Munos, Rémi
Published in arXiv.org (19.08.2019)
Published in arXiv.org (19.08.2019)
Get full text
Paper
Journal Article
Understanding Self-Predictive Learning for Reinforcement Learning
Tang, Yunhao, Guo, Zhaohan Daniel, Pierre Harvey Richemond, Bernardo Ávila Pires, Chandak, Yash, Munos, Rémi, Rowland, Mark, Azar, Mohammad Gheshlaghi, Charline Le Lan, Lyle, Clare, György, András, Thakoor, Shantanu, Dabney, Will, Piot, Bilal, Calandriello, Daniele, Valko, Michal
Published in arXiv.org (06.12.2022)
Published in arXiv.org (06.12.2022)
Get full text
Paper
Journal Article
Geometric Entropic Exploration
Guo, Zhaohan Daniel, Azar, Mohammad Gheshlaghi, Saade, Alaa, Thakoor, Shantanu, Piot, Bilal, Bernardo Avila Pires, Valko, Michal, Mesnard, Thomas, Lattimore, Tor, Munos, Rémi
Published in arXiv.org (07.01.2021)
Published in arXiv.org (07.01.2021)
Get full text
Paper
Journal Article
Bootstrap your own latent: A new approach to self-supervised Learning
Jean-Bastien Grill, Strub, Florian, Altché, Florent, Tallec, Corentin, Richemond, Pierre H, Buchatskaya, Elena, Doersch, Carl, Bernardo Avila Pires, Guo, Zhaohan Daniel, Azar, Mohammad Gheshlaghi, Piot, Bilal, Kavukcuoglu, Koray, Munos, Rémi, Valko, Michal
Published in arXiv.org (10.09.2020)
Published in arXiv.org (10.09.2020)
Get full text
Paper
Journal Article