Language Model Alignment with Elastic Reset
Noukhovitch, Michael, Lavoie, Samuel, Strub, Florian, Courville, Aaron
Year of Publication 06.12.2023
Year of Publication 06.12.2023
Get full text
Journal Article
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Noukhovitch, Michael, Huang, Shengyi, Xhonneux, Sophie, Hosseini, Arian, Agarwal, Rishabh, Courville, Aaron
Year of Publication 23.10.2024
Year of Publication 23.10.2024
Get full text
Journal Article
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization
Huang, Shengyi, Noukhovitch, Michael, Hosseini, Arian, Rasul, Kashif, Wang, Weixun, Tunstall, Lewis
Year of Publication 23.03.2024
Year of Publication 23.03.2024
Get full text
Journal Article
Emergent Communication under Competition
Noukhovitch, Michael, LaCroix, Travis, Lazaridou, Angeliki, Courville, Aaron
Year of Publication 25.01.2021
Year of Publication 25.01.2021
Get full text
Journal Article
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
Lavoie, Samuel, Tsirigotis, Christos, Schwarzer, Max, Vani, Ankit, Noukhovitch, Michael, Kawaguchi, Kenji, Courville, Aaron
Year of Publication 01.04.2022
Year of Publication 01.04.2022
Get full text
Journal Article
Learning Multi-Agent Communication with Contrastive Learning
Yat Long Lo, Sengupta, Biswa, Foerster, Jakob, Noukhovitch, Michael
Published in arXiv.org (01.02.2024)
Get full text
Published in arXiv.org (01.02.2024)
Paper
Language Model Alignment with Elastic Reset
Noukhovitch, Michael, Lavoie, Samuel, Strub, Florian, Courville, Aaron
Published in arXiv.org (06.12.2023)
Get full text
Published in arXiv.org (06.12.2023)
Paper
Pretraining Representations for Data-Efficient Reinforcement Learning
Schwarzer, Max, Rajkumar, Nitarshan, Noukhovitch, Michael, Anand, Ankesh, Charlin, Laurent, Hjelm, Devon, Bachman, Philip, Courville, Aaron
Year of Publication 09.06.2021
Year of Publication 09.06.2021
Get full text
Journal Article
Systematic Generalization: What Is Required and Can It Be Learned?
Bahdanau, Dzmitry, Murty, Shikhar, Noukhovitch, Michael, Nguyen, Thien Huu, de Vries, Harm, Courville, Aaron
Year of Publication 30.11.2018
Year of Publication 30.11.2018
Get full text
Journal Article
Emergent Communication under Competition
Noukhovitch, Michael, LaCroix, Travis, Lazaridou, Angeliki, Courville, Aaron
Published in arXiv.org (25.01.2021)
Get full text
Published in arXiv.org (25.01.2021)
Paper
Commonsense mining as knowledge base completion? A study on the impact of novelty
Jastrzębski, Stanisław, Bahdanau, Dzmitry, Hosseini, Seyedarian, Noukhovitch, Michael, Bengio, Yoshua, Cheung, Jackie Chi Kit
Year of Publication 24.04.2018
Year of Publication 24.04.2018
Get full text
Journal Article