Search Results - "Léonard Hussenot" :: K.UTB vyhledávací portál

WARM: On the Benefits of Weight Averaged Reward Models

by Ramé, Alexandre, Vieillard, Nino, Hussenot, Léonard, Dadashi, Robert, Cideron, Geoffrey, Bachem, Olivier, Ferret, Johan
Year of Publication 22.01.2024

Get full text

Journal Article

Loading…

CopyCAT: Taking Control of Neural Policies with Constant Attacks

by Hussenot, Léonard, Geist, Matthieu, Pietquin, Olivier
Year of Publication 29.05.2019

Get full text

Journal Article

Loading…

WARP: On the Benefits of Weight Averaged Rewarded Policies

by Ramé, Alexandre, Ferret, Johan, Vieillard, Nino, Dadashi, Robert, Hussenot, Léonard, Cedoz, Pierre-Louis, Sessa, Pier Giuseppe, Girgin, Sertan, Douillard, Arthur, Bachem, Olivier
Year of Publication 24.06.2024

Get full text

Journal Article

Loading…

vec2text with Round-Trip Translations

by Cideron, Geoffrey, Girgin, Sertan, Raichuk, Anton, Pietquin, Olivier, Bachem, Olivier, Hussenot, Léonard
Year of Publication 14.09.2022

Get full text

Journal Article

Loading…

Learning Energy Networks with Generalized Fenchel-Young Losses

by Blondel, Mathieu, Llinares-López, Felipe, Dadashi, Robert, Hussenot, Léonard, Geist, Matthieu
Year of Publication 19.05.2022

Get full text

Journal Article

Loading…

MusicRL: Aligning Music Generation to Human Preferences

by Cideron, Geoffrey, Girgin, Sertan, Verzetti, Mauro, Vincent, Damien, Kastelic, Matej, Borsos, Zalán, McWilliams, Brian, Ungureanu, Victor, Bachem, Olivier, Pietquin, Olivier, Geist, Matthieu, Hussenot, Léonard, Zeghidour, Neil, Agostinelli, Andrea
Year of Publication 06.02.2024

Get full text

Journal Article

Loading…

Show me the Way: Intrinsic Motivation from Demonstrations

by Hussenot, Léonard, Dadashi, Robert, Geist, Matthieu, Pietquin, Olivier
Year of Publication 23.06.2020

Get full text

Journal Article

Loading…

Primal Wasserstein Imitation Learning

by Dadashi, Robert, Hussenot, Léonard, Geist, Matthieu, Pietquin, Olivier
Year of Publication 08.06.2020

Get full text

Journal Article

Loading…

Conditional Language Policy: A General Framework for Steerable Multi-Objective Finetuning

by Wang, Kaiwen, Kidambi, Rahul, Sullivan, Ryan, Agarwal, Alekh, Dann, Christoph, Michi, Andrea, Gelmi, Marco, Li, Yunxuan, Gupta, Raghav, Dubey, Avinava, Ramé, Alexandre, Ferret, Johan, Cideron, Geoffrey, Hou, Le, Yu, Hongkun, Ahmed, Amr, Mehta, Aranyak, Hussenot, Léonard, Bachem, Olivier, Leurent, Edouard
Year of Publication 22.07.2024

Get full text

Journal Article

Loading…

BOND: Aligning LLMs with Best-of-N Distillation

by Sessa, Pier Giuseppe, Dadashi, Robert, Hussenot, Léonard, Ferret, Johan, Vieillard, Nino, Ramé, Alexandre, Shariari, Bobak, Perrin, Sarah, Friesen, Abe, Cideron, Geoffrey, Girgin, Sertan, Stanczyk, Piotr, Michi, Andrea, Sinopalnikov, Danila, Ramos, Sabela, Héliou, Amélie, Severyn, Aliaksei, Hoffman, Matt, Momchev, Nikola, Bachem, Olivier
Year of Publication 19.07.2024

Get full text

Journal Article

Loading…

Continuous Control with Action Quantization from Demonstrations

by Dadashi, Robert, Hussenot, Léonard, Vincent, Damien, Girgin, Sertan, Raichuk, Anton, Geist, Matthieu, Pietquin, Olivier
Year of Publication 19.10.2021

Get full text

Journal Article

Loading…

Offline Reinforcement Learning as Anti-Exploration

by Rezaeifar, Shideh, Dadashi, Robert, Vieillard, Nino, Hussenot, Léonard, Bachem, Olivier, Pietquin, Olivier, Geist, Matthieu
Year of Publication 11.06.2021

Get full text

Journal Article

Loading…

Offline Reinforcement Learning with Pseudometric Learning

by Dadashi, Robert, Rezaeifar, Shideh, Vieillard, Nino, Hussenot, Léonard, Pietquin, Olivier, Geist, Matthieu
Year of Publication 02.03.2021

Get full text

Journal Article

Loading…

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

by Roit, Paul, Ferret, Johan, Shani, Lior, Aharoni, Roee, Cideron, Geoffrey, Dadashi, Robert, Geist, Matthieu, Girgin, Sertan, Hussenot, Léonard, Keller, Orgad, Momchev, Nikola, Ramos, Sabela, Stanczyk, Piotr, Vieillard, Nino, Bachem, Olivier, Elidan, Gal, Hassidim, Avinatan, Pietquin, Olivier, Szpektor, Idan
Year of Publication 31.05.2023

Get full text

Journal Article

Loading…

What Matters for Adversarial Imitation Learning?

by Orsini, Manu, Raichuk, Anton, Hussenot, Léonard, Vincent, Damien, Dadashi, Robert, Girgin, Sertan, Geist, Matthieu, Bachem, Olivier, Pietquin, Olivier, Andrychowicz, Marcin
Year of Publication 01.06.2021

Get full text

Journal Article

Loading…

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

by Ramos, Sabela, Girgin, Sertan, Hussenot, Léonard, Vincent, Damien, Yakubovich, Hanna, Toyama, Daniel, Gergely, Anita, Stanczyk, Piotr, Marinier, Raphael, Harmsen, Jeremiah, Pietquin, Olivier, Momchev, Nikola
Year of Publication 04.11.2021

Get full text

Journal Article

Loading…

WARM: On the Benefits of Weight Averaged Reward Models

by Ramé, Alexandre, Vieillard, Nino, Léonard Hussenot, Dadashi, Robert, Cideron, Geoffrey, Bachem, Olivier, Ferret, Johan
Published in arXiv.org (22.01.2024)

Get full text

Paper

Loading…

Gemma 2: Improving Open Language Models at a Practical Size

by Sessa, Pier Giuseppe, Hardin, Cassidy, Bhupatiraju, Surya, Hussenot, Léonard, Shahriari, Bobak, Ramé, Alexandre, Ferret, Johan, Friesen, Abe, Tsitsulin, Anton, Vieillard, Nino, Girgin, Sertan, Hoffman, Matt, Grill, Jean-Bastien, Neyshabur, Behnam, Abdagic, Alvin, Carl, Amanda, Brock, Andy, Paterson, Antonia, Royal, Brandon, Choquette-Choo, Christopher A, Weinberger, David, Vijaykumar, Dimple, Herbison, Dustin, Bandy, Elisa, Wang, Emma, Noland, Eric, Moreira, Erica, Senter, Evan, Eltyshev, Evgenii, Rasskin, Gabriel, Wei, Gary, Cameron, Glenn, Martins, Gus, Hashemi, Hadi, Klimczak-Plucińska, Hanna, Zhou, Jack, Stanway, Jeff, Chan, Jetha, Becker, Jocelyn, Fernandez, Joe, Gordon, Josh, Lipschultz, Josh, Newlan, Josh, Ji, Ju-yeong, Mohamed, Kareem, Badola, Kartikeya, Black, Kat, Millican, Katie, Greene, Kish, Sjoesund, Lars Lowe, Usui, Lauren, Kilpatrick, Logan, Dixon, Lucas, Reid, Machel, Iverson, Mark, Miller, Matt, Rahtz, Matthew, Risdal, Meg, Rahman, Mofi, Khatwani, Mohit, Bardoliwalla, Nenshad, Dumai, Neta, Botarda, Pankil, Barham, Paul, Culliton, Phil, Comanescu, Ramona, Jana, Reena, Agarwal, Rishabh, Saadat, Samaneh, Cogan, Sarah, Perrin, Sarah, Arnold, Sébastien M. R, Krause, Sebastian, Garg, Shruti, Sheth, Shruti, Chan, Susan, Yu, Ting, Kocisky, Tomas, Jain, Vihan, Yadav, Vikas, Meshram, Vilobh, Dharmadhikari, Vishal, Barkley, Warren, Shen, Zhe, Gong, Zhitao, Kirk, Phoebe, Rao, Anand, Warkentin, Tris, Ghahramani, Zoubin, Hadsell, Raia, Banks, Jeanine, Dragan, Anca, Vinyals, Oriol, Dean, Jeff, Kavukcuoglu, Koray, Farabet, Clement, Fiedel, Noah, Kenealy, Kathleen, Dadashi, Robert, Andreev, Alek
Year of Publication 31.07.2024

Get full text

Journal Article

Loading…

CopyCAT: Taking Control of Neural Policies with Constant Attacks

by Léonard Hussenot, Geist, Matthieu, Pietquin, Olivier
Published in arXiv.org (21.01.2020)

Get full text

Paper

Loading…

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

by Botev, Aleksandar, De, Soham, Smith, Samuel L, Fernando, Anushan, Muraru, George-Cristian, Haroun, Ruba, Berrada, Leonard, Pascanu, Razvan, Sessa, Pier Giuseppe, Dadashi, Robert, Hussenot, Léonard, Ferret, Johan, Girgin, Sertan, Bachem, Olivier, Andreev, Alek, Kenealy, Kathleen, Mesnard, Thomas, Hardin, Cassidy, Bhupatiraju, Surya, Pathak, Shreya, Sifre, Laurent, Rivière, Morgane, Kale, Mihir Sanjay, Love, Juliette, Tafti, Pouya, Joulin, Armand, Fiedel, Noah, Senter, Evan, Chen, Yutian, Srinivasan, Srivatsan, Desjardins, Guillaume, Budden, David, Doucet, Arnaud, Vikram, Sharad, Paszke, Adam, Gale, Trevor, Borgeaud, Sebastian, Chen, Charlie, Brock, Andy, Paterson, Antonia, Brennan, Jenny, Risdal, Meg, Gundluru, Raj, Devanathan, Nesh, Mooney, Paul, Chauhan, Nilay, Culliton, Phil, Martins, Luiz Gustavo, Bandy, Elisa, Huntsperger, David, Cameron, Glenn, Zucker, Arthur, Warkentin, Tris, Peran, Ludovic, Giang, Minh, Ghahramani, Zoubin, Farabet, Clément, Kavukcuoglu, Koray, Hassabis, Demis, Hadsell, Raia, Teh, Yee Whye, de Frietas, Nando
Year of Publication 11.04.2024

Get full text

Journal Article

Refine Results

Format

Subject Area

Topic

Language

Year of Publication

Database