Search Results - "Hussenot, Léonard" :: K.UTB vyhledávací portál

WARM: On the Benefits of Weight Averaged Reward Models

by Ramé, Alexandre, Vieillard, Nino, Léonard Hussenot, Dadashi, Robert, Cideron, Geoffrey, Bachem, Olivier, Ferret, Johan
Published in arXiv.org (22.01.2024)

Get full text

Paper Journal Article

Loading…

Primal Wasserstein Imitation Learning

by Dadashi, Robert, Léonard Hussenot, Geist, Matthieu, Pietquin, Olivier
Published in arXiv.org (17.03.2021)

Get full text

Paper Journal Article

Loading…

Learning Energy Networks with Generalized Fenchel-Young Losses

by Blondel, Mathieu, Llinares-López, Felipe, Dadashi, Robert, Léonard Hussenot, Geist, Matthieu
Published in arXiv.org (12.10.2022)

Get full text

Paper Journal Article

Loading…

Show me the Way: Intrinsic Motivation from Demonstrations

by Léonard Hussenot, Dadashi, Robert, Geist, Matthieu, Pietquin, Olivier
Published in arXiv.org (13.01.2021)

Get full text

Paper Journal Article

Loading…

WARP: On the Benefits of Weight Averaged Rewarded Policies

by Ramé, Alexandre, Ferret, Johan, Vieillard, Nino, Dadashi, Robert, Léonard Hussenot, Pierre-Louis Cedoz, Sessa, Pier Giuseppe, Girgin, Sertan, Douillard, Arthur, Bachem, Olivier
Published in arXiv.org (24.06.2024)

Get full text

Paper Journal Article

Loading…

vec2text with Round-Trip Translations

by Cideron, Geoffrey, Girgin, Sertan, Raichuk, Anton, Pietquin, Olivier, Bachem, Olivier, Léonard Hussenot
Published in arXiv.org (14.09.2022)

Get full text

Paper Journal Article

Loading…

Continuous Control with Action Quantization from Demonstrations

by Dadashi, Robert, Léonard Hussenot, Vincent, Damien, Girgin, Sertan, Raichuk, Anton, Geist, Matthieu, Pietquin, Olivier
Published in arXiv.org (03.06.2022)

Get full text

Paper Journal Article

Loading…

CopyCAT: Taking Control of Neural Policies with Constant Attacks

by Léonard Hussenot, Geist, Matthieu, Pietquin, Olivier
Published in arXiv.org (21.01.2020)

Get full text

Paper Journal Article

Loading…

Offline Reinforcement Learning as Anti-Exploration

by Rezaeifar, Shideh, Dadashi, Robert, Vieillard, Nino, Léonard Hussenot, Bachem, Olivier, Pietquin, Olivier, Geist, Matthieu
Published in arXiv.org (11.06.2021)

Get full text

Paper Journal Article

Loading…

MusicRL: Aligning Music Generation to Human Preferences

by Cideron, Geoffrey, Girgin, Sertan, Verzetti, Mauro, Vincent, Damien, Kastelic, Matej, Borsos, Zalán, McWilliams, Brian, Ungureanu, Victor, Bachem, Olivier, Pietquin, Olivier, Geist, Matthieu, Léonard Hussenot, Zeghidour, Neil, Agostinelli, Andrea
Published in arXiv.org (06.02.2024)

Get full text

Paper Journal Article

Loading…

Offline Reinforcement Learning with Pseudometric Learning

by Dadashi, Robert, Rezaeifar, Shideh, Vieillard, Nino, Léonard Hussenot, Pietquin, Olivier, Geist, Matthieu
Published in arXiv.org (02.06.2021)

Get full text

Paper Journal Article

Loading…

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

by Roit, Paul, Ferret, Johan, Shani, Lior, Aharoni, Roee, Cideron, Geoffrey, Dadashi, Robert, Geist, Matthieu, Girgin, Sertan, Léonard Hussenot, Keller, Orgad, Momchev, Nikola, Ramos, Sabela, Stanczyk, Piotr, Vieillard, Nino, Bachem, Olivier, Elidan, Gal, Hassidim, Avinatan, Pietquin, Olivier, Szpektor, Idan
Published in arXiv.org (31.05.2023)

Get full text

Paper Journal Article

Loading…

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

by Ramos, Sabela, Girgin, Sertan, Léonard Hussenot, Vincent, Damien, Yakubovich, Hanna, Toyama, Daniel, Gergely, Anita, Stanczyk, Piotr, Marinier, Raphael, Harmsen, Jeremiah, Pietquin, Olivier, Momchev, Nikola
Published in arXiv.org (04.11.2021)

Get full text

Paper Journal Article

Loading…

What Matters for Adversarial Imitation Learning?

by Orsini, Manu, Raichuk, Anton, Léonard Hussenot, Vincent, Damien, Dadashi, Robert, Girgin, Sertan, Geist, Matthieu, Bachem, Olivier, Pietquin, Olivier, Andrychowicz, Marcin
Published in arXiv.org (01.06.2021)

Get full text

Paper Journal Article

Loading…

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

by Botev, Aleksandar, De, Soham, Smith, Samuel L, Anushan Fernando, George-Cristian Muraru, Haroun, Ruba, Berrada, Leonard, Pascanu, Razvan, Sessa, Pier Giuseppe, Dadashi, Robert, Léonard Hussenot, Ferret, Johan, Girgin, Sertan, Bachem, Olivier, Andreev, Alek, Kenealy, Kathleen, Mesnard, Thomas, Cassidy Hardin, Bhupatiraju, Surya, Pathak, Shreya, Sifre, Laurent, Rivière, Morgane, Mihir Sanjay Kale, Love, Juliette, Tafti, Pouya, Joulin, Armand, Fiedel, Noah, Senter, Evan, Chen, Yutian, Srinivasan, Srivatsan, Desjardins, Guillaume, Budden, David, Doucet, Arnaud, Vikram, Sharad, Paszke, Adam, Gale, Trevor, Borgeaud, Sebastian, Chen, Charlie, Brock, Andy, Paterson, Antonia, Brennan, Jenny, Risdal, Meg, Gundluru, Raj, Devanathan, Nesh, Mooney, Paul, Chauhan, Nilay, Culliton, Phil, Martins, Luiz Gustavo, Bandy, Elisa, Huntsperger, David, Cameron, Glenn, Zucker, Arthur, Warkentin, Tris, Peran, Ludovic, Giang, Minh, Ghahramani, Zoubin, Farabet, Clément, Kavukcuoglu, Koray, Hassabis, Demis, Hadsell, Raia, Teh, Yee Whye, Nando de Frietas
Published in arXiv.org (28.08.2024)

Get full text

Paper Journal Article

Loading…

Gemma 2: Improving Open Language Models at a Practical Size

by Team, Gemma, Léonard Hussenot, Shahriari, Bobak, Ramé, Alexandre, Ferret, Johan, Friesen, Abe, Tsitsulin, Anton, Vieillard, Nino, Girgin, Sertan, Hoffman, Matt, Neyshabur, Behnam, Abdagic, Alvin, Carl, Amanda, Brock, Andy, Paterson, Antonia, Royal, Brandon, Weinberger, David, Vijaykumar, Dimple, Herbison, Dustin, Bandy, Elisa, Wang, Emma, Noland, Eric, Moreira, Erica, Senter, Evan, Eltyshev, Evgenii, Rasskin, Gabriel, Wei, Gary, Cameron, Glenn, Martins, Gus, Hashemi, Hadi, Hanna Klimczak-Plucińska, Zhou, Jack, Stanway, Jeff, Chan, Jetha, Jin Peng Zhou, Becker, Jocelyn, Fernandez, Joe, Joost van Amersfoort, Gordon, Josh, Lipschultz, Josh, Newlan, Josh, Badola, Kartikeya, Black, Kat, Millican, Katie, Greene, Kish, Lars Lowe Sjoesund, Usui, Lauren, Lilly McNealus, Livio Baldini Soares, Kilpatrick, Logan, Dixon, Lucas, Reid, Machel, Iverson, Mark, Miller, Matt, Rahtz, Matthew, Risdal, Meg, Rahman, Mofi, Khatwani, Mohit, Bardoliwalla, Nenshad, Dumai, Neta, Botarda, Pankil, Barham, Paul, Culliton, Phil, Comanescu, Ramona, Jana, Reena, Agarwal, Rishabh, Samaneh Saadat, Sara Mc Carthy, Perrin, Sarah, Arnold, Sébastien M R, Garg, Shruti, Chan, Susan, Eccles, Tom, Hennigan, Tom, Kocisky, Tomas, Doshi, Tulsee, Jain, Vihan, Yadav, Vikas, Dharmadhikari, Vishal, Barkley, Warren, Ye, Wenming, Han, Woohyun, Xu, Xiang, Shen, Zhe, Gong, Zhitao, Zichuan Wei, Rao, Anand, Peran, Ludovic, Warkentin, Tris, Collins, Eli, Barral, Joelle, Ghahramani, Zoubin, Dragan, Anca, Vinyals, Oriol, Kavukcuoglu, Koray, Clement Farabet, Fiedel, Noah, Kenealy, Kathleen, Dadashi, Robert, Andreev, Alek
Published in arXiv.org (02.08.2024)

Get full text

Paper Journal Article

Loading…

Gemma: Open Models Based on Gemini Research and Technology

by Team, Gemma, Mesnard, Thomas, Cassidy Hardin, Dadashi, Robert, Bhupatiraju, Surya, Pathak, Shreya, Sifre, Laurent, Rivière, Morgane, Mihir Sanjay Kale, Love, Juliette, Tafti, Pouya, Léonard Hussenot, Sessa, Pier Giuseppe, Chowdhery, Aakanksha, Roberts, Adam, Barua, Aditya, Castro-Ros, Alex, Slone, Ambrose, Tacchetti, Andrea, Bulanova, Anna, Paterson, Antonia, Tsai, Beth, Shahriari, Bobak, Charline Le Lan, Choquette-Choo, Christopher A, Crepy, Clément, Cer, Daniel, Ippolito, Daphne, Reid, David, Buchatskaya, Elena, Noland, Eric, Geng, Yan, Tucker, George, George-Christian Muraru, Rozhdestvenskiy, Grigory, Michalewski, Henryk, Tenney, Ian, Austin, Jacob, Keeling, James, Jean-Baptiste Lespiau, Stanway, Jeff, Brennan, Jenny, Chen, Jeremy, Ferret, Johan, Chiu, Justin, Mao-Jones, Justin, Lee, Katherine, Yu, Kathy, Millican, Katie, Lars Lowe Sjoesund, Lee, Lisa, Dixon, Lucas, Reid, Machel, Mikuła, Maciej, Wirth, Mateo, Sharman, Michael, Chinaev, Nikolai, Thain, Nithum, Bachem, Olivier, Wahltinez, Oscar, Bailey, Paige, Michel, Paul, Yotov, Petko, Chaabouni, Rahma, Comanescu, Ramona, Jana, Reena, Rohan, Anil, McIlroy, Ross, Smith, Samuel L, Borgeaud, Sebastian, Girgin, Sertan, Sholto Douglas, Pandya, Shree, Shakeri, Siamak, De, Soham, Klimenko, Ted, Hennigan, Tom, Feinberg, Vlad, Stokowiec, Wojciech, Yu-hui, Chen, Zafarali Ahmed, Gong, Zhitao, Warkentin, Tris, Peran, Ludovic, Giang, Minh, Farabet, Clément, Vinyals, Oriol, Dean, Jeff, Kavukcuoglu, Koray, Hassabis, Demis, Ghahramani, Zoubin, Eck, Douglas, Barral, Joelle, Pereira, Fernando, Collins, Eli, Joulin, Armand, Fiedel, Noah, Senter, Evan, Andreev, Alek, Kenealy, Kathleen
Published in arXiv.org (16.04.2024)

Get full text

Paper Journal Article

Loading…

What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study

by Andrychowicz, Marcin, Raichuk, Anton, Stańczyk, Piotr, Orsini, Manu, Girgin, Sertan, Marinier, Raphael, Léonard Hussenot, Geist, Matthieu, Pietquin, Olivier, Michalski, Marcin, Gelly, Sylvain, Bachem, Olivier
Published in arXiv.org (10.06.2020)

Get full text

Paper Journal Article

Loading…

Acme: A Research Framework for Distributed Reinforcement Learning

by Hoffman, Matthew W, Shahriari, Bobak, Aslanides, John, Barth-Maron, Gabriel, Momchev, Nikola, Sinopalnikov, Danila, Stańczyk, Piotr, Ramos, Sabela, Raichuk, Anton, Vincent, Damien, Léonard Hussenot, Dadashi, Robert, Dulac-Arnold, Gabriel, Orsini, Manu, Jacq, Alexis, Ferret, Johan, Vieillard, Nino, Seyed Kamyar Seyed Ghasemipour, Girgin, Sertan, Pietquin, Olivier, Behbahani, Feryal, Norman, Tamara, Abdolmaleki, Abbas, Cassirer, Albin, Yang, Fan, Baumli, Kate, Henderson, Sarah, Friesen, Abe, Haroun, Ruba, Novikov, Alex, Sergio Gómez Colmenarejo, Cabi, Serkan, Gulcehre, Caglar, Tom Le Paine, Srinivasan, Srivatsan, Cowie, Andrew, Wang, Ziyu, Piot, Bilal, Nando de Freitas
Published in arXiv.org (20.09.2022)

Get full text

Paper Journal Article

Loading…

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning

by Wang, Kaiwen, Kidambi, Rahul, Sullivan, Ryan, Agarwal, Alekh, Dann, Christoph, Michi, Andrea, Gelmi, Marco, Li, Yunxuan, Gupta, Raghav, Dubey, Avinava, Ramé, Alexandre, Ferret, Johan, Cideron, Geoffrey, Hou, Le, Yu, Hongkun, Ahmed, Amr, Mehta, Aranyak, Hussenot, Léonard, Bachem, Olivier, Leurent, Edouard
Year of Publication 22.07.2024

Get full text

Journal Article

Refine Results

Format

Subject Area

Topic

Language

Year of Publication

Database