Training Language Models with Language Feedback at Scale

Pretrained language models often generate outputs that are not in line with human preferences, such as harmful text or factually incorrect summaries. Recent work approaches the above issues by learning from a simple form of human feedback: comparisons between pairs of model-generated outputs. Howeve...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Scheurer, Jérémy, Jon Ander Campos, Korbak, Tomasz, Jun Shern Chan, Chen, Angelica, Cho, Kyunghyun, Perez, Ethan
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 22.02.2024
Subjects
Online AccessGet full text

Cover

Loading…