Training Language Models with Language Feedback at Scale

Pretrained language models often generate outputs that are not in line with human preferences, such as harmful text or factually incorrect summaries. Recent work approaches the above issues by learning from a simple form of human feedback: comparisons between pairs of model-generated outputs. Howeve...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Scheurer, Jérémy, Jon Ander Campos, Korbak, Tomasz, Jun Shern Chan, Chen, Angelica, Cho, Kyunghyun, Perez, Ethan
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 22.02.2024
Subjects	Bayesian analysis Conditioning (learning) Feedback Human performance Language Statistical inference Summaries
Online Access	Get full text

Cover

Loading…

Be the first to leave a comment!