Data-Centric AI in the Age of Large Language Models

This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs). We start by making the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs, and ye...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Xu, Xinyi, Wu, Zhaoxuan, Qiao, Rui, Verma, Arun, Yao Shu, Wang, Jingtan, Niu, Xinyuan, He, Zhenfeng, Chen, Jiangwei, Zhou, Zijian, Gregory Kang Ruey Lau, Dao, Hieu, Lucas Agussurja, Rachael Hwee Ling Sim, Lin, Xiaoqiang, Hu, Wenyang, Dai, Zhongxiang, Pang Wei Koh, Bryan Kian Hsiang Low
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 20.06.2024
Subjects
Online AccessGet full text

Cover

Loading…