Rethinking Optimization and Architecture for Tiny Language Models
Tang, Yehui, Liu, Fangcheng, Ni, Yunsheng, Tian, Yuchuan, Bai, Zheyuan, Yi-Qi, Hu, Liu, Sichao, Shangling Jui, Han, Kai, Wang, Yunhe
Published in arXiv.org (06.02.2024)
Get full text
Published in arXiv.org (06.02.2024)
Paper