"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization
Kurtic, Eldar, Marques, Alexandre, Pandit, Shubhra, Kurtz, Mark, Alistarh, Dan
Year of Publication 04.11.2024
Year of Publication 04.11.2024
Get full text
Journal Article
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
Agarwalla, Abhinav, Gupta, Abhay, Marques, Alexandre, Pandit, Shubhra, Goin, Michael, Kurtic, Eldar, Leong, Kevin, Nguyen, Tuan, Salem, Mahmoud, Alistarh, Dan, Lie, Sean, Kurtz, Mark
Year of Publication 06.05.2024
Year of Publication 06.05.2024
Get full text
Journal Article
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment
Agarwalla, Abhinav, Gupta, Abhay, Marques, Alexandre, Pandit, Shubhra, Goin, Michael, Kurtic, Eldar, Leong, Kevin, Nguyen, Tuan, Salem, Mahmoud, Alistarh, Dan, Lie, Sean, Kurtz, Mark
Published in arXiv.org (06.05.2024)
Get full text
Published in arXiv.org (06.05.2024)
Paper