EXAQ: Exponent Aware Quantization For LLMs Acceleration
Shkolnik, Moran, Fishman, Maxim, Chmiel, Brian, Ben-Yaacov, Hilla, Banner, Ron, Levy, Kfir Yehuda
Year of Publication 04.10.2024
Year of Publication 04.10.2024
Get full text
Journal Article
DropCompute: simple and more robust distributed synchronous training via compute variance reduction
Giladi, Niv, Gottlieb, Shahar, Shkolnik, Moran, Karnieli, Asaf, Banner, Ron, Hoffer, Elad, Levy, Kfir Yehuda, Soudry, Daniel
Year of Publication 18.06.2023
Year of Publication 18.06.2023
Get full text
Journal Article
EXAQ: Exponent Aware Quantization For LLMs Acceleration
Moran Shkolnik, Fishman, Maxim, Chmiel, Brian, Ben-Yaacov, Hilla, Banner, Ron, Kfir Yehuda Levy
Published in arXiv.org (04.10.2024)
Get full text
Published in arXiv.org (04.10.2024)
Paper
Neural gradients are near-lognormal: improved quantized and sparse training
Chmiel, Brian, Ben-Uri, Liad, Shkolnik, Moran, Hoffer, Elad, Banner, Ron, Soudry, Daniel
Year of Publication 15.06.2020
Year of Publication 15.06.2020
Get full text
Journal Article
Robust Quantization: One Model to Rule Them All
Shkolnik, Moran, Chmiel, Brian, Banner, Ron, Shomron, Gil, Nahshan, Yury, Bronstein, Alex, Weiser, Uri
Year of Publication 18.02.2020
Year of Publication 18.02.2020
Get full text
Journal Article
Robust Quantization: One Model to Rule Them All
Moran Shkolnik, Chmiel, Brian, Banner, Ron, Shomron, Gil, Nahshan, Yury, Bronstein, Alex, Weiser, Uri
Published in arXiv.org (22.10.2020)
Get full text
Published in arXiv.org (22.10.2020)
Paper
Neural gradients are near-lognormal: improved quantized and sparse training
Chmiel, Brian, Ben-Uri, Liad, Moran Shkolnik, Hoffer, Elad, Banner, Ron, Soudry, Daniel
Published in arXiv.org (12.10.2020)
Get full text
Published in arXiv.org (12.10.2020)
Paper