Method and apparatus for compressing neural network model

A method for compressing a neural network model includes acquiring a to-be-compressed neural network model. A first bit width, a second bit width and a target thinning rate corresponding to the to-be-compressed neural network model are determined. A target value is obtained according to the first bi...

Full description

Saved in:
Bibliographic Details
Main Authors Jia, Lei, Dong, Hao, Cong, Shijun, Wang, Guibin
Format Patent
LanguageEnglish
Published 02.01.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A method for compressing a neural network model includes acquiring a to-be-compressed neural network model. A first bit width, a second bit width and a target thinning rate corresponding to the to-be-compressed neural network model are determined. A target value is obtained according to the first bit width, the second bit width and the target thinning rate. Then the to-be-compressed neural network model is compressed using the target value, the first bit width and the second bit width to obtain a compression result of the to-be-compressed neural network model.
Bibliography:Application Number: US202217968688