Method and apparatus for compressing neural network model

A method for compressing a neural network model includes acquiring a to-be-compressed neural network model. A first bit width, a second bit width and a target thinning rate corresponding to the to-be-compressed neural network model are determined. A target value is obtained according to the first bi...

Full description

Saved in:

Bibliographic Details
Main Authors	Jia, Lei, Dong, Hao, Cong, Shijun, Wang, Guibin
Format	Patent
Language	English
Published	02.01.2024
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A method for compressing a neural network model includes acquiring a to-be-compressed neural network model. A first bit width, a second bit width and a target thinning rate corresponding to the to-be-compressed neural network model are determined. A target value is obtained according to the first bit width, the second bit width and the target thinning rate. Then the to-be-compressed neural network model is compressed using the target value, the first bit width and the second bit width to obtain a compression result of the to-be-compressed neural network model.
Bibliography:	Application Number: US202217968688