Optimal Brain Surgeon and general network pruning

The use of information from all second-order derivatives of the error function to perform network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training...

Full description

Saved in:

Bibliographic Details
Published in	IEEE International Conference on Neural Networks pp. 293 - 299 vol.1
Main Authors	Hassibi, B., Stork, D.G., Wolff, G.J.
Format	Conference Proceeding
Language	English
Published	IEEE 1993
Subjects	Backpropagation Benchmark testing Biological neural networks Data mining Hardware Machine learning Pattern recognition Statistics Surges Training data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The use of information from all second-order derivatives of the error function to perform network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and, in some cases, enable rule extraction, is investigated. The method, Optimal Brain Surgeon (OBS), is significantly better than magnitude-based methods and Optimal Brain Damage, which often remove the wrong weights. OBS, permits pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix H/sup -1/ from training data and structural information of the set. OBS deletes the correct weights from a trained XOR network in every case.< >
ISBN:	0780309995 9780780309999
DOI:	10.1109/ICNN.1993.298572