A Mixed-Integer Programming Approach to Training Dense Neural Networks

Artificial Neural Networks (ANNs) are prevalent machine learning models that are applied across various real-world classification tasks. However, training ANNs is time-consuming and the resulting models take a lot of memory to deploy. In order to train more parsimonious ANNs, we propose a novel mixe...

Full description

Saved in:

Bibliographic Details
Main Authors	Patil, Vrishabh, Mintz, Yonatan
Format	Journal Article
Language	English
Published	03.01.2022
Subjects	Computer Science - Learning Mathematics - Optimization and Control
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Artificial Neural Networks (ANNs) are prevalent machine learning models that are applied across various real-world classification tasks. However, training ANNs is time-consuming and the resulting models take a lot of memory to deploy. In order to train more parsimonious ANNs, we propose a novel mixed-integer programming (MIP) formulation for training fully-connected ANNs. Our formulations can account for both binary and rectified linear unit (ReLU) activations, and for the use of a log-likelihood loss. We present numerical experiments comparing our MIP-based methods against existing approaches and show that we are able to achieve competitive out-of-sample performance with more parsimonious models.
DOI:	10.48550/arxiv.2201.00723