Optimal Mixed Integer Linear Optimization Trained Multivariate Classification Trees
Multivariate decision trees are powerful machine learning tools for classification and regression that attract many researchers and industry professionals. An optimal binary tree has two types of vertices, (i) branching vertices which have exactly two children and where datapoints are assessed on a...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
02.08.2024
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.2408.01297 |
Cover
Summary: | Multivariate decision trees are powerful machine learning tools for
classification and regression that attract many researchers and industry
professionals. An optimal binary tree has two types of vertices, (i) branching
vertices which have exactly two children and where datapoints are assessed on a
set of discrete features and (ii) leaf vertices at which datapoints are given a
prediction, and can be obtained by solving a biobjective optimization problem
that seeks to (i) maximize the number of correctly classified datapoints and
(ii) minimize the number of branching vertices. Branching vertices are linear
combinations of training features and therefore can be thought of as
hyperplanes. In this paper, we propose two cut-based mixed integer linear
optimization (MILO) formulations for designing optimal binary classification
trees (leaf vertices assign discrete classes). Our models leverage on-the-fly
identification of minimal infeasible subsystems (MISs) from which we derive
cutting planes that hold the form of packing constraints. We show theoretical
improvements on the strongest flow-based MILO formulation currently in the
literature and conduct experiments on publicly available datasets to show our
models' ability to scale, strength against traditional branch and bound
approaches, and robustness in out-of-sample test performance. Our code and data
are available on GitHub. |
---|---|
DOI: | 10.48550/arxiv.2408.01297 |