A class boundary preserving algorithm for data condensation
In instance-based machine learning, algorithms often suffer from storing large numbers of training instances. This results in large computer memory usage, long response time, and often oversensitivity to noise. In order to overcome such problems, various instance reduction algorithms have been devel...
Saved in:
Published in | Pattern recognition Vol. 44; no. 3; pp. 704 - 715 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Kidlington
Elsevier Ltd
01.03.2011
Elsevier |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In instance-based machine learning, algorithms often suffer from storing large numbers of training instances. This results in large computer memory usage, long response time, and often oversensitivity to noise. In order to overcome such problems, various instance reduction algorithms have been developed to remove noisy and surplus instances. This paper discusses existing algorithms in the field of instance selection and abstraction, and introduces a new approach, the Class Boundary Preserving Algorithm (CBP), which is a multi-stage method for pruning the training set, based on a simple but very effective heuristic for instance removal. CBP is tested with a large number of datasets and comparatively evaluated against eight of the most successful instance-based condensation algorithms. Experiments showed that our algorithm achieved similar classification accuracies, with much improved storage reduction and competitive execution speeds. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0031-3203 1873-5142 |
DOI: | 10.1016/j.patcog.2010.08.014 |