A class boundary preserving algorithm for data condensation

In instance-based machine learning, algorithms often suffer from storing large numbers of training instances. This results in large computer memory usage, long response time, and often oversensitivity to noise. In order to overcome such problems, various instance reduction algorithms have been devel...

Full description

Saved in:
Bibliographic Details
Published inPattern recognition Vol. 44; no. 3; pp. 704 - 715
Main Authors Nikolaidis, K., Goulermas, J.Y., Wu, Q.H.
Format Journal Article
LanguageEnglish
Published Kidlington Elsevier Ltd 01.03.2011
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In instance-based machine learning, algorithms often suffer from storing large numbers of training instances. This results in large computer memory usage, long response time, and often oversensitivity to noise. In order to overcome such problems, various instance reduction algorithms have been developed to remove noisy and surplus instances. This paper discusses existing algorithms in the field of instance selection and abstraction, and introduces a new approach, the Class Boundary Preserving Algorithm (CBP), which is a multi-stage method for pruning the training set, based on a simple but very effective heuristic for instance removal. CBP is tested with a large number of datasets and comparatively evaluated against eight of the most successful instance-based condensation algorithms. Experiments showed that our algorithm achieved similar classification accuracies, with much improved storage reduction and competitive execution speeds.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2010.08.014