Data Reduction

An increase in dataset dimensionality and size implies a large computational complexity and possible estimation errors. In this context, data reduction methods try to construct a new and more compact data subset. This subset should maintain the most representative information and remove redundant, i...

Full description

Saved in:

Bibliographic Details
Published in	Multiple Instance Learning pp. 169 - 189
Main Authors	Bello, Rafael, Zafra, Amelia, Vluymans, Sarah, Sánchez-Tarragó, Dánel, Cornelis, Chris, Herrera, Francisco, Ventura, Sebastián
Format	Book Chapter
Language	English
Published	Switzerland Springer International Publishing AG 2016 Springer International Publishing
Subjects	Algorithms & data structures Artificial intelligence Candidate Memory Cell Image processing Minimum Hausdorff Distance Negative Bags Positive Bag Prototypical Instances
Online Access	Get full text
ISBN	3319477587 9783319477589
DOI	10.1007/978-3-319-47759-6_8

Cover

Loading…

More Information
Summary:	An increase in dataset dimensionality and size implies a large computational complexity and possible estimation errors. In this context, data reduction methods try to construct a new and more compact data subset. This subset should maintain the most representative information and remove redundant, irrelevant, and/or noisy information. The inherent uncertainty of MIL renders the data reduction process more difficult. Each positive bag is composed of several instances, of which only a part approximate the positive concept. Information on which instances are positive is not available. In this chapter, we first provide an introduction to data reduction. Next, two main strategies to reduce MIL data are considered. Section 8.2 describes the main concepts of feature selection as well as methods that try to reduce the number of features in MIL problems. Section 8.3 considers bag prototype selection and analyzes the corresponding multi-instance methods.
ISBN:	3319477587 9783319477589
DOI:	10.1007/978-3-319-47759-6_8