Data Reduction

An increase in dataset dimensionality and size implies a large computational complexity and possible estimation errors. In this context, data reduction methods try to construct a new and more compact data subset. This subset should maintain the most representative information and remove redundant, i...

Full description

Saved in:
Bibliographic Details
Published inMultiple Instance Learning pp. 169 - 189
Main Authors Bello, Rafael, Zafra, Amelia, Vluymans, Sarah, Sánchez-Tarragó, Dánel, Cornelis, Chris, Herrera, Francisco, Ventura, Sebastián
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2016
Springer International Publishing
Subjects
Online AccessGet full text
ISBN3319477587
9783319477589
DOI10.1007/978-3-319-47759-6_8

Cover

Loading…
More Information
Summary:An increase in dataset dimensionality and size implies a large computational complexity and possible estimation errors. In this context, data reduction methods try to construct a new and more compact data subset. This subset should maintain the most representative information and remove redundant, irrelevant, and/or noisy information. The inherent uncertainty of MIL renders the data reduction process more difficult. Each positive bag is composed of several instances, of which only a part approximate the positive concept. Information on which instances are positive is not available. In this chapter, we first provide an introduction to data reduction. Next, two main strategies to reduce MIL data are considered. Section 8.2 describes the main concepts of feature selection as well as methods that try to reduce the number of features in MIL problems. Section 8.3 considers bag prototype selection and analyzes the corresponding multi-instance methods.
ISBN:3319477587
9783319477589
DOI:10.1007/978-3-319-47759-6_8