Multilabel Classification : Problem Analysis, Metrics and Techniques

This book offers a comprehensive review of multilabel techniques widely used to classify and label texts, pictures, videos and music in the Internet. A deep review of the specialized literature on the field includes the available software needed to work with this kind of data. It provides the user w...

Full description

Saved in:
Bibliographic Details
Main Authors Herrera, Francisco, Charte, Francisco, Rivera, Antonio J., del Jesus, María J.
Format eBook Book
LanguageEnglish
Published Cham Springer 2016
Springer International Publishing AG
Springer International Publishing
Edition1
Subjects
Online AccessGet full text
ISBN9783319411101
3319411101
DOI10.1007/978-3-319-41111-8

Cover

Table of Contents:
  • 4.3 Binary Classification Based Methods -- 4.3.1 OVO Versus OVA Approaches -- 4.3.2 Ensembles of Binary Classifiers -- 4.4 Multiclass Classification-Based Methods -- 4.4.1 Labelsets and Pruned Labesets -- 4.4.2 Ensembles of Multiclass Classifiers -- 4.5 Data Transformation Methods in Practice -- 4.5.1 Experimental Configuration -- 4.5.2 Classification Results -- 4.6 Summarizing Comments -- References -- 5 Adaptation-Based Classifiers -- 5.1 Overview -- 5.2 Tree-Based Methods -- 5.2.1 Multilabel C4.5, ML-C4.5 -- 5.2.2 Multilabel Alternate Decision Trees, ADTBoost.MH -- 5.2.3 Other Tree-Based Proposals -- 5.3 Neuronal Network-Based Methods -- 5.3.1 Multilabel Back-Propagation, BP-MLL -- 5.3.2 Multilabel Radial Basis Function Network, ML-RBF -- 5.3.3 Canonical Correlation Analysis and Extreme Learning Machine, CCA-ELM -- 5.4 Vector Support Machine-Based Methods -- 5.4.1 MODEL-x -- 5.4.2 Multilabel SVMs Based on Ranking, Rank-SVM and SCRank-SVM -- 5.5 Instance-Based Methods -- 5.5.1 Multilabel kNN, ML-kNN -- 5.5.2 Instance-Based and Logistic Regression, IBLR-ML -- 5.5.3 Other Instance-Based Classifiers -- 5.6 Probabilistic Methods -- 5.6.1 Collectible Multilabel Classifiers, CML and CMLF -- 5.6.2 Probabilistic Generic Models, PMM1 and PMM2 -- 5.6.3 Probabilistic Classifier Chains, PCC -- 5.6.4 Bayesian and Tree Naïve Bayes Classifier Chains, BCC and TNBCC -- 5.6.5 Conditional Restricted Boltzmann Machines, CRBM -- 5.7 Other MLC Adaptation-Based Methods -- 5.8 Adapted Methods in Practice -- 5.8.1 Experimental Configuration -- 5.8.2 Classification Results -- 5.9 Summarizing Comments -- References -- 6 Ensemble-Based Classifiers -- 6.1 Introduction -- 6.2 Ensembles of Binary Classifiers -- 6.2.1 Ensemble of Classifier Chains, ECC -- 6.2.2 Ranking by Pairwise Comparison, RPC -- 6.2.3 Calibrated Label Ranking, CLR -- 6.3 Ensembles of Multiclass Classifiers
  • 6.3.1 Ensemble of Pruned Sets, EPS -- 6.3.2 Random k-Labelsets, RAkEL -- 6.3.3 Hierarchy of Multilabel Classifiers, HOMER -- 6.4 Other Ensembles -- 6.5 Ensemble Methods in Practice -- 6.5.1 Experimental Configuration -- 6.5.2 Classification Results -- 6.5.3 Training and Testing Times -- 6.6 Summarizing Comments -- References -- 7 Dimensionality Reduction -- 7.1 Overview -- 7.1.1 High-Dimensional Input Space -- 7.1.2 High-Dimensional Output Space -- 7.2 Feature Space Reduction -- 7.2.1 Feature Engineering Approaches -- 7.2.2 Multilabel Supervised Feature Selection -- 7.2.3 Experimentation -- 7.3 Label Space Reduction -- 7.3.1 Sparseness and Dependencies Among Labels -- 7.3.2 Proposals for Reducing Label Space Dimensionality -- 7.3.3 Experimentation -- 7.4 Summarizing Comments -- References -- 8 Imbalance in Multilabel Datasets -- 8.1 Introduction -- 8.2 Imbalanced MLD Specificities -- 8.2.1 How to Measure the Imbalance Level -- 8.2.2 Concurrence Among Imbalanced Labels -- 8.3 Facing Imbalanced Multilabel Classification -- 8.3.1 Classifier Adaptation -- 8.3.2 Resampling Techniques -- 8.3.3 The Ensemble Approach -- 8.4 Multilabel Imbalanced Learning in Practice -- 8.4.1 Experimental Configuration -- 8.4.2 Classification Results -- 8.5 Summarizing Comments -- References -- 9 Multilabel Software -- 9.1 Overview -- 9.2 Working with Multilabel Data -- 9.2.1 Multilabel Data File Formats -- 9.2.2 Multilabel Data Repositories -- 9.2.3 The mldr.datasets Package -- 9.2.4 Generating Synthetic MLDs -- 9.3 Exploratory Analysis of MLDs -- 9.3.1 MEKA -- 9.3.2 The mldr Package -- 9.4 Conducting Multilabel Experiments -- 9.4.1 MEKA -- 9.4.2 MULAN -- 9.4.3 The RunMLClassifier Utility -- 9.5 Summarizing Comments -- References -- Glossary
  • Intro -- Preface -- Contents -- Acronyms -- 1 Introduction -- 1.1 Overview -- 1.2 The Knowledge Discovery in Databases Process -- 1.3 Data Preprocessing -- 1.4 Data Mining -- 1.4.1 DM Methods Attending to Available Data -- 1.4.2 DM Methods Attending to Target Objective -- 1.4.3 DM Methods Attending to Knowledge Representation -- 1.5 Classification -- 1.5.1 Binary Classification -- 1.5.2 Multiclass Classification -- 1.5.3 Multilabel Classification -- 1.5.4 Multidimensional Classification -- 1.5.5 Multiple Instance Learning -- References -- 2 Multilabel Classification -- 2.1 Introduction -- 2.2 Problem Formal Definition -- 2.2.1 Definitions -- 2.2.2 Symbols -- 2.2.3 Terminology -- 2.3 Applications of Multilabel Classification -- 2.3.1 Text Categorization -- 2.3.2 Labeling of Multimedia Resources -- 2.3.3 Genetics/Biology -- 2.3.4 Other Application Fields -- 2.3.5 MLDs Repositories -- 2.4 Learning from Multilabel Data -- 2.4.1 The Data Transformation Approach -- 2.4.2 The Method Adaptation Approach -- 2.4.3 Ensembles of Classifiers -- 2.4.4 Label Correlation Information -- 2.4.5 High Dimensionality -- 2.4.6 Label Imbalance -- 2.5 Multilabel Data Tools -- References -- 3 Case Studies and Metrics -- 3.1 Overview -- 3.2 Case Studies -- 3.2.1 Text Categorization -- 3.2.2 Labeling of Multimedia Resources -- 3.2.3 Genetics/Biology -- 3.2.4 Synthetic MLDs -- 3.3 MLD Characteristics -- 3.3.1 Basic Metrics -- 3.3.2 Imbalance Metrics -- 3.3.3 Other Metrics -- 3.3.4 Summary of Characterization Metrics -- 3.4 Multilabel Classification by Example -- 3.4.1 The ML-kNN Algorithm -- 3.4.2 Experimental Configuration and Results -- 3.5 Assessing Classifiers Performance -- 3.5.1 Example-Based Metrics -- 3.5.2 Label-based Metrics -- References -- 4 Transformation-Based Classifiers -- 4.1 Introduction -- 4.2 Multilabel Data Transformation Approaches