Support Vector Machines for Pattern Classification
A guide on the use of SVMs in pattern classification, including a rigorous performance comparison of classifiers and regressors. The book presents architectures for multiclass classification and function approximation problems, as well as evaluation criteria for classifiers and regressors. Features:...
Saved in:
Main Author | |
---|---|
Format | eBook |
Language | English German |
Published |
London
Springer Nature
2010
Springer Verlag London Limited Springer London, Limited Springer London Springer |
Edition | 2. Aufl. |
Series | Advances in Pattern Recognition |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Table of Contents:
- Intro -- Preface -- Acknowledgments -- Symbols -- 1 Introduction -- 1.1 Decision Functions -- 1.1.1 Decision Functions for Two-Class Problems -- 1.1.2 Decision Functions for Multiclass Problems -- 1.2 Determination of Decision Functions -- 1.3 Data Sets Used in the Book -- 1.4 Classifier Evaluation -- References -- 2 Two-Class Support Vector Machines -- 2.1 Hard-Margin Support Vector Machines -- 2.2 L1 Soft-Margin Support Vector Machines -- 2.3 Mapping to a High-Dimensional Space -- 2.3.1 Kernel Tricks -- 2.3.2 Kernels -- 2.3.3 Normalizing Kernels -- 2.3.4 Properties of Mapping Functions Associated with Kernels -- 2.3.5 Implicit Bias Terms -- 2.3.6 Empirical Feature Space -- 2.4 L2 Soft-Margin Support Vector Machines -- 2.5 Advantages and Disadvantages -- 2.5.1 Advantages -- 2.5.2 Disadvantages -- 2.6 Characteristics of Solutions -- 2.6.1 Hessian Matrix -- 2.6.2 Dependence of Solutions on C -- 2.6.3 Equivalence of L1 and L2 Support Vector Machines -- 2.6.4 Nonunique Solutions -- 2.6.5 Reducing the Number of Support Vectors -- 2.6.6 Degenerate Solutions -- 2.6.7 Duplicate Copies of Data -- 2.6.8 Imbalanced Data -- 2.6.9 Classification for the Blood Cell Data -- 2.7 Class Boundaries for Different Kernels -- 2.8 Developing Classifiers -- 2.8.1 Model Selection -- 2.8.2 Estimating Generalization Errors -- 2.8.3 Sophistication of Model Selection -- 2.8.4 Effect of Model Selection by Cross-Validation -- 2.9 Invariance for Linear Transformation -- References -- 3 Multiclass Support Vector Machines -- 3.1 One-Against-All Support Vector Machines -- 3.1.1 Conventional Support Vector Machines -- 3.1.2 Fuzzy Support Vector Machines -- 3.1.3 Equivalence of Fuzzy Support Vector Machines and Support Vector Machines with Continuous Decision Functions -- 3.1.4 Decision-Tree-Based Support Vector Machines -- 3.2 Pairwise Support Vector Machines
- 4.7.2 Incremental Training Using Hyperspheres -- 4.8 Learning Using Privileged Information -- 4.9 Semi-Supervised Learning -- 4.10 Multiple Classifier Systems -- 4.11 Multiple Kernel Learning -- 4.12 Confidence Level -- 4.13 Visualization -- References -- 5 Training Methods -- 5.1 Preselecting Support Vector Candidates -- 5.1.1 Approximation of Boundary Data -- 5.1.2 Performance Evaluation -- 5.2 Decomposition Techniques -- 5.3 KKT Conditions Revisited -- 5.4 Overview of Training Methods -- 5.5 Primal--Dual Interior-Point Methods -- 5.5.1 Primal--Dual Interior-Point Methods for Linear Programming -- 5.5.2 Primal--Dual Interior-Point Methods for Quadratic Programming -- 5.5.3 Performance Evaluation -- 5.6 Steepest Ascent Methods and Newton's Methods -- 5.6.1 Solving Quadratic Programming Problems Without Constraints -- 5.6.2 Training of L1 Soft-Margin Support Vector Machines -- 5.6.3 Sequential Minimal Optimization -- 5.6.4 Training of L2 Soft-Margin Support Vector Machines -- 5.6.5 Performance Evaluation -- 5.7 Batch Training by Exact Incremental Training -- 5.7.1 KKT Conditions -- 5.7.2 Training by Solving a Set of Linear Equations -- 5.7.3 Performance Evaluation -- 5.8 Active Set Training in Primal and Dual -- 5.8.1 Training Support Vector Machines in the Primal -- 5.8.2 Comparison of Training Support Vector Machines in the Primal and the Dual -- 5.8.3 Performance Evaluation -- 5.9 Training of Linear Programming Support Vector Machines -- 5.9.1 Decomposition Techniques -- 5.9.2 Decomposition Techniques for Linear Programming Support Vector Machines -- 5.9.3 Computer Experiments -- References -- 6 Kernel-Based Methods -- 6.1 Kernel Least Squares -- 6.1.1 Algorithm -- 6.1.2 Performance Evaluation -- 6.2 Kernel Principal Component Analysis -- 6.3 Kernel Mahalanobis Distance -- 6.3.1 SVD-Based Kernel Mahalanobis Distance
- 3.2.1 Conventional Support Vector Machines -- 3.2.2 Fuzzy Support Vector Machines -- 3.2.3 Performance Comparison of Fuzzy Support Vector Machines -- 3.2.4 Cluster-Based Support Vector Machines -- 3.2.5 Decision-Tree-Based Support Vector Machines -- 3.2.6 Pairwise Classification with Correcting Classifiers -- 3.3 Error-Correcting Output Codes -- 3.3.1 Output Coding by Error-Correcting Codes -- 3.3.2 Unified Scheme for Output Coding -- 3.3.3 Equivalence of ECOC with Membership Functions -- 3.3.4 Performance Evaluation -- 3.4 All-at-Once Support Vector Machines -- 3.5 Comparisons of Architectures -- 3.5.1 One-Against-All Support Vector Machines -- 3.5.2 Pairwise Support Vector Machines -- 3.5.3 ECOC Support Vector Machines -- 3.5.4 All-at-Once Support Vector Machines -- 3.5.5 Training Difficulty -- 3.5.6 Training Time Comparison -- References -- 4 Variants of Support Vector Machines -- 4.1 Least-Squares Support Vector Machines -- 4.1.1 Two-Class Least-Squares Support Vector Machines -- 4.1.2 One-Against-All Least-Squares Support Vector Machines -- 4.1.3 Pairwise Least-Squares Support Vector Machines -- 4.1.4 All-at-Once Least-Squares Support Vector Machines -- 4.1.5 Performance Comparison -- 4.2 Linear Programming Support Vector Machines -- 4.2.1 Architecture -- 4.2.2 Performance Evaluation -- 4.3 Sparse Support Vector Machines -- 4.3.1 Several Approaches for Sparse SupportVector Machines -- 4.3.2 Idea -- 4.3.3 Support Vector Machines Trained in the Empirical Feature Space -- 4.3.4 Selection of Linearly Independent Data -- 4.3.5 Performance Evaluation -- 4.4 Performance Comparison of Different Classifiers -- 4.5 Robust Support Vector Machines -- 4.6 Bayesian Support Vector Machines -- 4.6.1 One-Dimensional Bayesian Decision Functions -- 4.6.2 Parallel Displacement of a Hyperplane -- 4.6.3 Normal Test -- 4.7 Incremental Training -- 4.7.1 Overview
- 6.3.2 KPCA-Based Mahalanobis Distance -- 6.4 Principal Component Analysis in the EmpiricalFeature Space -- 6.5 Kernel Discriminant Analysis -- 6.5.1 Kernel Discriminant Analysis for Two-Class Problems -- 6.5.2 Linear Discriminant Analysis for Two-Class Problems in the Empirical Feature Space -- 6.5.3 Kernel Discriminant Analysis for Multiclass Problems -- References -- 7 Feature Selection and Extraction -- 7.1 Selecting an Initial Set of Features -- 7.2 Procedure for Feature Selection -- 7.3 Feature Selection Using Support Vector Machines -- 7.3.1 Backward or Forward Feature Selection -- 7.3.2 Support Vector Machine-Based Feature Selection -- 7.3.3 Feature Selection by Cross-Validation -- 7.4 Feature Extraction -- References -- 8 Clustering -- 8.1 Domain Description -- 8.2 Extension to Clustering -- References -- 9 Maximum-Margin Multilayer Neural Networks -- 9.1 Approach -- 9.2 Three-Layer Neural Networks -- 9.3 CARVE Algorithm -- 9.4 Determination of Hidden-Layer Hyperplanes -- 9.4.1 Rotation of Hyperplanes -- 9.4.2 Training Algorithm -- 9.5 Determination of Output-Layer Hyperplanes -- 9.6 Determination of Parameter Values -- 9.7 Performance Evaluation -- References -- 10 Maximum-Margin Fuzzy Classifiers -- 10.1 Kernel Fuzzy Classifiers with Ellipsoidal Regions -- 10.1.1 Conventional Fuzzy Classifiers withEllipsoidal Regions -- 10.1.2 Extension to a Feature Space -- 10.1.3 Transductive Training -- 10.1.4 Maximizing Margins -- 10.1.5 Performance Evaluation -- 10.2 Fuzzy Classifiers with Polyhedral Regions -- 10.2.1 Training Methods -- 10.2.2 Performance Evaluation -- References -- 11 Function Approximation -- 11.1 Optimal Hyperplanes -- 11.2 L1 Soft-Margin Support Vector Regressors -- 11.3 L2 Soft-Margin Support Vector Regressors -- 11.4 Model Selection -- 11.5 Training Methods -- 11.5.1 Overview -- 11.5.2 Newton's Methods
- 11.5.3 Active Set Training -- 11.6 Variants of Support Vector Regressors -- 11.6.1 Linear Programming Support Vector Regressors -- 11.6.2 -Support Vector Regressors -- 11.6.3 Least-Squares Support Vector Regressors -- 11.7 Variable Selection -- 11.7.1 Overview -- 11.7.2 Variable Selection by Block Deletion -- 11.7.3 Performance Evaluation -- References -- A Conventional Classifiers -- A.1 Bayesian Classifiers -- A.2 Nearest-Neighbor Classifiers -- References -- B Matrices -- B.1 Matrix Properties -- B.2 Least-Squares Methods and Singular Value Decomposition -- B.3 Covariance Matrices -- References -- C Quadratic Programming -- C.1 Optimality Conditions -- C.2 Properties of Solutions -- D Positive Semidefinite Kernels and Reproducing Kernel Hilbert Space -- D.1 Positive Semidefinite Kernels -- D.2 Reproducing Kernel Hilbert Space -- References -- Index