Automatic classifier selection for non-experts

Choosing a suitable classifier for a given dataset is an important part of developing a pattern recognition system. Since a large variety of classification algorithms are proposed in literature, non-experts do not know which method should be used in order to obtain good classification results on the...

Full description

Saved in:
Bibliographic Details
Published inPattern analysis and applications : PAA Vol. 17; no. 1; pp. 83 - 96
Main Authors Reif, Matthias, Shafait, Faisal, Goldstein, Markus, Breuel, Thomas, Dengel, Andreas
Format Journal Article
LanguageEnglish
Published London Springer London 01.02.2014
Springer
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Choosing a suitable classifier for a given dataset is an important part of developing a pattern recognition system. Since a large variety of classification algorithms are proposed in literature, non-experts do not know which method should be used in order to obtain good classification results on their data. Meta-learning tries to address this problem by recommending promising classifiers based on meta-features computed from a given dataset. In this paper, we empirically evaluate five different categories of state-of-the-art meta-features for their suitability in predicting classification accuracies of several widely used classifiers (including Support Vector Machines, Neural Networks, Random Forests, Decision Trees, and Logistic Regression). Based on the evaluation results, we have developed the first open source meta-learning system that is capable of accurately predicting accuracies of target classifiers. The user provides a dataset as input and gets an automatically created high-performance ready-to-use pattern recognition system in a few simple steps. A user study of the system with non-experts showed that the users were able to develop more accurate pattern recognition systems in significantly less development time when using our system as compared to using a state-of-the-art data mining software.
ISSN:1433-7541
1433-755X
DOI:10.1007/s10044-012-0280-z