Stability and Classification Performance of Feature Selection Techniques

Feature selection techniques can be evaluated based on either model performance or the stability (robustness) of the technique. The ideal situation is to choose a feature selection technique that is robust to change, while also ensuring that models built with the selected features perform well. One...

Full description

Saved in:
Bibliographic Details
Published in2011 10th International Conference on Machine Learning and Applications and Workshops Vol. 1; pp. 151 - 156
Main Authors Huanjing Wang, Khoshgoftaaar, T. M., Qianhui Liang
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2011
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Feature selection techniques can be evaluated based on either model performance or the stability (robustness) of the technique. The ideal situation is to choose a feature selection technique that is robust to change, while also ensuring that models built with the selected features perform well. One domain where feature selection is especially important is software defect prediction, where large numbers of metrics collected from previous software projects are used to help engineers focus their efforts on the most faulty modules. This study presents a comprehensive empirical examination of seven filter-based feature ranking techniques (rankers) applied to nine real-world software measurement datasets of different sizes. Experimental results demonstrate that signal-to-noise ranker performed moderately in terms of robustness and was the best ranker in terms of model performance. The study also shows that although Relief was the most stable feature selection technique, it performed significantly worse than other rankers in terms of model performance.
ISBN:9781457721342
1457721341
DOI:10.1109/ICMLA.2011.133