A novel adaptive feature selector for supervised classification

Optimal feature Selection is an imperative area of research in medical data mining systems. Feature selection is an important factor that boosts-up the classification accuracy. In this paper we have proposed a adaptive feature selector based on game theory and optimization approach for an investigat...

Full description

Saved in:
Bibliographic Details
Published inInformation processing letters Vol. 117; pp. 25 - 34
Main Authors Sasikala, S., Appavu alias Balamurugan, S., Geetha, S.
Format Journal Article
LanguageEnglish
Published Amsterdam Elsevier B.V 01.01.2017
Elsevier Sequoia S.A
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Optimal feature Selection is an imperative area of research in medical data mining systems. Feature selection is an important factor that boosts-up the classification accuracy. In this paper we have proposed a adaptive feature selector based on game theory and optimization approach for an investigation on the improvement of the detection accuracy and optimal feature subset selection. Particularly, the embedded Shapely Value includes two memetic operators namely include and remove features (or genes) to realize the genetic algorithm (GA) solution. The use of GA for feature selection facilitates quick improvement in the solution through a fine tune search. An extensive experimental comparison on 22 benchmark datasets (both synthetic and microarray) from UCI Machine Learning repository and Kent ridge repository of proposed method and other conventional learning methods such as Support Vector Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbor (KNN), J48 (C4.5) and Artificial Neural Network (ANN) confirms that the proposed SVEGA strategy is effective and efficient in removing irrelevant and redundant features. We provide representative methods from each of wrapper, filter, conventional GA and show that this novel memetic algorithm – SVEGA yields overall promising results in terms of the evaluation criteria namely classification accuracy, number of selected genes, running time and other metrics over conventional feature selection methods.
ISSN:0020-0190
1872-6119
DOI:10.1016/j.ipl.2016.08.003