Gene selection using information gain and improved simplified swarm optimization

Recently, gene selection (also called feature selection in data mining) has played an important role in the development of efficient cancer diagnoses and classification because gene expression data are coded by huge measured variables (genes), and only a small number of them present distinct profile...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 218; pp. 331 - 338
Main Authors Lai, Chyh-Ming, Yeh, Wei-Chang, Chang, Chung-Yi
Format Journal Article
LanguageEnglish
Published Elsevier B.V 19.12.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recently, gene selection (also called feature selection in data mining) has played an important role in the development of efficient cancer diagnoses and classification because gene expression data are coded by huge measured variables (genes), and only a small number of them present distinct profiles for different classes of samples. Gene selection problem involves reducing irrelevant, redundant and noisy genes and identifying the most distinguished genes to improve the classification accuracy. In this paper, a hybrid filter/wrapper method, known as IG-ISSO is proposed for gene selection problem. In this method, information gain (IG) as a filter is applied to select the most informative genes and an improved simplified swarm optimization (ISSO) is proposed as a gene search engine to guide the search for an optimal gene subset. The support vector machine (SVM) with a linear kernel serves as a classifier of the IG-ISSO. To evaluate the performance of the proposed method empirically, experiments are examined using ten gene expression datasets, and the corresponding results are compared with up-to-date works. The results of the statistical analysis indicate that the proposed method is better than its competitors. •The first work to apply simplified swarm optimization to a gene selection problem.•The GPS helps IG-ISSO to identify a smaller gene set with a higher accuracy.•Statistical results indicate IG-ISSO is better than other algorithms.•GPS: gene pruning strategy.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2016.08.089