Correlation feature selection based improved-Binary Particle Swarm Optimization for gene selection and cancer classification
[Display omitted] •A two phase hybrid model based on improved-Binary Particle Swarm Optimization (iBPSO) is proposed for cancer diagnosis and classification using DNA microarray technology.•The model is examined on 11 different types of cancer microarray datasets and classified the samples with 100%...
Saved in:
Published in | Applied soft computing Vol. 62; pp. 203 - 215 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Elsevier B.V
01.01.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | [Display omitted]
•A two phase hybrid model based on improved-Binary Particle Swarm Optimization (iBPSO) is proposed for cancer diagnosis and classification using DNA microarray technology.•The model is examined on 11 different types of cancer microarray datasets and classified the samples with 100% accuracy for seven datasets and with more than 92% for the remaining datasets.•Comparative performance evaluation of the proposed model is done with seven other benchmark methods and the model exhibits superior performance.•The model also selects small number (<1.5%) of highly relevant genes responsible for cancer classification which facilitates early prognosis of the disease.•The proposed improved-BPSO also provides the solution for inherent local optimum problem of traditional BPSO.
DNA microarray technology has emerged as a prospective tool for diagnosis of cancer and its classification. It provides better insights of many genetic mutations occurring within a cell associated with cancer. However, thousands of gene expressions measured for each biological sample using microarray pose a great challenge. Many statistical and machine learning methods have been applied to get most relevant genes prior to cancer classification. A two phase hybrid model for cancer classification is being proposed, integrating Correlation-based Feature Selection (CFS) with improved-Binary Particle Swarm Optimization (iBPSO). This model selects a low dimensional set of prognostic genes to classify biological samples of binary and multi class cancers using Naive–Bayes classifier with stratified 10-fold cross-validation. The proposed iBPSO also controls the problem of early convergence to the local optimum of traditional BPSO. The proposed model has been evaluated on 11 benchmark microarray datasets of different cancer types. Experimental results are compared with seven other well known methods, and our model exhibited better results in terms of classification accuracy and the number of selected genes in most cases. In particular, it achieved up to 100% classification accuracy for seven out of eleven datasets with a very small sized prognostic gene subset (up to <1.5%) for all eleven datasets. |
---|---|
ISSN: | 1568-4946 1872-9681 |
DOI: | 10.1016/j.asoc.2017.09.038 |