Genetic algorithm-based feature selection with manifold learning for cancer classification using microarray data
Microarray data have been widely utilized for cancer classification. The main characteristic of microarray data is "large p and small n" in that data contain a small number of subjects but a large number of genes. It may affect the validity of the classification. Thus, there is a pressing...
Saved in:
Published in | BMC bioinformatics Vol. 24; no. 1; pp. 139 - 22 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
England
BioMed Central Ltd
08.04.2023
BioMed Central BMC |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Microarray data have been widely utilized for cancer classification. The main characteristic of microarray data is "large p and small n" in that data contain a small number of subjects but a large number of genes. It may affect the validity of the classification. Thus, there is a pressing demand of techniques able to select genes relevant to cancer classification.
This study proposed a novel feature (gene) selection method, Iso-GA, for cancer classification. Iso-GA hybrids the manifold learning algorithm, Isomap, in the genetic algorithm (GA) to account for the latent nonlinear structure of the gene expression in the microarray data. The Davies-Bouldin index is adopted to evaluate the candidate solutions in Isomap and to avoid the classifier dependency problem. Additionally, a probability-based framework is introduced to reduce the possibility of genes being randomly selected by GA. The performance of Iso-GA was evaluated on eight benchmark microarray datasets of cancers. Iso-GA outperformed other benchmarking gene selection methods, leading to good classification accuracy with fewer critical genes selected.
The proposed Iso-GA method can effectively select fewer but critical genes from microarray data to achieve competitive classification performance. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 1471-2105 1471-2105 |
DOI: | 10.1186/s12859-023-05267-3 |