Gene Selection and Classification of scRNA-seq Data Combining Information Gain Ratio and Genetic Algorithm with Dynamic Crossover

Single-cell RNA sequencing (scRNA-seq) is emerging as a promising technology. There exist a huge number of genes in a scRNA-seq data. However, some genes are high quality genes, and some are noises and irrelevant genes because of unspecific technology reasons. These noises and irrelevant genes may h...

Full description

Saved in:
Bibliographic Details
Published inWireless communications and mobile computing Vol. 2022; pp. 1 - 16
Main Authors Feng, Junhong, Niu, Xishuan, Zhang, Jie, Wang, Jian-Hong
Format Journal Article
LanguageEnglish
Published Oxford Hindawi 31.01.2022
Hindawi Limited
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Single-cell RNA sequencing (scRNA-seq) is emerging as a promising technology. There exist a huge number of genes in a scRNA-seq data. However, some genes are high quality genes, and some are noises and irrelevant genes because of unspecific technology reasons. These noises and irrelevant genes may have a strong influence on downstream data analyses, such as a cell classification, gene function analysis, and cancer biomarker detection. Therefore, it is very significant to obviate these irrelevant genes and choose high quality genes by gene selection methods. In this study, a novel gene selection and classification method is presented by combining the information gain ratio and the genetic algorithm with dynamic crossover (abbreviated as IGRDCGA). The information gain ratio (IGR) is employed to eliminate irrelevant genes roughly and obtain a preliminary gene subset, and then the genetic algorithm with a dynamic crossover (DCGA) is utilized to choose high quality genes finely from the preliminary gene subset. The main difference between the IGRDCGA and the existing methods is that the DCGA and IGR are integrated first and used to select genes from scRNA-seq data. We conduct the IGRDCGA and several competing methods on some real-world scRNA-seq datasets. The obtained results demonstrate that the IGRDCGA can choose high quality genes effectively and efficiently and outperforms the other several competing methods in terms of both the dimensionality reduction and the classification accuracy.
ISSN:1530-8669
1530-8677
DOI:10.1155/2022/9639304