Deep learning identified glioblastoma subtypes based on internal genomic expression ranks

Glioblastoma (GBM) can be divided into subtypes according to their genomic features, including Proneural (PN), Neural (NE), Classical (CL) and Mesenchymal (ME). However, it is a difficult task to unify various genomic expression profiles which were standardized with various procedures from different...

Full description

Saved in:

Bibliographic Details
Published in	BMC cancer Vol. 22; no. 1; p. 86
Main Authors	Mao, Xing-Gang, Xue, Xiao-Yan, Wang, Ling, Lin, Wei, Zhang, Xiang
Format	Journal Article
Language	English
Published	England BioMed Central Ltd 20.01.2022 BioMed Central BMC
Subjects	Algorithms Analysis Brain cancer Classical Classification Databases, Genetic Datasets Deep Learning Deep neural network Diagnosis Gene expression Gene Expression Profiling - methods Gene Expression Regulation, Neoplastic Genetic aspects Genomics Glioblastoma Glioblastoma - classification Glioblastoma multiforme Health aspects Humans Machine learning Mesenchymal Mesenchyme Neural Neural networks Neural Networks, Computer Normal Distribution Proneural Risk factors Statistical analysis Tumors China Support vector machines Deep neural network Glioma Machine learning Classical Molecular subtype Proneural Artificial intelligence Mesenchymal Neural
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Glioblastoma (GBM) can be divided into subtypes according to their genomic features, including Proneural (PN), Neural (NE), Classical (CL) and Mesenchymal (ME). However, it is a difficult task to unify various genomic expression profiles which were standardized with various procedures from different studies and to manually classify a given GBM sample into a subtype. An algorithm was developed to unify the genomic profiles of GBM samples into a standardized normal distribution (SND), based on their internal expression ranks. Deep neural networks (DNN) and convolutional DNN (CDNN) models were trained on original and SND data. In addition, expanded SND data by combining various The Cancer Genome Atlas (TCGA) datasets were used to improve the robustness and generalization capacity of the CDNN models. The SND data kept unimodal distribution similar to their original data, and also kept the internal expression ranks of all genes for each sample. CDNN models trained on the SND data showed significantly higher accuracy compared to DNN and CDNN models trained on primary expression data. Interestingly, the CDNN models classified the NE subtype with the lowest accuracy in the GBM datasets, expanded datasets and in IDH wide type GBMs, consistent with the recent studies that NE subtype should be excluded. Furthermore, the CDNN models also recognized independent GBM datasets, even with small set of genomic expressions. The GBM expression profiles can be transformed into unified SND data, which can be used to train CDNN models with high accuracy and generalization capacity. These models suggested NE subtype may be not compatible with the 4 subtypes classification system.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1471-2407 1471-2407
DOI:	10.1186/s12885-022-09191-2