Molecular Subtyping of Cancer Based on Robust Graph Neural Network and Multi-Omics Data Integration

Accurate molecular subtypes prediction of cancer patients is significant for personalized cancer diagnosis and treatments. Large amount of multi-omics data and the advancement of data-driven methods are expected to facilitate molecular subtyping of cancer. Most existing machine learning–based method...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in genetics Vol. 13; p. 884028
Main Authors Yin, Chaoyi, Cao, Yangkun, Sun, Peishuo, Zhang, Hengyuan, Li, Zhi, Xu, Ying, Sun, Huiyan
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 13.05.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Accurate molecular subtypes prediction of cancer patients is significant for personalized cancer diagnosis and treatments. Large amount of multi-omics data and the advancement of data-driven methods are expected to facilitate molecular subtyping of cancer. Most existing machine learning–based methods usually classify samples according to single omics data, fail to integrate multi-omics data to learn comprehensive representations of the samples, and ignore that information transfer and aggregation among samples can better represent them and ultimately help in classification. We propose a novel framework named multi-omics graph convolutional network (M-GCN) for molecular subtyping based on robust graph convolutional networks integrating multi-omics data. We first apply the Hilbert–Schmidt independence criterion least absolute shrinkage and selection operator (HSIC Lasso) to select the molecular subtype-related transcriptomic features and then construct a sample–sample similarity graph with low noise by using these features. Next, we take the selected gene expression, single nucleotide variants (SNV), and copy number variation (CNV) data as input and learn the multi-view representations of samples. On this basis, a robust variant of graph convolutional network (GCN) model is finally developed to obtain samples’ new representations by aggregating their subgraphs. Experimental results of breast and stomach cancer demonstrate that the classification performance of M-GCN is superior to other existing methods. Moreover, the identified subtype-specific biomarkers are highly consistent with current clinical understanding and promising to assist accurate diagnosis and targeted drug development.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics
Edited by: Jianpeng Sheng, Nanyang Technological University, Singapore
Reviewed by: Jiazhou Chen, South China University of Technology, China
Massimo La Rosa, National Research Council (CNR), Italy
These authors have contributed equally to this work and share first authorship
ISSN:1664-8021
1664-8021
DOI:10.3389/fgene.2022.884028