MoGCN: A Multi-Omics Integration Method Based on Graph Convolutional Network for Cancer Subtype Analysis

In light of the rapid accumulation of large-scale omics datasets, numerous studies have attempted to characterize the molecular and clinical features of cancers from a multi-omics perspective. However, there are great challenges in integrating multi-omics using machine learning methods for cancer su...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in genetics Vol. 13; p. 806842
Main Authors Li, Xiao, Ma, Jie, Leng, Ling, Han, Mingfei, Li, Mansheng, He, Fuchu, Zhu, Yunping
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 02.02.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In light of the rapid accumulation of large-scale omics datasets, numerous studies have attempted to characterize the molecular and clinical features of cancers from a multi-omics perspective. However, there are great challenges in integrating multi-omics using machine learning methods for cancer subtype classification. In this study, MoGCN, a multi-omics integration model based on graph convolutional network (GCN) was developed for cancer subtype classification and analysis. Genomics, transcriptomics and proteomics datasets for 511 breast invasive carcinoma (BRCA) samples were downloaded from the Cancer Genome Atlas (TCGA). The autoencoder (AE) and the similarity network fusion (SNF) methods were used to reduce dimensionality and construct the patient similarity network (PSN), respectively. Then the vector features and the PSN were input into the GCN for training and testing. Feature extraction and network visualization were used for further biological knowledge discovery and subtype classification. In the analysis of multi-dimensional omics data of the BRCA samples in TCGA, MoGCN achieved the highest accuracy in cancer subtype classification compared with several popular algorithms. Moreover, MoGCN can extract the most significant features of each omics layer and provide candidate functional molecules for further analysis of their biological effects. And network visualization showed that MoGCN could make clinically intuitive diagnosis. The generality of MoGCN was proven on the TCGA pan-kidney cancer datasets. MoGCN and datasets are public available at https://github.com/Lifoof/MoGCN. Our study shows that MoGCN performs well for heterogeneous data integration and the interpretability of classification results, which confers great potential for applications in biomarker identification and clinical diagnosis.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by: Joel Correa Da Rosa, Icahn School of Medicine at Mount Sinai, United States
This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics
Reviewed by: Bolin Chen, Northwestern Polytechnical University, China
Jiazhou Chen, South China University of Technology, China
These authors have contributed equally to this work
ISSN:1664-8021
1664-8021
DOI:10.3389/fgene.2022.806842