THE SCALABLE BIRTH-DEATH MCMC ALGORITHM FOR MIXED GRAPHICAL MODEL LEARNING WITH APPLICATION TO GENOMIC DATA INTEGRATION

Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atla...

Full description

Saved in:
Bibliographic Details
Published inThe annals of applied statistics Vol. 17; no. 3; p. 1958
Main Authors Wang, Nanwei, Massam, Hélène, Gao, Xin, Briollais, Laurent
Format Journal Article
LanguageEnglish
Published United States 01.09.2023
Subjects
Online AccessGet more information

Cover

Loading…
Abstract Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atlas (TCGA), where specific types of biological information on specific type of tissue or cell are available. In cancer research, the challenge is now to perform integrative analyses of high-dimensional multi-omic data with the goal to better understand genomic processes that correlate with cancer outcomes, e.g. elucidate gene networks that discriminate a specific cancer subgroups (cancer sub-typing) or discovering gene networks that overlap across different cancer types (pan-cancer studies). In this paper, we propose a novel mixed graphical model approach to analyze multi-omic data of different types (continuous, discrete and count) and perform model selection by extending the Birth-Death MCMC (BDMCMC) algorithm initially proposed by Stephens (2000) and later developed by Mohammadi and Wit (2015). We compare the performance of our method to the LASSO method and the standard BDMCMC method using simulations and find that our method is superior in terms of both computational efficiency and the accuracy of the model selection results. Finally, an application to the TCGA breast cancer data shows that integrating genomic information at different levels (mutation and expression data) leads to better subtyping of breast cancers.
AbstractList Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atlas (TCGA), where specific types of biological information on specific type of tissue or cell are available. In cancer research, the challenge is now to perform integrative analyses of high-dimensional multi-omic data with the goal to better understand genomic processes that correlate with cancer outcomes, e.g. elucidate gene networks that discriminate a specific cancer subgroups (cancer sub-typing) or discovering gene networks that overlap across different cancer types (pan-cancer studies). In this paper, we propose a novel mixed graphical model approach to analyze multi-omic data of different types (continuous, discrete and count) and perform model selection by extending the Birth-Death MCMC (BDMCMC) algorithm initially proposed by Stephens (2000) and later developed by Mohammadi and Wit (2015). We compare the performance of our method to the LASSO method and the standard BDMCMC method using simulations and find that our method is superior in terms of both computational efficiency and the accuracy of the model selection results. Finally, an application to the TCGA breast cancer data shows that integrating genomic information at different levels (mutation and expression data) leads to better subtyping of breast cancers.
Author Gao, Xin
Wang, Nanwei
Briollais, Laurent
Massam, Hélène
Author_xml – sequence: 1
  givenname: Nanwei
  surname: Wang
  fullname: Wang, Nanwei
  organization: Department of Mathematics and Statistics, University of New Brunswick, Toronto, Canada
– sequence: 2
  givenname: Hélène
  surname: Massam
  fullname: Massam, Hélène
  organization: Department of Mathematics and Statistics, York University, Toronto, Canada
– sequence: 3
  givenname: Xin
  surname: Gao
  fullname: Gao, Xin
  organization: Department of Mathematics and Statistics, York University, Toronto, Canada
– sequence: 4
  givenname: Laurent
  surname: Briollais
  fullname: Briollais, Laurent
  organization: Lunenfeld-Tanenbaum Research Institute of Sinai Health System and Dalla Lana School of Public Health (Biostatistics), University of Toronto, Toronto, Canada
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37830084$$D View this record in MEDLINE/PubMed
BookMark eNo1kDtPwzAcxD0U0QdM7Oj_BQLxI4k9uombWHLiKhjBViWpI4HoQw2o4tuTCrjlhrv7DTdHk_1h7xG6w-EDJpg9EhI0h2bASYgnaIYFJUGMo2SK5sPwHoYR4wxfoylNOA1Dzmbo7AoFT6k0cmkULHXtiiBT0hVQpmUK0uS21q4oYWVrKPWryiCv5brQ4wRKmykDRsm60lUOL2MR5HptxtBpW4GzkKvKljqFTDoJunJqXF-yG3TVNx-Dv_3zBXpeKZcWgbH5hR10lJPPQPTEiy72LU8wHSWanmx50wtPPI673rM48WLLBWk9bWMqaMQS5psWd95HuCMLdP_LPX61O7_dHE9vu-b0vfl_gPwAbUZTQw
CitedBy_id crossref_primary_10_3390_math13010065
ContentType Journal Article
DBID NPM
DOI 10.1214/22-aoas1701
DatabaseName PubMed
DatabaseTitle PubMed
DatabaseTitleList PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Mathematics
ExternalDocumentID 37830084
Genre Journal Article
GrantInformation_xml – fundername: NCATS NIH HHS
  grantid: KL2 TR001874
– fundername: NCI NIH HHS
  grantid: U01 CA164920
GroupedDBID 123
23M
2AX
6J9
AAWIL
ABAWQ
ABBHK
ABFAN
ABQDR
ABXSQ
ABYWD
ABZEH
ACDIW
ACGFO
ACHJO
ACMTB
ACTMH
ADODI
ADULT
AELLO
AENEX
AETVE
AEUPB
AFFOW
AFVYC
AGLNM
AIHAF
AKBRZ
ALMA_UNASSIGNED_HOLDINGS
ALRMG
AS~
CS3
DQDLB
DSRWC
EBS
ECEWR
EJD
F5P
FEDTE
GIFXF
GR0
HDK
HQ6
HVGLF
IPSME
J9A
JAA
JAAYA
JBMMH
JBZCM
JENOY
JHFFW
JKQEH
JLEZI
JLXEF
JMS
JPL
JST
NPM
OK1
P2P
PUASD
RBU
RNS
RPE
SA0
SJN
TN5
WHG
WS9
ID FETCH-LOGICAL-c382t-9f2e9c6eb87133339af2d8af9e2e16cfe467e9d892be3b63935474eab1cee51c2
ISSN 1932-6157
IngestDate Fri Mar 28 01:32:00 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords Genomic integration
SBDMCMC
TCGA
Mixed graphical models
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c382t-9f2e9c6eb87133339af2d8af9e2e16cfe467e9d892be3b63935474eab1cee51c2
OpenAccessLink https://www.ncbi.nlm.nih.gov/pmc/articles/10569451
PMID 37830084
ParticipantIDs pubmed_primary_37830084
PublicationCentury 2000
PublicationDate 2023-Sep
PublicationDateYYYYMMDD 2023-09-01
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sep
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle The annals of applied statistics
PublicationTitleAlternate Ann Appl Stat
PublicationYear 2023
SSID ssj0054841
Score 2.3251388
Snippet Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological...
SourceID pubmed
SourceType Index Database
StartPage 1958
Title THE SCALABLE BIRTH-DEATH MCMC ALGORITHM FOR MIXED GRAPHICAL MODEL LEARNING WITH APPLICATION TO GENOMIC DATA INTEGRATION
URI https://www.ncbi.nlm.nih.gov/pubmed/37830084
Volume 17
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1Lj5swEICtbCtV20PV5_YtH3qzaIVNAB9JQkMq2EQpq-a2wmBLqdrsqkm1Uv9P_2dnMC-tturjQiKMEDCfh5nBM0PIm3JslNG8cEKNLcxM6DnK15WjQwnzSyluDMYhs1M_OfM-bMab0ejnYNXS94N6W_64Ma_kf6QK-0CumCX7D5LtTgo74D_IF7YgYdj-nYyTmH2cRmk0SWM2WazzxJnFUZ6wbJrBlE_ny_UiTzIGnh7LFpsYG3pGq6QugZAtZ3HK0jhan2LA6hMcyKLVqs0sZvmSzeO6xyGbRXnEsHbufBDR-txjVnRFmIvGpsU0JVsBug_YW6UC2vxKb_s4-H5viUzsF_sv9U_Yf-ufF3Usd7PtIJ582yK7tjYCpnW3S3ea2AUX3eKsVt2C9QjOqy1R3enjYMCdGChXrItzo9bnroeJLdwpLoo9FpgfHgUiu_xaAyCCUGD_gD-PXivB3Q4dkSNwRrC7KoaE7OsePL66PWp3K00SKFzTu8EVHZM77VmuOTC1IZPfJ_caD4RGFqcHZKR3D8ndrCvfu39ErgAs2oJFB2BRBIt2YFEAi9Zg0Q4sWoNFW7AogkUHYNF8SRuwKIJFB2A9Jmfv43yaOE2HDqcUIT840nAtS1-rEGMdQsjC8CosjNRcu35pNLyGtaxCyZUWysc0cC_wdKFcsM3GbsmfkFu7i51-SqgbCCPdICil9DxMx_arSghPS1Fpv5TFM3Jin9r5pS3Dct4-z-e_HXlBjnvqXpLbBiaDfgVG5EG9roX4CyA0XBs
linkProvider National Library of Medicine
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=THE+SCALABLE+BIRTH-DEATH+MCMC+ALGORITHM+FOR+MIXED+GRAPHICAL+MODEL+LEARNING+WITH+APPLICATION+TO+GENOMIC+DATA+INTEGRATION&rft.jtitle=The+annals+of+applied+statistics&rft.au=Wang%2C+Nanwei&rft.au=Massam%2C+H%C3%A9l%C3%A8ne&rft.au=Gao%2C+Xin&rft.au=Briollais%2C+Laurent&rft.date=2023-09-01&rft.issn=1932-6157&rft.volume=17&rft.issue=3&rft.spage=1958&rft_id=info:doi/10.1214%2F22-aoas1701&rft_id=info%3Apmid%2F37830084&rft_id=info%3Apmid%2F37830084&rft.externalDocID=37830084
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6157&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6157&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6157&client=summon