THE SCALABLE BIRTH-DEATH MCMC ALGORITHM FOR MIXED GRAPHICAL MODEL LEARNING WITH APPLICATION TO GENOMIC DATA INTEGRATION
Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atla...
Saved in:
Published in | The annals of applied statistics Vol. 17; no. 3; p. 1958 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
United States
01.09.2023
|
Subjects | |
Online Access | Get more information |
Cover
Loading…
Abstract | Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atlas (TCGA), where specific types of biological information on specific type of tissue or cell are available. In cancer research, the challenge is now to perform integrative analyses of high-dimensional multi-omic data with the goal to better understand genomic processes that correlate with cancer outcomes, e.g. elucidate gene networks that discriminate a specific cancer subgroups (cancer sub-typing) or discovering gene networks that overlap across different cancer types (pan-cancer studies). In this paper, we propose a novel mixed graphical model approach to analyze multi-omic data of different types (continuous, discrete and count) and perform model selection by extending the Birth-Death MCMC (BDMCMC) algorithm initially proposed by Stephens (2000) and later developed by Mohammadi and Wit (2015). We compare the performance of our method to the LASSO method and the standard BDMCMC method using simulations and find that our method is superior in terms of both computational efficiency and the accuracy of the model selection results. Finally, an application to the TCGA breast cancer data shows that integrating genomic information at different levels (mutation and expression data) leads to better subtyping of breast cancers. |
---|---|
AbstractList | Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atlas (TCGA), where specific types of biological information on specific type of tissue or cell are available. In cancer research, the challenge is now to perform integrative analyses of high-dimensional multi-omic data with the goal to better understand genomic processes that correlate with cancer outcomes, e.g. elucidate gene networks that discriminate a specific cancer subgroups (cancer sub-typing) or discovering gene networks that overlap across different cancer types (pan-cancer studies). In this paper, we propose a novel mixed graphical model approach to analyze multi-omic data of different types (continuous, discrete and count) and perform model selection by extending the Birth-Death MCMC (BDMCMC) algorithm initially proposed by Stephens (2000) and later developed by Mohammadi and Wit (2015). We compare the performance of our method to the LASSO method and the standard BDMCMC method using simulations and find that our method is superior in terms of both computational efficiency and the accuracy of the model selection results. Finally, an application to the TCGA breast cancer data shows that integrating genomic information at different levels (mutation and expression data) leads to better subtyping of breast cancers. |
Author | Gao, Xin Wang, Nanwei Briollais, Laurent Massam, Hélène |
Author_xml | – sequence: 1 givenname: Nanwei surname: Wang fullname: Wang, Nanwei organization: Department of Mathematics and Statistics, University of New Brunswick, Toronto, Canada – sequence: 2 givenname: Hélène surname: Massam fullname: Massam, Hélène organization: Department of Mathematics and Statistics, York University, Toronto, Canada – sequence: 3 givenname: Xin surname: Gao fullname: Gao, Xin organization: Department of Mathematics and Statistics, York University, Toronto, Canada – sequence: 4 givenname: Laurent surname: Briollais fullname: Briollais, Laurent organization: Lunenfeld-Tanenbaum Research Institute of Sinai Health System and Dalla Lana School of Public Health (Biostatistics), University of Toronto, Toronto, Canada |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/37830084$$D View this record in MEDLINE/PubMed |
BookMark | eNo1kDtPwzAcxD0U0QdM7Oj_BQLxI4k9uombWHLiKhjBViWpI4HoQw2o4tuTCrjlhrv7DTdHk_1h7xG6w-EDJpg9EhI0h2bASYgnaIYFJUGMo2SK5sPwHoYR4wxfoylNOA1Dzmbo7AoFT6k0cmkULHXtiiBT0hVQpmUK0uS21q4oYWVrKPWryiCv5brQ4wRKmykDRsm60lUOL2MR5HptxtBpW4GzkKvKljqFTDoJunJqXF-yG3TVNx-Dv_3zBXpeKZcWgbH5hR10lJPPQPTEiy72LU8wHSWanmx50wtPPI673rM48WLLBWk9bWMqaMQS5psWd95HuCMLdP_LPX61O7_dHE9vu-b0vfl_gPwAbUZTQw |
CitedBy_id | crossref_primary_10_3390_math13010065 |
ContentType | Journal Article |
DBID | NPM |
DOI | 10.1214/22-aoas1701 |
DatabaseName | PubMed |
DatabaseTitle | PubMed |
DatabaseTitleList | PubMed |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | no_fulltext_linktorsrc |
Discipline | Mathematics |
ExternalDocumentID | 37830084 |
Genre | Journal Article |
GrantInformation_xml | – fundername: NCATS NIH HHS grantid: KL2 TR001874 – fundername: NCI NIH HHS grantid: U01 CA164920 |
GroupedDBID | 123 23M 2AX 6J9 AAWIL ABAWQ ABBHK ABFAN ABQDR ABXSQ ABYWD ABZEH ACDIW ACGFO ACHJO ACMTB ACTMH ADODI ADULT AELLO AENEX AETVE AEUPB AFFOW AFVYC AGLNM AIHAF AKBRZ ALMA_UNASSIGNED_HOLDINGS ALRMG AS~ CS3 DQDLB DSRWC EBS ECEWR EJD F5P FEDTE GIFXF GR0 HDK HQ6 HVGLF IPSME J9A JAA JAAYA JBMMH JBZCM JENOY JHFFW JKQEH JLEZI JLXEF JMS JPL JST NPM OK1 P2P PUASD RBU RNS RPE SA0 SJN TN5 WHG WS9 |
ID | FETCH-LOGICAL-c382t-9f2e9c6eb87133339af2d8af9e2e16cfe467e9d892be3b63935474eab1cee51c2 |
ISSN | 1932-6157 |
IngestDate | Fri Mar 28 01:32:00 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Keywords | Genomic integration SBDMCMC TCGA Mixed graphical models |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c382t-9f2e9c6eb87133339af2d8af9e2e16cfe467e9d892be3b63935474eab1cee51c2 |
OpenAccessLink | https://www.ncbi.nlm.nih.gov/pmc/articles/10569451 |
PMID | 37830084 |
ParticipantIDs | pubmed_primary_37830084 |
PublicationCentury | 2000 |
PublicationDate | 2023-Sep |
PublicationDateYYYYMMDD | 2023-09-01 |
PublicationDate_xml | – month: 09 year: 2023 text: 2023-Sep |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | The annals of applied statistics |
PublicationTitleAlternate | Ann Appl Stat |
PublicationYear | 2023 |
SSID | ssj0054841 |
Score | 2.3251388 |
Snippet | Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological... |
SourceID | pubmed |
SourceType | Index Database |
StartPage | 1958 |
Title | THE SCALABLE BIRTH-DEATH MCMC ALGORITHM FOR MIXED GRAPHICAL MODEL LEARNING WITH APPLICATION TO GENOMIC DATA INTEGRATION |
URI | https://www.ncbi.nlm.nih.gov/pubmed/37830084 |
Volume | 17 |
hasFullText | |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1Lj5swEICtbCtV20PV5_YtH3qzaIVNAB9JQkMq2EQpq-a2wmBLqdrsqkm1Uv9P_2dnMC-tturjQiKMEDCfh5nBM0PIm3JslNG8cEKNLcxM6DnK15WjQwnzSyluDMYhs1M_OfM-bMab0ejnYNXS94N6W_64Ma_kf6QK-0CumCX7D5LtTgo74D_IF7YgYdj-nYyTmH2cRmk0SWM2WazzxJnFUZ6wbJrBlE_ny_UiTzIGnh7LFpsYG3pGq6QugZAtZ3HK0jhan2LA6hMcyKLVqs0sZvmSzeO6xyGbRXnEsHbufBDR-txjVnRFmIvGpsU0JVsBug_YW6UC2vxKb_s4-H5viUzsF_sv9U_Yf-ufF3Usd7PtIJ582yK7tjYCpnW3S3ea2AUX3eKsVt2C9QjOqy1R3enjYMCdGChXrItzo9bnroeJLdwpLoo9FpgfHgUiu_xaAyCCUGD_gD-PXivB3Q4dkSNwRrC7KoaE7OsePL66PWp3K00SKFzTu8EVHZM77VmuOTC1IZPfJ_caD4RGFqcHZKR3D8ndrCvfu39ErgAs2oJFB2BRBIt2YFEAi9Zg0Q4sWoNFW7AogkUHYNF8SRuwKIJFB2A9Jmfv43yaOE2HDqcUIT840nAtS1-rEGMdQsjC8CosjNRcu35pNLyGtaxCyZUWysc0cC_wdKFcsM3GbsmfkFu7i51-SqgbCCPdICil9DxMx_arSghPS1Fpv5TFM3Jin9r5pS3Dct4-z-e_HXlBjnvqXpLbBiaDfgVG5EG9roX4CyA0XBs |
linkProvider | National Library of Medicine |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=THE+SCALABLE+BIRTH-DEATH+MCMC+ALGORITHM+FOR+MIXED+GRAPHICAL+MODEL+LEARNING+WITH+APPLICATION+TO+GENOMIC+DATA+INTEGRATION&rft.jtitle=The+annals+of+applied+statistics&rft.au=Wang%2C+Nanwei&rft.au=Massam%2C+H%C3%A9l%C3%A8ne&rft.au=Gao%2C+Xin&rft.au=Briollais%2C+Laurent&rft.date=2023-09-01&rft.issn=1932-6157&rft.volume=17&rft.issue=3&rft.spage=1958&rft_id=info:doi/10.1214%2F22-aoas1701&rft_id=info%3Apmid%2F37830084&rft_id=info%3Apmid%2F37830084&rft.externalDocID=37830084 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6157&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6157&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6157&client=summon |