Multiple Linear Regression: Bayesian Inference for Distributed and Big Data in the Medical Informatics Platform of the Human Brain Project
We propose a Multiple Linear Regression (MLR) methodology for the analysis of distributed and Big Data in the framework of the Medical Informatics Platform (MIP) of the Human Brain Project (HBP). MLR is a very versatile model, and is considered one of the workhorses for estimating dependences betwee...
Saved in:
Published in | bioRxiv |
---|---|
Main Authors | , , , |
Format | Paper |
Language | English |
Published |
Cold Spring Harbor
Cold Spring Harbor Laboratory Press
05.01.2018
Cold Spring Harbor Laboratory |
Edition | 1.1 |
Subjects | |
Online Access | Get full text |
ISSN | 2692-8205 2692-8205 |
DOI | 10.1101/242883 |
Cover
Abstract | We propose a Multiple Linear Regression (MLR) methodology for the analysis of distributed and Big Data in the framework of the Medical Informatics Platform (MIP) of the Human Brain Project (HBP). MLR is a very versatile model, and is considered one of the workhorses for estimating dependences between clinical, neuropsychological and neurophysiological variables in the field of neuroimaging. One of the main concepts behind MIP is to federate data, which is stored locally in geographically distributed sites (hospitals, customized databases, etc.) around the world. We restrain from using a unique federation node for two main reasons: first the maintenance of data privacy, and second the efficiency in management of big volumes of data in terms of latency and storage resources needed in the federation node. Considering these conditions and the distributed nature of data, MLR cannot be estimated in the classical way, which raises the necessity of modifications of the standard algorithms. We use the Bayesian formalism that provides the armamentarium necessary to implement the MLR methodology for distributed Big Data. It allows us to account for the heterogeneity of the possible mechanisms that explain data sets across sites expressed through different models of explanatory variables. This approach enables the integration of highly heterogeneous data coming from different subjects and hospitals across the globe. Additionally, it offers general and sophisticated ways, which are extendable to other statistical models, to suit high-dimensional and distributed multimodal data. This work forms part of a series of papers related to the methodological developments embedded in the MIP. |
---|---|
AbstractList | We propose a Multiple Linear Regression (MLR) methodology for the analysis of distributed and Big Data in the framework of the Medical Informatics Platform (MIP) of the Human Brain Project (HBP). MLR is a very versatile model, and is considered one of the workhorses for estimating dependences between clinical, neuropsychological and neurophysiological variables in the field of neuroimaging. One of the main concepts behind MIP is to federate data, which is stored locally in geographically distributed sites (hospitals, customized databases, etc.) around the world. We restrain from using a unique federation node for two main reasons: first the maintenance of data privacy, and second the efficiency in management of big volumes of data in terms of latency and storage resources needed in the federation node. Considering these conditions and the distributed nature of data, MLR cannot be estimated in the classical way, which raises the necessity of modifications of the standard algorithms. We use the Bayesian formalism that provides the armamentarium necessary to implement the MLR methodology for distributed Big Data. It allows us to account for the heterogeneity of the possible mechanisms that explain data sets across sites expressed through different models of explanatory variables. This approach enables the integration of highly heterogeneous data coming from different subjects and hospitals across the globe. Additionally, it offers general and sophisticated ways, which are extendable to other statistical models, to suit high-dimensional and distributed multimodal data. This work forms part of a series of papers related to the methodological developments embedded in the MIP. |
Author | Melie-Garcia, Lester Kherif, Ferath Ashburner, John Draganski, Bogdan |
Author_xml | – sequence: 1 givenname: Lester surname: Melie-Garcia fullname: Melie-Garcia, Lester – sequence: 2 givenname: Bogdan surname: Draganski fullname: Draganski, Bogdan – sequence: 3 givenname: John surname: Ashburner fullname: Ashburner, John – sequence: 4 givenname: Ferath surname: Kherif fullname: Kherif, Ferath |
BookMark | eNpNkN1OAjEQhRuDiYj4BiZNvF5tu__eCaiQQCRGr8l0O4slS4tt18gr-NQu4oU385Mz38nJnJOesQYJueTshnPGb0UiiiI-IX2RlSIqBEt7_-YzMvR-wxgTZcbjPOmT70XbBL1rkM61QXD0BdcOvdfW3NER7NFrMHRmanRoKqS1dXSifXBatgEVBaPoSK_pBAJQbWh4R7pApStoDpR1Wwi68nTZQDhs1Na_N9N22_mOHHTM0tkNVuGCnNbQeBz-9QF5e3x4HU-j-fPTbHw_jyRPRRxVWArIZZ1IpgBiJfIkQxUnharSFIVUXU2ZYFmZVHFRZzIrZYGIspa5Yh0xIFdHX6mt-9Kfq53TW3D71fF3nX591HfOfrTow2pjW2e6SCvBcs5F0sWIfwCfZHCG |
Cites_doi | 10.1016/j.neuroimage.2009.03.025 10.1371/journal.pcbi.1000709 10.1109/MSP.2008.929620 10.1016/j.neuroimage.2007.07.062 10.1162/neco.1992.4.3.415 10.1016/j.neuroimage.2004.08.034 |
ContentType | Paper |
Copyright | 2018. This article is published under http://creativecommons.org/licenses/by/4.0/ ( the License ). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. 2018, Posted by Cold Spring Harbor Laboratory |
Copyright_xml | – notice: 2018. This article is published under http://creativecommons.org/licenses/by/4.0/ ( the License ). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: 2018, Posted by Cold Spring Harbor Laboratory |
DBID | 8FE 8FH ABUWG AFKRA AZQEC BBNVY BENPR BHPHI CCPQU DWQXO GNUQQ HCIFZ LK8 M7P PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS FX. |
DOI | 10.1101/242883 |
DatabaseName | ProQuest SciTech Collection ProQuest Natural Science Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials Biological Science Collection ProQuest Central Natural Science Collection ProQuest One ProQuest Central ProQuest Central Student SciTech Premium Collection Biological Sciences Biological Science Database ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China bioRxiv |
DatabaseTitle | Publicly Available Content Database ProQuest Central Student ProQuest One Academic Middle East (New) ProQuest Biological Science Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Natural Science Collection Biological Science Database ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest One Academic UKI Edition Natural Science Collection ProQuest Central Korea Biological Science Collection ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: FX. name: bioRxiv url: https://www.biorxiv.org/ sourceTypes: Open Access Repository – sequence: 2 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 2692-8205 |
Edition | 1.1 |
ExternalDocumentID | 242883v1 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FH ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BBNVY BENPR BHPHI CCPQU DWQXO GNUQQ HCIFZ LK8 M7P NQS PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PROAC RHI FX. |
ID | FETCH-LOGICAL-b1523-ce92a7bf4b0daa3d2746ed348dc55e2bd55e5020694c38f6b69b8eeebfb7d0aa3 |
IEDL.DBID | BENPR |
ISSN | 2692-8205 |
IngestDate | Tue Jan 07 18:56:08 EST 2025 Fri Jul 25 09:18:25 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Keywords | Bayesian linear model Bayesian linear regression parallel computation model averaging linear regression distributed computation variational Bayes Human Brain Project general linear model MLR Bayesian modeling linear model multiple linear regression |
Language | English |
License | This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at http://creativecommons.org/licenses/by/4.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-b1523-ce92a7bf4b0daa3d2746ed348dc55e2bd55e5020694c38f6b69b8eeebfb7d0aa3 |
Notes | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 |
ORCID | 0000-0001-7605-2518 0000-0001-5602-8916 0000-0002-5159-5919 0000-0001-5698-0413 |
OpenAccessLink | https://www.proquest.com/docview/2071124523?pq-origsite=%requestingapplication% |
PQID | 2071124523 |
PQPubID | 2050091 |
PageCount | 18 |
ParticipantIDs | biorxiv_primary_242883 proquest_journals_2071124523 |
PublicationCentury | 2000 |
PublicationDate | 20180105 |
PublicationDateYYYYMMDD | 2018-01-05 |
PublicationDate_xml | – month: 01 year: 2018 text: 20180105 day: 05 |
PublicationDecade | 2010 |
PublicationPlace | Cold Spring Harbor |
PublicationPlace_xml | – name: Cold Spring Harbor |
PublicationTitle | bioRxiv |
PublicationYear | 2018 |
Publisher | Cold Spring Harbor Laboratory Press Cold Spring Harbor Laboratory |
Publisher_xml | – name: Cold Spring Harbor Laboratory Press – name: Cold Spring Harbor Laboratory |
References | Beal (242883v1.1) 2003 Stephan, Marshall, Penny, Friston, Fink (242883v1.9) 2007 Tzikas, Likas, Galatsanos (242883v1.12) 2008; 25 Penny, Mattout, Trujillo-Barreto (242883v1.6) 2006 Broderick, Boyd, Wibisono, Wilson, Jordan (242883v1.2) 2013 Lappalainen, Miskin (242883v1.4) 2000 Penny, Trujillo-Barreto, Friston (242883v1.8) 2005; 24 Stephan, Penny, Daunizeau, Moran, Friston (242883v1.10) 2009; 46 MacKay (242883v1.5) 1992; 4 Hoeting, Madigan, Raftery, Volinsky (242883v1.3) 1999 Penny, Stephan, Daunizeau, Rosa, Friston, Schofield, Leff (242883v1.7) 2010; 6 Trujillo-Barreto, Aubert-Vázquez, Penny (242883v1.11) 2008; 39 |
References_xml | – year: 2000 ident: 242883v1.4 article-title: Ensemble learning publication-title: Advances in Independent Component Analysis – start-page: 1727 year: 2013 end-page: 1735 ident: 242883v1.2 article-title: Streaming Variational Bayes publication-title: Advances in Neural Information Processing Systems – year: 2003 ident: 242883v1.1 publication-title: PhD thesis, Gatsby Computational Neuroscience Unit – volume: 46 start-page: 1004 year: 2009 end-page: 1017 ident: 242883v1.10 article-title: Bayesian model selection for group studies publication-title: Neuroimage doi: 10.1016/j.neuroimage.2009.03.025 – volume: 6 start-page: e1000709 year: 2010 ident: 242883v1.7 article-title: Comparing Families of Dynamic Causal Models publication-title: PLoS Comput. Biol. doi: 10.1371/journal.pcbi.1000709 – volume: 25 start-page: 131 year: 2008 end-page: 146 ident: 242883v1.12 article-title: The variational approximation for Bayesian inference publication-title: IEEE Signal Process. Mag. doi: 10.1109/MSP.2008.929620 – volume: 39 start-page: 318 year: 2008 end-page: 335 ident: 242883v1.11 article-title: Bayesian M/EEG source reconstruction with spatio-temporal priors publication-title: Neuroimage doi: 10.1016/j.neuroimage.2007.07.062 – start-page: 382 year: 1999 end-page: 401 ident: 242883v1.3 article-title: Bayesian model averaging: a tutorial publication-title: Stat. Sci. – volume: 4 start-page: 415 year: 1992 end-page: 447 ident: 242883v1.5 article-title: Bayesian Interpolation publication-title: Neural Comput doi: 10.1162/neco.1992.4.3.415 – year: 2006 ident: 242883v1.6 publication-title: Stat. Parametr. Mapp. Anal. Funct. brain images – volume: 24 start-page: 350 year: 2005 end-page: 362 ident: 242883v1.8 article-title: Bayesian fMRI time series analysis with spatial priors publication-title: Neuroimage doi: 10.1016/j.neuroimage.2004.08.034 – start-page: 27 year: 2007 ident: 242883v1.9 article-title: Interhemispheric Integration of Visual Processing during Task-Driven Lateralization publication-title: J. NeuroSci. |
SSID | ssj0002961374 |
Score | 1.5254166 |
SecondaryResourceType | preprint |
Snippet | We propose a Multiple Linear Regression (MLR) methodology for the analysis of distributed and Big Data in the framework of the Medical Informatics Platform... |
SourceID | biorxiv proquest |
SourceType | Open Access Repository Aggregation Database |
SubjectTerms | Bayesian analysis Big Data Brain Health informatics Hospitals Informatics Latency Mathematical models Neuroimaging Neuroscience Regression analysis Statistical analysis |
SummonAdditionalLinks | – databaseName: bioRxiv dbid: FX. link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LS8NAEF60RfDmq_ioMgev0TSbbDYeay1VqBSx0FvYZwlIWtoo9i_4q51NtnoQL4HAbA6T3Z1vnh8h1yrjYWqVCGQmqEsz0kBwlQUsZrFRCbdRXYw5fmajafw0S2beUVz7skpZLFafxUedx3cF23j7Noc77N2iReGc7pI27qPIUTUMZzc_MZUoQ-OUxp5C6Fccsa3_5p8btzYjwwPSnoilWR2SHVMekb2GB3JzTL7GvqwP0DXErQcvZt7Up5Z30Bcb4zod4XHbmwcINGHgJt46siqjQZQa-sUcBqISUJSAoA58BgZ8v5GbxgyTN1G5N1jYWqaO4EPfsUTApAnJnJDp8OH1fhR4koRAoumlgTJZJFJpYxlqIahGL5MZTWOuVZKYSGp8JogJWRYryi2TLJPcGCOtTHWIKzqkVS5Kc0pAWY4OayYZisQcoYmwMjI2dFFCizjyjHS8GvNlMwojb_R7Rrpbreb-CKzzCMFLz-V16fl_6y7IPuIPXkc0ki5pVat3c4k2vpJX9Y_9BgK0pyE priority: 102 providerName: Cold Spring Harbor Laboratory Press |
Title | Multiple Linear Regression: Bayesian Inference for Distributed and Big Data in the Medical Informatics Platform of the Human Brain Project |
URI | https://www.proquest.com/docview/2071124523 https://www.biorxiv.org/content/10.1101/242883 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LSwMxEA62peDNV7Fayxy8ri77SBMvwtoWFVpKsdBbybMUZFvbKvYv-Kud7KZ6ELwsLJu9JJOZb75M5iPkWnEWdqwSgeQidseMcSCY4gFNaGJUymxUFGMOhvRxkjxP06kn3Da-rHLvEwtHrZfKceSOCUFokGDedL96C5xqlDtd9RIaFVJDF8zQzmtZbzga_7AsEcdwVbRijijHrR-FqRcYQlO8xfDEXL_Aulws15-Ljz_-uAgy_SNSG4mVWR-TA5OfkHqpErk7JV8DX_QHmDiiYcLYzMvq1fwOMrEz7h4kPO1v7gHCUOi6frhOyspoELmGbDGHrtgKWOSAkA_8-Qz420iuVzOMXsXWvcHSFmMKfh8ypyEBo5KwOSOTfu_l4THwEgqBxMAcB8rwSHSkTWSohYg15qDU6DhhWqWpiaTGZ4qIkfJExcxSSblkxhhpZUeH-EeDVPNlbs4JKMswneWS4pCEIXARVkbGho5DtIgym6Thp3G2KhtlzMr5bZLWflZnfoNsZr_LefH_50tyiBiFFaxH2iLV7frdXCEO2Mq2X-w2qfSnN9-LUbQ7 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3dSxtBEB9sgtS31irV2nYe2sfDY29vs1sQIY2SqAlBFHy77qcEyiWNaTX_Qv8Y_0Zn76N9KPTNl4Pj9g5udnbmN98An6ySaS9YnRilsxhmzBItrUoEF9zbXAZWJWOOJ2J4zc9u8psNeGxrYWJaZSsTK0Ht5jb6yKMnhKABJ7vpePEjiVOjYnS1HaFRs8W5X9-TyXZ3NBrQ_n5m7PTk6uswaaYKJIZ0VZZYr5jumcBN6rTOHJllwruMS2fz3DPj6JoTiBKK20wGYYQy0ntvgum5lN6g776ALo8VrR3o9k8m08s_Xh2mSD1WrZ-ZUCRqWJo3A42I9Q9JHcrYn3DTzObLh9mvf-R_pdROX0F3qhd--Ro2fLkNm_VUyvUb-D1ukgyRDFX6Y7z0t3W2bPkF-3rtY90ljtpKQSTYi4PYfzeOzvIOdemwP7vFgV5pnJVIEBObeBA21U-xNzROv-tVvMN5qNZU8QTsx5kVOK0dRDtw_SzE3YVOOS_9W0AbJJnPyghawiUBJR0M8yGNPstAqHYPdhsyFou6MUdR03cPDlqqFs2BvCv-ss_-_x9_hJfDq_FFcTGanL-DLcJHsvK45AfQWS1_-veEQVbmQ7PxCN-em9eeADvW8hE |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LT8JAEN4oROPNFxFFnYPXYulz6xGxARXSGEm4NbvdXdLEFALVyF_wVzvbLnowXpo02elhO7vzzTcvQm6yiNqhypjFI-bqMKNrMZpFVuAFnsx8qpwqGXM8CYZT73Hmzwx1sTZplTxfrD7zjyqOrxO28fatD7fdu0WLQqnb1dx0dynULmmiQvW0Osez7g-54kRopULPzBL6lUOQaz7-5-qt7El8SJoJW8rVEdmRxTHZqwdCbk7I19jk9wH6iKiD8CLndaJqcQd9tpG65BFG2yI9QMQJA936Vk-tkgJYIaCfz2HASgZ5AYjuwIRiwBQe6bbMkLyxUr_BQlVrKiof-npcBCQ1N3NKpvHD6_3QMtMSLI422LUyGTks5MrjtmDMFehuBlK4HhWZ70uHC3z6CA6DyMtcqgIeRJxKKbniobBRokUaxaKQZwQyRdFzjXiASzyKGIUp7khla7pQIaBsk5bZxnRZ98RI6_1tk852V1NzFtapgyimpwO87vl_ctdkPxnE6fNo8nRBDhCT0Irl8DukUa7e5SXa_ZJfVf_4G9jXrQk |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multiple+Linear+Regression%3A+Bayesian+Inference+for+Distributed+and+Big+Data+in+the+Medical+Informatics+Platform+of+the+Human+Brain+Project&rft.jtitle=bioRxiv&rft.au=Melie-Garcia%2C+Lester&rft.au=Draganski%2C+Bogdan&rft.au=Ashburner%2C+John&rft.au=Kherif%2C+Ferath&rft.date=2018-01-05&rft.pub=Cold+Spring+Harbor+Laboratory+Press&rft.issn=2692-8205&rft.eissn=2692-8205&rft_id=info:doi/10.1101%2F242883 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2692-8205&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2692-8205&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2692-8205&client=summon |