Countering reproducibility issues in mathematical models with software engineering techniques: A case study using a one-dimensional mathematical model of the atrioventricular node
Abstract One should assume that in silico experiments in systems biology are less susceptible to reproducibility issues than their wet-lab counterparts, because they are free from natural biological variations and their environment can be fully controlled. However, recent studies show that only half...
Saved in:
Published in | bioRxiv |
---|---|
Main Authors | , , , , |
Format | Paper |
Language | English |
Published |
Cold Spring Harbor
Cold Spring Harbor Laboratory Press
01.03.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Abstract One should assume that in silico experiments in systems biology are less susceptible to reproducibility issues than their wet-lab counterparts, because they are free from natural biological variations and their environment can be fully controlled. However, recent studies show that only half of the published mathematical models of biological systems can be reproduced without substantial effort. In this article we examine the potential causes for failed or cumbersome reproductions in a case study of a one-dimensional mathematical model of the atrioventricular node, which took us four months to reproduce. The model features almost all common types of reproducibility issues including missing information, errors in equations and parameters, a lack in available data files, non-executable code, missing or incomplete experiment protocols, and missing semantic information about the rationale behind equations. Many of these issues seem similar to problems that have already been solved in software engineering using techniques such as unit testing, regression tests, continuous integration, version control, archival services, and a thorough modular design with extensive documentation. Applying these techniques, we reimplement the examined model using the modeling language Modelica. The resulting workflow can be applied to any mathematical model. It guarantees methods reproducibility by executing automated tests in a virtual machine on a server that is physically separated from the development environment. Additionally, it facilitates results reproducibility, because the model is more understandable and because the complete model code, experiment protocols, and simulation data are published and can be accessed in the exact version that was used in this article. While the increased attention to design aspects and documentation required considerable effort, we found it justified, even just considering the immediate benefits during development such as easier and faster debugging, increased understandability of equations, and a reduced requirement for looking up details from the literature. Author summary Reproducibility is one of the cornerstones of the scientific method. In order to draw reliable conclusions, an experiment must yield the same results when it is repeated using the same methods. However, biological systems are complex, which makes experiments cumbersome. It is therefore desirable to build a mathematical representation of the biological system, which captures its essential behavior in a set of variables and equations and allows for easier and faster experimentation. Unfortunately, recent studies have shown that half of the published mathematical models are not immediately reproducible due to missing information, mathematical errors, and incomplete documentation. These issues are similar to those faced in software engineering: A single missing file or a buggy line of code can render any kind of software useless. Software engineering has turned to rigorous software testing, automated development pipelines, and version control systems to overcome these challenges, but these techniques are not yet widely applied to mathematical modeling. In this paper we demonstrate their benefit for the reproducibility of a large mathematical model of the atrioventricular node. The software engineering solutions that we employ can be applied to any mathematical model and could therefore facilitate scientific progress by encouraging and simplifying model reuse. Competing Interest Statement The authors have declared no competing interest. Footnotes * Corrects author information and adds supplement. * https://github.com/CSchoel/inamo |
---|---|
AbstractList | Abstract One should assume that in silico experiments in systems biology are less susceptible to reproducibility issues than their wet-lab counterparts, because they are free from natural biological variations and their environment can be fully controlled. However, recent studies show that only half of the published mathematical models of biological systems can be reproduced without substantial effort. In this article we examine the potential causes for failed or cumbersome reproductions in a case study of a one-dimensional mathematical model of the atrioventricular node, which took us four months to reproduce. The model features almost all common types of reproducibility issues including missing information, errors in equations and parameters, a lack in available data files, non-executable code, missing or incomplete experiment protocols, and missing semantic information about the rationale behind equations. Many of these issues seem similar to problems that have already been solved in software engineering using techniques such as unit testing, regression tests, continuous integration, version control, archival services, and a thorough modular design with extensive documentation. Applying these techniques, we reimplement the examined model using the modeling language Modelica. The resulting workflow can be applied to any mathematical model. It guarantees methods reproducibility by executing automated tests in a virtual machine on a server that is physically separated from the development environment. Additionally, it facilitates results reproducibility, because the model is more understandable and because the complete model code, experiment protocols, and simulation data are published and can be accessed in the exact version that was used in this article. While the increased attention to design aspects and documentation required considerable effort, we found it justified, even just considering the immediate benefits during development such as easier and faster debugging, increased understandability of equations, and a reduced requirement for looking up details from the literature. Author summary Reproducibility is one of the cornerstones of the scientific method. In order to draw reliable conclusions, an experiment must yield the same results when it is repeated using the same methods. However, biological systems are complex, which makes experiments cumbersome. It is therefore desirable to build a mathematical representation of the biological system, which captures its essential behavior in a set of variables and equations and allows for easier and faster experimentation. Unfortunately, recent studies have shown that half of the published mathematical models are not immediately reproducible due to missing information, mathematical errors, and incomplete documentation. These issues are similar to those faced in software engineering: A single missing file or a buggy line of code can render any kind of software useless. Software engineering has turned to rigorous software testing, automated development pipelines, and version control systems to overcome these challenges, but these techniques are not yet widely applied to mathematical modeling. In this paper we demonstrate their benefit for the reproducibility of a large mathematical model of the atrioventricular node. The software engineering solutions that we employ can be applied to any mathematical model and could therefore facilitate scientific progress by encouraging and simplifying model reuse. Competing Interest Statement The authors have declared no competing interest. Footnotes * Corrects author information and adds supplement. * https://github.com/CSchoel/inamo |
Author | Schölzel, Christopher Blesius, Valeria Dominik, Andreas Ernst, Gernot Goesmann, Alexander |
Author_xml | – sequence: 1 givenname: Christopher surname: Schölzel fullname: Schölzel, Christopher – sequence: 2 givenname: Valeria surname: Blesius fullname: Blesius, Valeria – sequence: 3 givenname: Gernot surname: Ernst fullname: Ernst, Gernot – sequence: 4 givenname: Alexander surname: Goesmann fullname: Goesmann, Alexander – sequence: 5 givenname: Andreas surname: Dominik fullname: Dominik, Andreas |
BookMark | eNplkL1OAzEQhF1AAYEHoFuJ-g6vfb7L0UURP5Ei0aSPHJ-dGF3s4B-iPBcviKPQ0exIO9pPs3NLrpx3mpAHpDUixSdGGdaU1djXDcde4A35mfvskg7WbSHoQ_BDVnZjR5tOYGPMOoJ1sJdpp8uwSo6w94MeIxxt2kH0Jh1l0KDd1jp94SStds5-ldtnmIGSUUNMeThBjmdbQklVDXavXbTenYn_8OANlB3IFKz_1q6IyqMM4Ip7R66NHKO-_9MJWb2-rObv1fLjbTGfLatDh1ghp5spV0a0RvKGU4UdLS-XDujAjZS0o6JnjDWd4S0VG8rMIE3byY4hVf2UT8jjBVtaOT-T1p8-h5I3rpmgAgU2bcN_AY-AcU8 |
ContentType | Paper |
Copyright | 2021. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2021. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | 8FE 8FH AAFGM AAMXL ABOIG ABUWG ADZZV AFKRA AFLLJ AFOLM AGAJT AQTIP AZQEC BBNVY BENPR BHPHI CCPQU DWQXO GNUQQ HCIFZ LK8 M7P PIMPY PQCXX PQEST PQQKQ PQUKI PRINS |
DOI | 10.1101/2021.02.19.431951 |
DatabaseName | ProQuest SciTech Collection ProQuest Natural Science Collection ProQuest Central Korea - hybrid linking Natural Science Collection - hybrid linking Biological Science Collection - hybrid linking ProQuest Central (Alumni) ProQuest Central (Alumni) - hybrid linking ProQuest Central UK/Ireland SciTech Premium Collection - hybrid linking ProQuest Central Student - hybrid linking ProQuest Central Essentials - hybrid linking ProQuest Women's & Gender Studies - hybrid linking ProQuest Central Essentials Biological Science Collection AUTh Library subscriptions: ProQuest Central ProQuest Natural Science Collection ProQuest One Community College ProQuest Central ProQuest Central Student SciTech Premium Collection (Proquest) (PQ_SDU_P3) Biological Sciences Biological Science Database Publicly Available Content Database (Proquest) (PQ_SDU_P3) ProQuest Central - hybrid linking ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China |
DatabaseTitle | Publicly Available Content Database ProQuest Central Student ProQuest Biological Science Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Natural Science Collection Biological Science Database ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Academic UKI Edition Natural Science Collection ProQuest Central Korea Biological Science Collection ProQuest One Academic |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: BENPR name: AUTh Library subscriptions: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FH ABUWG AFKRA AZQEC BBNVY BENPR BHPHI CCPQU DWQXO GNUQQ HCIFZ LK8 M7P PIMPY PQEST PQQKQ PQUKI PRINS |
ID | FETCH-LOGICAL-p711-130b83cf56fa3430c1709512020d3faa0705922247f3605b02fdaf67a7210c983 |
IEDL.DBID | BENPR |
IngestDate | Thu Oct 10 16:37:56 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-p711-130b83cf56fa3430c1709512020d3faa0705922247f3605b02fdaf67a7210c983 |
OpenAccessLink | https://www.proquest.com/docview/2505151464?pq-origsite=%requestingapplication% |
PQID | 2505151464 |
PQPubID | 2050091 |
ParticipantIDs | proquest_journals_2505151464 |
PublicationCentury | 2000 |
PublicationDate | 20210301 |
PublicationDateYYYYMMDD | 2021-03-01 |
PublicationDate_xml | – month: 03 year: 2021 text: 20210301 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Cold Spring Harbor |
PublicationPlace_xml | – name: Cold Spring Harbor |
PublicationTitle | bioRxiv |
PublicationYear | 2021 |
Publisher | Cold Spring Harbor Laboratory Press |
Publisher_xml | – name: Cold Spring Harbor Laboratory Press |
Score | 1.6595569 |
Snippet | Abstract One should assume that in silico experiments in systems biology are less susceptible to reproducibility issues than their wet-lab counterparts,... |
SourceID | proquest |
SourceType | Aggregation Database |
SubjectTerms | Automation Case studies Computer programs Documentation Experiments Mathematical models Reproducibility Software Software engineering |
Title | Countering reproducibility issues in mathematical models with software engineering techniques: A case study using a one-dimensional mathematical model of the atrioventricular node |
URI | https://www.proquest.com/docview/2505151464 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3NS8MwFA-6XbwpKn5MeQev1bZJk9aLqGwMD2PIhN1GmiYiaDvbifh3-Q_6XtZtB8VjGyjlvbzP_PJ7jF2YPI5TbTI0JIUFSm6KIJWWEjmV5KmhOsizfY7k8Ek8TJNp23BrWljlyid6R11UhnrkVxSqMToJKW7m7wFNjaLT1XaExjbrxpjK4D7v3vVH48f2-BK3GxX3npUzyi4xVmZJ9Mvp-kgy2GXdsZ7beo9t2XKffdOV8IVnAwSilyT21SVc9Qu8RBp4KeFtTa2qX8GPrmmA-qfQoA_91LUFu2EVhDUra3MNt2AwSoHnkAWCuD-Dhqq0QUGk_ktCjj8-D5UDfAdE318RINJ3CXUNJa4esMmgP7kfBu0YhWCuCK_GwzzlxiXSaS54aCJFaRXKJSy40xptPskwSxDKcaxt8jB2hXZSaawNQ5Ol_JB1SvyzIwY2conCgpDnhRT4kBnpMiGNspqbIhbHrLcS7aw1hWa2UdzJ_8unbIeUtQR49VhnUX_YM4z4i_y8VesPKfSyAg |
link.rule.ids | 781,785,21393,27930,33749,43810 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfZ1LS8NAEIAXbQ96U1R8VJ2D12jSTXYTL6LSUrWWIhV6K5vNrgia1KQi_i7_oDPbtB4Uj8lCCLOZ506-YexEp-12rHSCiiQxQUl15sXCUCAnozTWlAc52udA9B7D23E0rgtuVd1WubCJzlBnhaYa-Rm5avROoQgvpm8eTY2i09V6hMYqazpUVYM1rzqD4UN9fImfGyX3jsoZJKfoK5Mo-GV0nSfpbrDmUE1NuclWTL7FvuiX8JmjAQLhJYm-Om9X_QQnkQqec3hdolXVC7jRNRVQ_RQqtKEfqjRgfqiCsKSyVudwCRq9FDiGLFCL-xMoKHLjZQT1nwM5_ng8FBbwHhC-v6CGSFclVCXkuLrNRt3O6Lrn1WMUvKmkfjXupzHXNhJW8ZD7OpAUVqFc_IxbpVDnowSjhFBajrlN6rdtpqyQCnNDXycx32GNHN9sl4EJbCQxIeRpJkK8SLSwSSi0NIrrrB3usdZCtJNaFarJz8bt_798zNZ6o_v-pH8zuDtg67Rx82avFmvMyndziN5_lh7VW_wNnlq06g |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Countering+reproducibility+issues+in+mathematical+models+with+software+engineering+techniques%3A+A+case+study+using+a+one-dimensional+mathematical+model+of+the+atrioventricular+node&rft.jtitle=bioRxiv&rft.au=Sch%C3%B6lzel%2C+Christopher&rft.au=Blesius%2C+Valeria&rft.au=Ernst%2C+Gernot&rft.au=Goesmann%2C+Alexander&rft.date=2021-03-01&rft.pub=Cold+Spring+Harbor+Laboratory+Press&rft_id=info:doi/10.1101%2F2021.02.19.431951 |