Countering reproducibility issues in mathematical models with software engineering techniques: A case study using a one-dimensional mathematical model of the atrioventricular node

Abstract One should assume that in silico experiments in systems biology are less susceptible to reproducibility issues than their wet-lab counterparts, because they are free from natural biological variations and their environment can be fully controlled. However, recent studies show that only half...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Schölzel, Christopher, Blesius, Valeria, Ernst, Gernot, Goesmann, Alexander, Dominik, Andreas
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 01.03.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Abstract One should assume that in silico experiments in systems biology are less susceptible to reproducibility issues than their wet-lab counterparts, because they are free from natural biological variations and their environment can be fully controlled. However, recent studies show that only half of the published mathematical models of biological systems can be reproduced without substantial effort. In this article we examine the potential causes for failed or cumbersome reproductions in a case study of a one-dimensional mathematical model of the atrioventricular node, which took us four months to reproduce. The model features almost all common types of reproducibility issues including missing information, errors in equations and parameters, a lack in available data files, non-executable code, missing or incomplete experiment protocols, and missing semantic information about the rationale behind equations. Many of these issues seem similar to problems that have already been solved in software engineering using techniques such as unit testing, regression tests, continuous integration, version control, archival services, and a thorough modular design with extensive documentation. Applying these techniques, we reimplement the examined model using the modeling language Modelica. The resulting workflow can be applied to any mathematical model. It guarantees methods reproducibility by executing automated tests in a virtual machine on a server that is physically separated from the development environment. Additionally, it facilitates results reproducibility, because the model is more understandable and because the complete model code, experiment protocols, and simulation data are published and can be accessed in the exact version that was used in this article. While the increased attention to design aspects and documentation required considerable effort, we found it justified, even just considering the immediate benefits during development such as easier and faster debugging, increased understandability of equations, and a reduced requirement for looking up details from the literature. Author summary Reproducibility is one of the cornerstones of the scientific method. In order to draw reliable conclusions, an experiment must yield the same results when it is repeated using the same methods. However, biological systems are complex, which makes experiments cumbersome. It is therefore desirable to build a mathematical representation of the biological system, which captures its essential behavior in a set of variables and equations and allows for easier and faster experimentation. Unfortunately, recent studies have shown that half of the published mathematical models are not immediately reproducible due to missing information, mathematical errors, and incomplete documentation. These issues are similar to those faced in software engineering: A single missing file or a buggy line of code can render any kind of software useless. Software engineering has turned to rigorous software testing, automated development pipelines, and version control systems to overcome these challenges, but these techniques are not yet widely applied to mathematical modeling. In this paper we demonstrate their benefit for the reproducibility of a large mathematical model of the atrioventricular node. The software engineering solutions that we employ can be applied to any mathematical model and could therefore facilitate scientific progress by encouraging and simplifying model reuse. Competing Interest Statement The authors have declared no competing interest. Footnotes * Corrects author information and adds supplement. * https://github.com/CSchoel/inamo
AbstractList Abstract One should assume that in silico experiments in systems biology are less susceptible to reproducibility issues than their wet-lab counterparts, because they are free from natural biological variations and their environment can be fully controlled. However, recent studies show that only half of the published mathematical models of biological systems can be reproduced without substantial effort. In this article we examine the potential causes for failed or cumbersome reproductions in a case study of a one-dimensional mathematical model of the atrioventricular node, which took us four months to reproduce. The model features almost all common types of reproducibility issues including missing information, errors in equations and parameters, a lack in available data files, non-executable code, missing or incomplete experiment protocols, and missing semantic information about the rationale behind equations. Many of these issues seem similar to problems that have already been solved in software engineering using techniques such as unit testing, regression tests, continuous integration, version control, archival services, and a thorough modular design with extensive documentation. Applying these techniques, we reimplement the examined model using the modeling language Modelica. The resulting workflow can be applied to any mathematical model. It guarantees methods reproducibility by executing automated tests in a virtual machine on a server that is physically separated from the development environment. Additionally, it facilitates results reproducibility, because the model is more understandable and because the complete model code, experiment protocols, and simulation data are published and can be accessed in the exact version that was used in this article. While the increased attention to design aspects and documentation required considerable effort, we found it justified, even just considering the immediate benefits during development such as easier and faster debugging, increased understandability of equations, and a reduced requirement for looking up details from the literature. Author summary Reproducibility is one of the cornerstones of the scientific method. In order to draw reliable conclusions, an experiment must yield the same results when it is repeated using the same methods. However, biological systems are complex, which makes experiments cumbersome. It is therefore desirable to build a mathematical representation of the biological system, which captures its essential behavior in a set of variables and equations and allows for easier and faster experimentation. Unfortunately, recent studies have shown that half of the published mathematical models are not immediately reproducible due to missing information, mathematical errors, and incomplete documentation. These issues are similar to those faced in software engineering: A single missing file or a buggy line of code can render any kind of software useless. Software engineering has turned to rigorous software testing, automated development pipelines, and version control systems to overcome these challenges, but these techniques are not yet widely applied to mathematical modeling. In this paper we demonstrate their benefit for the reproducibility of a large mathematical model of the atrioventricular node. The software engineering solutions that we employ can be applied to any mathematical model and could therefore facilitate scientific progress by encouraging and simplifying model reuse. Competing Interest Statement The authors have declared no competing interest. Footnotes * Corrects author information and adds supplement. * https://github.com/CSchoel/inamo
Author Schölzel, Christopher
Blesius, Valeria
Dominik, Andreas
Ernst, Gernot
Goesmann, Alexander
Author_xml – sequence: 1
  givenname: Christopher
  surname: Schölzel
  fullname: Schölzel, Christopher
– sequence: 2
  givenname: Valeria
  surname: Blesius
  fullname: Blesius, Valeria
– sequence: 3
  givenname: Gernot
  surname: Ernst
  fullname: Ernst, Gernot
– sequence: 4
  givenname: Alexander
  surname: Goesmann
  fullname: Goesmann, Alexander
– sequence: 5
  givenname: Andreas
  surname: Dominik
  fullname: Dominik, Andreas
BookMark eNplkL1OAzEQhF1AAYEHoFuJ-g6vfb7L0UURP5Ei0aSPHJ-dGF3s4B-iPBcviKPQ0exIO9pPs3NLrpx3mpAHpDUixSdGGdaU1djXDcde4A35mfvskg7WbSHoQ_BDVnZjR5tOYGPMOoJ1sJdpp8uwSo6w94MeIxxt2kH0Jh1l0KDd1jp94SStds5-ldtnmIGSUUNMeThBjmdbQklVDXavXbTenYn_8OANlB3IFKz_1q6IyqMM4Ip7R66NHKO-_9MJWb2-rObv1fLjbTGfLatDh1ghp5spV0a0RvKGU4UdLS-XDujAjZS0o6JnjDWd4S0VG8rMIE3byY4hVf2UT8jjBVtaOT-T1p8-h5I3rpmgAgU2bcN_AY-AcU8
ContentType Paper
Copyright 2021. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2021. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 8FE
8FH
AAFGM
AAMXL
ABOIG
ABUWG
ADZZV
AFKRA
AFLLJ
AFOLM
AGAJT
AQTIP
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
GNUQQ
HCIFZ
LK8
M7P
PIMPY
PQCXX
PQEST
PQQKQ
PQUKI
PRINS
DOI 10.1101/2021.02.19.431951
DatabaseName ProQuest SciTech Collection
ProQuest Natural Science Collection
ProQuest Central Korea - hybrid linking
Natural Science Collection - hybrid linking
Biological Science Collection - hybrid linking
ProQuest Central (Alumni)
ProQuest Central (Alumni) - hybrid linking
ProQuest Central UK/Ireland
SciTech Premium Collection - hybrid linking
ProQuest Central Student - hybrid linking
ProQuest Central Essentials - hybrid linking
ProQuest Women's & Gender Studies - hybrid linking
ProQuest Central Essentials
Biological Science Collection
AUTh Library subscriptions: ProQuest Central
ProQuest Natural Science Collection
ProQuest One Community College
ProQuest Central
ProQuest Central Student
SciTech Premium Collection (Proquest) (PQ_SDU_P3)
Biological Sciences
Biological Science Database
Publicly Available Content Database (Proquest) (PQ_SDU_P3)
ProQuest Central - hybrid linking
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
DatabaseTitle Publicly Available Content Database
ProQuest Central Student
ProQuest Biological Science Collection
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Natural Science Collection
Biological Science Database
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Academic UKI Edition
Natural Science Collection
ProQuest Central Korea
Biological Science Collection
ProQuest One Academic
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: BENPR
  name: AUTh Library subscriptions: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Genre Working Paper/Pre-Print
GroupedDBID 8FE
8FH
ABUWG
AFKRA
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
GNUQQ
HCIFZ
LK8
M7P
PIMPY
PQEST
PQQKQ
PQUKI
PRINS
ID FETCH-LOGICAL-p711-130b83cf56fa3430c1709512020d3faa0705922247f3605b02fdaf67a7210c983
IEDL.DBID BENPR
IngestDate Thu Oct 10 16:37:56 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-p711-130b83cf56fa3430c1709512020d3faa0705922247f3605b02fdaf67a7210c983
OpenAccessLink https://www.proquest.com/docview/2505151464?pq-origsite=%requestingapplication%
PQID 2505151464
PQPubID 2050091
ParticipantIDs proquest_journals_2505151464
PublicationCentury 2000
PublicationDate 20210301
PublicationDateYYYYMMDD 2021-03-01
PublicationDate_xml – month: 03
  year: 2021
  text: 20210301
  day: 01
PublicationDecade 2020
PublicationPlace Cold Spring Harbor
PublicationPlace_xml – name: Cold Spring Harbor
PublicationTitle bioRxiv
PublicationYear 2021
Publisher Cold Spring Harbor Laboratory Press
Publisher_xml – name: Cold Spring Harbor Laboratory Press
Score 1.6595569
Snippet Abstract One should assume that in silico experiments in systems biology are less susceptible to reproducibility issues than their wet-lab counterparts,...
SourceID proquest
SourceType Aggregation Database
SubjectTerms Automation
Case studies
Computer programs
Documentation
Experiments
Mathematical models
Reproducibility
Software
Software engineering
Title Countering reproducibility issues in mathematical models with software engineering techniques: A case study using a one-dimensional mathematical model of the atrioventricular node
URI https://www.proquest.com/docview/2505151464
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3NS8MwFA-6XbwpKn5MeQev1bZJk9aLqGwMD2PIhN1GmiYiaDvbifh3-Q_6XtZtB8VjGyjlvbzP_PJ7jF2YPI5TbTI0JIUFSm6KIJWWEjmV5KmhOsizfY7k8Ek8TJNp23BrWljlyid6R11UhnrkVxSqMToJKW7m7wFNjaLT1XaExjbrxpjK4D7v3vVH48f2-BK3GxX3npUzyi4xVmZJ9Mvp-kgy2GXdsZ7beo9t2XKffdOV8IVnAwSilyT21SVc9Qu8RBp4KeFtTa2qX8GPrmmA-qfQoA_91LUFu2EVhDUra3MNt2AwSoHnkAWCuD-Dhqq0QUGk_ktCjj8-D5UDfAdE318RINJ3CXUNJa4esMmgP7kfBu0YhWCuCK_GwzzlxiXSaS54aCJFaRXKJSy40xptPskwSxDKcaxt8jB2hXZSaawNQ5Ol_JB1SvyzIwY2conCgpDnhRT4kBnpMiGNspqbIhbHrLcS7aw1hWa2UdzJ_8unbIeUtQR49VhnUX_YM4z4i_y8VesPKfSyAg
link.rule.ids 781,785,21393,27930,33749,43810
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfZ1LS8NAEIAXbQ96U1R8VJ2D12jSTXYTL6LSUrWWIhV6K5vNrgia1KQi_i7_oDPbtB4Uj8lCCLOZ506-YexEp-12rHSCiiQxQUl15sXCUCAnozTWlAc52udA9B7D23E0rgtuVd1WubCJzlBnhaYa-Rm5avROoQgvpm8eTY2i09V6hMYqazpUVYM1rzqD4UN9fImfGyX3jsoZJKfoK5Mo-GV0nSfpbrDmUE1NuclWTL7FvuiX8JmjAQLhJYm-Om9X_QQnkQqec3hdolXVC7jRNRVQ_RQqtKEfqjRgfqiCsKSyVudwCRq9FDiGLFCL-xMoKHLjZQT1nwM5_ng8FBbwHhC-v6CGSFclVCXkuLrNRt3O6Lrn1WMUvKmkfjXupzHXNhJW8ZD7OpAUVqFc_IxbpVDnowSjhFBajrlN6rdtpqyQCnNDXycx32GNHN9sl4EJbCQxIeRpJkK8SLSwSSi0NIrrrB3usdZCtJNaFarJz8bt_798zNZ6o_v-pH8zuDtg67Rx82avFmvMyndziN5_lh7VW_wNnlq06g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Countering+reproducibility+issues+in+mathematical+models+with+software+engineering+techniques%3A+A+case+study+using+a+one-dimensional+mathematical+model+of+the+atrioventricular+node&rft.jtitle=bioRxiv&rft.au=Sch%C3%B6lzel%2C+Christopher&rft.au=Blesius%2C+Valeria&rft.au=Ernst%2C+Gernot&rft.au=Goesmann%2C+Alexander&rft.date=2021-03-01&rft.pub=Cold+Spring+Harbor+Laboratory+Press&rft_id=info:doi/10.1101%2F2021.02.19.431951