Improving Numerical Reproducibility of Scientific Software in Parallel Systems
Recently, numerical reproducibility has received increased emphasis from the scientific community. Software results that are not reproducible make it difficult to examine the science the software supports. A common source of numerical reproducibility errors in computational science occurs during flo...
Saved in:
Published in | 2020 IEEE International Conference on Electro Information Technology (EIT) pp. 066 - 074 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.07.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Recently, numerical reproducibility has received increased emphasis from the scientific community. Software results that are not reproducible make it difficult to examine the science the software supports. A common source of numerical reproducibility errors in computational science occurs during floating-point arithmetic. Finite precisions and limited storage for floating-point numbers require computers to truncate and round results of some math operations. As a consequence, an approximate value is stored instead of the exact result. One programming idiom that is not always reproducible is the global sum reduction of a distributed array. Changing the number of compute units changes the order array elements are added together which, in turn, changes the truncation and rounding. This may change the result of individual add operations and the resulting global sum. Therefore, floating-point addition is not always associative. This research has improved the numerical reproducibility of scientific applications on parallel systems. Automating the improvement of reproducibility in scientific software is the innovative contribution of this research. Two reproducible global sum reduction functions have been implemented and packaged in a software library. The automated improving of reproducibility has been done by developing a source code scanner to recognize certain MPI-based global sum reductions that may have reproducibility errors. The scanner replaces those reductions with calls to the library function containing reproducible codes. Reproducibility and performance testing have demonstrated the effectiveness of the system. This will extend the usefulness of legacy software and can lead to faster rates of discovery, and more efficient application of scientists' time. |
---|---|
AbstractList | Recently, numerical reproducibility has received increased emphasis from the scientific community. Software results that are not reproducible make it difficult to examine the science the software supports. A common source of numerical reproducibility errors in computational science occurs during floating-point arithmetic. Finite precisions and limited storage for floating-point numbers require computers to truncate and round results of some math operations. As a consequence, an approximate value is stored instead of the exact result. One programming idiom that is not always reproducible is the global sum reduction of a distributed array. Changing the number of compute units changes the order array elements are added together which, in turn, changes the truncation and rounding. This may change the result of individual add operations and the resulting global sum. Therefore, floating-point addition is not always associative. This research has improved the numerical reproducibility of scientific applications on parallel systems. Automating the improvement of reproducibility in scientific software is the innovative contribution of this research. Two reproducible global sum reduction functions have been implemented and packaged in a software library. The automated improving of reproducibility has been done by developing a source code scanner to recognize certain MPI-based global sum reductions that may have reproducibility errors. The scanner replaces those reductions with calls to the library function containing reproducible codes. Reproducibility and performance testing have demonstrated the effectiveness of the system. This will extend the usefulness of legacy software and can lead to faster rates of discovery, and more efficient application of scientists' time. |
Author | Jalal Apostal, Sara Faraji Marsh, Ronald Apostal, David |
Author_xml | – sequence: 1 givenname: Sara Faraji surname: Jalal Apostal fullname: Jalal Apostal, Sara Faraji organization: University of North Dakota,Department of Computer Science – sequence: 2 givenname: David surname: Apostal fullname: Apostal, David organization: University of North Dakota,Department of Computer Science – sequence: 3 givenname: Ronald surname: Marsh fullname: Marsh, Ronald organization: University of North Dakota,Department of Computer Science |
BookMark | eNotj21LwzAUhaMouM39AhHyB1rvTdIm-ShjzsGYYufnkWY3EunLaDtl_96C-3Tg8JwHzpTdNG1DjD0ipIhgn5brnTLW2lSAgNQKMFKaKzZFLQxmErW6ZhOBmUpAannH5n3_DQDjNLfCTNh2XR-79ic2X3x7qqmL3lX8g8bucPKxjFUczrwNvPCRmiGG6HnRhuHXdcRjw99d56qKKl6c-4Hq_p7dBlf1NL_kjH2-LHeL12TztlovnjdJFBkMSWY9ZoBl7kqhQkmlUuCsUEoYbURO3gR90JkesZDnOgRpypwQvUSJoIScsYd_bySi_bGLtevO-8t9-QfSdVDw |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/EIT48999.2020.9208338 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 1728153174 9781728153179 |
EISSN | 2154-0373 |
EndPage | 074 |
ExternalDocumentID | 9208338 |
Genre | orig-research |
GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IN AAJGR ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL RNS |
ID | FETCH-LOGICAL-i250t-59c1501b6ab24fbeb440a9244287826ec8f7d7579c1f667ff38b6e11c31310423 |
IEDL.DBID | RIE |
IngestDate | Wed Jun 26 19:26:31 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i250t-59c1501b6ab24fbeb440a9244287826ec8f7d7579c1f667ff38b6e11c31310423 |
OpenAccessLink | https://commons.und.edu/cgi/viewcontent.cgi?article=4097&context=theses |
PageCount | 9 |
ParticipantIDs | ieee_primary_9208338 |
PublicationCentury | 2000 |
PublicationDate | 2020-July |
PublicationDateYYYYMMDD | 2020-07-01 |
PublicationDate_xml | – month: 07 year: 2020 text: 2020-July |
PublicationDecade | 2020 |
PublicationTitle | 2020 IEEE International Conference on Electro Information Technology (EIT) |
PublicationTitleAbbrev | EIT |
PublicationYear | 2020 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0001096928 |
Score | 1.8130394 |
Snippet | Recently, numerical reproducibility has received increased emphasis from the scientific community. Software results that are not reproducible make it difficult... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 066 |
SubjectTerms | Codes Parallel programming Reproducibility of results Runtime Software Software algorithms Software libraries |
Title | Improving Numerical Reproducibility of Scientific Software in Parallel Systems |
URI | https://ieeexplore.ieee.org/document/9208338 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3Na8IwFA_qaadt6Ng3Oey41H7kq-ehuIEiTMGbJDEBmdQhLbL99XtJO2Vjh11KG1IaEtL3ey-_33sIPRgjqJOSEqZzRnyxbaKUMYTDxWYrFmfG653HEz6a05cFW7TQ40ELY60N5DMb-dtwlr_amsqHyvp5CoAhk23UlnFaa7WO8RTA4nkqG5EOPPUHzzMK3oRXo6Rx1Lz7o4hKsCHDUzT-_npNHXmLqlJH5vNXYsb_Du8M9Y5qPTw92KFz1LJFF00O0QI8qepTmQ0GtB0SvNaM2A-8dTjs7cAXwq_wR96rncXrAk_VzhdZ2eAmo3kPzYeD2dOINLUTyBpATUlYbgDqJZornVKnraY0VuBreQ8JPAprpBMrwQR0c5wL5zKpuU0SkyUA-ABjXaBOsS3sJcLcy1G14i5WkqY6k9ZoAX2VoY4ZIa5Q18_F8r1Oj7FspuH67-YbdOLXo2a83qJOuavsHdj1Ut-HBf0CWAOjfg |
link.rule.ids | 310,311,786,790,795,796,802,23958,23959,25170,27956,55107 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH7MedCTyib-NgePtuuPNEnPsrHpVgZusNtIsgSGY5PRIvrX-9LWieLBS2lDSkNC-r738n3vAdxpzakVgnqJShPPFdv2pNTaY3gx8SIJYu30zqOM9af0cZbMGnC_08IYY0rymfHdbXmWv9jowoXKOmmEgCEWe7CPdj7glVrrO6KCaDyNRC3TwadOdzCh6E84PUoU-PXbP8qolFakdwSjr-9X5JEXv8iVrz9-pWb87wCPof2t1yPjnSU6gYZZtyDbxQtIVlTnMiuCeLtM8VpxYt_JxpJyd5eMIfKM_-Q3uTVkuSZjuXVlVlakzmnehmmvO3noe3X1BG-JsCb3klQj2AsVkyqiVhlFaSDR23I-EvoURgvLFzzh2M0yxq2NhWImDHUcIuRDlHUKzfVmbc6AMCdIVZLZQAoaqVgYrTj2lZraRHN-Di03F_PXKkHGvJ6Gi7-bb-GgPxkN58NB9nQJh25tKv7rFTTzbWGu0crn6qZc3E8mW6bS |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2020+IEEE+International+Conference+on+Electro+Information+Technology+%28EIT%29&rft.atitle=Improving+Numerical+Reproducibility+of+Scientific+Software+in+Parallel+Systems&rft.au=Jalal+Apostal%2C+Sara+Faraji&rft.au=Apostal%2C+David&rft.au=Marsh%2C+Ronald&rft.date=2020-07-01&rft.pub=IEEE&rft.eissn=2154-0373&rft.spage=066&rft.epage=074&rft_id=info:doi/10.1109%2FEIT48999.2020.9208338&rft.externalDocID=9208338 |