Analyzing the Impact of Lossy Compressor Variability on Checkpointing Scientific Simulations

Lossy compression algorithms are effective tools to reduce the size of high-performance computing data sets. As established lossy compressors such as SZ and ZFP evolve, they seek to improve the compression/decompression bandwidth and the compression ratio. Algorithm improvements may alter the spatia...

Full description

Saved in:

Bibliographic Details
Published in	2019 IEEE International Conference on Cluster Computing (CLUSTER) pp. 1 - 5
Main Authors	Triantafyllides, Pavlo, Reza, Tasmia, Calhoun, Jon C.
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2019
Subjects	Bandwidth Checkpointing Compression algorithms Distribution functions Graphical models Numerical models
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Lossy compression algorithms are effective tools to reduce the size of high-performance computing data sets. As established lossy compressors such as SZ and ZFP evolve, they seek to improve the compression/decompression bandwidth and the compression ratio. Algorithm improvements may alter the spatial distribution of errors in the compressed data even when using the same error bound and error bound type. If HPC applications are to compute on lossy compressed data, application users require an understanding of how the performance and spatial distribution of error changes. We explore how spatial distributions of error, compression/decompression bandwidth, and compression ratio change for HPC data sets from the applications PlasComCM and Nek5000 between various versions of SZ and ZFP. In addition, we explore how the spatial distribution of error impacts application correctness when restarting from lossy compressed checkpoints. We verify that known approaches to selecting error tolerances for lossy compressed checkpointing are robust to compressor selection and in the face of changes in the distribution of error.
ISSN:	2168-9253
DOI:	10.1109/CLUSTER.2019.8891052