Fast and Efficient Compression of Floating-Point Data

Large scale scientific simulation codes typically run on a cluster of CPUs that write/read time steps to/from a single file system. As data sets are constantly growing in size, this increasingly leads to I/O bottlenecks. When the rate at which data is produced exceeds the available I/O bandwidth, th...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on visualization and computer graphics Vol. 12; no. 5; pp. 1245 - 1250
Main Authors	Lindstrom, P., Isenburg, M.
Format	Journal Article
Language	English
Published	United States IEEE 01.09.2006 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Analytical models Bandwidth Central processing units Compressing Computer simulation Data compression Data visualization Entropy fast entropy coding file compaction for I/O efficiency File systems Floating point arithmetic High throughput Image coding Integers large scale simulation and visualization Large-scale systems lossless compression predictive coding Predictive models range coder Studies Throughput Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Large scale scientific simulation codes typically run on a cluster of CPUs that write/read time steps to/from a single file system. As data sets are constantly growing in size, this increasingly leads to I/O bottlenecks. When the rate at which data is produced exceeds the available I/O bandwidth, the simulation stalls and the CPUs are idle. Data compression can alleviate this problem by using some CPU cycles to reduce the amount of data needed to be transfered. Most compression schemes, however, are designed to operate offline and seek to maximize compression, not throughput. Furthermore, they often require quantizing floating-point values onto a uniform integer grid, which disqualifies their use in applications where exact values must be retained. We propose a simple scheme for lossless, online compression of floating-point data that transparently integrates into the I/O of many applications. A plug-in scheme for data-dependent prediction makes our scheme applicable to a wide variety of data used in visualization, such as unstructured meshes, point sets, images, and voxel grids. We achieve state-of-the-art compression rates and speeds, the latter in part due to an improved entropy coder. We demonstrate that this significantly accelerates I/O throughput in real simulation runs. Unlike previous schemes, our method also adapts well to variable-precision floating-point and integer data
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Article-2 ObjectType-Feature-1
ISSN:	1077-2626 1941-0506
DOI:	10.1109/TVCG.2006.143