Data Compression Accelerator on IBM POWER9 and z15 Processors : Industrial Product
Lossless data compression is highly desirable in enterprise and cloud environments for storage and memory cost savings and improved utilization I/O and network. While the value provided by compression is recognized, its application in practice is often limited because it's a processor intensive...
Saved in:
Published in | 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) pp. 1 - 14 |
---|---|
Main Authors | , , , , , , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.05.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Lossless data compression is highly desirable in enterprise and cloud environments for storage and memory cost savings and improved utilization I/O and network. While the value provided by compression is recognized, its application in practice is often limited because it's a processor intensive operation resulting low throughput and high elapsed time for compression intense workloads.The IBM POWER9 and IBM z15 systems overcome the shortcomings of existing approaches by including a novel on-chip integrated data compression accelerator. The accelerator reduces processor cycles, I/O traffic, memory and storage footprint of many applications practically with zero hardware cost. The accelerator also eliminates the cost and I/O slots that would have been necessary with FPGA/ASIC based compression adapters. On the POWER9 chip, a single accelerator uses less than 0.5% of the processor chip area, but provides a 388x speedup factor over the zlib compression software running on a general-purpose core and provides a 13x speedup factor over the entire chip of cores. On a POWER9 system, the accelerators provide an end-to-end 23% speedup to Apache Spark TPC-DS workload compared to the software baseline. The z15 chip doubles the compression rate of POWER9 resulting in even much higher speedup factors over the compression software running on general-purpose cores. On a maximally configured z15 system topology, on-chip compression accelerators provide up to 280 GB/s data compression rate, the highest in the industry. Overall, the on-chip accelerators significantly advance the state of the art in terms of area, throughput, latency, compression ratio, reduced processor utilization, power/energy efficiency, and integration into the system stack.This paper describes the architecture, and novel elements of the POWER9 and z15 compression/decompression accelerators with emphasis on trade-offs that made the on-chip implementation possible. |
---|---|
DOI: | 10.1109/ISCA45697.2020.00012 |