Enabling zero knowledge proof by accelerating zk-SNARK kernels on GPU
As a recent cryptography protocol, Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARK) allows one party to prove that it possesses certain information without revealing information to these untrusted proof provers. This mechanism has the ability to provide the function for const...
Saved in:
Published in | Journal of parallel and distributed computing Vol. 173; pp. 20 - 31 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Elsevier Inc
01.03.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | As a recent cryptography protocol, Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARK) allows one party to prove that it possesses certain information without revealing information to these untrusted proof provers. This mechanism has the ability to provide the function for constructing and verifying information integrity without privacy leaking. However, the computation kernels of zk-SNARK consume too much computing power and produce a significant performance bottleneck with the growing data volume and security requirement. In this paper, we take advantage of Graphic Processing Unit (GPU) to enhance zk-SNARK efficiency by accelerating the most time-consuming computation kernels: modular multiplication and Number-Theoretic Transform (NTT)/Inverse Number-Theoretic Transform (INTT) in Elliptic Curve Cryptography (ECC) pairing with two major improvements: (1) Adopting interval limbs multiply-add quaternary operation to directly accelerate ECC pairing by making full advantage of information entropy within the limited hardware bit width; (2) Data layout and shuffle methods in GPU global memory and shared memory for data space consistency maintenance accelerating NTT/INTT which indirectly works on ECC pairing. To the best of our knowledge, our work would be the first exploration to accelerate these improvements on GPU. The measured results show that our methods are able to accelerate modular multiplication and NTT/INNT by 1.22× and 4.67× times respectively compared with the previous GPU implementation. With these accelerated kernels, we are able to achieve 3.14× speedup for Groth16, which is the most efficient zk-SNARK implementation working on BLS12-381 ECC field. With the bottleneck tackled, our work will expand the deployment scenarios of zk-SNARK in Zero Knowledge Proof (ZKP).
•To accelerate zk-SNARK, we propose one mathematical scheme and its implementation.•We improve Montgomery algorithm for modular exponentiation & NTT/INTT as ECC kernels.•Our mathematical scheme well exploits the utilization of hardware bitwidth.•We propose a large data transposition method for NTT/INTT parallelism in GPU memory.•The improved zk-SNARK & multiplication reduction based kernels are tested effective. |
---|---|
ISSN: | 0743-7315 1096-0848 |
DOI: | 10.1016/j.jpdc.2022.10.009 |