Gaussian differential privacy
In the past decade, differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in anal...
Saved in:
Published in | Journal of the Royal Statistical Society. Series B, Statistical methodology Vol. 84; no. 1; pp. 3 - 37 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Oxford
Oxford University Press
01.02.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In the past decade, differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analysing important primitives like privacy amplification by subsampling. Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation of differential privacy, which we term ‘f‐differential privacy’ (f‐DP). This notion of privacy has a number of appealing properties and, in particular, avoids difficulties associated with divergence based relaxations. First, f‐DP faithfully preserves the hypothesis testing interpretation of differential privacy, thereby making the privacy guarantees easily interpretable. In addition, f‐DP allows for lossless reasoning about composition in an algebraic fashion. Moreover, we provide a powerful technique to import existing results proven for the original differential privacy definition to f‐DP and, as an application of this technique, obtain a simple and easy‐to‐interpret theorem of privacy amplification by subsampling for f‐DP. In addition to the above findings, we introduce a canonical single‐parameter family of privacy notions within the f‐DP class that is referred to as ‘Gaussian differential privacy’ (GDP), defined based on hypothesis testing of two shifted Gaussian distributions. GDP is the focal privacy definition among the family of f‐DP guarantees due to a central limit theorem for differential privacy that we prove. More precisely, the privacy guarantees of any hypothesis testing based definition of privacy (including the original differential privacy definition) converges to GDP in the limit under composition. We also prove a Berry–Esseen style version of the central limit theorem, which gives a computationally inexpensive tool for tractably analysing the exact composition of private algorithms. Taken together, this collection of attractive properties render f‐DP a mathematically coherent, analytically tractable and versatile framework for private data analysis. Finally, we demonstrate the use of the tools we develop by giving an improved analysis of the privacy guarantees of noisy stochastic gradient descent. |
---|---|
ISSN: | 1369-7412 1467-9868 |
DOI: | 10.1111/rssb.12454 |