Differentially Private SQL with Bounded User Contribution

Differential privacy (DP) provides formal guarantees that the output of a database query does not reveal too much information about any individual present in the database. While many differentially private algorithms have been proposed in the scientific literature, there are only a few end-to-end im...

Full description

Saved in:
Bibliographic Details
Published inProceedings on Privacy Enhancing Technologies Vol. 2020; no. 2; pp. 230 - 250
Main Authors Wilson, Royce J, Zhang, Celia Yuxin, Lam, William, Desfontaines, Damien, Simmons-Marengo, Daniel, Gipson, Bryant
Format Journal Article
LanguageEnglish
Published Sciendo 01.04.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Differential privacy (DP) provides formal guarantees that the output of a database query does not reveal too much information about any individual present in the database. While many differentially private algorithms have been proposed in the scientific literature, there are only a few end-to-end implementations of differentially private query engines. Crucially, existing systems assume that each individual is associated with at most one database record, which is unrealistic in practice. We propose a generic and scalable method to perform differentially private aggregations on databases, even when individuals can each be associated with arbitrarily many rows. We express this method as an operator in relational algebra, and implement it in an SQL engine. To validate this system, we test the utility of typical queries on industry benchmarks, and verify its correctness with a stochastic test framework we developed. We highlight the promises and pitfalls learned when deploying such a system in practice, and we publish its core components as open-source software.
ISSN:2299-0984
2299-0984
DOI:10.2478/popets-2020-0025