Adapting Differentially Private Synthetic Data to Relational Databases
Existing differentially private (DP) synthetic data generation mechanisms typically assume a single-source table. In practice, data is often distributed across multiple tables with relationships across tables. In this paper, we introduce the first-of-its-kind algorithm that can be combined with any...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
28.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Existing differentially private (DP) synthetic data generation mechanisms
typically assume a single-source table. In practice, data is often distributed
across multiple tables with relationships across tables. In this paper, we
introduce the first-of-its-kind algorithm that can be combined with any
existing DP mechanisms to generate synthetic relational databases. Our
algorithm iteratively refines the relationship between individual synthetic
tables to minimize their approximation errors in terms of low-order marginal
distributions while maintaining referential integrity. Finally, we provide both
DP and theoretical utility guarantees for our algorithm. |
---|---|
DOI: | 10.48550/arxiv.2405.18670 |