Statistically Valid Inferences from Privacy-Protected Data

Unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of privacy concerns. We address this problem with a general-purpose data acce...

Full description

Saved in:
Bibliographic Details
Published inThe American political science review Vol. 117; no. 4; pp. 1275 - 1290
Main Authors EVANS, GEORGINA, KING, GARY, SCHWENZFEIER, MARGARET, THAKURTA, ABHRADEEP
Format Journal Article
LanguageEnglish
Published New York, USA Cambridge University Press 01.11.2023
Subjects
Online AccessGet full text
ISSN0003-0554
1537-5943
DOI10.1017/S0003055422001411

Cover

Loading…
More Information
Summary:Unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of privacy concerns. We address this problem with a general-purpose data access and analysis system with mathematical guarantees of privacy for research subjects, and statistical validity guarantees for researchers seeking social science insights. We build on the standard of “differential privacy,” correct for biases induced by the privacy-preserving procedures, provide a proper accounting of uncertainty, and impose minimal constraints on the choice of statistical methods and quantities estimated. We illustrate by replicating key analyses from two recent published articles and show how we can obtain approximately the same substantive results while simultaneously protecting privacy. Our approach is simple to use and computationally efficient; we also offer open-source software that implements all our methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0003-0554
1537-5943
DOI:10.1017/S0003055422001411