Yahtzee: An Anonymized Group Level Matching Procedure

Researchers often face the problem of needing to protect the privacy of subjects while also needing to integrate data that contains personal information from diverse data sources. The advent of computational social science and the enormous amount of data about people that is being collected makes pr...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 8; no. 2; p. e55760
Main Authors Jones, Jason J., Bond, Robert M., Fariss, Christopher J., Settle, Jaime E., Kramer, Adam D. I., Marlow, Cameron, Fowler, James H.
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 05.02.2013
Public Library of Science (PLoS)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Researchers often face the problem of needing to protect the privacy of subjects while also needing to integrate data that contains personal information from diverse data sources. The advent of computational social science and the enormous amount of data about people that is being collected makes protecting the privacy of research subjects ever more important. However, strict privacy procedures can hinder the process of joining diverse sources of data that contain information about specific individual behaviors. In this paper we present a procedure to keep information about specific individuals from being "leaked" or shared in either direction between two sources of data without need of a trusted third party. To achieve this goal, we randomly assign individuals to anonymous groups before combining the anonymized information between the two sources of data. We refer to this method as the Yahtzee procedure, and show that it performs as predicted by theoretical analysis when we apply it to data from Facebook and public voter records.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: Two of the authors on this paper are employed by a commercial company, Facebook Inc. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials. The official data sharing policy at Facebook is that they will work with researchers who want to replicate published findings. An earlier version of the paper is publicly available here: http://arxiv.org/abs/1112.1038. The authors have declared that no competing interests exist.
Conceived and designed the experiments: JJJ RMB CJF JES ADIK CM JHF. Performed the experiments: JJJ RMB CJF JES ADIK CM JHF. Analyzed the data: JJJ RMB CJF JES ADIK CM JHF. Wrote the paper: JJJ RMB CJF JES ADIK CM JHF.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0055760