An examination of data confidentiality and disclosure issues related to publication of empirical ROC curves
Grant funding institutions often require organizations to share their collected data as widely as possible while safeguarding the privacy of individuals. Summaries based on these data are often released. Here, the receiver operating characteristic (ROC) curve is explored for potential statistical di...
Saved in:
Published in | Academic radiology Vol. 20; no. 7; p. 889 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
United States
01.07.2013
|
Subjects | |
Online Access | Get more information |
Cover
Loading…
Summary: | Grant funding institutions often require organizations to share their collected data as widely as possible while safeguarding the privacy of individuals. Summaries based on these data are often released. Here, the receiver operating characteristic (ROC) curve is explored for potential statistical disclosures in the presence of auxiliary data.
Formulas are introduced for calculating the missing data points from the full data set, given that a user has an empirical ROC curve and a subset of the data used to generate such a curve. Further, a discussion of the plausibility of this scenario is presented.
Diagnostic test data were simulated and an ROC curve was produced. Using a subset of the true data and the points on the empirical ROC curve, an attempt was made to reproduce the missing parts of the data. Disease statuses were able to be determined exactly, whereas test scores were solved for up to their rank.
If an individual or organization possessed the points of an empirical ROC curve and a subset of the true data, the true data underlying the ROC curve can be reproduced relatively accurately. As a result, the release of summaries of data, including the ROC curve, must be given careful thought before their release from a statistical disclosure perspective. |
---|---|
ISSN: | 1878-4046 |
DOI: | 10.1016/j.acra.2013.04.011 |