Distributed non-disclosive validation of predictive models by a modified ROC-GLM
Distributed statistical analyses provide a promising approach for privacy protection when analysing data distributed over several databases. It brings the analysis to the data and not the data to the analysis. The analyst receives anonymous summary statistics which are combined to a aggregated resul...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
21.03.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Distributed statistical analyses provide a promising approach for privacy
protection when analysing data distributed over several databases. It brings
the analysis to the data and not the data to the analysis. The analyst receives
anonymous summary statistics which are combined to a aggregated result. We are
interested to calculate the AUC of a prediction score based on a distributed
approach without getting to know the data of involved individual subjects
distributed over different databases. We use DataSHIELD as the technology to
carry out distributed analyses and use a newly developed algorithms to perform
the validation of the prediction score. Calibration can easily be implemented
in the distributed setting. But, discrimination represented by a respective ROC
curve and its AUC is challenging. We base our approach on the ROC-GLM algorithm
as well as on ideas of differential privacy. The proposed algorithms are
evaluated in a simulation study. A real-word application is described: The
audit use case of DIFUTURE (Medical Informatics Initiative) with the goal to
validate a treatment prediction rule of patients with newly diagnosed multiple
sclerosis. |
---|---|
DOI: | 10.48550/arxiv.2203.10828 |