Scalable and Provably Accurate Algorithms for Differentially Private Distributed Decision Tree Learning

This paper introduces the first provably accurate algorithms for differentially private, top-down decision tree learning in the distributed setting (Balcan et al., 2012). We propose DP-TopDown, a general privacy preserving decision tree learning algorithm, and present two distributed implementations...

Full description

Saved in:

Bibliographic Details
Main Authors	Wang, Kaiwen, Dick, Travis, Balcan, Maria-Florina
Format	Journal Article
Language	English
Published	19.12.2020
Subjects	Computer Science - Cryptography and Security Computer Science - Learning Statistics - Machine Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper introduces the first provably accurate algorithms for differentially private, top-down decision tree learning in the distributed setting (Balcan et al., 2012). We propose DP-TopDown, a general privacy preserving decision tree learning algorithm, and present two distributed implementations. Our first method NoisyCounts naturally extends the single machine algorithm by using the Laplace mechanism. Our second method LocalRNM significantly reduces communication and added noise by performing local optimization at each data holder. We provide the first utility guarantees for differentially private top-down decision tree learning in both the single machine and distributed settings. These guarantees show that the error of the privately-learned decision tree quickly goes to zero provided that the dataset is sufficiently large. Our extensive experiments on real datasets illustrate the trade-offs of privacy, accuracy and generalization when learning private decision trees in the distributed setting.
DOI:	10.48550/arxiv.2012.10602