Privacy-Preserving Decision Trees Training and Prediction
In the era of cloud computing and machine learning, data has become a highly valuable resource. Recent history has shown that the benefits brought forth by this data driven culture come at a cost of potential data leakage. Such breaches have a devastating impact on individuals and industry, and lead...
Saved in:
Published in | Machine Learning and Knowledge Discovery in Databases Vol. 12457; pp. 145 - 161 |
---|---|
Main Authors | , , , , , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2021
Springer International Publishing |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
ISBN | 9783030676575 3030676579 |
ISSN | 0302-9743 1611-3349 |
DOI | 10.1007/978-3-030-67658-2_9 |
Cover
Loading…
Summary: | In the era of cloud computing and machine learning, data has become a highly valuable resource. Recent history has shown that the benefits brought forth by this data driven culture come at a cost of potential data leakage. Such breaches have a devastating impact on individuals and industry, and lead the community to seek privacy preserving solutions. A promising approach is to utilize Fully Homomorphic Encryption (FHE $$\mathsf {FHE }$$ ) to enable machine learning over encrypted data, thus providing resiliency against information leakage. However, computing over encrypted data incurs a high computational overhead, thus requiring the redesign of algorithms, in an “FHE $$\mathsf {FHE }$$ -friendly” manner, to maintain their practicality.
In this work we focus on the ever-popular tree based methods (e.g., boosting, random forests), and propose a new privacy-preserving solution to training and prediction for trees. Our solution employs a low-degree approximation for the step-function together with a lightweight interactive protocol, to replace components of the vanilla algorithm that are costly over encrypted data. Our protocols for decision trees achieve practical usability demonstrated on standard UCI datasets, encrypted with fully homomorphic encryption. In addition, the communication complexity of our protocols is independent of the tree size and dataset size in prediction and training, respectively, which significantly improves on prior works. |
---|---|
Bibliography: | The first author thanks the Israel Science Foundation (grant 3380/19) and Israel National Cyber Directorate via the Haifa, BIU and Tel-Aviv cyber centers for their support. The authors wish to thank Yaron Sheffer for helpful discussions. Original Abstract: In the era of cloud computing and machine learning, data has become a highly valuable resource. Recent history has shown that the benefits brought forth by this data driven culture come at a cost of potential data leakage. Such breaches have a devastating impact on individuals and industry, and lead the community to seek privacy preserving solutions. A promising approach is to utilize Fully Homomorphic Encryption (FHE\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {FHE }$$\end{document}) to enable machine learning over encrypted data, thus providing resiliency against information leakage. However, computing over encrypted data incurs a high computational overhead, thus requiring the redesign of algorithms, in an “FHE\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {FHE }$$\end{document}-friendly” manner, to maintain their practicality. In this work we focus on the ever-popular tree based methods (e.g., boosting, random forests), and propose a new privacy-preserving solution to training and prediction for trees. Our solution employs a low-degree approximation for the step-function together with a lightweight interactive protocol, to replace components of the vanilla algorithm that are costly over encrypted data. Our protocols for decision trees achieve practical usability demonstrated on standard UCI datasets, encrypted with fully homomorphic encryption. In addition, the communication complexity of our protocols is independent of the tree size and dataset size in prediction and training, respectively, which significantly improves on prior works. |
ISBN: | 9783030676575 3030676579 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-030-67658-2_9 |