Optimizing Privacy-Preserving Outsourced Convolutional Neural Network Predictions

Convolutional neural networks (CNN) is a popular architecture in machine learning for its predictive power, notably in computer vision and medical image analysis. Its great predictive power requires extensive computation, which encourages model owners to host the prediction service in a cloud platfo...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on dependable and secure computing Vol. 19; no. 3; pp. 1592 - 1604
Main Authors	Li, Minghui, Chow, Sherman S. M., Hu, Shengshan, Yan, Yuejing, Shen, Chao, Wang, Qian
Format	Journal Article
Language	English
Published	Washington IEEE 01.05.2022 IEEE Computer Society
Subjects	Accuracy Artificial neural networks Circuits Cloud computing Computational modeling Computer privacy Computer vision convolutional neural network Convolutional neural networks Cryptography Datasets homomorphic encryption Image analysis Machine learning Medical imaging Network latency Neural networks Outsourcing Parallel processing Polynomials Predictive models Privacy Reduction Secure outsourcing Servers
Online Access	Get full text
ISSN	1545-5971 1941-0018
DOI	10.1109/TDSC.2020.3029899

Cover

Loading…

More Information
Summary:	Convolutional neural networks (CNN) is a popular architecture in machine learning for its predictive power, notably in computer vision and medical image analysis. Its great predictive power requires extensive computation, which encourages model owners to host the prediction service in a cloud platform. This article proposes a CNN prediction scheme that preserves privacy in the outsourced setting, i.e., the model-hosting server cannot learn the query, (intermediate) results, and the model. Similar to SecureML (S&P'17), a representative work that provides model privacy, we employ two non-colluding servers with secret sharing and triplet generation to minimize the usage of heavyweight cryptography. We made the following optimizations for both overall latency and accuracy. 1) We adopt asynchronous computation and SIMD for offline triplet generation and parallelizable online computation. 2) As MiniONN (CCS'17) and its improvement by the generic EzPC compiler (EuroS&P'19), we use a garbled circuit for the non-polynomial ReLU activation to keep the same accuracy as the underlying network (instead of approximating it in SecureML prediction). 3) For the pooling in CNN, we employ (linear) average-pooling, which achieves almost the same accuracy as the (non-linear, and hence less efficient) max-pooling exhibited by MiniONN and EzPC. Considering both offline and online costs, our experiments on the MNIST dataset show a latency reduction of <inline-formula><tex-math notation="LaTeX">122\times</tex-math> <mml:math><mml:mrow><mml:mn>122</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq1-3029899.gif"/> </inline-formula>, <inline-formula><tex-math notation="LaTeX">14.63\times</tex-math> <mml:math><mml:mrow><mml:mn>14</mml:mn><mml:mo>.</mml:mo><mml:mn>63</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq2-3029899.gif"/> </inline-formula>, and <inline-formula><tex-math notation="LaTeX">36.69\times</tex-math> <mml:math><mml:mrow><mml:mn>36</mml:mn><mml:mo>.</mml:mo><mml:mn>69</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq3-3029899.gif"/> </inline-formula> compared to SecureML, MiniONN, and EzPC; and a reduction of communication costs by <inline-formula><tex-math notation="LaTeX">1.09\times</tex-math> <mml:math><mml:mrow><mml:mn>1</mml:mn><mml:mo>.</mml:mo><mml:mn>09</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq4-3029899.gif"/> </inline-formula>, <inline-formula><tex-math notation="LaTeX">36.69\times</tex-math> <mml:math><mml:mrow><mml:mn>36</mml:mn><mml:mo>.</mml:mo><mml:mn>69</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq5-3029899.gif"/> </inline-formula>, and <inline-formula><tex-math notation="LaTeX">31.32\times</tex-math> <mml:math><mml:mrow><mml:mn>31</mml:mn><mml:mo>.</mml:mo><mml:mn>32</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq6-3029899.gif"/> </inline-formula>, respectively. On the CIFAR dataset, our scheme achieves a lower latency by <inline-formula><tex-math notation="LaTeX">7.14\times</tex-math> <mml:math><mml:mrow><mml:mn>7</mml:mn><mml:mo>.</mml:mo><mml:mn>14</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq7-3029899.gif"/> </inline-formula> and <inline-formula><tex-math notation="LaTeX">3.48\times</tex-math> <mml:math><mml:mrow><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>48</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq8-3029899.gif"/> </inline-formula> and lower communication costs by <inline-formula><tex-math notation="LaTeX">13.88\times</tex-math> <mml:math><mml:mrow><mml:mn>13</mml:mn><mml:mo>.</mml:mo><mml:mn>88</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq9-3029899.gif"/> </inline-formula> and <inline-formula><tex-math notation="LaTeX">77.46\times</tex-math> <mml:math><mml:mrow><mml:mn>77</mml:mn><mml:mo>.</mml:mo><mml:mn>46</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="wang-ieq10-3029899.gif"/> </inline-formula> when compared with MiniONN and EzPC, respectively.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1545-5971 1941-0018
DOI:	10.1109/TDSC.2020.3029899