Differential Privacy-Based Genetic Matching in Personalized Medicine

Genetic matching in personalized medicine is becoming more popular in cloud computing, whereby a cloud server performs genetic matching from the genetic data outsourced by a gene provider (e.g., a genetic lab) and an authorized party (e.g., a doctor) for diagnosing the patients' diseases. Due t...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on emerging topics in computing Vol. 9; no. 3; pp. 1109 - 1125
Main Authors Wei, Jianhao, Lin, Yaping, Yao, Xin, Zhang, Jin, Liu, Xinbo
Format Journal Article
LanguageEnglish
Published New York IEEE 01.07.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Genetic matching in personalized medicine is becoming more popular in cloud computing, whereby a cloud server performs genetic matching from the genetic data outsourced by a gene provider (e.g., a genetic lab) and an authorized party (e.g., a doctor) for diagnosing the patients' diseases. Due to sensitive privacy, we should protect genetic data before outsourcing it to the untrusted cloud. However, traditional differential privacy schemes do not support genetic matching and ciphertext methods hinder data availability. In this article, we propose a differential privacy-based genetic matching (DPGM) scheme to achieve effective genetic matching and protect genetic privacy. Specifically, DPGM first uses a DP-based EIGENSTRAT (DPE) algorithm to construct a published sequence that contains significantly noisy single-nucleotide polymorphisms (SNPs) associated with diseases, thereby ensuring outsourced genetic data privacy. Second, DPGM adopts a DP-based N-order Markov (DPNM) algorithm to generate a noisy query sequence, which considers query privacy and the similarity between the noisy query and the actual query. Finally, DPGM calculates the longest common subsequence (LCS) based on a dynamic programming algorithm, which achieves effective matching results. Detailed theoretical analysis proves that our DPGM scheme achieves <inline-formula><tex-math notation="LaTeX">\epsilon</tex-math> <mml:math><mml:mi>ε</mml:mi></mml:math><inline-graphic xlink:href="wei-ieq1-2970094.gif"/> </inline-formula>-differential privacy. Extensive experiments over actual genetic datasets demonstrate that our scheme achieves high efficiency and data utility.
ISSN:2168-6750
2168-6750
DOI:10.1109/TETC.2020.2970094