Multi-Keyword search over encrypted data with scoring and search pattern obfuscation

Search over encrypted data recently became a critical operation that raised a considerable amount of interest in both academia and industry. Especially, as outsourcing, sensitive data to cloud prove to be a strong trend to benefit from the unmatched storage and computing capacities thereof. Indeed,...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of information security Vol. 15; no. 3; pp. 251 - 269
Main Authors Orencik, Cengiz, Selcuk, Ayse, Savas, Erkay, Kantarcioglu, Murat
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2016
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Search over encrypted data recently became a critical operation that raised a considerable amount of interest in both academia and industry. Especially, as outsourcing, sensitive data to cloud prove to be a strong trend to benefit from the unmatched storage and computing capacities thereof. Indeed, privacy-preserving search over encrypted data, an apt term to address privacy-related issues concomitant in outsourcing sensitive data, have been widely investigated in the literature under different models and assumptions. In this work, we propose an efficient scheme that allows privacy-preserving search over encrypted data using queries with multiple keywords. Most important contributions of this work are as follows. Firstly, using a property referred as δ - mean query obfuscation , the proposed scheme hides the search patterns, which are allowed to leak in many works in the literature including our preliminary work on the subject Orencik et al. (2013) [ 1 ]. Secondly, a two-server setting is employed to eliminate the correlation between the queries and matching documents sent to the user under the assumption that the two servers are not colluding. Thirdly, we propose a novel compression scheme that reduces both the communication cost between the two servers and the computation cost of the search operation more than 55 times compared to the standard approach. And finally, the proposed scheme also provides an effective scoring and ranking capability that is based on term frequency–inverse document frequency (tf-idf) weights of keyword–document pairs. Our analyses demonstrate that the proposed scheme is privacy-preserving, efficient and effective.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:1615-5262
1615-5270
DOI:10.1007/s10207-015-0294-9