PKIS: practical keyword index search on cloud datacenter

Park, Hyun-A; Park, Jae Hyun; Lee, Dong Hoon

doi:10.1186/1687-1499-2011-64

Research
Open access
Published: 17 August 2011

PKIS: practical keyword index search on cloud datacenter

Hyun-A Park¹,
Jae Hyun Park² &
Dong Hoon Lee¹

EURASIP Journal on Wireless Communications and Networking volume 2011, Article number: 64 (2011) Cite this article

6163 Accesses
4 Citations
Metrics details

Abstract

This paper highlights the importance of the interoperability of the encrypted DB in terms of the characteristics of DB and efficient schemes. Although most prior researches have developed efficient algorithms under the provable security, they do not focus on the interoperability of the encrypted DB. In order to address this lack of practical aspects, we conduct two practical approaches--efficiency and group search in cloud datacenter. The process of this paper is as follows: first, we create two schemes of efficiency and group search--practical keyword index search--I and II; second, we define and analyze group search secrecy and keyword index search privacy in our schemes; third, we experiment on efficient performances over our proposed encrypted DB. As the result, we summarize two major results: (1)our proposed schemes can support a secure group search without re-encrypting all documents under the group-key update and (2)our experiments represent that our scheme is approximately 935 times faster than Golle's scheme and about 16 times faster than Song's scheme for 10,000 documents. Based on our experiments and results, this paper has the following contributions: (1) in the current cloud computing environments, our schemes provide practical, realistic, and secure solutions over the encrypted DB and (2) this paper identifies the importance of interoperability with database management system for designing efficient schemes.

1 Introduction

Cloud computing technologies have become a central issue in order to open a new digitalized information society by heterogeneous services and convergence of technologies. In the era of cloud computing, personal computer and storage have changed their functions and features in socio-technical perspectives: the functions of personal computers have changed their concerns from individual to centralized managerial ones; the features of storage have also transformed its boundaries from personal databases or Enterprise Resource Planning (ERP) severs to the datacenter in social storage systems [1, 2].

In the cloud computing era, security research also encounters a variety of challenges and issues. Because the datacenter is made up of complex private information, and the datacenter is faced with the risks of information leakages and intruders or insiders' attacks. With these reasons, prior researchers have considered encryption as the most substantial way for protecting sensitive information as the last line of database defense.

1.1 Problem identification

In DB encryption, previous researchers have conducted the keyword index search over encrypted documents with various scenarios; however, the keyword index search scheme is inefficient and impractical aspects in a real world. The keyword index search enables a legitimate queries to search the encrypted documents with an encrypted keyword over the encrypted indexes without revealing any information on the query and documents, even to the server.

In most prior research, we find that the indexes of each data are stored by a row, not by a field (column) as another inefficient respect. The keyword index search schemes require at least a verifying test for every row of each data, so that the computational complexity of the previous schemes requires at least O(n) if the total number of stored data is n. The computation or scanning over many fields within one row is not fast, while the computation or scanning within one field is relatively faster than in one row. Moreover, encryption algorithm needs many random factors, which makes it hard to apply efficient DB schema^a to encrypted databases.

Our schemes are in the line of the keyword index search area, and this paper focuses on more practical approaches over the encrypted database to resolve the problems--the efficiency and group search of the encrypted database in the cloud datacenter service.

In this paper, we extend the search scope from between a server and a single user to the search between a server and group members (multiple users) in the cloud datacenter services, because current changing cloud computing technologies call for a variety of collaborations and cooperation among users in a certain social networking environment. These changing social networking environments require multiple users' information sharing in a certain organization; therefore, we propose the group key search of database encryption, when a group member shares his or her sensitive information among multiple users. Especially, sharing sensitive information should be encrypted by a group key in group search of database encryption. On the other hand, a group key has some problems to be used as a search key, because the group key has a dynamic property, i.e., a person may join or leave from the group. When a member leaves from a group, all data accessible to the group should not be accessible any more. It could be resolved by updating a group key, and the leaving member must not compute a new group key. On the other hand, when a member joins a group, he or she should obtain all of the previous group keys in order to access all of the group data. This problem, a member joins a group, makes design much harder. A naive solution is to decrypt all documents of the group and re-encrypt the documents by the new group key according to every membership change. Yet this solution entails a large amount of computational overheads.

In prior research, most schemes have not considered practical usages, while [3, 4] worked on the search schemes of dynamic group membership changes without re-encrypting documents. Park et al.'s scheme [3] is relatively faster than that of Wang et al. [4]. Wang et al.'s is based on bilinear, while Park et al. utilized the reversed hash key chains and bloom filters. The faster Park et al.'s scheme has a potential problem related to 'group member leave'. This paper, therefore, seeks to fix this proposed problem from Park et al.'s scheme--the reversed hash key chains, and it also develops novel efficient schemes with the experiments.

1.2 Key idea and contribution

The previous schemes have focused on the development of new encryption algorithms, while we apply general DB schema to the encrypted database instead of developing an efficient encryption algorithm. Based on this key idea, we devise two tables and store all indexes for all documents in one field (column). The two tables enable to build database normalization^b by applying primary keys and foreign keys into the tables. These properties of two tables enable the server to directly access the data that a user wants to search without any verification processes for every row.

Based on these two tables for efficiency, we construct PKIS-I with the reversed one-way hash key chain and PKIS-II with the key matching table, for the group search.

Through PKIS-I and PKIS-II, we summarize the results as follows:

1)
Efficiency
- Compared to computational complexity during the search process, our schemes' is O(1), while other previous papers' is at least O(n).
- Our experiments represent our scheme is approximately 935 times faster than Golle's scheme and about 16 times faster than Song's scheme for 10,000 documents.
2)
Group search
- By re-encrypting keywords or documents with the group manager (GM)'s secret key k_c, we resolved the encrypted database group search problem in cloud service.
- Whenever every membership change, our schemes can support a secure group search without re-encrypting all documents.
3)
Security
- We made definitions on group search secrecy and keyword index search privacy and analyzed them.

Therefore, this paper has two contributions as follows: (1) our schemes provide practical and realistic encrypted DB solutions in the cloud computing environments and (2) this paper identifies the importance of interoperability with DBMS as well as developing algorithms, to design efficient schemes.

1.3 Related works

The search systems research of encrypted data has been regarded as an active area with various scenarios. In this section, we review the prior papers in search systems on encrypted database.

Song et al. [5] firstly proposed a sequential scanning search algorithm, searchable symmetric key encryption, over entire documents by using stream and block ciphers. Following this idea, most researches have been conducted on the keyword index search. Boneh et al. [6] proposed a keyword search with a public key system, where they defined the concept of a public key encryption with keyword search (PEKS) and showed that PEKS implies identity-based encryption; however, the converse is currently an open problem. Chang et al. [7] suggested two index search schemes with the idea of pre-built dictionaries. Goh [8] formulated a security model for indexes known as semantic security (or indistinguishability) against an adaptive chosen keyword attack (IND-CKA), and they also proposed an secure index scheme in the model. Waters et al. [9] published the building of an encrypted and a searchable audit log, which searches the encrypted log with extracted keywords. Byun et al. [10] raised a serious vulnerability of public key-based keyword search schemes, which are susceptible to an off-line keyword guessing attack through much smaller space than passwords.

In addition, some proposed schemes extend the types of encrypted data queries. Boneh and Waters [11] suggested a public key system in order to support queries for testing any predicate on encrypted data with tokens produced by a secret key. They constructed comparison systems, subset queries, and conjunctive versions of these predicates, which introduce a primitive, hidden vector encryption. Hacigumüs et al. [12] proposed the method of range queries on encrypted data in the Database As a Service (DAS) model by using privacy homomorphism that allows basic arithmetic (+, -, ×) on encrypted data. Golle et al. [13] firstly proposed an efficient conjunctive keyword search over encrypted data and their scheme constructs a keyword field.

Hwang et al. [14] constructed a conjunctive keyword search scheme for group users, based on the public key. Wang et al. [4] developed threshold privacy preserving keyword search scheme. These schemes cannot support dynamic groups, while Park et al. [3] firstly proposed search schemes of dynamic groups, and their search schemes deal with membership changes without re-encrypting documents for each change of membership. Later, Wang et al. [15] built conjunctive keyword searches on encrypted data without keyword fields, and they applied these searches to the setting of dynamic groups.

Zerr et al. [16] worked on the problem of supporting keyword search for sensitive unstructured documents shared within collaboration groups. They proposed r-confidential Zerber indexing facility for sensitive documents, and they utilized secret splitting and term merging to provide tunable limits on information leakage, even under statistical attacks. As they admitted, this proposed indexing scheme would be unattainable in practice, and their scheme is inefficient. In succession, Zerr et al. [17] published Top-K retrieval algorithm from ZERBER^+R. In this work, they focused on ranked keyword search, term frequencies, and a novel relevance score transformation function. Here, the function in novel relevance score transformation hides the term-specific distribution of relevance score values, and it makes the scores of different terms indistinguishable. The authors of [18, 19] also handled with the same problems.

Wang et al. [20] considered the problem, concerning effective yet secure ranked keyword search over encrypted cloud data. In order to achieve practical performance, Wang et al. proposed a definition for ranked searchable symmetric encryption and used order-preserving symmetric encryption. Yet [20] is not a design for the group search. Cao et al. firstly explored the problem of multi-keyword ranked search over encrypted cloud data (MRSE), and they established a set of strict privacy requirements for such a secure cloud data utilization system to become a reality [21]. They proposed a basic MRSE scheme using secure inner product and then improved this scheme in order to meet different privacy requirements in two levels of threat models. Additionally, Zerr et al.'s schemes are not Boolean operation on multiple keywords searches in traditional searchable encryption schemes but they are ranked search operation. The evaluation methods and security requirements such as term frequency^c are different. Hence, the comparisons with our schemes are actually meaningless.

As for the papers about encrypted data in cloud computing, additionally, there are Li et al.'s [22] and Yu et al.'s [23]. Li et al. handled with the problem of authorized private keyword searches (APKS) over encrypted data in cloud computing, where multiple data owners encrypt their records along with a keyword index to allow searches by multiple users. Their two novel solutions for APKS are based on hierarchical predicate encryption, which uses pairing-based cryptography. Yu et al. proposed a secure and scalable fine-grained data access control scheme for cloud computing. In order to achieve this goal, they combined the techniques of attribute-based encryption, proxy re-encryption, and lazy re-encryption, which are also pairing-based cryptography.

2 Preliminaries

2.1 Keyword index search scheme

In general, keyword index search schemes consist of setup and searching processes. In the setup process, a client uploads encrypted data together with its indexes (also called searchable information) on a database server, and the indexes are encrypted keywords for searching the data. To search data with a keyword in the searching process, a user generates a trapdoor and sends it to the server. Here, the trapdoor is the encryption of the keyword and provides only search capabilities to the server without revealing any information about the keyword. The database manager runs the test algorithm with the indexes and the trapdoor as input to find the corresponding data. That is, this searching verification is performed on the indexes rather than on the encrypted data. The results are returned to the client, and the client finally decrypts the results and sends them back to the user.

2.2 System environments

2.2.1 Multiple user setting

Our system is devised for a certain group organization, which includes many departments such as government offices, organizations, or enterprises. This group includes subgroups (g₁, g₂, ..., g₇) and their members (p₁, p₂, ..., p₁₅). This paper identifies a group as a set of people with the same aims, and the group organizes the people working together. In this paper, we focus on a group search, because private search is possible through the same process as well.

2.2.2 Cloud datacenter service and modified DAS model

Our application storage system is a datacenter for the cloud storage service.^d The users of group members store their sharing documents in a datacenter, not their own server. In this case, we cannot guarantee that the datacenter server managers are trust; therefore, we utilize the cryptographic method for the data. This is similar to DAS model of [12]. In the DAS model, a client is trustworthy, while users' data are stored in and managed by an untrustworthy server. A client has a restricted computational power and storage and relies on the server for a mass computational power and storage. A server can be an inside attacker and is not allowed to read the data. Hence, the encryption key should not be known to the server (or the database administrator). Data privacy is assured under the conditions that a client does not share encryption keys, metadata or original data with any party.

Here, we modify the DAS model into our application system. Our scheme is made up of three parties: (1) users of group members, (2) a group manager GM, and (3) a datacenter server DS.

Users of group members are the owners of documents, and they are registered in their organization. GM plays a similar role of a client server, and it is a trusted party in our scheme. In our scheme, the GM manages the group session keys and the search keys of all groups, for secure communication and secure keyword index search.

DS is not a trustable party in our scheme. Hence, all of the documents in a server should be encrypted and querying keywords should be also encrypted. One of the most important things is that there is no decryption by a server through all processes.

2.3 Notations

TG: a huge hierarchical group
g_i: i th small group of G
$g_{i}^{j}$ : a small group g_iat j th session
D_n: n th documents
W_n: keywords list of D_n
$w_{n}^{i}$ : i th keyword of W_n
d_n: identifier of D_n
gk_i: group session key of a small group g_i
ik_i: index generation key of a small group g_i
dk_i: documents encryption key of a small group g_i
$g k_{i}^{j}$ : group session key of g_iat j th session
$i k_{i}^{j}$ : index generation key of g_iat j th session
$d k_{i}^{j}$ : documents encryption key of g_iat j th session
k_c: GM's secret key
f (·): pseudorandom function (PRF)
h(·): one-way hash function

2.4 Definitions

Definition 1. One-Way Hash Key Chain

It is generated by selecting the last value at random and applying a one-way hash function h repeatedly. Note that the initially chosen value is the last value of the key chain. The followings are two properties of a one-way hash chain [24].

Property 1 : Anybody can deduce that an earlier value k_ibelongs to the one-way key chain by using the later value k_jof the chain and by checking h^j-i(k_j) which equals k_iwith the later value k_j.
Property 2 : Given the latest released value k_iof a one-way key chain, an adversary cannot find a later value k_jsuch that h^j-i(k_j) equals k_i. Even when value k_i+1is released, the second pre-image collision resistant property prevents an adversary from finding $k_{i + 1}^{'}$ different from k_i+1such that h(k_i+1) equals k_i.

Definition 2. PRF We say that 'F : K_f × X → Y is (t, q, e)-secure PRF' if every oracle algorithm A making at most q oracle queries and with running time at most t has advantage Adv_A < e. The advantage is defined as $A d v_{A} = | P r [A^{F_{k}} = 1] - P r [A^{R} = 1] |$ where R represents a random function selected uniformly from the set of all maps from X to Y, in which the probabilities are taken over the choice of k and R[5].

2.5 Algorithm

SysPara( 1^k). It takes an input as a security parameter k and outputs a system parameter λ. λ determines elements in order to set the encrypted database system such as the size of database, encryption/decryption algorithm, functions, the size of parameters, and so on.
KeyGen( λ ). Taking λ as an input, this algorithm generates users' group session key set {g_k}, index generation key set {ik}, and document encryption key set {dk}.
IndGen( ik, W ). Inputs of algorithm IndGen are an index generation key ik and a keyword set W. Output is index list table.
DocEnc( dk, D ). Given a document encryption key dk and a document D, this algorithm outputs an encrypted document.
TrapGen( w, ik ). This algorithm takes a keyword w and index generation key ik. It encrypts the keyword w with index generation key ik and returns the encryption value, which is the trapdoor T_wfor the keyword w.
Retrieval( T_w). This algorithm takes input as trapdoor T_w. If there exist matching values to the trapdoor T_win an index list, then it outputs the encrypted documents that are mapped to the identifiers of the matching values in the index list table.
Dec( E(D), dk ). Given a document encryption key dk and encrypted document E(D), it outputs a plaintext document D.

3 Construction Of Practical Keyword Index Search-I (PKIS-I)

Our scheme PKIS largely comprises of two parts; (1) uploading phase and (2) downloading phase. The uploading phase consists of four algorithms of SysPara; KeyGen; IndGen; DocEnc. The downloading phase is composed of three algorithms of TrapGen; Retrieval; Dec.

PKIS-I's group key generation method is based on [3]. However, in [3], SIS-G has a big potential problem. If one of group members would reveal his/her group key to a server, the server could know all of the previous documents of the group members. In order to resolve this problem, we add a re-encryption process through GM and propose a new practical scheme with normalized database tables over encrypted documents in a keyword index search protocol area.

3.1 Uploading phase

3.1.1 SysPara(1^k) construction

With the algorithm SysPara(1 ^k ), GM generates system parameter λ = (f (·), h(·), q). f : {0, 1} ^k × {0, 1}* → {0, 1} ^k is a PRF and h : {0, 1}* → {0, 1} ^k is one-way hash function. q is the length of one-way hash key chain.

3.1.2 KeyGen(λ) construction

In this construction, group search keys are generated. With system parameter λ, GM generates group session keys ${g k_{i}^{j}}$ , index generation keys ${i k_{i}^{j}}$ , and document encryption keys ${d k_{i}^{j}}$ , where index generation keys and document encryption keys are called as search keys. The search keys are reversely generated by one-way hash key chains. At first, the last key of a key chain is selected (i.e. $i k_{1}^{q}$ and $d k_{1}^{q}$ , if the length of a key chain is q). GM applies the last key to a hash function repeatedly and computes all other keys until the first key comes out. It can be expressed like this: $i k_{1}^{i} = h (i k_{1}^{i + 1})$ , $d k_{1}^{i} = h (d k_{1}^{i + 1})$ where i ∈ [1,q - 1]. In more detail;

\begin{array}{l} {i k_{1}^{i}} = {i k_{1}^{q} \in_{R} {0, 1}^{k}, \\ h (i k_{1}^{q}) = i k_{1}^{q - 1}, \\ h (i k_{1}^{q - 1}) = i k_{1}^{q - 2}, \\ \dots \dots \\ h (i k_{1}^{4}) = i k_{1}^{3}, \\ h (i k_{1}^{3}) = i k_{1}^{2}, \\ h (i k_{1}^{2}) = i k_{1}^{1}} . \end{array}

\begin{array}{l} {d k_{1}^{i}} = {d k_{1}^{q} \in R {0, 1}^{k}, \\ h (d k_{1}^{q}) = d k_{1}^{q - 1}, \\ h (d k_{1}^{q - 1}) = d k_{1}^{q - 2}, \\ \dots \dots \\ h (d k_{1}^{4}) = d k_{1}^{3}, \\ h (d k_{1}^{3}) = d k_{1}^{2}, \\ h (d k_{1}^{2}) = d k_{1}^{1}} . \end{array}

For example, if an event of a session-change happens for a subgroup g₁, the first session is changed into the second session and then the group session key, a document encryption key, and an index generation key are changed like this: $g k_{1}^{1} \to g k_{1}^{2}, d k_{1}^{1} \to d k_{1}^{2}, i k_{1}^{1} \to i k_{1}^{2}$ .

One-way hash function h plays the important role of group search key in PKIS-I. One-wayness property of hash function can prohibit a leaving member from computing new keys after leaving the group. But any newly joining member can obtain all previous keys through applying the current key to hash function h repeatedly. This eliminates decryption and re-encryption of the previous documents.

These search keys are distributed to all of the group members every membership change. For example, in the second session, a member of subgroup g₁ receives a new group session key $g k_{1}^{2}$ at first. This group session key can be distributed by GM with well-known group key protocols, such as one in [25]. Then, $d k_{1}^{2}$ and $i k_{1}^{2}$ , which are computed in advance by the hash key chain, are encrypted with $g k_{1}^{2}$ and transferred to all members of subgroup g₁. It is illustrated in Figure 1.

3.1.3 IndGen(ik, W) and DocEnc(dk, D) construction

When a user stores documents D_n and its keywords W_n = {w_n,1, w_n,2,...} in a server, he encrypts the document and keywords with the algorithms DocEnc and IndGen. For a member of a small group g_i in the j th session, the encrypted document and indexes are generated as follows;

{d_{n}, f_{d k_{i}^{j}} (D_{n}), f_{i k_{i}^{j}} (w_{n, 1}), f_{i k_{i}^{j}} (w_{n, 2}), \dots}

$f_{i k_{i}^{j}} (w_{n, 1}), f_{i k_{i}^{j}} (w_{n, 2}), \dots$ are indexes that are the encrypted keywords. The user sends the encrypted document and indexes to GM.

3.1.4 Database update

Receiving the encrypted document and its indexes, GM re-encrypts them with his security key k_c . After this, GM sends them to a datacenter server DS. DS adds the received data to the tables of 'Index List' and 'Encrypted Document' every uploading time. 'Index List' is composed of indexes and their document identifiers as follows: $f_{k_{c}} (f_{i k_{i}^{j}} (w_{n, 1}))$ , $f_{k_{c}} (d_{n})$ ; $f_{k_{c}} (f_{i k_{i}^{j}} (w_{n, 2}))$ , $f_{k_{c}} (d_{n}); \dots; f_{k_{c}} (f_{i k_{i}^{j}} (w_{n, t}))$ , $f_{k_{c}} (d_{n})$ . Table 1 shows some parts of index list table. Then, DS stores an identifier $f_{k_{c}} (d_{n})$ and encrypted documents $f_{k_{c}} (f_{d k_{1}^{2}} (D_{n}))$ in a row like Table 2. Namely, PKIS is composed of two tables, where $f_{k_{c}} (d_{n})$ plays a role of a pointer as well as an identifier of D_n .

Table 1 Index list

Full size table

Table 2 Encrypted document

Full size table

Since an index list is made by this way, we can make a relational DB by applying primary key and foreign key into PKIS. The 'Index' and 'Identifier of Document' of Table 1 are defined as 'primary key', and 'Identifier of Document' of Table 2 is defined as 'foreign key'. There is no computation to test and to search in a datacenter server. We can diminish the gap from general plaintext search systems through minimizing computational overhead in the retrieval stage and applying efficient DB schema.

3.2 Downloading phase

3.2.1 TrapGen(w, ik) construction

Algorithm TrapGen(w, ik) outputs trapdoors for a keyword w. We assume again that the user of group g₁ at the second session wants to search a keyword w. The keyword w may be included in the document at the second session or/and the first session. Therefore, the user has to generate two trapdoors encrypted with $i k_{1}^{1}$ and $i k_{1}^{2}$ . That is, a user has to generate the trapdoors as many as the number of session-changes, which is possible because a user can compute all the previous search keys by applying the current search key to hash function h repeatedly. Then, the user computes trapdoors using the same method as index generation and sends them to GM. GM re-encrypts them with his secret key and then queries a datacenter server DS with the trapdoors. For a member of a small group g_i in the j th session, the trapdoors for a keyword w are as follows;

\begin{align} T_{w} & = {f_{k_{c}} (f_{i k_{i}^{s}} (w)), 1 \leq s \leq j} & (1) \\ = {f_{k_{c}} (f_{i k_{i}^{1}} (w)), f_{k_{c}} (f_{i k_{i}^{2}} (w)), \dots, f_{k_{c}} (f_{i k_{i}^{j}} (w))} & (2) \\ (3) \end{align}

3.2.2 Retrieval(T_w) and Dec(E(D), dk) construction

By the algorithm Retrieval, at first, DS searches the same values as the querying trapdoors in the 'Index' field of Table 1 and finds out the matching values to 'Index' and 'Identifier of Document'. Then, DS searches the same values as 'Identifier of Document' in Table 2 and returns the matching 'Encrypted Document's to GM. GM decrypts them with his secure key k_c and sends them to the user again. The user decrypts them with his/her group document encryption key.

Figure 1 describes the whole process of PKIS-I.

4 Construction Of Practical Keyword Index Search--II (PKIS-II)

In PKIS-II, the main difference from PKIS-I is that the search keys are not changed but fixed, irrespectively of membership changes. GM keeps the key matching information for groups, which consists of all of the group session keys and group search keys for each group. All users of group members do not know their group search keys. The only thing they know is a group session key. Instead, GM takes users' places for search processes.

The operative processes are similar to PKIS-I.

4.1 Uploading phase

4.1.1 SysPara(1^k) construction

This process is the same as PKIS-I.

4.1.2 KeyGen(λ) construction

GM generates group session keys, index generation keys, and document encryption keys for each group and stores them in a key matching table. In PKIS-II, if a session-change happens, for example of a subgroup g₁ from the first session to the second session, then the group session key is changed from $g k_{1}^{1}$ to $g k_{1}^{2}$ . However, the search keys of document encryption key dk₁ and index encryption key ik₁ are unchanged and remain still as dk₁ and ik₁. When needed, they can be encrypted with GM's secret key k_c .

4.1.3 IndGen(ik, W) and DocEnc(dk, D) construction

When a user stores a document D_n and its keywords {w_n,1, w_n,2,...} in a server, he encrypts the document and keywords with his group session key. For a member of a small group g_i in the j th session, the encrypted document and indexes in PKI-II are generated as follows;

{f_{g k_{i}^{j}} (d_{n}), f_{g k_{i}^{j}} (D_{n}), f_{g k_{i}^{j}} (w_{n, 1}), f_{g k_{i}^{j}} (w_{n, 2}), \dots}

The user sends these to GM.

4.1.4 Database update

Receiving the encrypted document and its indexes, GM decrypts them with the group g_i 's session key and then re-encrypts with the group search keys (index encryption key and document encryption key) and GM's secret key. Then, GM sends them to a server as follows:

{f_{k_{c}} (d_{n}), f_{d k_{j}} (D_{n}), f_{i k_{i}} (w_{n, 1}), f_{i k_{j}} (w_{n, 2}), \dots}

The next process is the same as PKIS-I.

4.2 Downloading phase

4.2.1 TrapGen(w, ik) construction

Main difference from PKIS-I in the construction of algorithm TrapGen(w, ik) is that PKIS-II does not need to generate trapdoors as many as the number of session-changes. If a user wants to search a keyword w, the user encrypts the keyword with his group session key and sends the trapdoor to GM. Like the Database Update Stage, GM decrypts and re-encrypts them. Then, GM queries DS with it. For a member of a small group g_i , the trapdoor for a keyword w in PKIS-II is only one for every time like this;

T_{w} = (f_{i k_{i}} (w))

4.2.2 Retrieval(T_w) and Dec(E(D), dk) construction

The retrieval stage is also the same as PKIS-I. Receiving the results (encrypted documents) from DS, GM decrypts them with data encryption key dk_i and re-encrypts with group session key $g k_{i}^{j}$ . And then, GM sends them to the user again. The user decrypts them with his group session key $g k_{i}^{j}$ .

Figure 2 shows the whole process of PKIS-II.

5 Security Analysis

5.1 Group search secrecy

Our retrieval system is the group key-based cryptographic searching method on encrypted documents. Therefore, in this section, we discuss group key secrecy. The following are group key security requirements in [26].

○ Group key secrecy: It must be computationally infeasible for a passive adversary to discover any secret group key.
○ Forward secrecy: Any passive adversary being in possession of a subset of old group keys must not be able to discover any subsequent group key.
○ Backward secrecy: Any passive adversary being in possession of a subset of subsequent group keys must not be able to discover any preceding group key.
○ Key independence: Any passive adversary being in possession of any subset of group keys must not be able to discover any other group key.
○ Forward secrecy provides security for subtractive events (leave), since it prevents former group members from computing the updated group key. Similarly, backward secrecy provides security for additive events (join), because it prevents new members from discovering the previously used group keys [27].

In this paper, the term 'negligible function' refers to a function η : N → R such that for any c ∈ N, there exists n_c ∈ N, such that $η (n) < \frac{1}{n_{c}}$ for all n ≥ n_c[13].

However, group key-based search system should not follow the above properties because a new joiner to the group such as a company or a government office should be able to search all of the previous documents to perform their successive tasks of the group. Namely, backward secrecy must not be a security requirement for our group search system. In this paper, we define group search secrecy as follows.

Forward search secrecy : For any group $g_{i}^{j}$ , the probability that a participant $p \in g_{i}^{j}$ can generate valid trapdoors for (j +1)th session is negligible when the participant knows valid group search key $K_{i}^{j}$ , where $p \notin g_{i}^{j + 1}$ and 0 < j < q. $i k_{i}^{j}$ and $d k_{i}^{j}$ fall under $K_{i}^{j}$ in PKIS-I and $g k_{i}^{j}$ falls under $K_{i}^{j}$ in PKIS-II.

It means that all leaving members from a group should not access to all of the next documents of the group any more.

Backward search accessibility : For any group $g_{i}^{j}$ , the probability that a participant $p \in g_{i}^{j}$ can generate valid trapdoors for (j - l)th session is 1 - η (n) when the participant knows valid group search key $K_{i}^{j}$ , where $p \notin g_{i}^{j - l}$ and 0 < l < j. $i k_{i}^{j}$ and $d k_{i}^{j}$ fall under $K_{i}^{j}$ in PKIS-I and $g k_{i}^{j}$ falls under $K_{i}^{j}$ in PKIS-II.

Namely, all joining members to a group can access to all of the previous documents of the group.

Group search secrecy: For a datacenter server DS, when a revelation of group search key $K_{i}^{j}$ happens, the probability that DS can guess correctly the encrypted documents of group g_iat the j th session is negligible.

It must be computationally infeasible for DS to know or guess correctly the contents of the encrypted documents and trapdoors even if a leaving member or another member in a group reveals his group search keys.

5.1.1 PKIS-I

In PKIS-I, group search keys are reversely generated by the one-way hash key chain. Our scheme PKIS-I satisfies with Group Search Secrecy as follows.

Forward search secrecy: By the Property 2 of Definition 1, if the latest released group search key is $K_{i}^{j}$ , any participant cannot know a later value $K_{i}^{l}$ such that $h^{l - j} (K_{i}^{l}) = K_{i}^{j}$ . Therefore, the probability that a participant $p \in g_{i}^{j}$ can generate valid trapdoors for the next (j + 1)th session is negligible, where $p \notin g_{i}^{j + 1}$ .
Backward search accessibility: By the Property 1 of Definition 1, if the latest released group search key is $K_{i}^{j}$ , any participant can deduce an earlier value $K_{i}^{l}$ by applying the later value $K_{i}^{j}$ to one-way hash key chain like this; $h^{j - l} (K_{i}^{j}) = K_{i}^{l}$ . Therefore, the probability that a participant $p \in g_{i}^{j}$ can generate valid trapdoors for (j - l)th session is 1 - η(n), where $p \notin g_{i}^{j - l}$ and 0 < l < j.
Group search secrecy: In PKIS-I, GM re-encrypts all documents and indexes including trapdoors with his secret key k_c. Although one of group members reveals his/her group search keys to a datacenter server DS, DS cannot learn anything because DS does not know GM's secret key k_c. Therefore, the probability that DS can guess correctly the encrypted documents of group g_iat the j th session is negligible when $K_{i}^{j}$ is revealed to DS.

5.1.2 PKIS-II

Group search keys ik and dk are unchangeable in PKIS-II and actual group search secrecy depends on group session key gk. When a user queries GM with a keyword, the keyword is encrypted by his/her group session key. If the user is a valid member of a certain group, GM can decrypt the querying keyword and then can generate a valid trapdoor for the user with his/her group search key. In this respect, it is proper that we regard a group session key as a group search key in PKIS-II. Thus, group search secrecy is up to the security of a group key agreement protocol.

Forward search secrecy: If membership changes occur, a new group session key is generated and distributed securely to valid members according to a given protocol, and leaving members cannot get a new group session key. Hence, the leaving member cannot generate the valid trapdoor for a new session because GM decrypts a trapdoor with the group's newly updated session key.

We assume that a given group key agreement protocol satisfies with forward secrecy with the probability of 1 - η (n). Then, the probability that a participant $p \in g_{i}^{j}$ can generate valid trapdoors in the next (j +1) session is negligible (or follows negligible function) when the participant knows the j th valid group search key $K_{i}^{j} (= g k_{i}^{j})$ .

Backward search accessibility: For joining members, a new group session key is generated and distributed securely to valid members according to a given protocol, and the new joiners can also retrieve all of the previous documents because group search keys ik and dk are unchangeable in PKIS-II. If a joiner is authenticated as a valid user with his/her group session key, GM queries DS with a trapdoor instead of the user. The trapdoor is encrypted by unchangeable index generation key ik.

We assume again that the given group key agreement protocol satisfies with backward secrecy with the probability of 1 - η (n). Then, the probability that a participant $p \in g_{i}^{j}$ can generate valid trapdoors for (j -l) - th session is 1 - η(n) when the participant knows valid group search key $K_{i}^{j} (= g k_{i}^{j})$ , where $p \notin g_{i}^{j - l}$ and 0 < l < j.

Group search secrecy: Members of a group cannot know their group search keys ik and dk in PKIS-II and only GM knows them. Even if a leaving member or another malicious member reveals his group session key gk to DS, DS cannot know the contents of the documents or trapdoor because they are encrypted with the group search keys ik and dk that group members do not know. Therefore, the probability that a datacenter server DS can guess correctly the encrypted data of a group g_iat the j th session is negligible when $K_{i}^{j} (= g k_{i}^{j})$ is revealed to DS.

5.2 Keyword index search privacy

Song et al. [5] firstly proposed a cryptographic scheme which queries with encrypted keyword over encrypted data without decrypting anything by a server. They introduced four security requirements under an untrustworthy server. They are 'provable secrecy' (an untrustworthy server cannot learn anything about the plaintext given only the ciphertext), 'controlled searching' (an untrustworthy server cannot search for a word without the user's authorization), 'hidden queries' (an user may ask the untrustworthy server to search for a secret word without revealing the word to the server), and 'query isolation' (an untrustworthy server learns nothing more than the search result about the plaintext). However, Song's scheme is not for an index search system so that 'indistinguishability of indexes' have been considered additionally in other keyword index search schemes as well as the Song's requirements.

In our scheme, we assume an untrustworthy server as an adversary and our goal is to prevent a server from revealing or misusing users' information without users' consent. We accomplish our goal by encrypting documents and querying keywords. With relation to this goal, we define our security requirements using the term of 'Privacy'. The privacy is the ability to control private information, which includes identity and identifiers, and sensitive information [28], i.e., self-control for his/her information. The following is our definition about keyword index search privacy.

5.2.1 Retrieval access control

User access control.

For participants p ∈ g_i , the probability that p can search for the documents of gt is negligible, where i, t ≥ 1, t ≠ i. It means that all of the users encrypt their documents with their secret key and can retrieve only their documents. It is because only a legitimate user who has a valid key can generate valid trapdoors and decrypt the retrieved data, where valid trapdoors mean the querying keywords to GM, generated by valid users.

1)
PKIS-I: If a user p ∈ g_i tries to retrieve some documents of a group g_t in the second session, p should know $i k_{t}^{1}$ , $i k_{t}^{2}$ and $d k_{t}^{1}$ , $d k_{t}^{2}$ , which are encrypted with each group session keys and transferred to the group members of g_t like this: $f_{g k_{t}^{1}} (i k_{t}^{1}, d k_{t}^{1})$ , $f_{g k_{t}^{2}} (i k_{t}^{2}, d k_{t}^{2})$ . Refer to Figure 2. The only users that know the search keys $i k_{t}^{1}$ and $i k_{t}^{2}$ can generate valid trapdoors. Then, the users query GM with the trapdoors. Except for the members of a group g_t , nobody knows the values $i k_{t}^{1}$ , $i k_{t}^{2}$ and $d k_{t}^{1}$ , $d k_{t}^{2}$ because of the security of PRF f.

We assume that f is (t, q, e)-secure PRF and a user p ∈ g_i tries to retrieve the documents of a group g_t in the j th session, where i, t ≥ 1, t ≠ i. Then, by Definition 2, we know AdvA < e^j , 0 < e < 1. Therefore, we can say that the probability of retrieval is negligible.

In addition, if malicious leaving members from g_t reveal their group search keys to other groups' members when a session is changed from the second to the third, other users can know only $i k_{t}^{1}$ , $i k_{t}^{2}$ and $d k_{t}^{1}$ , $d k_{t}^{2}$ . Because they cannot know new session's keys $i k_{t}^{3}$ , $d k_{t}^{3}$ , they cannot generate valid trapdoors for the third session so that they cannot be authenticated as valid users to GM.

This problem falls under Forward Search Secrecy.

2)
PKIS-II: A user p ∈ g_i should know $g k_{t}^{j}$ to retrieve the documents of a group g_t in the j th session. This is because valid users generate trapdoors with their group session key and then query GM with the trapdoors in PKIS-II. The group session keys are distributed to the group members securely according to a given group key agreement protocol. We assume that a given group key agreement protocol is secure for key distribution with the probability of 1 - η(n). Therefore, the probability that a participant p ∈ g_i can retrieve the documents of g_t follows negligible function η (n), where i, t ≥ 1, t ≠ i.

Server search control.

For a datacenter server DS, when DS generates trapdoors with a random selected keyword and search keys, the probability that a server succeeds in retrieving is negligible.

It is the similar concept to 'controlled searching' of [5] and 'capability' of [13]. An untrustworthy server cannot search for a word without given 'searching ability' from users. In our schemes, the concept is the same meaning as a valid trapdoor. The valid trapdoor generation requires that a user should know secret key values. Here, valid trapdoors mean the querying keywords generated by GM to a datacenter server DS.

1)
PKIS-I: Valid trapdoors are generated by the secret values of each session in PKIS-I: an index generation key ik and GM's secret key k_c . The two values are secret keys for PRF f. By Definition 2, if DS generates trapdoors with a random selected keyword and search keys, the probability that a server can succeed in retrieving is e ², negligible.
2)
PKIS-II: Valid trapdoors are generated by an unchanging index generation key ik. In PKIS-II, ik is the secret key which any user does not know but only GM knows that. The key is also a secret key for PRF f. Therefore, by Definition 2, if DS generates trapdoors with a random selected keyword and search keys, the probability that a server can succeed in retrieving is e, negligible.

5.2.2 Unobservability

Generally, unobservability means that when a user utilizes a resource or service, the others cannot know the resource or service is being used [29]. If f is a pseudorandom function, h is one-way hash function, and all processes are performed according to the given protocol, all attackers(including insiders such as a datacenter server DS) cannot learn anything about the contents of encrypted documents by querying with encrypted keywords. It is because all the search processes by DS are implemented without decrypting anything.

We assume that f is (t, q, e)-secure PRF as we define earlier, h is (t, e_h ) one-way hash function such that any attack algorithm A running in time t has success probability at most e_h , and a given group key agreement protocol is secure with the probability of 1 - η (n). We choose the key material as described above, and all processes are done according to the given protocol. Then, our scheme PKIS-I can guarantee the security at least 1 - {e_h + (2e² + e) + e²} through whole processes in that an adversary cannot learn anything about the contents of encrypted documents except for the results.^e PKIS-II can guarantee the security at least 1 - {η (n)+3e +2e}.

5.2.3 Unlinkability--index indistinguishability

Unlinkability means that when resources and services are used by someone, the others cannot link these being correlated or used together. In keyword index search system, it can be regarded as index indistinguishability.

Since Goh [8] formulated IND-CKA for indexes known as semantic security, most researchers have followed Goh's security definition and proof in this area. 'Indistinguishability for Indexes' guarantees that an adversary cannot deduce data's contents from its index list. An adversary cannot know even the fact whether two documents have the common keyword or not. Given two word lists W₀ and W₁, we say that the search scheme provides 'Index Indistinguishability' if a server S cannot distinguish the index list I₀ from I₁ for W₀ and W₁ with non-negligible advantage.

However, our schemes do not guarantee this property. In our scheme, the common keywords in different documents for a certain group have the same index values. Even if an adversary does not know what the keywords mean, the adversary can know that the keywords have something in common. An adversary might guess that two documents have something correlated. This is because we use only deterministic symmetric functions that have the same encryption value under the same data and the same key. And we did not use any random factor in our schemes. It makes our schemes more efficient than any other schemes because we can apply the database schema of 'primary key' and 'foreign key'. The details are addressed in the next section.

Consequently, our schemes can guarantee 'Retrieval Access Control' and 'Unobservability' but not 'Unlinkability'. However, in a common real world, users would like to choose practical schemes under the appropriate control of security other than the scheme which is hard to apply a real world due to inefficiency from the high level of security.

6 Experiments Of Performance

In this section, we describe the experiments of our proposed schemes.

6.1 Setting of experiments

Our system processes the transactions on an Intel Pentium 4 CPU 2.66 GHz processor with 512 MB RAM. We use MS SQL Server 2000 as the database system and use WinAPI C Library and MS-SQL DB Library for C. These experiments use OpenSSL cryptography modules for cryptographic operations such as SHA-1 and AES. Table 3 describes the detailed implementation parameters. We assume different documents contain common keywords, and we set that a common keyword repeats at least every 435 documents among 10,000 documents.

Table 3 Implementation environment and parameters

Full size table

Through our experiments, group search and efficiency can be identified as primary results of our schemes. Consequently, our experiments consist of largely two parts: Sections 6.2 and 6.3. Section 6.2 deals with the analysis of our schemes in group search. Section 6.3 deals with comparisons of our scheme PKIS-II with other schemes in order to show the efficiency of our schemes.

6.2 Analysis on PKIS-I and PKIS-II

We experiment with respect to the number of documents and the number of sessions. For example, the search process of PKIS-I takes about 7.9 ms (0.0079 s) at the first session and PKIS-II takes about 8.8 ms (0.0088 s) for 10,000 documents. Refer to Table 4. The main difference between PKIS-I and PKIS-II is key management.

Table 4 Searching time according to session-changes (time unit: ms)

Full size table

In PKIS-I, group search keys ik and dk are reversely generated with hash key chains by GM, which are dynamic to session-changes. The group search keys for each session are encrypted with a group session key and then transferred to group members. Actual encryption keys for indexes and documents in database tables are made up of the group search keys and GM's secret key. This means that secret values are managed together by group members and GM. Especially, the more number of sessions have passed, the more trapdoors for one keyword query should be generated in PKIS-I, because group search keys ik and dk are updated dynamically to session-changes. Nevertheless, the searching time of PKIS-I is only within 53 ms (0.053 s) when a session is the 1000th. In fact, the current session may be over 1000 in some environments such as mobile environments, and it would require more time and computational overheads. However, our applications are for organizations such as companies or municipal offices, so that our performance can manage these applications (group organizations) sufficient.

In PKIS-II, group search keys ik and dk are unchanging irrespectively of session-changes. GM keeps a key matching information for groups, where group search keys ik and dk are matched to the dynamic group's session keys. When group members query GM with some data, the data should be encrypted with the group's session key, whereby a group member can be authenticated as a valid group member. Once a member passes the authentication, most processes are implemented by GM instead of the member. Receiving some data from a group member or a server, GM decrypts and re-encrypts the received data, so that GM gets to know all of the contents of documents and trapdoors every query time. However, only one trapdoor is sufficient for one keyword due to unchanging group search keys independently of session-changes. The invariable searching time is required irrespectively of session-changes. If the current number of session is high, the performance of PKIS-II is more efficient than PKIS-I as described in Table 4.

6.3 Comparison of our scheme with other schemes

6.3.1 The results of implementation

In order to evaluate our scheme's performance with objective validity, we experiment the following four previous schemes: (1) Song et al.'s [5]; (2) Golle et al.'s [13]; (3) Waters et al.'s [9] variation; and (4) Park et al.'s [30]. Song et al. deal with the symmetric cryptographic method as a pioneering work in this area. Golle et al. conduct the most secure scheme, which satisfies 'query isolation' and index indistinguishability as well. Waters et al. deal with audit log server; however, we assume that their server is as a general database server, because their keyword search technique on the encrypted data has wide applications beyond searchable audit logs. We experiment only one, symmetric scheme of their two symmetric and asymmetric schemes, because symmetric scheme is much faster. Park et al.'s schemes also deal with symmetric methods. They work on similarity search, and their schemes are the encrypted characters by characters. The searching method is approximate string matching test by hamming distance, i.e., we can expect the schemes would be inefficient. However, Park et al. maintain Golle et al.'s security and improve Golle et al.'s inefficiency in spite of the characterwise encryption method. In their paper [30], they did not show the formal security proof and the experimental proof. Therefore, this paper compares Golle et al.'s and Park et al.'s with our schemes.

Although there are many papers as the recent schemes such as [18, 20–23], [18, 20, 21] do not deal with the Boolean operation on keyword searches as the traditional searchable encryption schemes, but the ranked search operation. As we mentioned earlier, the comparison with our method is meaningless, because their evaluation method and security requirements are different. In addition, these schemes of [22, 23] are also not appropriate to compare with our schemes, because [22, 23] deal with asymmetric schemes based on pairing-based cryptography. Section 6.3.3 demonstrates the detailed reasons.

In order to evaluate the efficiency of encrypted search systems more precisely, we also perform experiments on the plaintext version (PKISIIP) without encryption. We compared only PKIS-II with other schemes, because our schemes take the multiple user setting of group search. On the other hand, PKIS-II has the similar search processes to other schemes, because it does not require the group search key changes such as PKIS-I.

Table 5 shows the result of our experiments. The performance of our scheme is much better than the existing schemes. For instance, the performance of PKIS-II is about 935 times faster than Golle's scheme and about 16 times faster than Song et al.'s scheme for 10,000 documents. Park et al.'s schemes, SSS-I and SSS-II are not fast but their schemes are faster than Golle's as they claimed.

Table 5 Searching time comparison with other schemes (time unit: ms)

Full size table

In the search process, PKIS-II needs very slight computational overheads, within 10 ms (0.01 s). With the respect to time consumption, a search process is the most important factor. The search process of PKIS-II is similar to general plaintext search system because it can directly access the data without verifying for every row. It needs the additional time only to generate a trapdoor and to decrypt returned documents. The used cryptographic function in PKIS is also very fast.

From the next subsection, we analyze our results in two respects of the applicability of DB schema and the influence of functions.

6.3.2 The applicability of DB schema

In most existing schemes, the indexes of each document are encrypted with random factors for indistinguishability and the encrypted indexes are stored by a row. Hence, a server should implement at least one computation for each document every row to verify whether this document contains the querying keyword or not. This makes it difficult to apply DB schemas into encrypted database search systems. Accordingly, the computational complexity of previous schemes requires at least O(n) if the number of documents is n. In addition, most previous schemes store a document's indexes by a row not in a field (column). The computation or scanning within one field is relatively faster than within one row. In contrast, the computation or scanning for many fields within one row is not fast.

Our schemes solved these problems by different database structures from other schemes. In Table 1 Index List, all of the indexes for all documents are stored in one field. Generally, the row size limitation is strict but the field size of database is at least 4 TB or more, i.e. relatively unrestricted. For example, the maximum number of bytes per row of MS SQL 2000 is only 8 kB and MS SQL 2005 is 2 GB [31]. Hence, setting an index column for all indexes does not have any problem in our schemes, and the encrypted documents and their identifiers are stored in another table.

We achieved database 'normalization' with 'primary key' and 'foreign key'. This is possible because we use different database table structure and deterministic functions. We do not use any random factors. Consequently, these properties enable a server to directly access the data that a user wants. Thus, there is no computation to test whether this document contains the querying keyword or not for every row.

6.3.3 The influence of function

The kind of applied functions greatly influences on the search time. There are many schemes dealing with bilinear function such as [13, 22, 23, 32–37] among the recently proposed keyword search schemes. For example, in the experiment of [35], searching 10,000 indexes requires approximately 720 s (720000 ms). Compared with symmetric cryptographic method, the calculation of one pairing takes much more time. Consequently, bilinear function is not appropriate for real-world applications. On the other hand, our proposed schemes are based on the only symmetric cryptographic function.

7 Conclusion

In cloud computing environments, DAS model is the most realistic to manage sensitive information with safety, because a server manager is considered untrustworthy. Encryption over database is also one of the most substantial ways in order to accomplish the goal of the DAS model. Although the encryption method has some negative effects such as inefficiency and hardness of applying DB schemas, we should not hinder the performance or general operations of database because of the encryption for security and privacy.

Considering prior researchers' endeavors in the individual setting between a server and a user, this paper focuses on more realistic applications and environments with two aspects: the group search and efficiency. To do this, firstly, we conduct a group search rather than a private setting. This group search does not require re-encrypting all documents under the key update from session-change. Secondly, for more efficient application in a real world, we develop the database table in order to apply the efficient DB schemas (normalization using primary key and foreign key) to encrypted documents. Also, we define and analyze the group search secrecy and keyword index search privacy. Moreover, this paper represents our scheme's efficiency through experiments.

This paper realizes efficient performances by developing two novel encrypted database tables. These two encrypted database tables make it possible a server to access data directly. Prior papers'computational complexity is at least O(n), while our schemes' computational complexity is O(1) during a search process. Therefore, our scheme is approximately 935 times faster than Golle's scheme and around 16 times faster than Song's scheme for 10,000 documents.

As the result of our experiments, we maintain the characteristics of DB application layers, which supports the interoperability of DB applications in order to design efficient schemes. This paper has two contributions: (1) in the cloud datacenter service environments, our schemes provide practical and realistic encrypted DB solution and (2) identifying the importance of interoperability with DBMS for designing efficient schemes.

For future works, we need to focus on the more experiments of the performance in real mobile applications. In cloud computing environments, end-users require various types of usages with mobile applications such as PDA or mobile phone as many as PCs. Therefore, we believe 'interoperability' of a mobile application and 'compatibility' between mobile and DB applications as important factors to improve the efficiency of schemes.

9 Endnotes

^aDB schema is the structure of a database system, described in a formal language supported by the DBMS. In a relational database, the schema defines the tables, the fields in each table, and the relationships. ^bDatabase normalization can be defined as the practice to optimize table structures. Particularly concentrating on how these data are interrelated, optimization is the result of a investigation from the various pieces of data stored within the database. Considering the analysis of this data and its corresponding relationships, it is advantageous in two points: first, the analysis will be the result of substantial improvement of the speed when the tables are queried; second, it decreases the chance of the database integrity compromised due to tedious maintenance procedures. ^cIn ranked search, term frequency means a count of the number of times that term appears in that document [16]. ^dThe perspective of utility computing. The cloud computing technologies and services enables for providers and companies to offer a policy: pay-for-what-you-use such as that of electricity, fuel, and water. With these economic strengths, cloud computing has become a leading computing technology and expanded seamless services; however, security studies encounter new challenges and issues in cloud computing era. First of all, the datacenter of cloud storage services has high risk of information leakage by intruders or insiders. Especially, it cannot guaranteed that datacenter managers are trustful. Storing confidential information outside (datacenter) makes the data center risky in terms of the infringement of privacy and security. Cloud services are broadly divided into three categories: Infrastructure-as-a- Service (IaaS), Platform-as-a-Service (PaaS) and Software-as-a-Service (SaaS) [38]. ^eThe first part within a brace is for key generation, the second part is for database table, and the third part is for trapdoor.

References

Armbrust M, Fox A, Griffith R, Joseph AD, Katz RH, Konwinski A, Lee G: Above the clouds: a Berkeley view of cloud computing. Technical Report: EECS-2009-28 2009.
Google Scholar
Buyya R: Market-oriented cloud computing: vision, hype, and reality of delivering computing as the 5th utility. 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, ccgrid 2009, 1.
Google Scholar
Park H, Byun J, Lee D: Secure index search for groups. TrustBus 2005, LNCS3592 2005, 128-140.
Google Scholar
Wang P, Wang H, Pieprzyk J: Threshold privacy preserving key word searches. SOFSEM 2008, LNCS 4910 2008, 646-658.
Google Scholar
Song D, Wagner D, Perrig A: Practical techniques for searches on encrypted data. IEEE Symposium on Security and Privacy 2000, 44-55.
Google Scholar
Boneh D, Crescenzo GD, Ostrovsky R, Persiano G: Public-key encryption with keyword search. Eurocrypt04, LNCS 3027 2004, 506-522.
Google Scholar
Chang YC, Mitzenmacher M: Privacy preserving keyword searches on remote encrypted data. Cryptology (ePrint Archive) 2004.
Google Scholar
Goh E: Secure indexes. Cryptology (ePrint Archive) 2004.
Google Scholar
Waters B, Balfanz D, Durfee G, Smetters D: Building an encrypted and searchable audit log. NDSS04, The Internet Society 2004, 205-214.
Google Scholar
Byun J, Rhee H, Park H, Lee D: "Off-Line Keyword Guessing Attacks on Recent KeywordSearch Schemes over Encrypted Data". SDM2006, Lecture Notes in Computer Science 4165 2006, 75-83.
Google Scholar
Boneh D, Waters B: Conjunctive, subset, and range queries on encrypted data. Proceedings of TCC 07 2007.
Google Scholar
Hacigumus H, Iyer B, Mehrotra S: Efficient execution of aggregation queries over encrypted relational databases. DASFAA 2004, LNCS 2793 2004, 125-136.
Google Scholar
Golle P, Staddon J, Waters B: Secure conjunctive keyword search over encrypted data. ACNS04, LNCS 3089 2004, 31-45.
Google Scholar
Hwang Y, Lee P: Public key encryption with conjunctive keyword search and its extension to a multi-user system. Pairing 2007, LNCS 4575 2007, 2-22.
Google Scholar
Wang P, Wang H, Pieprzyk J: Keyword field-free conjunctive keyword searches on encrypted data and extension for dynamic groups. CANS 2008, LNCS 2008.
Google Scholar
Zerr S, Demidova E, Olmedilla D, Nejdl W, Winslett M, Mitra S: Zerber: r-confidential indexing for distributed documents. EDBT'08: Proceedings of the 11th international conference on Extending database technology 2008, 287-298.
Chapter Google Scholar
Zerr S, Olmedilla D, Nejdl W, Siberski W: Zerber+R: top-k retrieval from a confidential index. EDBT '09: Proc. of the 12th International Conference on Extending Database Technology: Advances in Database Technology 2009, 439-449.
Chapter Google Scholar
Pang H, Ding X, Xiao X: Embellishing text search queries to protect user privacy. PVLDB 3(1) 2010, 598-607.
Google Scholar
Swaminathan A, Mao Y, Su G-M, Gou H, Varna A, He S, Wu M, Oard D: Confidentiality-preserving rank-ordered search. Storage SS'07, in Proc. of the 2007 ACM workshop on Storage security and survivability 2007, 7-12.
Chapter Google Scholar
Wang C, Cao N, Li J, Ren K, Lou W: Secure ranked keyword search over encrypted cloud data. ICDCS'10, in Proc. of the 2010 IEEE 30th International Conference on Distributed Computing Systems 2010, 253-262.
Chapter Google Scholar
Cao N, Wang C, Li M, Ren K, Lou W: Privacy-preserving multikeyword ranked search over encrypted cloud data. IEEE INFOCOM 2011.
Google Scholar
Li M, Yu S, Cao N, Lou W: Authorized private keyword search over encrypted data in cloud computing. Proc of IEEE ICDCS'11 2011.
Google Scholar
Yu S, Wang C, Ren K, Lou W: "Achieving secure, scalable, and fine-grained data access control in cloud computing. IEEE INFOCOM'10 2010.
Google Scholar
Hu Y, Perrig A, Johnson DB: Efficient security mechanisms for routing protocols. Network and Distributed System Security Symposium, NDSS'03 2003, 57-73.
Google Scholar
Burrnester M, Desmedt Y: A secure and efficient conference key distribution system. The Advances in Cryptology--EUROCRYPT 1994.
Google Scholar
Kim Y, Perrig A, Tsudik G: Tree-based group key agreement. ACM Trans Inf Syst Secur 2004, 7(1):60-96. 10.1145/984334.984337
Article Google Scholar
Liao L, Manulis M: Tree-based group key agreement framework for mobile ad-hoc networks. Fut Gener Comput Syst 2007, 23(6):787-803. 10.1016/j.future.2007.01.001
Article Google Scholar
Burmester M, Desmedt Y, Wright RN, Yasinsac A: Accountable Privacy. Security Protocols 2004, LNCS 3957 2006, 83-95.
Google Scholar
Ontario: Office of the Information and Privacy Commissioner (IPC)and Netherlands Registratiekamer. Privacy-Enhancing Technologies: The Path to Anonymity, Information and Privacy Commissioner and Registratiekamer 1995. [http://www.ipc.on.ca/english/Resources/Discussion-Papers/Discussion-Papers-Summary/Default.aspx?id=329%26print=1]
Google Scholar
Park H, Kim B, Lee D, Chung Y, Zhan J: Secure similarity search. In Grc 2007. IEEE ComputerSociety Press; 2007:598-604.
Google Scholar
[http://blogs.msdn.com/msdnts/archive/2006/12/01/row-size-limitation-in-sql-2000-and-2005.aspx]
Abdalla M, Bellare M, Catalano D, Kiltz E, Kohno T, Lange T, Malone-Lee J, Neven G, Paillier P, Ashi H: Searchable encryption revisited: consistency properties, relation to anonymous IBE, and extensions. Crypto05, LNCS 3621 2005, 205-222.
Google Scholar
Bellovin S, Cheswick W: Privacy-enhanced searches using encrypted bloom filters. Cryptology ePrint Archive, R eport2004/022 2004.
Google Scholar
Ballard L, Green M, de Medeiros B, Monrose F: Correlation-resistant storage via keyword-searchable encryption. SPAR Technical Report TR-SP-BGMM-050705
Ballad L, Kamara S, Monrose F: Achieving efficient conjunctive keyword searches over encrypted data. ICICS 2005, LNCS3783 2005, 414-426.
Google Scholar
Ogata W, Kurosawa K: Oblivious keyword search. J Complexity 2004, 20: 356-371. 10.1016/j.jco.2003.08.023
Article MathSciNet Google Scholar
Park H, Hong J, Park J, Zhan J, Lee D: Combined authentication based multi-level access control in mobile application for DailyLifeService. IEEE Trans Mobile Comput 2010, 9(6):824-837.
Article Google Scholar
Park H, Park J, Choi J, Lee D: Toward an integrated system between cloud computing and smartcard application. In ICCIT 2010. IEEE Computer Society Press; 2010:580-587.
Google Scholar

Download references

10 Acknowledgements

This research was supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC support program supervised by the NIPA (National IT Industry Promotion Agency) (NIPA-2010-C1090-1001-0004).

Author information

Authors and Affiliations

Graduate School of Information and Security, Korea University, 5-Ka, Anam-dong, Sungbuk-ku, Seoul, 136-701, Korea
Hyun-A Park & Dong Hoon Lee
Department of Information Systems, Weatherhead School of Management, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH, 44106, USA
Jae Hyun Park

Authors

Hyun-A Park
View author publications
You can also search for this author in PubMed Google Scholar
Jae Hyun Park
View author publications
You can also search for this author in PubMed Google Scholar
Dong Hoon Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dong Hoon Lee.

Additional information

8 Competing interests

1) The article-processing charge for this manuscript was supported by "ITRC" support program supervised by the NIPA (National IT Industry Promotion Agency) of Korea.

2) The two encrypted tables to build database normalization is 'Korean Registered Patent' by Hyun-A Park, Registered No.(10-0839220), June. 11. 2008.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Park, HA., Park, J.H. & Lee, D.H. PKIS: practical keyword index search on cloud datacenter. J Wireless Com Network 2011, 64 (2011). https://doi.org/10.1186/1687-1499-2011-64

Download citation

Received: 15 December 2010
Accepted: 17 August 2011
Published: 17 August 2011
DOI: https://doi.org/10.1186/1687-1499-2011-64

PKIS: practical keyword index search on cloud datacenter

Abstract

1 Introduction

1.1 Problem identification

1.2 Key idea and contribution

1.3 Related works

2 Preliminaries

2.1 Keyword index search scheme

2.2 System environments

2.2.1 Multiple user setting

2.2.2 Cloud datacenter service and modified DAS model

2.3 Notations

2.4 Definitions

2.5 Algorithm

3 Construction Of Practical Keyword Index Search-I (PKIS-I)

3.1 Uploading phase

3.1.1 SysPara(1k) construction

3.1.2 KeyGen(λ) construction

3.1.3 IndGen(ik, W) and DocEnc(dk, D) construction

3.1.4 Database update

3.2 Downloading phase

3.2.1 TrapGen(w, ik) construction

3.2.2 Retrieval(Tw) and Dec(E(D), dk) construction

4 Construction Of Practical Keyword Index Search--II (PKIS-II)

4.1 Uploading phase

4.1.1 SysPara(1k) construction

4.1.2 KeyGen(λ) construction

4.1.3 IndGen(ik, W) and DocEnc(dk, D) construction

4.1.4 Database update

4.2 Downloading phase

4.2.1 TrapGen(w, ik) construction

4.2.2 Retrieval(Tw) and Dec(E(D), dk) construction

5 Security Analysis

5.1 Group search secrecy

5.1.1 PKIS-I

5.1.2 PKIS-II

5.2 Keyword index search privacy

5.2.1 Retrieval access control

5.2.2 Unobservability

5.2.3 Unlinkability--index indistinguishability

6 Experiments Of Performance

6.1 Setting of experiments

6.2 Analysis on PKIS-I and PKIS-II

6.3 Comparison of our scheme with other schemes

6.3.1 The results of implementation

6.3.2 The applicability of DB schema

6.3.3 The influence of function

7 Conclusion

9 Endnotes

References

10 Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

8 Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

3.1.1 SysPara(1^k) construction

3.2.2 Retrieval(T_w) and Dec(E(D), dk) construction

4.1.1 SysPara(1^k) construction

4.2.2 Retrieval(T_w) and Dec(E(D), dk) construction