The research for protecting location privacy based on VW algorithm
 Xinyue Fan^{1},
 Jing Tu^{1},
 Chaolong Ye^{1} and
 Fei Zhou^{1}Email author
https://doi.org/10.1186/168714992014202
© Fan et al.; licensee Springer. 2014
Received: 25 June 2014
Accepted: 19 October 2014
Published: 28 November 2014
Abstract
With the development of mobile internet, protecting location privacy has already been an important issue. Based on previous studies and the drawback of traditional algorithms, the paper proposes a novel algorithm for protecting location privacy. The algorithm is based on the voronoi map of a road network, considers the problem of sideweight inference, and utilizes the information entropy as metrics. Meanwhile, the algorithm can defense the attack of sideweight inference and replay. Lastly, we verified the algorithm based on the real data of the road network. Results of experiences show that the improved algorithm has the better performance on some key performance metrics.
Keywords
Location privacy Voronoi map Road network Information entropy1 Introduction
With the rapid development of mobile network, the leakage of location privacy has become a problem that cannot be ignored. So lots of researchers have paid more attention on it. So far, researches on protecting location privacy mainly focus on two aspects: The infrastructure of a location privacy protection system and the method of protecting location privacy. Now, the mainstream architectures of a location privacy protection system are usually divided into three categories: noncooperative architecture, distributed peertopeer architecture, and centralized architecture [1].
The existing methods about location privacy protection mainly include falsename location privacy protection, landmark location privacy protection, falseaddress location privacy protection, and spatial anonymity location privacy protection. Falsename location privacy protection replaces the real identity of a user by a false name so that attackers cannot obtain the real query source from locationbased services (LBS) severs [2]. Landmark location privacy protection only utilizes some landmark position within a certain range instead of a user’s real position to interact with LBS, so attackers cannot identify the real position of users [3]. Through the further research on landmark method, the current method of protecting location privacy based on a landmark not only uses landmark position to replace the real position of a target, but also utilizes some algorithms to give a false position to replace the real position of users. For instance, literature [4] adopts an incremental nearest neighbor query algorithm to realize location privacy protection. In literature [5], an anonymous regional transformation algorithm is adopted. In the method of falseaddress location privacy protection, a position information set, that includes the real position of a user and a sequence of a false position (dubbed as dummy), is sent to LBS, so attackers have no way to distinguish the real position information from the received information [6, 7]. Spatialanonymity location privacy protection is also a very interesting method. An anonymous server makes user information anonymous to obtain a fuzzy space or user set and then uses the fuzzy space or user set as a requester body to interact with an LBS server. So far, KAnonymous algorithm proposed by Marco Gruteser is the most classic in spatialanonymity location privacy protection [8]. Based on KAnonymous algorithm, some scholars propose several improved algorithms aimed at different performance demands. For instance, literature [9, 10] propose personalized KAnonymous algorithm based on personalized demands of users. The algorithm can adjust the requested k value for users according to their security demand. To solve the problem of a low anonymous success rate of a traditional algorithm, Xiao et al. designed an efficient directed graph based on choking algorithms [11]. The Casper algorithm in [12] mainly considers a largescale requested condition and personalized anonymous demand.
However, the current researches mostly focus on Euclidean geometric space. In fact, a road network environment limits the user’s activity in most cases. Meanwhile, it is very meaningful to make further research on location privacy protection based on the road network. In [13], Rubner gives a method that transforms Euclidean geometric space into the road network environment. Literature [14] gives an XSTAR algorithm. The algorithm regards road crossing point as a node and conducts an anonymous process to users. It can simultaneously make an anonymous set satisfy two conditions, kanonymous of location and ldiversity of road. According to the road network environment, an anonymous ring, anonymous tree, or anonymous cellular can also be adopted to realize location privacy protection [15, 16]. In addition, Zhao et al. give an anonymous method based on a voronoi map. The anonymous method can be fulfilled in a v zone after satisfying the condition of KAnonymous and ldiversity. The above algorithms have their own advantages, but their shortcomings are also obvious. For example, the algorithm in [13] has too low security. The anonymous success rate in [14] is too low, and its computation complexity is too high. In [15], a too large anonymous ring may result in the serious degradation of QOS. Algorithms in [16, 17] do not consider the attack of sideweight inference, so their security is not enough. Therefore, in this paper, we consider the problem of sideweight inference and utilize the information entropy as metrics, then propose a novel algorithm based on voronoiweight (VW) for protecting location privacy.
The remainder of this paper is organized as follows. We analyze the related issues of a traditional algorithm and determine the design target of our algorithm in Section 2. The complete design is given in Section 3. Section 4 presents the performance evaluation result of our proposed algorithm. Finally, the paper is concluded in Section 5.
2 Problem proposed
2.1 The comparison of algorithms
As far as we are concerned, due to a simple architecture, centralized architecture is the most popular. In this architecture, the interaction between mobile terminal and anonymous severs is encrypted to guarantee the security of user requests. The data interaction between anonymous severs and an LBS server adopts plaintext transmission to save system resource.
In this paper, we mainly consider the shortcoming of traditional anonymous ring algorithms and anonymous cellular algorithms and improve the existing location privacy algorithm based on the road network. The advantages of an anonymous ring algorithm is that an attacker starts searching from any path by using the anonymous ring algorithm; the final output of the anonymous ring or anonymous tree are the same. So the algorithm has good antireplay attack capability. However, the anonymous ring algorithm only considers if there are mobile users at two sides of a ring and ignores the distributed probability of users. If the user’s distribution of each side is not the same, the user is easily attacked by sideweight inference. In addition, if the ring or road is too long, the output of anonymous road sets would cover a too large region and result in a too heavy computational burden.
 1)
Assuming that an attacker already intercepted anonymous servers’ requested information from an LBS server, how to guarantee that the attacker cannot obtain the correct location information?
 2)
How to guarantee that the server can defense the replay attack and sideweight inference attack effectively?
2.2 The design target of algorithm

Target 1 The output anonymous set of algorithm satisfies the kanonymous demand. Given that an attacker obtain the requested information that an anonymous server submitted to an LBS server, the request does not contain the real position information of a user but contain the anonymous road section where the user lies. Because there are k_{ i } mobile users in a set at least, the probability that the attacker can identify the position of the user is not higher than 1/k_{ i }, that is k=k_{ i }, where k is the number of users in an anonymous road set, k_{ i } denotes as the number of the user requests.

Target 2 The output anonymous road set satisfies the ldiversity demand of the road. Assuming that an attacker intercepts the request and obtains the anonymous road set. However, the set at least contains l_{ i } path, so the probability that the attacker can identify the correct path of a user from the set is not higher than 1/l_{ i }, that is l=l_{ i }, where l is the number of roads in the anonymous road set, and l_{ i } denotes as the ldiversity demand of U_{ i }.

Target 3 The algorithm has a good ability to defense the replay attack.

Target 4 The algorithm has a good ability to defense the sideweight inference attack.
Replay attack and sideweight inference attack are the most common attack methods to location privacy and also pose the greatest threat on location privacy. Therefore, the defense abilities against these threats are the main design target of our algorithm.
3 Algorithm implementation
Before the user first submits the location service request to the anonymous server, registration information should be submitted to the anonymous server; the format is [ I D,L o c(x,y),0], where ID represents the user identity, L o c(x,y) represents the real user’s location information, and 0 means that this information is the registered information. Another kind of information that the user submits to the anonymous server represents location update; the format is [ I D,L o c(x,y),2], where ID represents the user identity, L o c(x,y) represents the user’s new location information, 2 means that this information represent the location update. The user needs to submit the information to update the user location information in the server regularly and thus ensure the quality of user service.
In Figure 2, to judge whether the user belongs to an effective anonymous road set, we should consider whether the set’s time is effective, that is t_{ n }t_{ s }>t_{ i }, where t_{ n } is the current time, t_{ s } is the generation time of an anonymous set, and t_{ i } is the user’s tolerance time. If the set’s time is not effective, the VW anonymous algorithm is returned. Therefore, it would update the anonymous road set of the corresponding user and generate the requested information submitted to LBS. However, if the set’s time is effective, the request information can be generated directly by using the anonymous set.
The key step of an anonymous system is the design of VW algorithm. The main task of the algorithm is the threestage search for an anonymous road set. The first stage, preprocessing stage, adopts a voronoi map to divide the road network [17]. The second stage, that named as road search stage, mainly implements the search in the output road set of the preprocessing stage. The third stage is extended search stage. If the second stage cannot find the targeted road, the extended search must be implemented in the neighboring area.
Based on the anonymous cellular algorithm, the simplest way to guarantee the diversity of a road is by dividing the areas of the road network, which ensures each area includes l road at least. So the preprocessing stage must implement area dividing to the road network. In the VW algorithm, the road network can be regarded as an undirected graph G(V,E), which is also called as a road network map and is combined with line set E={e_{1},e_{2},⋯,e_{ n }} and node set V={v_{1},v_{2},⋯,v_{ n }}. The voronoi map is also called as Tyson polygon and adopted in GIS field.
The VW algorithm adopts the voronoi map to divide the road network. Hence, it not only ensures each user to link the corresponding voronoi polygon, but also reduces the search coverage for finding user U_{ k } effectively and improves the quality of service. For convenience of description, we denote V(V,E) as the corresponding voronoi map. Combining the real situation of the road network environment and the user demand of road diversity, VW algorithm selects the suited vector (V,E), which node metric must be greater than 3 in G(V,E).The node metric is defined as the number of road that crosses through the related node. So each voronoi polygon at least contains three roads. Figure 3 represents the voronoi map about a simple road model.
In the VW algorithm, we regard a voronoi polygon as a v zone. We would implement the search for the related v zone in the second and third stages. Hence, in the prepossessing stage, we need find the corresponding V(V,E) and map it into the related G(V,E). If so, we can determine the road contained in each v zone and map the user position into the corresponding G(V,E) and V(V,E) while the system receives the user’s registration information and updated location information. In other words, we can locate the user to the corresponding v zone and the corresponding road. For instance, in Figure 3, if User U_{13} requests, the system can find the v zone where the user lies according to the user information and find the corresponding road V_{ i }={n_{4}n_{5},n_{5}n_{12},n_{12}n_{13},n_{12}n_{15},n_{15}n_{16}}. After road mapping, we may adopt a quadtree way to code the corresponding v zone in order to search for the neighboring v zone in the third stage.

Step I Locate the v zone where the user lies and sort all roads in v zone by its own weight. The road weight w_{ e } is the number of users in the targeted road. For example, the result of road sorting in Figure 4 is as follows:$\begin{array}{c}{e}_{1}\left(7\right)\to {e}_{4}\left(6\right)\to {e}_{5}\left(6\right)\to {e}_{2}\left(5\right)\to {e}_{6}\left(5\right)\\ \phantom{\rule{3.5em}{0ex}}\to {e}_{3}\left(3\right)\to {e}_{7}\left(2\right)\hfill \end{array}$where the value within brackets (·) is road weight.

Step II Locate the road where the user lies. For instance, if user U_{20} requests, the road where the user lies is e_{4} and is joined to quasianonymous set S^{′}, S^{′}={e_{4}}. Then, select the maximum k and l of all users in road and assign these values to the anonymous demand of system, k_{ s } and l_{ s }. For convenience of description, let us assume that the anonymous demand of user U_{20} is k_{20}=10, l_{20}=3 while the anonymous demand of user U_{19} is k_{19}=12, l_{19}=3, so the system should assign k_{ s } and l_{ s }, k_{ s }=12, l_{ s }=3.

Step III Select the road randomly from l+∂ roads those that are adjacent to the targeted road to join quasianonymous set S^{′}, where l corresponds to the number of roads which are adjacent to the road where the user lies in, ∂ corresponds to random factor. Commonly, the system assigns ∂ as a certain value by default and guarantees the randomness of the anonymous road set that the user selected. Therefore, it can defense the replay attack. For instance, we randomly select six roads from the neighboring areas of e_{4}, (e_{1},e_{4},e_{5},e_{2},e_{6},e_{3}), to join S^{′}. Then we modify k_{ s } and l_{ s } of the system to the maximum demand of all users in S^{′}, that is k_{ s }= max{k_{ i }}, l_{ s }= max{l_{ i }}. When the selected road e_{2} is joined to the quasianonymous set, we can obtain S^{′}={e_{4},e_{2}}. Then we need to compare the k_{ i } and l_{ i } of all users who are in e_{4} and e_{2}. If only U_{10} in S^{′} satisfies l_{ i }=4>l_{ s }, the system would update l_{ s } and select a road from the updated road within l_{ s }+∂ area, l_{ s } represents the number of roads which are adjacent to the road where the user lies after updated quasianonymous set S^{′}, ∂ corresponds to random factor. After updated the anonymous demand of system, system need meet the conditions, n_{ k }≥k_{ s } and n_{ l }≥l_{ s } where n_{ k } is the number of user and n_{ l } is the number of road. If not, system need select road again from the remaining road within l_{ s }+∂. The above process would be repeated until the conditions, n_{ k }≥k_{ s } and n_{ l }≥l_{ s }, are met. If the algorithm cannot satisfy n_{ k }≥k_{ s } and n_{ l }≥l_{ s }, even though the number of the roads of all the candidates sets an increase, the system would return the anonymous failure information. In view of the introduced random value, the anonymous process has the following three cases:
 i)
l _{ s }+∂≤n _{ v }. Where n _{ v } corresponds to the total number of road in the v zone. In this case, as the above example shows, the system may select a road from l _{ s }+∂ area directly.
 ii)
l_{ s }+∂>n_{ v } and l_{ s }<n_{ v }. When the conditions are met, we may ignore ∂ and randomly select the road to join S^{′} from all the roads in the v zone. In other words, we need not extend the v zone.
 iii)
iii) l_{ s }>n_{ v }. When the condition is met, we need extend the v zone by combining the neighboring v zone, then repeat the procession of case i) or ii) according to the updated l_{ s }. In addition, if k_{ s }>v_{ k }, we also need to extend the v zone. v_{ k } corresponds to the total number of users in the v zone.
 i)

Step IV After obtaining the suited anonymous set S^{′} that satisfies n_{ k }≥k_{ s } and n_{ l }≥l_{ s }, we need to determine if S^{′} met the conditions that can defend the attack of sideweight inference. The condition is as follows.$max\left({p}_{\mathit{\text{ib}}}\right)\le \delta ,i\in (1,{n}_{l})$(5)
where p_{ ib } is the probability of sideweight inference and δ is the experience value (here, δ=0.5), which denotes the probability threshold of sideweight inference. If the probability of inference is greater than the threshold, it represents that the position privacy of the user may be attacked by sideweight inference.
In Equation 5, the sideweight inference probability of all roads would be computed. If the probability of all roads is less than the threshold, the quasianonymous set S^{′} would be regarded as anonymous set S. If the situation that a certain probability is greater than the threshold δ happened, we need to modify the set S^{′} by adding a new road from l_{ s }+∂ roads until Equation 5 is satisfied. If all roads in l_{ s }+∂ roads are already added to the set S^{′} completely, Equation 5 can still not be satisfied. We would select the new road from the v zone until Equation 5 is met. However, when we run out of all roads in the v zone, Equation 5 still cannot be satisfied, the system would return the anonymous failure information. For convenience, we may assume that quasianonymous set S^{′}={e_{4},e_{5},e_{2},e_{3}} of user U_{20} can be obtained by Step III, we have, max(p_{ ib })=p_{4b}<δ. Hence, the system would regard the output S^{′} as the anonymous set of user U_{20} and update all users’ anonymous set to the output S^{′}. The anonymous condition of all users within the set would be satisfied. At the same time, if another user submits a request within the time tolerance t_{ n }t_{ s }≤t_{ i }, the system also regards the set S^{′} as the anonymous set of the requester.
4 Simulation and analysis
4.1 Experimental environment and data
Test platform parameters
Category  Parameter 

Hardware platform  CPU Intel Core T660, main frequency 2.2 GHz 
Software platform  Eclips 
Operation system  Windows XP professional 
Database  Oracle 
Coding language  Java 
4.2 Simulation result and analysis
4.2.1 The success rate of anonymity
4.2.2 The average time of anonymity
4.2.3 The relative anonymous rate
4.2.4 The average entropy of a replay attack
4.2.5 The average entropy of a sideweight inference attack
Lastly, based on the above analysis, the algorithm proposed in this paper can meet our design target given in chapter III. The algorithm not only satisfies the demand of kanonymity and ldiversity, but also has the higher the average entropy of replay attack and sideweight inference. These provide the better ability on protecting location privacy for users.
5 Conclusions
This paper firstly introduces the existing method on location privacy protection. Secondly, we analyze the drawback of traditional algorithms and give the design target of our algorithm. Lastly, we proposed and analyzed the VW algorithm, then compared it with traditional algorithms. The results can show that our algorithm can provide the better performance on location privacy protection.
Declarations
Acknowledgements
This work was supported by the National Natural Science Foundation of China (61471077, 61301126), Fundamental and Frontier Research Project of Chongqing (cstc2013jcyjA40034, cstc2013jcyjA40041, cstc2013jcyjA40032), Science and Technology Project of Chongqing Municipal Education Commission (KJ1400413, KJ130528), Program for Changjiang Scholars and Innovative Research Team in University (IRT1299), and Special Fund of Chongqing Key Laboratory (CSTC).
Authors’ Affiliations
References
 Cheng R, Zhang Y, Bertino E, Prabhakar S: Preserving user location privacy in mobile data management infrastructures. In Proceedings of Privacy Enhancing Technologies: 2830 June 2006; Cambridge. Edited by: Danezis G, Golle P. Springer,, Berlin Heidelberg; 2006:393412.View ArticleGoogle Scholar
 Aryan A, Singh S: Protecting location privacy in augmented reality using kanonymization and pseudoid. In Proceedings of the 2010 International Conference on Computer and Communication Technology: 1719 September 2010; Allahabad. IEEE,, Piscataway; 2010:119124.Google Scholar
 Hong JI, Landay JA: An architecture for privacysensitive ubiquitous computing. In Proceedings of the 2nd International Conference on Mobile Systems, Applications, and Services:6–9June 2004; Boston. ACM, New York; 2004:177189.View ArticleGoogle Scholar
 Yiu ML, Jensen CS, Huang XG, Lu H: SpaceTwist: Managing the tradeoffs among location privacy, query performance, and query accuracy in mobile services. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering: 712 April 2008; Cancun. ACM,, New York; 2008:366375.View ArticleGoogle Scholar
 Xiao YF, Xu HY: A location privacy protection method based on anonymous region transformation. Comput. Eng 2013, 39(1):157163.Google Scholar
 Lu H, Jensen CS, Yiu ML: Pad: privacyarea aware, dummybased location privacy in mobile services. In Proceedings of the 7th ACM International Workshop on Data Engineering for Wireless and Mobile Access: 912 June 2008; Vancouver. ACM,, New York; 2008:1623.Google Scholar
 Kido H, Yanagisawa Y, Satoh T: An anonymous communication technique using dummies for locationbased services. In Proceedings of the 2nd International Conference on Pervasive Services: 1114 July 2005; Santorini. IEEE,, Piscataway; 2005:8897.View ArticleGoogle Scholar
 Gruteser M, Grunwald D: Anonymous usage of locationbased services through spatial and temporal cloaking. In Proceedings of the 2nd International Conference on Mobile Systems, Applications, and Services:5–8May 2003; San Francisco. ACM,, New York; 2003:3142.View ArticleGoogle Scholar
 Gedik B, Liu L: A customizable kanonymity model for protecting location privacy. In Proceedings of the International Conference on Distributed Computing Systems: 610 June 2005; Columbus. Georgia Institute of Technology, Georgia; 2005:112.Google Scholar
 Gedik B, Liu L: Protecting location privacy with personalized kanonymity: architecture and algorithms. Mobile Comput 2008, 7(1):118.View ArticleGoogle Scholar
 Xiao Z, Meng XF, Xu JL: Quality aware privacy protection for locationbased services. Adv. Database: Concepts Syst. Appl 2007, 10(33):434446.Google Scholar
 Mokbel MF, Chow CY, Aref WG: The new casper:query processing for location services without compromising privacy. In Proceedings of the 32nd International Conference on Very Large Data Bases: 1215 September 2006; Seoul. Edited by: Dayal U, Whang KY, Lomet D, Alonso G, Lohman G, Kersten M, Cha SK, Kim YK. ACM,, New York; 2006:736774.Google Scholar
 Rubner Y, Tomasi C, Guibas LJ: The earth mover’s distance as a metric for image retrieval. Comput. Vis 2000, 40(2):99121. 10.1023/A:1026543900054MATHView ArticleGoogle Scholar
 Wang T, Liu L: Privacyaware mobile services over road networks. In Proceeding of the 35th International Conference on Vary Large Data Bases: 2428 August 2009; Lyon. ACM,, New York; 2009:10421053.Google Scholar
 Xue J, Liu XY, Yang XC, Wang B: A location privacy preserving approach on road network. Chin. J. Comput 2011, 34(5):865878. 10.3724/SP.J.1016.2011.00865View ArticleGoogle Scholar
 Xu J, Xu M, Lin X, Zheng N: Location privacy protection through anonymous cells in road network. J. Zhe Jiang University (Engineering science) 2011, 45(3):429434.Google Scholar
 Zhao P, Ma CG, Gao XB, Zhu W: Protecting location privacy with voronoi diagram over road networks. Comput. Sci 2013, 40(7):116120.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.