- Research
- Open Access
- Published:

# Fingerprint indoor positioning algorithm based on affinity propagation clustering

*EURASIP Journal on Wireless Communications and Networking*
**volume 2013**, Article number: 272 (2013)

## Abstract

Recently, the fingerprint-based wireless local area network (WLAN) positioning has gained significant interest. A probability distribution-aided indoor positioning algorithm based on the affinity propagation clustering is proposed. Different from the conventional fingerprint-based WLAN positioning algorithms, the paper first utilizes the affinity propagation clustering to minimize the searching space of reference points (RPs). Then, we introduce the probability distribution-aided positioning algorithm to obtain the target's refined position. Furthermore, because the affinity clustering can effectively lead to a reduction of the computational cost for the RP searching which is involved in the probability distribution-aided positioning algorithm, the proposed algorithm can lower the difficulty and minimize the power consumption when estimating the user's position. Experimental results conducted in the real environments show that our proposed algorithm will significantly improve the performance of the probability distribution-aided positioning algorithm in both the positioning accuracy and real-time ability.

## 1 Introduction

In recent decade, the indoor wireless local area network (WLAN) positioning technology has caught significant attention by a variety of universities and research institutes [1–3]. Among them, the time of arrival (ToA), angle of arrival (AoA), and received signal strength (RSS) are the three most representative measurements for the position estimation. Compared to the ToA and AoA measurements, the RSS can be more easily measured without any additional special hardware devices in current open public WLAN networks. However, the most significant challenge of the RSS readings is about the irregular variations of RSS due to the variable radio channel attenuation, signal shadowing, multi-path interference, and even the variations of indoor temperature [4].

In general, two approaches are used by the existing WLAN positioning techniques for position estimation. One effective solution is the k-nearest neighbor (kNN) algorithm to estimate the mobile user's position at the centroid of the K closest neighbors. The closest neighbors are defined as the reference points (RPs) which have the smallest RSS distance to the on-line new collected RSS readings [5]. The kNN algorithm can be easily implemented by the current widely existing WLAN infrastructures, while the accuracy is limited. Another alternative approach is based on the statistical analysis on the probabilities of each candidate RP to calculate the confidence probability of each RP to be selected as the mobile user's estimated position [6, 7].

RSS clustering by the measures of similarities between the RSS readings can be suggested as a critical step for the fingerprint-based WLAN positioning. Performing the RSS clustering prior to the positioning process has two main advantages. First, it helps to mitigate the degradation on the positioning accuracy caused by the RSS deviations and potential outlier readings. Second, it is beneficial to be used to reduce the computation cost since only the cluster centers are considered for the positioning [8].

In this paper, we present a new accurate and scalable positioning algorithm to estimate the user's position with low computation cost in a public WLAN environment. Our algorithm consists of two steps: (1) the coarse positioning step is used to obtain the cluster which the user belongs to; and (2) the fine positioning step is utilized to calculate the accurate coordinates of the user.

The paper is organized as follows. Section 2 provides some related work on the fingerprint indoor positioning algorithms. Section 3 discusses the overall structure of our algorithm. Sections 4 addresses the detailed steps of the off-line affinity propagation, on-line cluster matching-based coarse positioning, and probability distribution-aided fine positioning, respectively. The performance of our proposed algorithm is verified in Section 5. Finally, Section 6 concludes this paper.

## 2 Related work

The RSS fingerprints are recognized as the vectors of RSS values recorded from the hearable access points (APs) in target area. The fingerprint indoor positioning by using the set of pre-calibrated RSS fingerprints (or called the radio map) can be normally classified into three categories: the (1) deterministic approach; (2) probabilistic approach; and (3) machine learning approach, as illustrated in Figure 1. The widely known RADAR system [9, 10] is addressed as one of most representative deterministic approaches for the WLAN fingerprint indoor positioning by using the K-nearest neighborhood RPs to infer the user's position.

The basic idea of probabilistic approach is to pre-store the RSS distribution with respect to each hearable AP into a radio map and use it to conduct the position estimation. As an example, the Horus system [11] estimates the user's position at the location which has the largest posterior probability by the Bayesian inference [12]. Although the Bayesian inference achieves high positioning accuracy, the recording of RSS distributions at RPs is time consuming.

Due to the significant computation cost involved in the previously mentioned approaches, the researchers begin to pay more attention to the machine learning approach (e.g., the artificial neural network [13, 14], support vector machine [15], and fuzzy logic [16, 17]) to realize the fingerprint positioning. The most important advantage of the machine learning approach is about the real-time ability of inferring the user's coordinates in the on-line phase. However, most of the current machine learning positioning systems is designed for a small-scale area (e.g., approximately 600 m^{2}[18]), and meanwhile, the positioning accuracy will significantly rely on the training process in the off-line phase.

To improve the positioning accuracy further, we always consider about the fingerprint processing, such as the RSS clustering and dimension reduction. A conventional way to the RSS clustering is to select a set of cluster centers to minimize the sum of the squared distances between the RSS readings and their corresponding centers. As a representative, the widely used k-means clustering begins with an initial set of randomly selected centers and then iteratively refines this set to decrease the sum of the squares distances. However, the k-means clustering is quite sensitive to the initial selection of centers. In this case, it is always performed in several times and with different initial centers to find the best clustering results. Therefore, the k-means clustering is limited in practical use due to the arbitrary selection of the initial cluster centers. To solve this problem, the affinity propagation clustering creates the centers and the corresponding clusters based on the constant exchanging of reading similarities between the RPs [19]. With this idea, the messages will be exchanged between RPs until a high-quality set of centers and corresponding clusters gradually emerge.

## 3 Overall structure of proposed positioning algorithm

The block diagram of our proposed indoor positioning algorithm is shown in Figure 2. Obviously, this algorithm contains two phases: (1) in the off-line phase, we construct the radio map and conduct the affinity propagation clustering; and (2) in the on-line phase, the cluster matching-based coarse and probability distribution-aided fine positioning will be performed, respectively. The detailed steps about the block diagram in Figure 2 will be analyzed in Section 3.

In the off-line phase, we first hold the WLAN mobile device to collect the RSS readings from the hearable APs to construct the radio map in the area of interest. During the construction of the radio map Ψ (see details below), the corresponding physical coordinates of RPs should also be stored. Then, the affinity propagation clustering is conducted in the radio map to cluster the raw RPs.

In the on-line phase, the first step is to use the mobile device to collect the on-line new RSS readings. Then, the coarse positioning will be used before the fine positioning for the reasons of reducing the on-line computation cost and improving the accuracy performance of the probability distribution-aided positioning algorithm.

Due to the time-variation property of radio propagation in indoor WLAN environment (e.g., multi-path effect, RSS shadowing, and adjacent channel interference), the fingerprint-based positioning has been more preferred in practical use [20]. During the off-line phase, the RSS readings are collected at pre-calibrated positions which are also named as the RPs. We denote the *τ* - th RSS readings from AP_{
i
} at RP_{
j
} as {*ψ*_{i,j}(*τ*), *τ* = 1, …, *q*, *q* > 1} where *q* is the number of RSS readings. Normally, the average of RSS readings will be computed and stored into a database which is known as the widely recognized radio map Ψ. The radio map can effectively describe the spatial distribution property of the RSS in target area.

where ${\psi}_{i,j}=\frac{1}{q}{\displaystyle {\sum}_{\tau =1}^{q}{\psi}_{i,j}\left(\tau \right)}\left(i=1,2,\dots ,L;j=1,2,\dots ,N\right)$ is the average of RSS readings from AP_{
i
} at RP_{
j
} over the time domain. *L* and *N* are the number of access points (APs) and RPs, respectively. Each column of the radio map Ψ (i.e., $\overrightarrow{{\psi}_{j}}={\left[{\psi}_{1,j},{\psi}_{2,j},\dots ,{\psi}_{L,j}\right]}^{T}$, *j* = 1, 2, …, *N*) represents a sequence of RSS readings at a RP_{
j
}. The superscript ‘*T*’ denotes the transposition operation.

## 4 Detailed steps of proposed positioning algorithm

Different from the conventional K-means clustering [21], the basic idea of our proposed affinity propagation clustering algorithm is to use the *preference* (*p*) to label the RPs and the RPs with larger preference are more likely to be selected as the cluster centers. Our proposed algorithm outperforms the K-means clustering because of the initialization-independent property and better selection of cluster centers [22].

For the affinity propagation clustering, we first use the pairwise similarity *s*(*i*, *j*) to describe the fitness of the RP_{
j
} to be selected as the cluster center with respect to the RP_{
i
}. Based on Equation (1), we can denote the RSS vector for each RP_{
j
} as $\overrightarrow{{\psi}_{j}}+\overrightarrow{{\delta}_{j}}$ where $\overrightarrow{{\delta}_{j}}$ is the measurement noise which obeys the Gaussian distribution. Therefore, the pairwise similarity *s*(*i*, *j*) can be defined as the squared Euclidean distance in Equation (2).

Furthermore, there are two types of messages transmitted among the RPs for the affinity propagation clustering: (1) responsible message *r*(*i*, *j*) which transmits the information about the clustering center; and (2) availability message *a*(*i*, *j*) which informs the attachment relations between the RPs and clusters.

The RP_{
i
} will send the responsible message to each candidate cluster center RP_{
j
} to transmit the accumulated fitness for the RP_{
j
} to be selected as the cluster center for RP_{
i
}. By taking all the other potential cluster centers *j*’ for RP_{
i
} into account, we can obtain

where *a*(*i*, *j*) is the availability message, as defined in Equation (5). Meanwhile, we define the self-responsibility *r*(*i*, *i*), which is known as *preference (p)* as the median of input similarities, resulting in the average number of clusters.

The availability message *a*(*i*, *j*) is sent from each candidate cluster center RP_{
j
} to RP_{
i
}. *a*(*i*, *j*) describes the accumulated fitness for RP_{
i
} to select RP_{
i
} as its center, such as

Similarly, the self-availability *a*(*j*, *j*) will reflect the accumulated fitness for RP_{
j
} to be selected as the center. Because of the requirement of positive responsibilities, we have

The previously mentioned messages are transmitted among the neighboring RPs until the optimal cluster centers are searched out. When updating the messages, it is important that they be damped to avoid numerical oscillations that arise in some circumstances. Each message is set to λ times its value from the previous iteration plus 1 ‒ λ times its prescribed updated value, where the damping factor λ is between 0 and 1. In each iteration, there are three steps involved [23]: (1) updating the responsibilities by Equation (7); (2) updating the availabilities by Equation (8); and (3) integrating the availabilities and responsibilities to determine the cluster centers by Equation (9). In our experiments, we keep on this process until the cluster centers have not changed in ten iterations or the number of iterations exceeds 300.

In the on-line phase, we will collect the new RSS readings at unknown positions $\overrightarrow{{\psi}_{r}}={\left[{\psi}_{1,r},\dots ,{\psi}_{L,r}\right]}^{T}$ where {*ψ*_{k,r}, *k* = 1, …, *L*} is the average of new RSS readings from AP_{
k
}. We define H and C_{
j
} as the set of cluster centers and the set of RPs with the center RP_{
j
} ∈ H. After the coarse positioning, the candidate cluster selected for the fine positioning can be obtained by Equation (10).

For the fine positioning, we will calculate the matching probability between the on-line new collected RSS readings and the pre-stored fingerprints in radio map. By assuming the Gaussian probability distribution of RSS readings at each RP, the RSS values should obey the normal distribution *N*(*μ*, *σ*^{2}) [7].

First, the likelihood function is calculated by

Second, we can obtain the logarithmic equation in Equation (12).

Third, likelihood equations should be

Last, by calculating Equation (13), one has

The likelihood equations have a unique solution (*μ*^{*}, *σ*^{*2}) which should also be a local maximum point. This result can be interpreted that when |*μ*| → ∞ or *σ*^{2} → 0 or *σ*^{2} → ∞, the non-negative function *L*(*μ*, *σ*^{2}) → 0. Therefore, the maximum likelihood estimation of *μ* and *σ*^{2} will be

where X is defined as the set of RSS readings. Based on the statistical property of the maximum likelihood estimation *μ*^{*} and *σ*^{*2}, we can approximately recognize *μ*^{*} as the RSS fingerprint at each RP. With this idea, the mean of RSS reading *e*_{
i
} and the corresponding variance *d*_{
i
} from each hearable AP AP_{
i
} should be calculated for the construction of radio map in the off-line phase.

In the on-line phase, after collecting the new RSS readings {rss_{
i
}, *i* = 1, 2, …, *L*}, according to Equation (18), we can calculate the probability of the RP (*x*, *y*) with respect to the *i* - th AP *P*_{
i
}(*x*, *y*). In Equation (18), we have *μ* = *e*_{
i
} and *σ* = *d*_{
i
}.

Then, the probability of each RP *P*(*x*, *y*) can be calculated. Finally, we will locate the user's position at the RP which has the maximum probability.

## 5 Experimental results

Figure 3 shows the target indoor WLAN positioning environment for our testing. By using RSS readings collected from nine public APs (Cisco WRT54G), we will compare the performance of our proposed algorithm with other three typical positioning algorithm; (1) kNN positioning algorithm with K-means clustering (K-means + Knn); (2) probability distribution-aided positioning algorithm with K-means clustering (K-means + Probability Distribution); and (3) kNN positioning algorithm with affinity propagation clustering (Affinity Propagation + Knn). The dimensions of our testing area are 66 × 22 m^{2}.

### 5.1 Clustering results

In the experiments, we only focus on the situations that the damping factor is in the range of [0.5, 0.9] because only the damping factor falling into this range can guarantee that the affinity clustering results converge when the clustering process ended. Figures 4 and 5 show the variations of the numbers of iterations and clusters with respect to the values of damping factor respectively. To be clearer, we compare the numbers of iterations and clusters in the conditions of 9-AP (or all the APs), 5 APs (or about half the APs), and 3 APs (or the 3 APs which have the strongest RSS readings). As can be seen in Figures 4 and 5, the damping factor λ = 0.65 performs best in the computation cost (or with the smallest number of iterations), and meanwhile results in stable number of clusters.

Figure 6 shows the number of clusters with respect to the values of *preference* (or the parameter *p*) for the affinity propagation clustering. A large value *p* will result in the small number of clusters as expected. To distribute the RPs with the equal probability of being the cluster centers, we first set the value *p* as the median of the input similarities to generate a proper number of clusters, as described in Equation (4). After that, we can tune the values of *p* for the better clustering performance [21].

Figure 7 gives the results of the affinity propagation clustering on the RPs. The solid circles represent the calibrated RPs, and the RPs belonging to different clusters are labeled by different color. The 182 RPs have been clustered into seven clusters and the RPs in the same cluster are physically adjacent.

### 5.2 Positioning results

By randomly selecting 81 test positions in target area (see Figure 8), we can compare the error performance of the K-means + Knn, K-means + Probability Distribution, Affinity Propagation + Knn and our proposed algorithm in Table 1. Furthermore, the comparisons of the cumulative density functions (CDFs) of positioning errors are also illustrated in Figure 9. The red line in target area stands for the path that the user was walking.

### 5.3 Error analysis

Based on the previous experimental results which are conducted in a public indoor WLAN environment, we can observe that: (1) our proposed probability distribution-aided positioning algorithm has reduced the mean of errors by 34.02%, 28.3%, and 16.17%, respectively compared to the K-means + Knn, K-means + Probability Distribution, and Affinity Propagation + Knn positioning algorithms; and (2) our proposed algorithm has also increased the confidence probability of errors within 3 m to 80.49% which is significantly larger than the probabilities 43.9%, 62.2%, and 65.85% achieved by the K-means + Knn, K-means + Probability Distribution, and Affinity Propagation + Knn positioning algorithms.

## 6 Conclusion

This paper proposes a new probability distribution-aided indoor positioning algorithm based on affinity propagation clustering. Compared with the conventional indoor fingerprint-based positioning algorithms, the positioning search space and computing cost are reduced. Because the affinity propagation clustering can be recognized as a preprocessing of the conventional fingerprint-based positioning algorithms, the proposed method can also be applied to the other fingerprint-based wireless positioning systems, like the RFID and mobile cellular network. However, due to the significant dependence on the RSS distributions and deployment of RPs, future research is required to optimize the layout of the APs and fingerprint modification for improving the positioning accuracy.

## References

- 1.
Zhou M, Wong AK, Tian Z, Zhang VY, Yu X, Luo X: Adaptive mobility mapping for people tracking using unlabelled Wi-Fi shotgun reads.

*IEEE Commun. Lett.*2013, 17(1):87-90. - 2.
Gezici S: A survey on wireless position estimation.

*Wirel. Pers. Commun.*2008, 44(3):263-282. 10.1007/s11277-007-9375-z - 3.
Nicoli M, Gezici S, Sahinoglu Z, Wymeersch H: Localization in mobile wireless and sensor networks.

*EURASIP J. Wirel. Commun. Netw.*2011, 197: 2011. - 4.
Zhou M, Tian Z, Xu K, Yu X, Wu H: Theoretical entropy assessment of fingerprint-based Wi-Fi localization accuracy.

*Expert. Syst. Appl.*2013, 40(15):6136-6149. 10.1016/j.eswa.2013.05.038 - 5.
Khodayari S, Maleki M, Hamedi E: A RSS-based fingerprinting method for positioning based on historical data. In

*Proc. IEEE SPECTS*. : Ottawa, ON, 11–14 July 2010; 2010:306-310. - 6.
Kushki A, Plataniotis KN, Venetsanopoulos AN: Kernel-based positioning in wireless local area networks.

*IEEE Trans. on Mobile Computing*2007, 6(6):689-705. - 7.
Youssef M, Agrawala A: The Horus WLAN location determination system. In

*Mobile Systems, Applications and Services*. Seattle, WA, 06–08 June 2005: ; 205-218. - 8.
Moragrega A, Closas P, Ibars C: LACFA: an algorithm for localization aware cluster formation in wireless sensor networks.

*EURASIP J. Wirel. Commun. Netw.*2011, 2011: 121. 10.1186/1687-1499-2011-121 - 9.
Bahl P, Padmanabhan V: RADAR: An in-building RF-based user location and tracking system.

*Proc. IEEE INFOCOM*2000, 2: 775-784. - 10.
Bahl P, Padmanabhan V, Balachandran A:

*Enhancements to the RADAR user location and tracking system*. 2000. Microsoft Research, no.MSR-TR-2000-12 - 11.
Youssef M, Agrawala A: The Horus location determination system.

*Wirel. Netw.*2008, 14(3):357-374. 10.1007/s11276-006-0725-7 - 12.
Zang H, Baccelli F, Bolot J: Bayesian inference for localization in cellular networks. In

*Proc. IEEE INFOCOM*. San Diego, CA, 14–19 March 2010: ; 1-9. - 13.
Zhou M, Xu Y, Tang L: Multilayer ANN indoor location system with area division in WLAN environment.

*J. Syst. Eng. Eletron.*2010, 21(5):914-926. - 14.
Ding X, Li H, Li F, Wu J: A novel infrastructure WLAN locating method based on neural network. In

*Proc. AINTEC*. Bangkok, 18–20 November 2008: ; 47-55. - 15.
Mundo L, Ansay R, Festin C, Ocampo R: A comparison of wireless fidelity (Wi-Fi) fingerprinting techniques. In

*Proc. IEEE ICTC*. Seoul, 28–30 September 2011: ; 20-25. - 16.
Teuber A, Eissfeller B, Pany T: A two-stage fuzzy logic approach for wireless LAN indoor positioning. In

*Proc. IEEE PLANS*. San Diego, CA, 25–27 April 2006: ; 730-738. - 17.
Zhou M, Xu Y, Ma L: Radio-map establishment based on fuzzy clustering for WLAN hybrid KNN/ANN indoor positioning.

*China Commun.*2010, 7(3):64-80. - 18.
Battiti R, Nhat T, Villani A:

*Location-aware computing: a neural network model for determining location in wireless LANs*. University of Trento: Technical Report DIT-02-083, Ingegneria e Scienza dell’ Informazione; 2002. - 19.
Chan H, Luk M, Perrig A: Using clustering information for sensor network localization. In

*Proc. IEEE DCOSS*. Marina del Rey, CA, 30 June 30–01 July 2005: ; 109-125. - 20.
Zhou M, Xu Y, Ma L, Tian S: On the statistical errors of RADAR location sensor networks with built-in Wi-Fi Gaussian linear fingerprints.

*Sensors*2012, 12(2):3605-3626. - 21.
Gokcay E, Principe J: Information theoretic clustering.

*IEEE Trans Pattern Anal Mach Intell*2002, 24(2):158-172. 10.1109/34.982897 - 22.
Wang Y, Li W, Sun Y: Wireless sensor network cluster locations: a probabilistic inference approach. In

*Proc. IEEE ICAL*. Chongqing, 15–16 August 2011: ; 76-80. - 23.
Frey BJ, Dueck D: Clustering by passing messages between data points.

*Science*2007, 315: 1.

## Acknowledgements

This work was supported in part by the National Science and Technology Major Project of China (2012ZX03006-002(3)), National Natural Science Foundation of China (61301126), Fundamental and Frontier Research Project of Chongqing (cstc2013jcyjA40032, cstc2013jcyjA40034 and cstc2013jcyjA40041), Special Fund of Chongqing Key Laboratory (CSTC), Science and Technology Project of Chongqing Municipal Education Commission (KJ130528), Startup Foundation for Doctors of CQUPT (A2012-33), and Science Foundation for Young Scientists of CQUPT (A2012-77).

## Author information

## Additional information

### Competing interests

The authors declare that they have no competing interests.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

## About this article

### Cite this article

Tian, Z., Tang, X., Zhou, M. *et al.* Fingerprint indoor positioning algorithm based on affinity propagation clustering.
*J Wireless Com Network* **2013, **272 (2013) doi:10.1186/1687-1499-2013-272

Received:

Accepted:

Published:

### Keywords

- WLAN indoor positioning
- Fingerprinting
- Affinity propagation clustering
- RSS
- Probability distribution