 Research
 Open Access
Indoor robot localization combining feature clustering with wireless sensor network
 Xiaoming Dong^{1},
 Benyue Su^{1} and
 Rong Jiang^{2}Email author
https://doi.org/10.1186/s1363801811791
© The Author(s). 2018
 Received: 7 May 2018
 Accepted: 11 June 2018
 Published: 11 July 2018
Abstract
Indoor robot localization is an indispensable ingredient for robots to perform autonomous services because GPS (Global Position System) information is not available. Natural features are usually used to implement this task, but it is difficult to solve the problem of localization robustness. A solution is proposed combining feature clustering and wireless sensor network to improve the effectiveness of robot localization: firstly, the SIFT (scalable invariable feature transform) features are extracted with feature clustering algorithm to estimate the robot position; secondly, the wireless sensor network is constructed to localize the robot from another independent way; finally, EKF (extended Kalman filter) is utilized to fuse the two kinds of localization results. The experiments demonstrate that this proposed method is effective and robust.
Keywords
 Robot
 EKF
 Localization
 Wireless sensor network
1 Introduction
GPS is commonly used to implement robot localization, but GPS cannot be used in some conditions such as indoor environments [1]. A common way to implement indoor localization is to estimate robots’ pose utilizing inertial sensors; however, this method is difficult to resolve the problem of robot’s wheel slor robot localization combining feature clusteipping, so the accumulated errors impact the estimating accuracy greatly [2]. In magazine [3], the NonCooperative Feature Points are extracted to implement the task of navigation, which is usually based on the satellite feature points. In magazine [4], the graphbased methods have been carried out for the robot localization and navigation. In magazine [5], the whole simulation process of robot navigation has been implemented. The drawback of the above methods is that it is difficult to obtain the robust feature points, which are easy to be affected by light changes. Recently, wireless sensors are used in mobile robot navigation as reference nodes; the methods can be divided into two categories [6–8]: One is to implement robot localization utilizing wireless sensor networks on their own sensor nodes [9], and the other is to carry out that on external targets [10]. This article focuses on the later that is to implement robot localization utilizing the external sensor targets. Vision features are another commonly used in robot localization and motion estimation. According to the specific localization mechanism, existing wireless sensor network localization methods can be divided into two categories [11]: A rangebased method and a rangefree method. The rangingbased positioning mechanism needs to measure the distance or angle information between the unknown node and the anchor node, and then use trilateration, triangulation, or maximum likelihood estimation to calculate the position of the unknown node. The nonrangingbased positioning mechanism does not require distance or angle information, or does not directly measure the information, and only implements node positioning based on network connectivity and other information. Localization based on ranging technology have the advantages that it can obtain higher accuracy, and the commonly used ranging technologies are RSSI (received signal strength indicator), TOA (time of arrival), TDOA (time difference of arrival) and AOA [12]. TOA (angle of arrival) is the time of arrival method. TDOA is the time difference of arrival positioning method. Both are positioning methods based on the propagation time of the waves. It is necessary to have three base stations known at the same time to assist in positioning. The basic principle of TOA is to get the distances from the equipment to the three base stations after getting the three arrival times, and then just build the equations and solve them according to the geometry, so as to obtain the location value. TDOA does not solve the distance immediately, but calculates the time difference first, then establishes the equation group through some clever mathematics algorithm and solves, thus obtains the location value. Since the TOA calculation is completely timedependent, the time synchronization of the system is required to be very high. Any smalltime error will be amplified many times. At the same time, due to the influence of multipath, it will bring a great error, so the pure TOA is rarely used in practice. DTOA can greatly improve the accuracy of positioning because the ingeniously designed difference process cancels out a large part of the time error and multipath effect. Because DTOA has relatively low network requirements and high precision, it has become a research hotspot. AOA is the angle of arrival method, which is a twobase station positioning method that performs positioning based on the incident angle of the signal. It determines the position by intersecting two straight lines, it is impossible to have multiple intersections, which avoids the ambiguity of positioning. However, in order to measure the incident angle of the electromagnetic wave, the receiver must be equipped with a highly directional antenna array. RSS positioning method is based on the strength of the received signal to achieve positioning. In the positioning process, the signal intensity of three different reference points is measured by the device, and three distance values are calculated according to the physical model. Then, a geometric solution method similar to TOA can be used to obtain the positioning point. Common RF chips all have RSSI measurement function, so the RSSI mechanism is easy to implement. The problem is it is susceptible to channel and noise, and it has large measurement error in longdistance positioning, which is mostly used in smallrange positioning [13]. The visual information is also commonly used to implement the robot localization. This series of methods can be divided into monocular visual localization method and stereoscopic visual localization method [14–16]. The purpose is to use the robot vision system to sense the environment, identify the location of the road signs and obtain local map information, and continuously use the obtained local map. The main problems existing in visual localization research are it is difficult to implement feature extraction and landmark recognition quickly and accurately through robot vision systems in complex environments [17, 18]; the amount of image features required for landmark recognition is too large, especially in a complex and largescale environment [19, 20]. In the case of global maps, there is a problem of information explosion. This increases the complexity and uncertainty of the localization task. Wireless sensors are flexible for robot localization, but it is difficult to achieve high accuracy only by themselves. On the other way, vision sensors are easy to be affected by many factors: the degree of image distortion correction, the tracking error of the feature points, and so on [21–23]. To improve the robustness of the robot localization, SIFT [24] and supersphere aggregation algorithm [25] are selected as the vision features. The object of this paper is to develop a method integrating distinguished features of indoor environment and wireless sensor network to improve the effect of robot localization.
This paper is organized as follows: Section 2 described the main methods; various experiments were implemented to verify the proposed method in Section 3. Section 4 gives conclusions.
2 Methods
The task of robot localization is to estimate the robot’s position as well as to set up a map of the surrounding environment according to the landmarks perceived. For the convenience of description, m represents landmark and x represents robot position; u represents the robot drive. In order to implement robot localization task, the distance that the robot has moved should be estimated and the position of the landmark should be calculated simultaneously. The robot updates the estimated position when the new landmark was found in the navigation process. Apparently, there are some errors in the robot localization between estimation position and real position, as well as that in the landmark estimation. The target of robot localization is to reduce the estimation errors of the robot position and surrounding environment.
2.1 Feature clustering for robot localization
Features can be clustered to improve the robustness in robot localization. The hypersphere soft assignment [25] is used to cluster the features in our design. The principle of the hypersphere soft assignment is to construct a hypersphere based on each clustering center in the feature space, and the feature points are assigned to the cluster centers corresponding to the hypersphere in the space position.
The hypersphere can be constructed in the feature space for each clustering center c_{i}, where the radius is radius_{i}. The feature points can be assigned to the corresponding cluster center according to the spatial position relative to all hyperspheres.
For a feature point x_{j} (x_{j}ϵR^{d}), a K dimensional attribution vector attrib_{j} = {attrib_{j1}, …, attrib_{jk}} is defined, where attib_{ji} indicates whether the feature point x_{j} is allocated to the clustering center: attrib_{ji} = 1 represents that x_{j} is allocated to c_{i}, attrib_{ij} = 0 represents that it is not allocated to c_{i}. The characteristic this soft allocation mechanism of the hypersphere is represented as follows:
Only when the Euclidean distance between the feature point x_{j} and the cluster center c_{i} is less than or equal to the corresponding radius of the super sphere, x_{j} would be allocated to c_{i}. The value of component attrib_{ji} in the corresponding vector attrib_{j} is set to 1.
a, b,…_{,}I are assumed as the clustering centers, and x_{1}, x_{2}, x_{3}, and x_{4} are the four feature points to be allocated. The policy of soft assignment can be described as follows: firstly, the hypersphere needs to be constructed for each clustering center, where the clustering center is set as the center of the corresponding hypersphere. The clustering centers are assumed as b, c, d, and e with the corresponding circles; secondly, x_{1}, x_{2}, x_{3}, and x_{4} will be assigned to a cluster center according to whether they are located within the corresponding circle. For example, x_{1} is only located within the circle corresponding to b, so x_{1} is only assigned to b; feature points as x_{3} and x_{4} are located within the circles corresponding to b and c, so the x_{3} and x_{4} are both allocated to b and c; x_{2} is located within the circles corresponding to c, d, and e, it will be allocated to the c, d, and e as three cluster centers.
d_{ij} is defined as the Euclidean distance, which is set from feature point x_{j} to clustering center c_{i}; σ is the standard deviation, which is assumed to be Gaussian mixture distribution. The optimal value can be determined through experiments. The weight vectors w_{j} needs to be L_{1} normalized, so that the weights of the cluster centers are calculated as follows.
VLAD is the abbreviation of “vector of locally aggregated descriptors,” which is an image descriptor aggregating the local features of the SIFT in the feature space. VLAD is proposed on the basis of the method of the BOF (bag of features) image descriptor, combined with the idea of the Fisher kernel. Similar to the BOF image descriptor, VLAD takes the Kmeans clustering algorithm as a training set to get k clustering centers. Each cluster center is called a visual word, and the set of K visual words is called a codebook C = {c_{1}, …c_{i}, …c_{k}}({c_{i}ϵR^{k}).
The soft assignment method can be combined with VLAD when it is used to generate the image descriptor SSAVLAD (Spherical Soft Assignment vector of locally aggregated descriptors). The way is similar to VLAD: a sample set is trained to obtain the codebook, which includes K cluster centers utilizing the Kmeans clustering algorithm. Differing from VLAD, the corresponding radius needs to be calculated for each visual word (the clustering center) in the training of C.

Step1: Using the hypersphere soft allocation method, each SIFT feature x_{j} is assigned to the adjacent visual words to obtain the attribution vector attrib_{j}.

Step2: Computing the weight vector w_{j} of the SIFT feature x_{j} relative to the nearest neighbor visual word.

Step3: The weighted vector residuals, which is the SIFT feature x_{j} relative to the nearest neighbor visual words, are calculated.

Step4: For each visual word C_{i}, allocate it for aggregation to the corresponding weighted residuals vector among all the features x_{j} belonging to the visual word.

Step5: The K aggregation vector v_{i} is connected in series to form an SSAVLAD vector {v_{1}, …v_{i}, …v_{k}}, where d is the vector dimension of the SIFT feature.
The vector dimension of SSAVLAD is the same as VLAD, which is k × d.
2.2 Robot localization based on wireless sensor network
The centroid localization algorithm is chosen as the auxiliary location method, which has the most important advantages: the little time consumption and the fast location speed. The essential of centroid algorithm is that the robot can use the geometric centroid of anchor nodes in the range of communication as its location estimation. The concrete process can be described as follows:
The robot position can be obtained according to the centroid coordinates, and the error of localization W_{t} is assumed as Gaussian distribution.
2.3 Robot localization combing feature clustering and WSN with EKF
2.3.1 EKF process in robot localization
Kalman filter is usually used as state estimation or parameter estimation for a system, which updates the mathematical expectation and covariance of Gaussian probability distribution under Bayesian theory framework. By minimizing the square of the deviation between the real state and the estimated state of the system, Kalman filter provides an effective estimation method for the system’s present and future state without the exact model of the system.
There is a requirement in the classic Kalman filter that a control system is linear, but most practical control systems are nonlinear. Therefore, the extended Kalman filter (EKF) is used in our design, where the nonlinear control process and the measurement model are linearized to make the traditional Kalman filter be appropriate for the nonlinear systems.
But EKF has two obvious shortcomings: first, derivation of the Jacobian matrix and linear estimation of nonlinear equations may be very complex, contributing to the difficulties in real applications; secondly, when the filter time step is not small enough, the linearization for nonlinear equations will lead to system instability.
The state prediction and the observation prediction are defined as follows respectively, where \( {\widehat{x}}_t^{} \) denotes the prediction value of robot position:
The EKF filtering algorithm is linearized through Taylor expansion to linearize the nonlinear motion model and observation model, and then, the update process can be obtained according to the similar way as Kalman filtering.
The time update equation in EKF can be represented as Eq. 12, where \( {\widehat{P}}_t^{} \) denotes the variance:
And the update equation of observation can be represented as Eq. 13:
Considering that the position of the measured landmark is not accurate absolutely. The appropriate motion model and observation model should be set up to analyze the robot localization error. In our designed experiment, the wheel robots are selected and its driven model can be set up as follows.
2.3.2 Robot model
Among which K_{x} and K_{y} are the robot drift coefficients, which are along the xaxis and the yaxis similarly, K_{Sθ} and K_{θθ} are the robot drift coefficients of angles. The values of the coefficients can represent the error in this model.
2.3.3 Observation model
In Eq. 20, the observation noise can be ordinarily assumed as Gaussian noise, among which the average is assumed to zero, and covariance matrix is assumed as diagonal matrix.
3 Experiment results and analysis
The Institute of Automation of Chinese Academy of Sciences has independently developed a versatile autonomous mobile robot platform called AIM based on its existing technology. The various theories and algorithms described in this paper are tested on the platform. The vision sensor is placed on the top of the mobile robot.
Standard deviation of different aggregation algorithm
Type  Position (cm)  Feature (cm) 

VLADR  9.5  11.2 
RNNVLADR  8.7  10.8 
SSAVLADR  7.3  9.2 
The average mean square values of feature error for different algorithms are shown in Fig. 1. The average feature error is smaller as dimension length increases. Among the three algorithms, the worst algorithm is the localization based on original VLAD, which has the largest feature estimation error. The best algorithm is the proposed method based on SSAVLAD, which has the smallest feature estimation error for the same dimension length.
The average position error values for different algorithms and different dimension length are shown in Fig. 2. It is shown that the average position errors of the three algorithms become smaller as dimension length increases. The worst algorithm is the method based on original VLAD, which has the largest average position error. The best algorithm is still the proposed method based on SSAVLAD, which has the smallest position estimation error for the same dimension length.
Figure 3 shows the impact of different dimension lengths in maximum mean square estimation of feature, and Fig. 4 shows that of position. In these conditions, the method based on SSAVLAD has the best results for each same dimension length, and the method based on RNNVLAD is better than the method based on original VLAD.
4 Conclusions
In this paper, the robot localization based on feature clustering algorithm was proposed, and robot localization utilizing wireless sensor network is integrated to improve the robustness. The performance of proposed algorithm was verified by reduced feature errors and robot position errors. The various experiment results show that the localization problem can be solved better, and the efficiency of localization was improved. Moreover, the proposed algorithm can still achieve satisfied estimation accuracy even when either feature clustering algorithm or wireless sensor network localization algorithm fails. In future work, the robustness of feature clustering algorithm will be analyzed further and different filter algorithms would be implemented to improve the localization efficacy in distinguished indoor environments.
Declarations
Acknowledgements
The paper was supported by the National Natural Science Foundation of China (nos. 61763048, 61263022, and 61303234), National Social Science Foundation of China (no. 12XTQ012), Science and Technology Foundation of Yunnan Province (no. 2017FB095, 201801PF00021), and the 18th Yunnan Young and Middleaged Academic and Technical Leaders Reserve Personnel Training Program (no. 2015HB038). It is also supported by the Foundation of University Research and Innovation Platform Team for Intelligent Perception and Computing of Anhui Province, key research project of natural science of Anhui Provincial Education Department (no. KJ2017A354). The authors would like to thank the anonymous reviewers and the editors for their suggestions.
Authors’ contributions
RJ and XD designed the experiments and wrote the paper. RJ and XD performed the experiments. BS analyzed the data. All authors have read and approved the final manuscript.
Author’s information
Xiaoming Dong received his M.S. and Ph.D. degree from Automation of Institution, Chinese Academy of Sciences in 2010. He is currently an associate professor at the school of computer and information, Anqing Normal University and the key laboratory of intelligent perception and computing of Anhui Province. His research interests lie in robot navigation, wireless sensor networks, pattern recognition, image processing, and Internet of Things (Email: dongxiaoming@aqnu.edu.cn).
Benyue Su received the Ph. D. degree from Hefei University of Technology in 2007. He is currently a professor at the school of computer and information, Anqing Normal University and the key laboratory of intelligent perception and computing of Anhui Province. His research interests lie in machine learning, pattern recognition, image processing, and computer graphics (Email: bysu@aqnu.edu.cn).
Rong Jiang* received his Ph.D. degree in system analysis and integration from the school of software at Yunnan University. He is currently a professor and Ph. D. supervisor at the school of information, Yunnan University of Finance and Economics, China. He has published more than 30 papers and ten books and has gotten more than 40 prizes in recent years. His main research interests include cloud computing, big data, software engineering, information management, and pattern recognition (Email: jiangrong@ynu.edu.cn).
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 J Fuentespacheco, J Ruizascencio, JM Rendonmancha, et al., Visual simultaneous localization and mapping: a survey[J]. Artif. Intell. Rev. 43(1), 55–81 (2015).View ArticleGoogle Scholar
 A Kim, RM Eustice, Active visual SLAM for robotic area coverage: theory and experiment[J]. Int. J. Robot. Res. 34(4), 457–475 (2015).Google Scholar
 S Hara, D Anzai, in IEEE Vehicular Technology Conference. Experimental Performance Comparison of RSSI and TDOABased Location Estimation Methods, vol. 2008 (Vtc Spring, 2008) pp. 2651–2655.Google Scholar
 M Ning, S Zhang, S Wang, A noncooperative satellite feature point selection method for visionbased navigation system. Sensors 18, 854 (2018).View ArticleGoogle Scholar
 K Eckenhoff, L Paull, G Huang, in IEEE International Conference on Intelligent Robots and Systems(IROS), Decoupled, consistent node removal and edge sparsification for graphbased SLAM[C]. Vol. 2016. pp. 3275–3282.Google Scholar
 T Bailey, HF Durrantwhyte, Simultaneous localization and mapping (SLAM): part II[J]. IEEE Rob. Autom. Mag. 13(3), 108–117 (2006).View ArticleGoogle Scholar
 A Awad, T Frunzke, F Dressler, in Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools Adaptive Distance Estimation and Localization in WSN Using RSSI Measures. (DSD, Lubeck, 2007) pp. 471–478.Google Scholar
 Y Kong, Y Kwon, G Park, Robust localization over obstructed interferences for inbuilding wireless applications. IEEE Transactions on Consumer Electronics, 55(1), 105111 (2009).Google Scholar
 HS Ahn, W Yu. EnvironmentalAdaptive RSSIBased Indoor Localization[J]. IEEE Transact. Autom. Sci. Eng. 6(4):626–633 (2009)Google Scholar
 AS Paul, EA Wan, RSSIBased Indoor Localization and Tracking Using SigmaPoint Kalman Smoothers[J]. IEEE J. Sel. Top. Sig. Process. 3(5):860–873 (2009).Google Scholar
 H Ren, QH Meng, Power Adaptive Localization Algorithm for Wireless Sensor Networks Using Particle Filter[J]. IEEE Transact. Veh. Technol. 58(5):2498–2508 (2009).Google Scholar
 YY Cheng, YY Lin, A new received signal strength based location estimation scheme for wireless sensor network[J]. IEEE Transact.Consum. Electron. 55(3):1295–1299 (2009).Google Scholar
 K Kleisouris, Y Chen, J Yang, et al., Empirical Evaluation of Wireless Localization when Using Multiple Antennas[J]. IEEE Transact Parallel Distributed Syst. 21(11):1595–1610 (2010).Google Scholar
 C Jinwoo et al., in SICEICASE, 2006. International Joint Conference. A practical solution to SLAM and navigation in home environment (2006).Google Scholar
 V Indelman, No correlations involved: decision making under uncertainty in a conservative sparse information space[J]. IEEE Robot. Autom. Lett. 1(1), 407–414 (2017).View ArticleGoogle Scholar
 Jonathan, H Zhang, in The Fourth Canadian Conference on Computer and Robot Vision. Quantitative evaluation of feature extractors for visual SLAM (2007).Google Scholar
 J Wu, J Li, W Chen, Practical adaptive fuzzy tracking control for a class of perturbed nonlinear systems with backlash nonlinearity. Inf. Sci. 420, 517–531 (2017).View ArticleGoogle Scholar
 L Carlone, R Tron, K Daniilidis, et al., in International Conference on Robotics and Automation. Initialization techniques for 3D SLAM: a survey on rotation estimation and its use in pose graph optimization[C] (2015).Google Scholar
 H Carrillo, P Dames, V Kumar, et al., in International Conference on Robotics and Automation. Autonomous robotic exploration using occupancy grid maps and graph SLAM based on Shannon and Rényi entropy[C] (2015).Google Scholar
 G Silveira, E Malis, P Rives, in IEEE International Conference on Robotics and Automation. An efficient direct method for improving visual SLAM (2007).Google Scholar
 W Jian, ZG Wu*, J Li, G Wang, H Zhao*, W Chen, Practical adaptive fuzzy control of nonlinear purefeedback systems with quantized nonlinearity input. IEEE Trans. Syst. Man Cybern. (2018). https://doi.org/10.1109/TSMC.2018.2800783.
 A Sunghwan et al., in Robotics and Automation, 2007 IEEE International Conference on. SLAM with visual plane: extracting vertical plane by fusing stereo vision and ultrasonic sensor for indoor environment (2007).Google Scholar
 A Sunghwan et al., in IEEE/RSJ International Conference on Intelligent Robots and Systems. Data association using visual object recognition for EKFSLAM in home environment (2006).Google Scholar
 DG Lowe, Distinctive image features from scaleinvariant keypoints[J]. Int. J. Comput. Vis. 60(2), 91–110 (2004).View ArticleGoogle Scholar
 AI Liefu, YU Junqing, WU Zebing, Y He, T Guan, Optimized residual vector quantization for efficient approximate nearest neighbor search. Multimedia Systems 23(2), 169–181 (2017).View ArticleGoogle Scholar
 J Herve, D Matthijs, S Cordelia, et al., in IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Aggregating local descriptors into a compact image representation (2010), pp. 3304–3311.Google Scholar
 P James, C Ondrej, I Michael, et al., in IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Lost in quantization: improving particular object retrieval in large scale image databases (2008), pp. 1–8.Google Scholar