In this section, we present the details of the proposed hybrid localization scheme for outdoor environments. The major steps in the proposed scheme are included in Fig. 2. Technically, our localization scheme consists of three phases. The first phase is the traditional training phase. In this phase, the system will collect fingerprint data, including GPS status and corresponding WiFi signal state, through crowdsourcing technique and then build up a fingerprint database in server. In the second phase, the server will divide the map into map tiles using our map tile cache mechanism. The final phase is user online localization phase, where users’ devices can calculate the dissimilarity between the collected patterns and the sample patterns in the database built before. The sample patterns are restricted in certain map tiles, which is stored in local cache and determined by sensor information, to reduce the computing time. And then the location of fingerprint with minimum difference will be selected as estimation for the location of the device. In the rest of this section, each phase will be presented in detail.

### Map tile mechanism

As mentioned earlier, frequent network requests can generate very high energy cost and even have a long response time when the network is congested in typical positioning scheme. Although several offline applications have been proposed (e.g., big planet tracks [26]) to solve this problem, the effect of them is not ideal. So, in this paper we propose a cache mechanism based on map tiles, using geographic features to divide the map into many map tiles and maintain a local database cache.

As Fig. 3 shows, we divide an area into *n* map tiles and the length of each side is *w* meters. According to the location (longitude and latitude) of uploaded fingerprint data, we allocate it into different map tile. And only *m* map tiles is stored in client side local database. In Fig. 3, *m* is equal to 4. The premise of using the user device to sense the surrounding WiFi signal information is that we are able to ensure that the user is roughly located in the map tile in the local database through sensor assistance methods, which is described in detail in the next section. When the rough location is absent from the local fingerprint database, we need to update the cache from the server before proceeding with the prefetching operation, because the local data cache is out of date. When the total number of location requests is *r*, the typical fingerprint mechanism requires *r* network requests, and our map tile cache mechanism requires only *r*/*m* network requests. It can be seen that the proposed approach achieves the purposes of energy-saving and efficient.

### Sensor-assisted matching

Comparing the measured RSS features fingerprint with the whole data base is inefficient and unnecessary. In this section, we introduce a sensor-assisted matching method which can restrict the matching operation in a small space through sensor information including direction and travel distance.

Figure 4 is an illustration. The current location can be calculated from the initial location *P*. Note that the initial location can be obtained by GPS or other localization methods. The distance and direction from the previous location to current location are denoted as *l* and *θ*, respectively. Actually, the distance and direction can be estimated using dead reckoning method with built-in inertial sensors like accelerometer, gyroscope, and compass. It indicates that the accuracy of count steps in dead reckoning can reach 98% [27]. The moving distance *l* can be obtained by multiply the user’s step length and the footsteps. In [27], the authors also introduced how to get direction *θ* using gyroscope and compass in detail. Because of the noisy sensors and heterogeneous devices, the distance *l* and the direction *θ* cannot be precisely calculated. Two variables *Δ**l* and *Δ**θ* are introduced here as fault tolerant range to cope with this problem. An annular region *ABCD* can be obtained with all the above variables determined, which contains the current position. Estimating the current location requires finding out which fingerprint points in local cache are contained in this restricted area. Here, we offer a solution based on Haversine formula [28].

Given the longitude and latitude (*φ*,*λ*) of the start point, and the distance *d* and bearing *θ* from the start point, the destination point (*φ*^{′},*λ*^{′}) can be calculated using the following Eq. (3), where *δ* is the angular distance 1/*R* and *R* denotes the Earth’s radius.

$$ \left\{\begin{array}{l} \varphi^{\prime} = \text{arcsin}(\sin\varphi\cdot\cos\delta+\cos\varphi\cdot\sin\delta\cdot\cos\theta) \\ \lambda^{\prime} = \text{arctan}\left(\frac{\sin\theta\cdot\sin\delta\cdot\cos\varphi}{\cos\delta-\sin\varphi\cdot\sin\varphi^{\prime}}\right) + \lambda \end{array}\right. $$

(3)

Using the above formula, a subset of local fingerprint database cache can be obtained \(\mathcal {F}=\{f_{1}, f_{2}, \cdots, f_{n}\}\). Within this subset, the location *L*_{
i
}(*f*_{
i
}) of every fingerprint *f*_{
i
} locates in the constrained area *ABCD*.

### High-accuracy localization

The observation in Section 3.2 shows that it costs much time for the probabilistic fingerprinting algorithm to collect samples for estimating current position. Therefore, we introduce a modified deterministic framework here. The basic idea is to estimate the result using weighted K-nearest neighbor method. The weight of these fingerprints includes two aspects: the GPS signal state and dissimilarity in RSS. We will first introduce how to estimate the location \(\hat {L}\) and then describe the calculation of weights.

The algorithm chooses each fingerprint sample \(f \in \mathcal {F}\) to compare with query fingerprint signal *f*^{′}. Estimating the final result of current location \(\hat {L}\) can be done with weighted K-nearest neighbor method [29] using the most similar *K* points.

$$ \hat{L} = \sum\limits_{i=1}^{K} \frac{w_{i}}{{\sum\nolimits}_{j=1}^{K} w_{j}} L_{i} $$

(4)

In the above equation, all weights *w*_{
i
} are nonnegative and represents the dissimilarity in RSS and the stored GPS states like number of satellites *n* and signal noise ratio *snr*.

Different WiFi APs have different level of discrimination. Since the discrimination factor depends on the distance between AP and user device, by using Log-Distance Path Loss Model \(P_{d} = P_{0} - 10 \gamma \log \frac d d_{0}\), we specify the discrimination factor *ρ*_{
i
} of the *i*-th AP with (5).

$$ \rho_{i}=\frac{1}{d_{i}}=10^{\frac{r_{i}-P_{0}}{10\gamma}} $$

(5)

In (5), *P*_{0} is the RSS value received at distance *d*_{0}. *γ* is the pass loss exponent. And *P*_{
d
} is the RSS sensed at distance *d*. Then, an normalized form \(\rho _{i}^{N}\) from \({\sum \nolimits }_{i=1}^{p}\rho _{i}\) can be used to calculate the modified dissimilarity in RSS between query fingerprint and stored one in (6).

$$ h(f,f^{\prime})=\sqrt{\sum\limits_{i=1}^{p}\left(\rho_{i}^{N}\cdot\sigma_{i}\right)^{2}} $$

(6)

Moreover, the GPS satellite signal health is also an important part in ultimate synthesized discrimination. In our research, GPS satellite statistics is incorporated into synthesized influence factor. The GPS health here are represented by the number of effective satellites *n* and corresponding signal-to noise ratio *snr*. The synthesized influence factor is expressed as follows:

$$ \psi=\frac{{\sum\nolimits}_{j=1}^{n}snr_{j}}{10n} $$

(7)

Note that with *ψ* increases, the reliability of the stored location of this sample fingerprint increases.

Combine all these factors into unified one, the ultimate synthesized dissimilarity metric is formulated in (8), where *p*=|*A*∪*A*^{′}| and *q*=|*A*∩*A*^{′}|. And *q* denotes the number common APs in stored fingerprint *f* and query one *f*^{′}. The dissimilarity of two fingerprints with fewer common APs will be amplified by *p*/*q*.

$$ \eta=\frac{h(f,f^{\prime})}{\psi}\cdot\frac{p}{q} $$

(8)

Finally, the absolute weight is obtained by (9), which can be used to calculate the current position in (4).

$$ w_{i}=\frac{1}{\eta} $$

(9)