In this section, we describe our experimental methodology and present the results that evaluate the effectiveness of our approaches.

### 6.1. Methodology

We would like to evaluate the feasibility of our approach in an environment close to real applications (e.g., status monitoring of patrol officers). Using mobile wireless networks, we conducted experiments based on mobile devices generated from a city environment and its vicinity in Germany [6, 7] as shown in Figure 3. The size of the area is 25000 m 25000 m. We created two simulation scenarios, one is in walking speed of 4 ft/s, and the other is in vehicular driving speed of 50 ft/s. For the walking scenario, two data sets, we call and , are obtained using this simulation environment with 1000 and 5000 nodes generated, respectively, and placed randomly inside the city.

For the vehicular driving scenario, one data set is created through the simulation environment with 1000 nodes generated and placed randomly inside the city. For the duration of our study, some new nodes may move into the city environment and some existing nodes may move out the city environment. There are no pre-defined trajectories for each node. However, group of nodes may travel together to common destinations (e.g., the city center). Figure 4 presents the average number of neighbors when using the small data set in walking scenario for the duration of our study time, shown as percentage from 0 to 1, for 600 nodes and 100 nodes, respectively.

The 100 nodes are randomly chosen from 600 nodes. We observed that the average number of neighbors increases from a few nodes to around 14 nodes as the study time moves along, indicating that groups of nodes are gradually formed and traveling together to the similar destinations. The vehicular driving scenario has the similar trend as the walking scenario. This is in line with our co-movement assumption. Thus, these datasets are suitable for our neighborhood prediction study.

### 6.2. Metrics

We will utilize the following performance metrics to evaluate the effectiveness of PARIS in terms of prediction of near likely nodes.

Prediction Accuracy

As described in Section 4.3, the Prediction Accuracy metric measures the statistical characteristics of neighborhood prediction based on the Cumulative Distribution Function (CDF) of the difference of the future probability to the past probability of the near likely node on top of the KL-divergence. We split our study time to a past time window for prediction and a future time window to evaluate our prediction. In the following discussion, we use the percentage of study time as the measurement of window size.

We investigate the impact of different window sizes of the past as well as the future on the prediction accuracy using both point-based and trace-based schemes.

Time Performance

By measuring the time that each scheme needs to provide the prediction results, we evaluate the feasibility of applying these schemes to nodes that usually have limited computational power and memory. The Time Performance metric helps to benchmark our approaches in the simulation environment and further indicates the possibility to implement them in real wireless device.

### 6.3. Results

KL-Divergence

We first study the neighborhood prediction accuracy in our proposed mechanism for both walking and vehicular speed scenarios. Figure 5 presents values of KL-divergence versus different past window sizes when fixing the future window size, whereas Figure 6 presents values of KL-divergence versus different future window sizes when fixing the past window size for both point-based as well as trace-based schemes under the case when the average number of neighbors is 5.

For the walking speed scenario, we observed small KL-divergence values that are always less than 0.5. This is encouraging as the smaller KL-divergence values indicate that the distribution of the belief probability in the future is close to the distribution of that in the past. Further, as shown in Figures 5(a) and 5(b), when fixing the time window of the future, 0.2 and 0.4 of the total study time, respectively, as the size of the past time window increases, the KL-divergence value presents an overall decreasing trend for the point-based scheme. This means that by using the point-based scheme, the larger the past window size, the more accurate the prediction of near likely node can become. However, for the trace-based scheme, we observed that the KL-divergence value fluctuates. This is interesting since it shows that for the trace-based scheme, simply increasing the past window size does not increase the accuracy, which indicates that we need both expansion and shrinkage for adaptive adjustment of window sizes.

On the other hand, when fixing the time window of the past, 0.2 and 0.4 of the total study time, respectively, as presented in Figures 6(a) and 6(b), we observed an increasing trend of the KL-divergence value for both schemes as the window size of future is increasing when the average number of neighbors is 5, indicating that the near likely node may gradually move away from the collector node when the future is long enough.

We also investigate the neighbor prediction accuracy in our proposed mechanism for the vehicular driving scenario. Figures 5(c) and 6(c) present the neighborhood prediction accuracy in the vehicular-driving scenario. First, similar to the walking scenario, the values of KL-divergence are less than 0.5, which indicates that our scheme obtains accurate prediction accuracy in the vehicular driving scenario as in the walking scenario. We also observed similar changing trend as the result of walking scenario.

In particular, as shown in Figure 5(c), when fixing the future window size to 0.4 of the total study time, as the size of the past window size increases, the KL-divergence value presents an overall decreasing trend, which is similar to the trend in Figure 5(b). While fixing the past window size to 0.4 as shown in Figure 6(c), we also observed similar increasing trend and KL-divergence value as in Figure 6(b). Further, the increased amount of the KL-divergence values is always small (around 0.05). These results indicate that our proposed schemes are appropriate for different mobility scenarios.

In general, we found that the KL-divergence values of trace-based scheme is smaller than those using point-based scheme for both walking and vehicular driving scenarios. Moreover, for the walking scenario, we observed similar results when the average number of neighbors increases to 15 and 45. Due to space limitation, the results are omitted. Therefore, the trace-based scheme has better prediction accuracy than the point-based scheme.

Further, we compared the values of KL-divergence between the small and the large data sets in Figure 7. In order to compare these two different data sets directly, we used the same transmission range of the nodes in each data set, which is under 300 m and 600 m, respectively.

We observed similar behavior for both large data set and small data set as the KL-divergence value presents an obvious decreasing trend when increasing the past window size and decreasing the future window size simultaneously. Furthermore, the KL-divergence values are smaller for the large data set. This is because there are more nodes in the large dataset, which form larger neighborhood and thereby provides better prediction result. In the sequel, due to the space limit, we will only present the results obtained from the small data set.

Cumulative Distribution Function (CDF)

Turning to studying the CDF of the difference of the future probability to the past probability of the near likely node. Figure 8 presents the CDF of the probability difference for both point-based and trace-based schemes when the window size of the future is fixed as 0.2 of the total study time, whereas the window size of the past changes from 0.2 to 0.4 of the total study time. We found that for both the positive difference and the negative difference , the CDF curve of the trace-based scheme lies to the left side of the point-based scheme.

This shows that in terms of neighborhood prediction accuracy, the trace-based scheme outperforms the point-based scheme, which is inline with the results obtained from the KL-divergence.

Moreover, we investigated the prediction accuracy under the cases of different average number of neighbors, that is, 5, 15, and 45, respectively, in the neighborhood. Figure 9 presents the CDF of for both of our schemes. We observed that for each scheme, the curves of different average number of neighbors are close to each other, suggesting that the prediction accuracy is not sensitive to different average number of neighbors. These results are very encouraging as it indicates that given a prediction scheme the prediction accuracy only relies on the window size.

Reinforcement Learning

We next examine the effects of reinforcement learning on prediction accuracy using WINTER algorithm. Figure 10 presents the expansion/shrinkage of the prediction window (i.e., the past window) according to the obtained KL-divergence value. In Figure 10(a), we observed that when fixing the future testing window to 0.12, the prediction window size is adjusted adaptively based on the KL-divergence values; when the KL-divergence value decreases from 0.043 to 0.02, the size of the prediction window expands from 0.12 to 0.2, while the prediction window shrinks from 0.2 to 0.08 when the KL-divergence value increases from 0.02 to 0.039. We observed the similar window adjustment behavior in Figure 10(b). Further, we found that by adaptive adjustment, the KL-divergence values are always less than 0.05 (even less than 0.016 in Figure 10(b)). These results are encouraging as it indicates that our approach of adaptive adjustment through reinforcement learning is effective in improving prediction accuracy during runtime.

We further investigate the behavior of adaptive adjustment through reinforcement learning by doubling the study time. Figure 11 presents how the KL-divergence values change during the expansion/shrinkage of the prediction window. First, we observed that the KL-divergence values are always less than 0.05. This proves the effectiveness of our adaptive adjustment approach. Second, we observed that although the study time in both Figures 11(a) and 11(b) is scaled from 0 to 1, the similar window size adjustment behavior presents as with shorter study time (Figure 10). For example, in Figure 11(a), when fixing the size of the future testing window as 0.12, when the KL-divergence value increases from 0.02 to 0.038, the size of the prediction window shrinks from 0.12 to 0.08, while the prediction window expands from 0.08 to 0.16 when the KL-divergence value decreases from 0.038 to 0.022. This demonstrates that our adaptive adjustment approach through reinforcement learning works in larger time windows as well.

Time Performance

Figure 12 presents the comparison of time measurements under various setups including different average number of neighbors and various window sizes for both small and large data sets. We found that the time to perform neighborhood prediction is in the order of milliseconds for both schemes.

We observed that the point-based scheme runs at about two times faster than the trace-based scheme constantly under different average number of neighbors and various window sizes. This is because the trace-based scheme needs to calculate correlation coefficients for both and dimensions, whereas the point-based scheme only calculates the correlation coefficient for gradient. Further, the time measurements of the large data set are also in the order of milliseconds as shown in Figure 12(b). This indicates that even when a node has large number of neighbors, our schemes can efficiently predict the near likely nodes.

Communication Overhead

Next, we measure the communication overhead incurred by collecting the trajectory information from a collector's neighbors. Let us consider the transmission packets of 512 bytes [25]. We assume that each trajectory record consists of a pair of () coordinates and a timestamp, each of float type. In other words, each trajectory record consists of 12 bytes. Therefore, one transmission packet can contain at most 42 trajectories. If one node records its trajectory every seconds, then the trajectory information of seconds can be stored in packets.

Moreover, we assume that for every seconds, to apply the neighborhood prediction mechanism, the collector node needs to collect the trajectory of seconds from its neighbors. Therefore, there are packets transmitted to the collector node from its neighbors. Assume the collector node needs to transfer its data of size to the storage node within these seconds. Then the size of the transmission packets needed for collecting trajectory information is of the data sent to the identified near likely node by the collector node. We assume that is less than 1.

Based on our analysis, we can see that both the packet size and the percentage of transmission packets are not affected by the moving speed of the mobile nodes. Thus, the communication overhead incurred from the trajectory information exchange in our approach is not sensitive to the mobility model.

We further study how the overhead for collecting trajectory information varies with the changing amount of data size under different network sizes. The results are shown in Figure 13. It presents that the overhead is negligible compared with the size of the transferred data. In particular, Figure 13(a) presents the results with the number of neighbors = 5, the trajectories of of seconds to be collected, the data size varying from 1 MB to 5 MB, and each node recording its trajectories every seconds. We observed that small overhead fraction values in percentage that are always less than 0.7%. Specifically, it is as small as 0.05% for the data size of 5 MB and trajectory of 60 seconds. Furthermore, we noticed that the larger the data size is, the smaller the fraction will be. The same trend is also observed in Figure 13(b), which presents a larger network with 15 neighbors. These results showed that the communication overhead incurred by collecting the trajectory information is negligible compared with the total size of the transferred data.

Furthermore, we realized that there exists a tradeoff between the communication overhead and the frequency of data update. The higher frequency the data is updated, the higher prediction accuracy may be achieved, however, higher communication overhead can occur. We note that in our scheme the data update is performed on-demand, and thus the frequency of data update can be configured.

Finally, we study the communication overhead incurred in terms of hop counts during data retrieval. Figure 14 presents the number of hops traveled with and without using the scheme we proposed over the study time. We found that under our proposed scheme the number of hops traveled for data retrieval is less than half of that without using it, indicating that using our scheme can significantly reduce the communication overhead and thus reduce the overall energy consumption of wireless devices. We will quantify the savings of energy consumption in our future work.

In summary, our experimental evaluation in prediction accuracy, time performance, and communication overhead is highly encouraging as they clearly indicate that our prediction schemes of near likely nodes can not only effectively but also efficiently perform future neighborhood prediction. Our results also point out that there is a tradeoff between the prediction accuracy and the time efficiency when choosing prediction schemes—the scheme that provides better prediction accuracy runs slower.