Mobile user-preference-based data dissemination on mobile P2P networks

A considerable number of studies have been performed recently on mobile peer-to-peer networks (MOPNETs), as the number of services based on mobile devices has increased. However, existing studies still reveal numerous shortcomings, such as bandwidth overhead, and redundant transmission in terms of multi-broadcast between peers. Especially, owing to the characteristics of MOPNETs, it is extremely important to determine broadcast size and to disseminate data within the limited environment of the network. The mechanism directly affects how well resource information in a mobile device is discovered and how data are transmitted. In this context, it is vital to efficiently disseminate data to be able to arrange hierarchal sequences of resource information of each mobile device for better search performance. Especially, it is also vital to determine the broadcast size, considering the consumption patterns of mobile users. In this article, we propose an adaptable algorithm that determines weighted values and disseminates data using the high-order Markov chain (HoMC). We apply weighted values in consideration of the MOPNET environment. In addition, the proposed HoMC-based Mobile User-preference-based Data Dissemination algorithm was simulated with a Qualnet simulator. Results show that the proposed algorithm performs 17.3% better, on average, in terms of data dissemination, than the existing dissemination methods.


Introduction
Mobile peer-to-peer network (MOPNET) [1] is an emerging infrastructure designed to facilitate information sharing among mobile devices. It has been adopted for diverse purposes, such as social networks, transportation, mobile electronic commerce, emergency response [2], and homeland security [3]. Data dissemination is an essential component for publication of or subscription to the resource information among dynamic mobile users on a MOPNET, since resource discovery is a main issue of MOPNET [4,5]. However, many communication problems have been reported in the MOPNET environment, including non-adaptiveness, implosiveness, query overlaps, resource blindness, and communication overhead. In addition, transmission of unnecessary data in a limited network environment often leads to loss of a mobile system's power capacity and bandwidth, and delay and loss of packets. Hence, it is necessary to decide an efficient algorithm of data dissemination for resource discovery, and disseminate data in a stable and timely fashion. Especially, dissemination of the rankbased broadcast (RBB) [6,7] is performed, only after the ranks (i.e., priority of broadcast) and the broadcast sizes are determined for queries and resource information (i. e., reports) occurring on a MOPNET. However, knowing the number of queries and reports being called is insufficient for exact broadcasting. Therefore, determination of exact broadcast size becomes possible through analysis of a mobile user's queries and reports to overcome limitations lingering in the traditional RBB. The new approach also secures high accuracy and matched throughput in dissemination. This study proposes an adaptive method in consideration of an account user's preferences that are tallied, based on the frequency of user query requests during data dissemination. It reduces unnecessary transmission by sharing the resources that are most frequently requested by users. It contrasts with traditional methods, such as the RBB. In this respect, our proposed mechanism facilitates efficient sharing and transmission of low bandwidth and limited content seeds in a limited network environment, or the MOPNET. A mobile user-preference-based data dissemination (MUDD) method has been proposed in this article, based on the high-order Markov chain (HoMC) [8,9]. Hardware characteristics limit the number of content items available on a MOPNET for each mobile user's communication purposes. Thus, application of the HoMC, which facilitates learning with a limited number of data, improves the accuracy and the volume of data processing. Moreover, it is further intended to optimize our proposed algorithm with weighted values to maximize query hit rates.
The remainder of this article is organized as follows: Section 2 describes the related work for data dissemination methods, such as traditional flooding-based method, gossiping method and RBB; Section 3 explains the proposed technique, which is a MUDD using HoMC; a Qualnet, (a network simulator)-based simulation result is described in Section 4; and Section 5 concludes this article and suggests future research.

Related work
Various researches have been proposed in the limited wireless network environment to operate the data dissemination task efficiently.
Chatzigiannakis et al. [10] proposed a new data dissemination protocol for wireless sensor networks, which pulls additional knowledge about the network to subsequently improve data forwarding toward the sink. This extra information is still local, limited, and obtained in a distributed manner. This extra knowledge is acquired by only a small fraction of sensors. Thus, the extra energy cost only marginally affects the overall protocol efficiency. The protocol has low latency and manages to propagate data successfully, even in the case of low densities. However, the proposed mechanism has a limitation in that it does not support adaptive data dissemination, because it does not reflect the real network condition. In contrast, our proposed model makes a feature of adaptive data dissemination based on selfadjusting the various parameters in terms of the actual network conditions. Prasad Sistla et al. [11] examined the dissemination of availability reports on resources in MOPNETs, where moving objects communicate with each other via shortrange wireless transmission. Each disseminated report represents an observed spatial-temporal event, and the relevance of the report to a moving object decays, as the age of the reported resource and the distance from its location increase. In addition, they proposed an opportunistic approach, in which an object propagates the reports it carries (namely, the information that it has about these resources) to objects encountered and obtains new reports in exchange. Least relevant reports are discarded after each exchange to limit the communication data volume of future exchanges. Their theoretical and experimental analysis indicates that the opportunistic dissemination algorithm automatically limits the global distribution of a report to a bounded spatial area and to the duration for which it is of interest. To some extent, our solution is similar to the Opportunistic Resource Exchange model, but we use a different propagation manner, a user-preference training-based probability model because of different application requirements. That is, it is important to broadcast the packet to other nodes more precisely in a limited network environment.
Li et al. [12] presented an Efficient Multi-Source data dissemination scheme (EMS), efficient multi-source data dissemination in P2P networks. Nodes are organized in a two-layer structure by leveraging node heterogeneity: the upper layer is based on the DHT protocol and composed of powerful and stable nodes, while the nodes at the lower layer attach to physically close upper layer nodes. Data objects are replicated and forwarded along implicit trees, based on the DHT route table. The average path length for delivering objects to all nodes converges to O(log n) automatically, where n is the number of nodes in the application. However, this approach has to evaluate the performance with different capacity profiles. In contrast, this article presents a comparative analysis of the proposed model and existing algorithms. In general, several methods have been devised to disseminate data. Flooding-based method [13,14], gossipingbased method [15,16], and the RBB [6,7] are typical examples.
A flooding-based method emits queries and reports on the content list to randomly chosen targets over the network. The method serves as a foundation for the other advanced methods. The method, however, fails to overcome several inherent problems, such as data implosion, query overlaps, and resource blindness. A gossipingbased method was proposed to reduce the frequency of the implosion and the system overhead. Reports are transmitted to the randomly selected targets that are concerned with a particular content under the new approach. However, some mobile users on a large network may not receive the message at all, since a neighbor is selected randomly. Thus, the gossiping method does not provide a reliable method of data dissemination.
Yet another new method, termed as the RBB algorithm, was conceived to solve these problems. The algorithm ranks reports for the purpose of discovering local spatial-temporal resources in highly mobile environments. Under RBB, the relevant functions are employed in consideration of prioritized reports to downsize the broadcast. Reports are ranked, based on their relevancy to queries received from other mobile users. Only top-ranked reports (k-1) are sent in each broadcast round; namely, reports are ranked to prevent the overhead of existing flooding methods and to reduce redundancy of data dissemination. However, high communication overhead of a RBB with large resources degrades its search performance. The RBB algorithm is not efficient, though, since every mobile user calculates the expectation value to determine ranks, a process that is not always necessary. Thus, a need arises for a more efficient algorithm to reduce the overheads from communication and unnecessary computation.
It is necessary to consider not all, but only the mobile users related to user preferences and their affected mobile users to achieve a heightened efficiency. Table 1 compares studies on data dissemination methods in terms of pros and cons.

Mobile User-preference-based Data Dissemination (MUDD) on a MOPNET
The aim of data dissemination with preference of mobile user means an efficient information resource sharing in a limited network environment, such as MOPNET. In addition, an advantage of this approach is its support of a personalization service. Especially, searching for the data of interest through queries should be efficient to share in the continuous information resource between mobile users. Thus, analysis of an account of mobile user's preferences is required for efficient information resource. Accordingly, the proposed model in this study analyzes user's preference using HoMC to decide the broadcast size and then to disseminate data to other nodes.
We define to help understand terms in our proposed model more clearly. Rank refers to the matching sum of queries received by each mobile user and retained resource information. The query list refers to the sum of queries retained by all users on a MOPNET. In addition, report denotes the content list to which the rank values retained by each mobile user are applied. The term preference refers to the resulting value(s) from the analysis of a user's content-consuming patterns through the HoMC function at the rate of matching a mobile user's reports with the number of queries from other users. Figure 1 illustrates how each mobile user communicates with each other for the user's preference-based broadcast on a MOPNET. When the new mobile user (a) enters a MOPNET environment, the new mobile user retains the native query and the broadcast size (k-1) on her or his report list, and broadcasts the modified report list to all users. Other mobile users (b) analyze their own preferences with the received report list and the native query from the new inserted mobile user. Further, each user (c) computes and updates ranks of each report with the analytic results. Then, all users rebroadcast their own query to other users, along with the modified report list. Once user preferences are trained, the broadcast algorithm becomes more efficient, since users tend to request most-frequently-used queries repeatedly, especially, in certain domains, such as social networks and transportation for commuters. Accuracy in data dissemination gradually increases, as time passes, with the help of the effective training algorithm. Figure 2 illustrates the pseudocode of how the user-preference-aware broadcast algorithm works.

Overview of MUDD
In the initial stage, the n number of mobiles users exchange and store their query lists and report lists to form a MOPNET.

1.
A mobile user broadcasts to other users the k-1 number of report lists and one query of her or his interest in the course of session formation on a MOPNET. (L5) 2. Each mobile user checks the received query and the k-1 number of reports against her or his own report lists to detect overlap. (L6) 3. Once detected, each user updates her or his report lists by analyzing her or his preferences through the Table 1 Comparison of studies on data dissemination methods

Approach Advantage Disadvantage
Floodingbased method [14,15] This ensures the data and queries for data are sent all over the network Simple and classical method data implosion, query overlaps, and resource blindness Gossipingbased [16,17] Mobile users receiving the packet forward it only to a single randomly selected neighbor, instead of sending it to all neighbors.
It avoids the problem of implosion and it does not waste as much network resources as flooding does.
Some mobile users in the large network may not receive the message, since the neighbor is selected randomly.
RBB [6,7] RBB algorithm ranks reports to discover local spatial-temporal resources in highly mobile environments. Otherwise, it computes ranks, based on user preference, and registers the relevant information. To which and how many target mobile users the reports are to be re-broadcast is determined when queries and reports are sent with the broadcast size "k-1". The process will be detailed in the following subsections.

MUDD process
Each mobile user receives queries from the sensors of the other mobile users to disseminate the queries through broadcast. If the received queries exist on the list of the mobile user, then the queries and their related reports are re-ranked. Otherwise, the information on the queries and the reports is updated to the mobile user O's list. The duplicates are checked for each received query, and only the copy with the smallest rank is kept. That is, this process attempts to significantly reduce duplicated transmissions by disseminating only new data to old neighbors, or only old data to new neighbors.

Computation of rank
Equation (1) defines the report as the sum of all relevance values to rank a mobile user's preference. Foreign queries, or the queries of the other mobile user O, are received from other mobile users, and these foreign queries constitute non-native queries. Let (Q 1 , Q 2 , ......, Q n ) be the queries related to a mobile user O. The following expression defines the report of each mobile user at the time t (i.e., a(R P )) in consideration of the ranks of mobile user preferences:  The relevance of a report "a(R P )" to a query "Q" is determinable, if Q(a(R P )) = true and match(a(R), Q) = 1. Otherwise, Q(a(R P )) = false and match(a(R P )), Q = 0. That is, the process denotes the meaning of "match" in equation (1). Finally, the mobile user (Q 1 ) sequentially prepares reports on the query and report list, and broadcasts queries and reports on the information required to be shared on the query and report list.

Training of mobile user's preference
We have designed the function P and computed mobile user preferences in a HoMC, as shown in equation (2), to train user-preference ranks through the HoMC: where, x: the number of objects, r: an order of Markov, n x : the number of received queries, and X: an eigen function about user-preference with time Equation (2) assumes the r-th ordered Markov model. Thus, the assumption leads to the conclusion that the condition probability at the point of t+1 is obtainable through an expression X (t+1) = x j , if the r-th state transition has occurred previously.

Decision of broadcast size
In addition, the size of the mobile user-preference probability is determined, which is the upper bound, to compute a broadcast size "k" as depicted in equation (3).
That is, the result of the "k" function is incorporated in the average receipt rate of queries (i.e.,. a random variable).

Expectation of broadcast size
When the reports on the interested topics are prepared from the query and report list, equation (4) is used to determine the expectation values for pairing with queries. An expectation value, in turn, denotes how to more correctly match the reports information in the sending mobile user and the queries in the receiving mobile user. In addition, it is adaptable to the size of the time duration, since the last broadcast of a mobile user, O. That is, the expected value 'E(X )' [7] of a random variable for the broadcast size "k" on the list is determined based on this idea, as depicted in equation (4): where parameters of symbols are defined as shown in Table 2, to compute the expected values in equation (4). Thus, a mobile user broadcasts, to other mobile users, the reports on a native query of a broadcast size k. If the size k is used to determine the broadcast size, it is possible to reduce unnecessary broadcasts between users, and to dynamically apply it to them.

Parameter optimization for HoMC
The accuracy of our model was compared for each of the parameter values (r,r, k) to estimate optimal parameters of the HoMC-based RBB, as shown in Figure 3. In detail, r is the scaling factor optimal value of which is computed, based on the empirical tests shown later. Moreover, r represents the degree of a HoMC, and k represents the continuation weighted-parameter "w" between the d-k and the d hour. These optimal parameters are derived from equation (5), as follows: where τ: number of recommendation report Accordingly, accuracy (j) is computed as depicted in equation (6): where ω : number of mobile user s interest packet (6) In the first-order Markov chain, the parameter values are r = 1, r = 1, and k = 0, and the mean accuracy is 33%. In addition, when the parameter values are r = 1, r = 2, and k = 0, the second-order Markov chain shows an accuracy of 30% on average. Thus, if the order of a p The probability that a moving object attempts to start a broadcast

p'
The probability that a moving object starts per second T The volume of transmitted packets per second Markov chain increases, then the number of combination patterns also increases. Thus, the low coverage is detectable through insufficient training reports. Figure 3 shows the result of a HoMC-based MUDD with weighted values, when r >1 and k >1. Upon application of optimized parameters, when r = 2, r = 1, and k = 2, the accuracy increases and reaches above maximum 68%. Therefore, when a set of optimized parameters, or (2, 1, 2), is selected, our proposed model performs 17.3% better in terms of accuracy than the existing RBB algorithm does. The optimal parameter set is actively formed and used, depending on relevant MOPNET components (e.g., the number of mobile users, the network bandwidth, the device performance, and the signal strength).

Simulation Environment
The Qualnet simulator [17] is employed to construct the proposed model to simulate efficient data dissemination based on user preference. A P2P network was created on a MOPNET that had one thousand mobile users, as depicted in Figure 4. In particular, at the start of the simulation, 1000 users were randomly positioned over an area of half a square mile. Each user was assigned to a random position within the area. Eight pairs of source and destination users were selected; these users communicated at a constant bit rate. Each source user randomly sent 512-byte packets to destination users during the simulation. The average simulation duration was 25 with 0.1-s, intervals inserted in between.

Mobile User-preference-based Data Dissemination on a MOPNET
One of the 1000 mobile users was selected randomly. Then, the datasets on the selected mobile user's preferences were collected for six 10-min periods, using the HoMC on a MOPNET. Specifically, as shown in Table  3, the preference datasets of the mobile user are presented as r 1 , r 2 ,......,r 8 through the 8 × 8 transition matrix, which has been designed to train a mobile user's preferences. This assumed that all combination sets of two values c'(r i , r j ) are compared to each other via the 8 × 8 matrix to compute the mobile user's preferences. Then, the combination set with the two highest values was selected. From Table 3, the transition of the stochastic characteristics of the mobile user's tendency inferred along the time line is determined by the equation (7) given below: Based on the computational results of equation (7), the c' = (r 1 , r 4 ) was recommended for its two highest values, or r 1 (0.4) and r 4 (0.33), since other combinations, or (r 1 , r 2 ), (r 1 , r 3 ), ......,(r 7 , r 8 ), were smaller than Figure 4 Qualnet simulation environment A P2P network was created on a MOPNET that had 1000 mobile users.
(r 1 , r 4 ). Furthermore, Table 4 illustrates how to measure the rank of a report to be broadcast, which has been adjusted by the weight value. Thus, r 1 (1.4) and r 3 (1.6) were chosen by a. r 3 was selected instead, however, since the weight value of the report (r 3 ) was higher than that of r 1 (1.6 >1.4). Likewise, the above equation continuously computes ranks of the other reports. Figure 5 compares the original RBB algorithm to RBB's values through the HoMC-based MUDD. In the ideal case, once a moving object enters the system, it immediately receives all the reports that have ever been generated in the system, and it instantaneously receives every report that is generated during its lifespan. In this context, our model performs better than the existing RBB algorithm in terms of corresponding match throughput, and proves that it is more adaptable to the time duration and the size of data dissemination through the HoMC-based MUDD optimization. Furthermore, in our model, dissemination and ranking of queries enable a mobile user to broadcast the selected reports in such a way as to maximize bandwidth utilization. Furthermore, when disseminating data in consideration of mobile user's preferences, as shown in Figure  5, the match throughput showed a 19% improvement on average, compared to those under traditional RBBs. Especially, when the number of users reached 500 or more, the processed volume was 30.1% higher on average. The findings indicate that the learning result via the Markov chain is in direct proportion to the number of users. In addition, as shown in Figure 6, this proposed model yields good performance in terms of precision, recall, and accuracy.
As illustrated in Figure 6, user-preference-based broadcasting produced a 97% true positive and a 96% precision. That is, each mobile user on a MOPNET    broadcasts accurate information to each other. Thus, it is fair to say that they are well trained.

Conclusions
Existing many researches have tried to solve data dissemination problems, such as loss of a mobile system's power capacity and bandwidth, and delay and loss of packets in terms of data dissemination. In spite of these efforts, there are lots of limitations, such as non-adaptiveness, implosiveness, query overlaps, resource blindness, and communication overhead in the MOPNET environment. In order to overcome these issues, we analyzed mobile user's preferences which mean a user's content-consuming patterns. As a result, we can reduce the implosion impact caused by queries and resource information sent out from mobile users. This article proposed an adaptive method with an account user's preferences that are tallied, based on the frequency of user query requests during data dissemination. It reduces unnecessary transmission by sharing the resources that are most frequently requested by users. It contrasts with traditional methods, such as the RBB. In this respect, our proposed mechanism facilitates efficient sharing and transmission of low bandwidth and limited content sent in a limited network environment, or the MOPNET. Thus, we proposed a method to discover resources, considering a user's preferences to gain better insight into the performance of a mobile device. Discovery is conducted through the HoMC to build a MOP-NET with each user. Then, this model was compared to the original RBB, and then juxtaposed with an ideal case. In addition, this proposed model sets priorities in data dissemination, based on predetermined conditions, such as a user's preferences, and corresponding responses, such as reports. The following results were achieved applying this research: First, it was possible to rank a query report related to a user's preferences and then to create, based on the accuracy of a MOPNET, optimal parameter sets. Second, we implemented a dynamic network with a Qualnet simulator. Thus, the proposed HoMC-based MUDD algorithm was simulated. The results showed that the proposed algorithm performed 17.3% better, on average, in terms of data dissemination, than other existing dissemination methods. Third, we employed a broadcast methodology in our algorithm which was based on prioritized data to overcome resource limitations. It is recommended that future studies investigate various wireless networks, such as IEEE 802.11, 802.15, and 802. 16. Furthermore, this proposed model will use optimized algorithms, such as a hidden Markov Model, as the basis for increasing the number of users and queries on a MOPNET.