A novel dynamic reputation-based source routing protocol for mobile ad hoc networks

Routing in mobile ad hoc networks is based on the cooperation of the network’s nodes. The presence of selfish nodes that do not cooperate in this task drastically reduces the number of delivered packets. In order to find the better paths that include nodes willing to cooperate, we propose a new routing algorithm based on the reputation of the nodes. In our proposal, each node locally assigns a reputation value to the rest of the nodes in the network and next it uses the assigned reputation values to find out the better routing paths, in order to minimize the overall packet loss ratio. We assume that nodes have a stationary routing behavior, but we also include a mechanism to detect changes in their behavior. Our approach has been evaluated in the presence of selfish nodes, in order to compare it with the dynamic source routing algorithm, obtaining a reduction in the packet loss ratio at the expenses of a small increase in the number of hops taken by the packets to reach their destinations.


Introduction
A Mobile Ad hoc NETwork (MANET) is a group of self-organized wireless mobile nodes (MNs) able to communicate with each other without the need of any fixed network infrastructure nor centralized administrative support.Furthermore, the transmission range in such mobile devices is limited; thus, a packet is forwarded in a multihop path relying on the nodes in the routing path.Due to that, MANETs need the cooperation of every node in the path to achieve a successful packet delivery.However, depending on the MANET application, nodes are willing to cooperate with each other (e.g., rescuing services) since they are controlled by an authority, or might be reluctant to cooperate (e.g., data sharing [1], traffic monitoring [2], emergency assistance services [3,4], and multimedia data transmission [5]) trying to save their own resources.Since MANET nodes usually have limited power (i.e., battery) and scarce computational resources (CPUs), nodes might refuse to cooperate in order to save their limited resources.This misbehavior of nodes would drastically affect the routing protocol operation.
It is worth mentioning that the presence of just a few number of selfish nodes could severely degrade the MANET performance [6].Therefore, detecting selfish nodes is crucial to ensure an effective and efficient MANET routing operation.

Contribution and organization
In this paper, we aim to enhance the overall performance of MANETs in the presence of selfish nodes by designing an efficient and dynamic reputation-based algorithm to be used in any ad hoc routing protocol.We would like to highlight the fact that our reputation-based algorithm can be applied over any routing protocol for ad hoc networks that stablishes end-to-end forwarding paths.Using our approach, the routing protocol will establish the forwarding path with the highest nodes' reputation.We introduce a simple metric to estimate the nodes' reputation according to their packet forwarding behavior.
It is worth noting that in this work, we refer to the node's reputation as its availability to forward packets.Due to the nature of their operation without infrastructure, the functionality of ad hoc networks heavily depends on the collaboration among nodes.However, selfish nodes aim to save battery by not forwarding others' packets.Thus, avoiding selfish nodes which do not aim to collaborate lead to ensure a proper network performance.Recall that a collaborative node that temporarily causes packet losses due to, e.g., network congestion or medium collision will soon recover its reputation when it moves to another better area.Thus, our goal is to detect nodes with a persistent selfish behavior.It is out of the scope of the proposal to define any mechanism to reward or penalize the nodes of the network in order to force their cooperation.Actually, our approach consists in a reputation-based forwarding strategy for ad hoc routing protocols that avoid using forwarding paths that include selfish nodes that would not forward packets properly.Thus, the quality of service offered to the user improves.
In the following, we highlight the novel contributions of our paper: We propose a new method to score the forwarding behavior of MANET nodes based only on first-hand information of each source node.This feature is congruent with the fact that in networks without infrastructure such as MANETs, nodes gather only local information.We propose a novel method based on the wellknown Exponentially Weighted Moving Average (EWMA) [7] to detect any change in the nodes' forwarding behavior.The peak signal-to-noise ratio of the second difference of the EWMA is calculated to set the parameters of the EWMA that optimize that peak signal-to-noise ratio.This allows us to detect the limits of the stationary time windows in which the reputation of a node remains static.This way, we can properly update the measure of the forwarding behavior of a node by averaging the reputation samples in those time windows.Additionally, we propose a reputation strategy based on estimating the paths' reputations with the objective of finding the best forwarding path to be used by the sender node.Our novel reputationbased forwarding strategy for ad hoc routing protocols avoids using forwarding paths that include selfish nodes that would not forward packets properly; thus, the percentage of packet losses decreases.Most of the routing protocols already proposed in the literature use a metric based on the minimum distance.Alternatively, our proposal uses a metric based on reputation.This reputation is estimated using only local information, thus avoiding having to rely on the opinions of third parties.Furthermore: We analyze different moving average techniques in order to reduce the noise of the reputation sampled scores and to be able to assign a true reputation forwarding value to the network nodes.We carry out several simulations to check the accuracy of the theoretical results.Finally, we test the proposed forwarding strategy operating over the well-known Dynamic Source Routing (DSR) protocol [8].It is important to notice that the proposed strategy can be easily applied to any end-to-end routing protocol.Results show a significant reduction of packet losses in the network at the expenses of a small increase in the number of hops taken by the packets to reach their destinations.
The rest of the paper is structured as follows: Firstly, Section 2 includes some relevant related work concerning reputation systems in MANETs.Then, Section 3 gives a brief explanation of the proposed routing protocol based on reputation.Next, Section 4 describes the mathematical analysis of the different proposals to assign a reputation value to the MANET nodes.After that, experimental results and discussion are shown in Section 5. Finally, conclusions and future work are given in Section 6.

Methods and experimental
In Sections 3 and 4, we present our proposal, whereas in Section 5, we validate the model and show simulation results.In Section 3, we introduce our approach of a reputation-based routing protocol for MANETs, called dynamic reputation-based source routing (DrepSR).DrepSR includes algorithms to compute the reputation of nodes and paths.Section 4 includes our methodology to average the reputation forwarding values of the nodes, using a cumulative moving average (CMA) to average the reputation samples, and an exponentially weighted moving average (EWMA) to detect behavioral changes in the nodes.The methodologies used to validate our proposal are as follows: We have developed a mathematical calculation to attain the peak signal-to-noise ratio of the second difference of an EWMA.We use this calculation to define the parameter of the EWMA that maximizes the peak signal-to-noise ratio and optimizes behavioral changes in the nodes of a MANET (see Section 4.3).We perform simulations on a static topology in order to validate the mathematical model developed (see Section 5.1).
Once our theoretical model is properly validated in Section 5.1, showing a good accuracy level, we carry out a performance evaluation in Section 5.2.To do so, we present: A representative set of simulations of a MANET for different scenarios with mobile nodes, in which our proposal DrepSR is compared with the DSR protocol.Complete details of this performance evaluation are described in Section 5.2, including simulation settings and simulation results.

Related work
The cooperation issue of nodes in MANETs has received much attention from the research community over the last years.In general, reputation systems can be classified into two categories: price-based schemes and reputation-based schemes.In the following, we present some representative works related to each category.
Price-based schemes treat packet forwarding services as transactions that can be paid for and introduce virtual benefits as a form of rewards to those nodes that have been participating in packet forwarding activities.This means that nodes obtain virtual credits in exchange of offering packet forwarding.The paper [9] proposes a stimulation mechanism based on a counter in each MANET node.The counter increases when the node forwards packets for others and decreases, following a function that depends on the estimated hop count to destination, when the node sends its own packets.A micro-payment scheme for multi-hop cellular networks is presented in [10] to encourage collaboration in packet forwarding by letting users benefit from relaying others' packets.Also, they propose mechanisms for detecting and rewarding collaboration, while detecting and punishing cheating.The work [11] presents a fair and efficient incentive mechanism to stimulate the node cooperation in MANETs by charging source and destination nodes when both of them benefit from the communication.To implement this charging policy efficiently, hashing operations are used in the ACK packets to reduce the number of operations.It has also pointed out that MANET nodes should cooperate and forward packets for others; otherwise, it would be impossible to achieve a pure ad hoc network [12].
Reputation-based schemes basically focus on evaluating each node's behavior and on detecting misbehaving nodes according to their reputation values.In [13], authors propose a watchdog method to identify misbehaving nodes and a path rater technique that helps routing protocols to avoid choosing those nodes to forward any packet.Simulation results prove the benefits of using those two techniques by increasing the throughput to at least 17%.Authors in [14] propose a protocol called CONFIDANT (cooperation of nodes: fairness in dynamic ad hoc networks) able to detect and isolate misbehaving nodes.They show that using CONFIDANT, a network can perform well even with a high percentage of malicious nodes.In [15], the authors present a generic mechanism based on reputation to impose cooperation among MANET nodes to avoid selfish behavior.Authors in [16] propose the observation-based cooperation enforcement in ad hoc networks (OCEAN) to detect and mitigate misleading routing behavior in ad hoc networks.Simulation results show that OCEAN works very well in terms of throughput.However, OCEAN is not fully able to penalize misbehaving nodes severely even though it performs at least equally well in comparison to another more complex schemes.The proposal [17] employs a trustbased system to counteract malicious node's behaviors that consists of a reputation system and a watchdog technique.Their watchdog technique uses a positive feedback message (PFM) as an evidence of the forwarding node's behavior.Regarding [18], the authors proposed a reputation-based trust management for detecting and preventing MANET vulnerabilities.Results of their performance evaluation show scalability and robustness using the proposed scheme.
In [19], authors propose a novel routing strategy that works with nodes that were considered trustworthy, but due to loss of power, they are reconsidered as selfish ones.Their strategy considers that selfish nodes are not malicious by default, and they are forced to provide their current energy level.This helps the routing strategy to avoid nodes with very low power based on the assumption that a node with low energy could drop packets to save its energy consumption.The proposal [20] proposes a trust mechanism based on a trust vector model (TV).The main goal of this mechanism is to detect malicious nodes to later take actions on them.It is implemented in two well-known routing protocols: dynamic source routing (DSR) [8] and ad hoc on-demand distance vector (AODV) [21], producing two proposals called TV-DSR and TV-AODV, respectively.Simulation results on both modified routing protocols show that they can effectively detect malicious nodes and mitigate their attacks.Authors in [22] propose a new mechanism to detect selfish nodes.They consider as a good strategy that selfish nodes drop control packets to avoid themselves being asked to forward data and in this way they save resources for their own use.Nodes are expected to contribute to the network within a time frame.Those nodes who fail to will undergo a test for their suspicious behavior.Simulation results support their scheme by obtaining good results.
Game theory and mechanism design have been widely applied for selfish routing.Authors in [23] propose an indirect reciprocity framework based on a gametheoretic solution to enforce cooperation among nodes.
Based on the proposed model, they obtain the threshold of benefit-to-cost ratio to ensure the convergence of cooperation.Simulation results demonstrate that their game-theoretical solution enforces cooperation among nodes when the benefit-to-cost ratio of the unselfish exceeds the critical condition.The proposal [24] addresses reducing energy consumption of energy-constrained wireless nodes through a game-theoretical energy-aware cooperative relaying scheme with fair distribution of the payoff among players, to keep them satisfied and discourage them from quitting the coalition.This solution also proposes a credit based system to reward cooperative players, hence excluding selfish users from cooperative coalitions.
The approach presented in [25] consists of a dynamic reputation management system to detect and isolate misbehaving nodes in MANETs.They introduce a novel direct monitoring technique that evaluates the nodes' reputation in the network.This technique ensures that misbehaving nodes will be detected and isolated from the network, whereas the rest of the nodes that spend their energy in forwarding data will be allowed to accomplish their network activities.Simulation results show the effectiveness of their model in restraining and mitigating the effects of misbehaving nodes in MANETs.Authors in [26] show how the use of their proposal called neighborhood compressive sensing (NCS) helps in the reduction of resource consumption and at the same time protects the network from attacks and misbehavior.The NCS model compresses sparse data such as routing tables updates.In addition, as individual nodes only accept routing tables updates if it is coming from the leader node's processed information, NCS also prevents the network from attacks and misbehavior.Simulation results show how the NCS model outperforms DSR in terms of energy consumption, lifetime, and packet dropping ratio.
To the best our knowledge, the approach of considering the reputation of candidate MANET nodes to conform the forwarding scheme, using only first-hand information of the nodes to score their reputation and using an exponential weighted moving average model to detect any changes in nodes' reputation, is novel.In the next section, we present our proposal of a reputation-based routing protocol for MANETs.

Proposed routing protocol based on reputation
We would like to highlight the fact that our reputationbased forwarding scheme could be implemented over any ad hoc routing protocol that performs an end-toend forwarding scheme.For the sake of simplicity, in this work, we have selected the well-known DSR protocol, although similar benefits could be obtained over other ad hoc routing protocols.Briefly, DSR [8] looks for routes formed by intermediate nodes when a source needs to send a packet to a destination.To do so, DSR periodically sends route discovery packets to search paths to that destination.Route reply packets contain the whole route for that packet.To fulfill source routing, the routed packets contain the address of each node the packet will traverse.In the event of link breakages, route error packets are generated and alternative routes will be found looking for stored paths in the route cache or looking for new routes.
Our proposal, called dynamic reputation-based source routing (DrepSR), uses DSR as the routing protocol engine to discover the available routes to destination.In addition, DrepSR estimates the reputation of those discovered paths.Then, when a route stored in cache has to be chosen, DrepSR selects the most reputable one instead of the shortest one.The goal is to choose, among the set of available paths, those paths that produce the fewest packet losses.We assume that the main cause of packet loss is the non-forwarding behavior of selfish nodes present in the network.As mobile services increase (e.g., video-on-demand, gaming, social networks), users are worried with the battery lifetime of their smartphones and try to save battery.Relaying packets for others consumes battery; thus, users might feel reluctant to cooperate in the forwarding tasks because of worrying about offering their limited-energy battery.Anyway, although there were other factors for packet losses (e.g., congestion, collisions), it would also be beneficial to use our proposed algorithm.In DrepSR, each node locally assigns a reputation value to the other nodes with which it interacts, using only first-hand information based on its own experience.After that, the reputations of the available paths in cache are calculated using the estimated reputation of the individual nodes that form those paths.The reputation of a path is easily calculated as the product of reputations of those intermediate nodes that form that path.In this way, the algorithm assigns a reputation score to the paths discovered by the DSR mechanism.Finally, each time a node requires to send information to a specific destination and finds out more than one available path, the node will choose the path with the highest reputation.
To estimate the reputation of a node, it is necessary to have feedback of the percentage of packet losses in the path to which the node belongs.In fact, while a source node has no previous experience in the network, it cannot estimate the reputation of the other nodes.Initially, the source node estimates the fraction of packet losses (L p ) for each path p through which it transmits.To get the end-to-end percentage of packet losses, the feedback information of the transport layer is used.In the case of non-connection-oriented communications, the use of RTP/RTCP (Real Time Protocol/Real Time Control Protocol) over UDP (User Datagram Protocol) is assumed.RTCP provides the percentage of packet losses through each available path.For the case of connection-oriented communications, the most widely used protocol is TCP (Transmission Control Protocol).In this case, losses are estimated from the number of retransmissions done by the source node regarding a same message.The main reason for making this choice is its simplicity, and although it may seem a coarse estimation, it is accurate enough to select the best path in the network.With this feedback information about the packet losses, the node calculates L p as the ratio between the amount of lost packets and the amount of sent packets through path p.This way, our cross-layer proposal uses transport information to take forwarding decisions at the network layer.The L p is calculated periodically, updating then the reputation of the nodes that belong to that path.

Algorithm to compute the reputation of nodes and paths
Once the estimation of the fraction of packet losses for a path has been done, the reputation of the intermediate nodes is estimated by sharing out the losses equally among nodes.That is, the algorithm considers that all the nodes of a path, loss the same amount of packets, which is not ever true.Therefore, if a well-behaving node belongs to a path in which there are several selfish nodes, its reputation will be penalized.Although it may seem unfair from the point of view of the nodes, it is not, since the goal of the algorithm is not to estimate an accurate reputation value, but to choose the path with fewer losses.Consequently, the design of schemes to reward or penalize nodes so that we can force their cooperation in the forwarding tasks is not our goal in this present paper.Anyway, it is important to notice that due to the inherent mobility of MANET nodes, a cooperative node wrongly penalized for having been in the same forwarding path than a selfish node will soon recover its reputation when it moves to another area.
The forwarding reputation value (r n ) assigned to a node n represents the forwarding probability for node n, that is, the ratio between the number of packets forwarded by the node and the total number of packets it has received.In the same way, we define the reputation value (R p ) assigned to path p as the ratio between the packets received by the destination node and the packets sent by the source node.Then, the reputation of a path is obtained from its fraction of packets loss given by Eq. ( 1) A packet reaches its destination successfully if all the nodes of the path successfully forwarded it.Therefore, assuming that the forwarding behavior of each node is independent from the others, the reputation (R p ) of path p can be computed as the product of the reputations (r n ) of the nodes n that form that path p, see Eq. (2).
As said before, the proposed algorithm estimates the reputation of the nodes ( b r n ) sharing out the losses equally between the nodes that form the path, except for the origin node that is not supposed to generate losses.In this case, Eq. ( 2) can be rewritten as follows, being b r n the estimated reputation of node n and N p the number of intermediate nodes in the path p.
Each time the source node updates the fraction of packet losses for path p, the forwarding reputation value of each node belonging to that path is estimated.To do that, the reputation value of the path calculated in Eq. ( 1) is used in Eq. ( 3) to obtain Eq. ( 4).
As mentioned before, the fact of dividing the losses equally between the nodes that compose the path could negatively affect the reputation of nodes that have a collaborative behavior.Furthermore, this also might hide a selfish behavior (i.e., the node obtains a higher reputation than it deserves).For this reason, the reputation of nodes must be averaged taking into account the estimations made in all the paths in which the node has previously participated.In the next section, we propose a methodology to average the reputation samples of the nodes.
The main goal is that each node has a table with the estimated reputations of the other nodes.This way, nodes can assess the reputation of the different discovered paths and choose the most reputable one.The reputation of the paths is computed by using the estimated reputations of the nodes in Eq. (2).
A drawback of the proposed algorithm is that nodes with a bad reputation can experience difficulties to update their reputation in case of changing their behavior and cooperate hereinafter.This happens because those nodes have not been selected as forwarding nodes for a while (since source nodes try to avoid them due to their selfish behavior), so they cannot provide new updated information of their current well behavior.To tackle this issue, we make nodes to periodically forget all they have learnt about the rest of the nodes, as a kind of reset in the framework.
In the following, we present the algorithms proposed to select the best forwarding path.Algorithm 3 is the main algorithm in our methodology.This algorithm checks, among the set of active paths, if it has new information about the packet losses experienced in the path.If positive, it updates the reputation of the nodes that belong to that path using Algorithm 1 (UPDATENODESREPUTATION).In the case that it is necessary to find a route to a destination, a DSR route discovery is done to find out the available paths to that destination.Next, the reputation of each discovered path is calculated using Algorithm 2 (GETPATHRE-PUTATION), and the path with the highest reputation score is selected.Additionally, Algorithm 1 uses the AVERAGE function (described in Algorithm 4) to estimate the abovementioned nodes' reputations that will be explained in the next section.

Algorithm to average the reputation forwarding values
We assume that nodes have a stable forwarding behavior for an undetermined time.This entails that the average reputation forwarding value of the nodes is kept stationary during an arbitrary interval.Nevertheless, we also consider the possibility that the forwarding behavior of the nodes will change throughout time starting a new stationary interval.This may be due to the fact that the node either changes its behavior or its neighborhood, which affects its reputation.
As mentioned above, each time a node gets the reputation of a path, an estimation of the reputation of the nodes belonging to that path is updated.For convenience in the nomenclature, we name the set of k consecutive samples of the estimated behavior for a node n, as r 1 , r 2 , … , r k instead of c r n 1 ; c r n 2 ;…; c r n k .For the sake of simplicity and clear explanations, we ignore the subscript n to refer to the node n under evaluation.To assign a reputation to that node, we use a moving average.The moving averages that we have considered are the cumulative moving average (CMA) and the exponentially weighted moving average (EWMA) [7].

Cumulative moving average
In the case of using a CMA [7], we compute the average using all the reputation history of a node: In Eq. ( 5), c k represents the CMA of node n after its k-th reputation sample has been estimated.Since past samples are weighted equally than the present ones, this average does not reflect the updated current behavior of the node in case it changed.If the behavior of the node were always the same, this would be a correct average to be implemented.
We assume that the node has a stationary behavior within a time window.Otherwise, it would not have sense to average the reputation samples.Under this assumption, we can recalculate the CMA each time a new stationary window starts.Now, the problem is to obtain the size of that window, that is, we need to detect when a change occurs in the behavior of the node, so that the average only includes samples that have the same stationary behavior.By means of a mechanism that allows us to detect those behavioral changes, we could define the window in which we would average the reputation samples.Each time a new window starts, we forget the old samples, that is, we reset the i counter in Eq. ( 5).
The set of reputation samples r k of a node n that has a steady behavior can be expressed as a constant value r to which a noisy signal Δr k is superimposed.
The noisy signal is a zero mean stochastic process (E[Δr k ] = 0).Therefore, r is the expected value of the samples E[r k ], which remains constant for all k samples because of the stationary behavior of the nodes.The second moment of Δr k is the variance of the samples, which we assume to be constant for the temporary window of steady behavior: In the same way, we can express c k ¼ c k þ Δc k .Then, using Eq. ( 6) in Eq. ( 5), we obtain The expected value of the averaged samples using CMA does not depend on k and remains constant for the time window with steady behavior.Since Δr k is a zero mean process, Δc k is a zero mean process as well.In order to evaluate the deviation from the expected value, we calculate the variance of the CMA samples.We assume that forwarding is a memoryless process, that is, a node that forwards only the 50% of the packets throws a coin before each forwarding independently of the others.Under this assumption, we have that processes Δr i and Δr j are independent for i ≠ j and the variance is given by Eq. ( 10): Concluding, for the CMA under the assumption of steady behavior in a well-defined temporary window, the expected value of the reputation samples remains the same.Also, the variance of the reputation samples reduces inversely proportional to the number of samples.

Exponentially weighted moving average
Another widely used alternative to average samples is EWMA [7].While averaging, more emphasis to recent samples is placed reducing exponentially the weight of past samples.We denote w k as the value of the EWMA after processing the r k sample: In Eq. ( 11), β is a factor between 0 and 1 that weights the current sample r k .The higher the β, the more emphasis is given to the present.In general, EWMA has less lag than CMA and is therefore more sensitive to changes.
To calculate the EWMA of a node, it is necessary to take an initial reputation value.A priori we consider that all nodes in the network are collaborative.Therefore, the initial reputation for all nodes must be one.Nevertheless, this proposal is defined in a generic way for any other initial value.Using Eq. ( 11) recursively, we obtain the moving average as a function of all the reputation samples exponentially weighted, where w 0 is the initial value of the moving average: In the same way that we have done with r k in Eq. ( 6), we can express w k ¼ w k þ Δw k , being w k the asymptotic behavior of the EWMA and Δw k a stochastic process that superimposes to it.Using Eq. ( 6) in Eq. ( 12), we relate w k with r and Δw k with Δr k .
We can see that after k samples, the difference between the expected value of the sample and the asymptotic behavior of the moving average, that is w k −r , has reduced by a factor (1 − β) k .Since (1 − β) k < 1, when the number of samples k grows, this difference tends to zero and w k approaches asymptotically to r .Therefore, the greater the β factor, the faster the asymptotic approach to the desired value r.
Since Δr k is a zero mean process, Δw k is a zero mean process as well.Note that according to our definition of r k (see Eq. ( 6)), we have that Δr k = 0, ∀ k < 1.
In order to evaluate the deviation of the EWMA of the reputation samples, from the asymptotic behavior, we calculate the variance of the EWMA of the reputation samples.Using the same assumptions as in CMA, we attain the variance of the EWMA as a function of the variance of the reputation samples (see [7] for further details): Notice that although we suppose that σ 2 r is constant in the stationary window, σ 2 w depends on k.Since always <1, we have that σ 2 w ðkÞ < σ 2 r .This means that the EWMA reduces the deviation of the samples from their mean value.The smaller the β factor is, the smaller the variance σ 2 w ðkÞ is.If we compare the variances of EWMA and CMA, we can notice that for CMA, this variance tends to zero when the number of samples increases (Eq.( 10)).In contrast, for EWMA, when the number of samples grows, the variance tends to a value that depends on β which is shown in Eq. ( 16): Additionally, for EWMA, there is a trade-off between the asymptotic approaching speed and the variance.High values for β entail a quick approach to the average stationary value, but with a low variance reduction.Conversely, low values for β reduce considerably the variance, although with a slow approach.

Stationary time windows detection
CMA is the best option for averaging the reputation samples, in case the behavior of the nodes changes steeply and we are able to detect the moments in which the stationary time windows start and end.To do so, we propose a method to detect changes in the nodes' behaviors and delimit these stationary time windows.Our proposal is based on the combined use of an EWMA to detect the change of the nodes' behavior and a CMA to average their reputation values.
Figure 1 shows the EWMA for the reputation of a node that changes, defining two stationary time windows.If the asymptotic behavior of the EWMA was a continuous function, we would have a discontinuity in the derivative each time the stationary behavior of the node changes and there is a Dirac delta function in the second derivative.Since we have a discrete time function, we use the second difference of the EWMA instead of the second derivative in order to detect the end of a stationary time window.We define the first difference of the EWMA as: and the second difference With this definition, if a peak for d k occurs at k 0 , it means that the sample r k 0 belongs to a new stationary time window.The above expression has no sense for k < 3, since at least three samples are necessary to compute the second difference.
In order to find the values of β that optimize the peak detection, we relate the second difference d k with the samples r k .Using Eq. ( 12) on Eq. ( 18), we have: Under the assumption that in the stationary window To relate the noise of the second difference of the EWMA with the noise of the samples, we calculate the standard deviation of d k .As we have said above, we assume that E(Δr i • Δr j ) = 0, ∀ i ≠ j and EðΔr 2 i Þ ¼ σ 2 r ; ∀i > 0 in the stationary window.In this case: Plotting the factor that relates both variances as a function of β, using a different number of samples (see Fig. 2), we get practically the same graph being impossible to distinguish them, then we can approximate the above expression for the case k → ∞(in this case we approximate (1 − β) 2(k − 2) ≈ 0, since β < 1).
For large values of β, the noise σ 2 d increases, being greater than that of the samples σ 2 r .Therefore, we have to choose a small value for β if we want to reduce the noise and optimize the detection.
On the other hand, the peak of the second difference will be higher the value of β, being easier to detect.We must compare the value of the peak with the noise and find the value of β that optimizes the peak signal-to-noise ratio.To calculate the peak value, we assume that the reputation of the node changes from a value r 1 to a value r 2 , and this change occurs between the sample k 0 − 1 and k 0 so that 19) we get: Comparing this result with Eq. ( 20), we observe that the peak that appears in the second difference caused by the change of behavior is: Using Eqs. ( 23) and ( 25), we can define the peak signal-to-noise ratio as a function of β: Since (2 − β)/2(2 + β) is a decreasing function of β, we can conclude that the lower the β value, the better the detection of a new stationary time window.Obviously, we cannot use β = 0 because in this case, we were not averaging samples.

Proposed averaging algorithm
Our proposal is to average the samples of the reputation of a node using a CMA in the stationary period of steady behavior for each node.We use this average as the reputation of the nodes in order to find the best forwarding routes in the MANET.To delimit this stationary period, in parallel, we calculate the EWMA of the samples using a low value for β, whose value is discussed in the simulation results section.Also, we compute the second difference of that EWMA average obtaining the values d k .We suppose that the samples d k represent an ergodic process and therefore its mean value d k and its standard deviation σ d k are calculated as follows: In Eq. ( 27), k represents the number of samples that we have in the stationary time window.The more samples, the more accurate the result we achieve.
When the absolute value of a new sample d k + 1 is higher than a certain number of standard deviations added to the mean value, a peak detection event occurs.In this case, we start a new CMA with the new samples that belong to a new stationary time window.The peak detection condition is shown in Eq. ( 28): Since we do not know if the reputation of the node will increase or decrease, we do not know if the peak of the second difference will be positive or negative.Therefore, we use the absolute value of d k + 1 and d k .The factor F determines how much higher the peak should be with respect to the noise.A small F value may be the cause of false positives so that the noise in the samples may seem like a peak.On the other hand, a large F value may be the reason why small changes in the behavior of a node are not detected.As seen in Eq. ( 25), the amplitude of the peak depends on the difference of both reputations, the old and the new one.
To sum up, the proposed algorithm uses a CMA to average the reputation samples and the second difference of the EWMA with a small β to detect the limits of the stationary time windows.r and the variance of the second difference of the exponentially weighted moving average of the samples σ 2 d , as a function of β for different number of samples k (see Eq. ( 22))

Experimental results and discussion
In this section, we first validate our proposal in Section 5.1, and afterwards, we carry out a performance evaluation in Section 5.2.

Validation of the theoretical model
In order to validate the theoretical results in the detection of behavioral changes in the MANET nodes and to be able to define the temporal windows in which nodes have a stationary behavior, several Matlab R2017b [27] simulations have been performed with the topology shown in Fig. 3.It depicts a source node (S) that transmits to three different destinations (D i , i = 1, 2, 3).Every 1000 packets transmitted through a path, the loss probability for that path is calculated and a reputation sample is obtained for each one of the intermediate nodes that form that path.In total, 100,000 packets have been transmitted to each destination.All nodes forward all packets they received except node 2, which does not forward a 30% of the packets during the first half of the simulation and a 10% during the second half.That is, its real reputation goes from 0.7 to 0.9.Node 4 only participates in paths S-D 2 and S-D 3 while nodes 1, 2, and 3 participate in all paths.Thus, node S should assign the same reputation to nodes 1, 2, and 3. Initially, for different values of β, we check if the behavioral change of node 2 is clearly detected, and we also verify if node S correctly assigns their reputation value to nodes 1 and 4. The detection threshold has been set with F = 2, since this value showed a very good detection rate after conducting many simulations.This way, a detection takes place if jd kþ1 j > jd k j þ 2 Á σ d k .In Figs.4,5,6,7,8,9, and 10, the horizontal axis indicates the number of samples.Each time a peak of |d k | gets above the threshold, the average of the reputation samples is reset.
Since node 1 participates in different paths, its samples are noisier than those of node 4. The noise makes it difficult to detect the change in the behavior of node 2 if the threshold of detection is too high.In Fig. 4, it is observed that for a value of β = 0.01, the detection can perfectly be done, although false positives are obtained.A false positive occurs when |d k | gets above the threshold but actually, there was not a change in the behavior of the node.One of the false positives has been marked with a red circle in Fig. 4 in sample number 125.As it can be seen, a false positive produces an oscillation in the average value of the reputation due to the fact that the CMA is reset (see the graph in the top left in Fig. 4).Since the variance of CMA decreases inversely proportional to the number of samples (see Eq. ( 10)), when the CMA is reset, its variance is the same than the variance of the reputation samples, and it decreases as we average more reputation samples.In Fig. 5, we get similar results using a value of β = 0.1.Increasing the value of β to 0.3 (see Fig. 6), we are in the limit of detection.However, for β = 0.7 (Fig. 7) and β = 0.9 (Fig. 8), the peak-to-noise ratio is lower so that no change in behavior occurs for node 1 and the averaging is slower.As already mentioned in the previous section, high β values worsen the peak-to-noise ratio, which makes it difficult to detect a change in the node's behavior.
The value of the threshold F must be set experimentally.Similar problems can be found in the literature (e.g., control charts [7]) and researchers adjust this parameter based on experiments.To do so, several simulations have been performed with β = 0.1, changing the detection threshold value F in each simulation: F = 1 in Fig. 9, F = 2 in Fig. 5, and F = 3 in Fig. 10.We can see that low F thresholds produce false positives that generate small oscillations in the averaging of the reputation of the nodes (see the top graphs in Fig. 9).Despite this, it is preferable to detect false positives than not being able to detect the behavior change of the nodes.In case that false positives always occur, reputation samples are not averaged, since the CMA is constantly reset, but the changes of behavior are always detected.Therefore, it is more convenient to use low F values.On the other hand, in Fig. 10, we can see that higher values for F do not allow to detect the change of the behavior of the node with noisier reputation samples, and its reputation is not updated properly.As mentioned above, since node 4 only participates in one path, its reputation samples are less noisy than those of the other nodes, being possible in this case to use higher F thresholds in order to avoid the risk of committing false positives (what happens for low F values).After many representative simulations we have conducted, we have obtained an adequate value for the threshold F = 2. Fig. 3 Network topology used in the simulation with a source node (S) and three destinations (D 1 , D 2, and D 3 ).In the simulation, all nodes forward 100% of the packets, except node 2 that forwards only a 70% of the packets during the first half of the simulation.Afterwards, the node changes its behavior and forwards a 90% of the packets during the last half of the simulation

Performance evaluation of the proposed routing protocol
To verify the operation of the proposed routing algorithm, a simulator of a MANET has been developed using Matlab R2017b [27].The simulator implements DSR and also our proposed DrepSR forwarding algorithm.This way, we can compare the performance of DSR in both cases, either by using the minimum number of hops (vanilla DSR) or, conversely, by using our algorithm to select the route with the highest reputation (DrepSR).following a random waypoint movement pattern.Half of the nodes have been configured so that they behave selfishly by dropping packets they should forward.Here, we show results with selfish nodes that only forward 50% of the packets they should forward.The simulation settings are shown in Table 1.
For each simulation, we have obtained the packet loss probability in the MANET and the average number of hops, both for DSR and DrepSR routing protocols.These results are shown in Table 2.
Averaging the ten simulations, we obtain that the proposed DrepSR reduces losses by 20% (DSR shows a a b packet loss probability of 21% and DrepSR 16%) with respect to DSR.Also, we can see that the number of hops barely increases and can be considered negligible.The average of the results and the 95% confidence interval are shown in Fig. 11.Optionally, researchers can use open source datasets for MANETs to carry out the performance evaluation of proposals, like the one presented in [28].This kind of traces generated from real experiments can be very useful to evaluate protocols and algorithms accurately.

Conclusion and future work
In this paper, we have presented a novel forwarding strategy for ad hoc routing protocols called DrepSR.DrepSR is a modification of the vanilla DSR that includes our novel reputation-based forwarding algorithm.
Our approach is based on the reputation of MANET nodes where we consider the presence of selfish nodes which do not cooperate in the routing tasks.Each source node uses the well-known DSR protocol as the routing engine to discover the available routes to destination.Then, the source node estimates the reputation of each path selecting the most reputable one using only first-hand information.The advantage of this methodology is its simplicity, and although it may not calculate the real reputation of the single nodes, it is able to find those paths with less losses, which was our initial goal.Our main objective is to minimize the packet loss ratio when sending packets through different routes.Therefore, it is natural to use the observed packet loss ratio of each node as a metric of its reputation.Of course, packet losses may not always be caused by the selfish behavior of the nodes, but could also be due to traffic congestion or the random nature of wireless communications, or even because of the limited capabilities of nodes.However, the actual cause of a packet loss is not captured by our approach and we suppose it is always related to a selfish behavior, directly affecting the reputation of the nodes involved.We would like to highlight that although our reputation metric is not an accurate measure of selfish behavior for any individual node, it a b is a good estimation that helps choosing the best available routing path.We have validated our theoretical results through extensive simulations using Matlab R2017b [27].After validation, the novel strategy was implemented over DSR using Matlab R2017b and many simulations were carried out for different scenarios with mobile nodes.We compared the probability of packet loss and the average number of hops for the DSR protocol (routing using the shortest path) and the DrepSR protocol (routing using the most reputable path).Results show that DrepSR reduces the probability of packet losses by 20%, while it slightly increases the average number of hops by 8%.
As a future work, we are planning to use Expectation Maximization model to accurately estimate the paths' reputation and compare it to our proposed strategy developed in this work.We are also considering the use of open source datasets for MANETs to carry out the performance evaluation of our protocol and algorithms.

Fig. 1
Fig.1Average of reputation samples of a node if we use an when the node changes its behavior defining two stationary windows

Fig. 2
Fig.2Plot of the factor that relates the variance of the samples σ 2 r and the variance of the second difference of the exponentially weighted moving average of the samples σ 2 d , as a function of β for different number of samples k (see Eq. (22))

Fig. 4 Fig. 5
Fig. 4 Average value of the samples (figures at the top) and interval detection (figures below) when β = 0.01 and F = 2: a for node 1 and b for node 4

Fig. 6 Fig. 7
Fig. 6 Average of the samples (figures at the top) and interval detection (figures below) when β = 0.3 and F = 2: a for node 1 and b for node 4

Fig. 8 Fig. 9
Fig. 8 Average value of the samples (figures at the top) and interval detection (figures below) when β = 0.9 and F = 2: a for node 1 and b for node 4

Fig. 10 Fig. 11
Fig. 10 Average value of the samples (figures at the top) and interval detection (figures below) when β = 0.1 and F = 3: a for node 1 and b for node 4 Sample of estimated reputation for node n N p Number of intermediate nodes in path p r k k-th sample of a node reputation r Mean value of r k Δr k Deviation from the mean value of the sample r k σ r Standard deviation of r k c k Cumulative moving average (CMA) of the k samples of r k c k Mean value of c k Δc k Deviation from the mean value of c k σ c Standard deviation of c k w k Exponentially weighted moving average (EWMA) of the k samples of r k β Weight for the next sample to be averaged in the EWMA w 0 Initial value for the EWMA w k Mean value of w k Δw k Deviation from the mean value of w k σ w Standard deviation of w k f k First difference of the w k values d k Second difference of the w k values Guaya-Delgado et al.EURASIP Journal on Wireless Communications and Networking (2019) 2019:77

Table 1
Simulation settings

Table 2
Simulation results

Table 3
Notation and symbols