Opportunistic wireless network coding with relay node selection

Broadcasting nature of wireless communications makes it possible to apply opportunistic network coding (OPNC) by overhearing transmitted packets from a source to sink nodes. However, it is difficult to apply network coding to the topology of multiple relay and sink nodes. We propose to use relay node selection, which finds a proper node for network coding since the OPNC alone in the topology of multiple relays and sink nodes cannot guarantee network coding gain. The proposed system is a novel combination of wireless network coding and relay selection. In this paper, with the consideration of channel state and potential network coding gain, we propose several relay node selection techniques that have performance gain over the conventional OPNC and the conventional channel-based selection algorithm in terms of average system throughput.


Introduction
Channel coding concept is used to mitigate the influence of noise and interferences in the physical layer. In [1], it was also shown that we can get coding gain in higher layers. Compared to the routing and scheduling techniques that are devised to prevent bottlenecks of packets from different senders, Alswede et al. [2] showed a way of making use of this disadvantage and showed that the achievable rate can be increased by applying certain innetwork processing at an intermediate node when packets are received at the node simultaneously. This type of innetwork processing is called network coding. Routing can be treated as a special case of network coding which is a simple permutation. Network coding has received attention since it can enhance system throughput and reliability. For throughput, network coding technique can take advantages of bottleneck effect of data at the intermediate node in wireless communication to improve the system throughput [3]. Ghaderi et al. [4] have shown that there are reliability benefits by applying network coding technique in their system. Li et al. [5] show that the maximum achievable rate can be achieved by linearly combining input packets at an intermediate node. Random linear network coding [6] (RLNC) and opportunistic network coding [7] (OPNC) have been known as one of practical implementations. RLNC randomly chooses elements from a finite field as the coefficients for a linear combination of packets.
OPNC performs bitwise XOR operation of packets that are selected by reception report. RLNC is suitable for the distributed system, and no reception report is needed since it contains all the information in the header to decode the received packets at the receiver node. However, as the number of hops or the number of participants increases, the length of the header also increases, which might degrade the throughput. Although OPNC needs extra report, the portion is not significant compared to the original information, and the implementation of coding and decoding is simple. As a practical implementation of OPNC, Katti et al. [7] introduced a scheme, COPE, that takes advantage of broadcasting nature of wireless communications.
COPE employs practical network coding technique for unicasts in wireless mesh networks to improve total throughput. They showed through experiments that with OPNC in the system, there exist significantly improvements in throughput of wireless networks with UDP traffic. Recently, Fang et al. [8] gave an analysis of COPE and argue that the key to COPEs success lies in the interaction between COPE and the MAC protocol. How MAC protocol deals with competing nodes in a given network plays an important role in performance improvement. In this paper, we consider the following two factors: one factor is the channel state information, which can affect the performance of a system, and the other factor is how to deal with multiple intermediate nodes, which can perform network coding simultaneously. This kind of networks, without certain decision methods at the intermediate nodes, cannot guarantee the throughput gain by using network coding in the system as in [7].
An uplink model that consists of multiple users, multiple relays, and a single base station (receiver) was used in [9], which proposed finite field network coding with superposition coding at the relay nodes. In [10] which uses the same uplink system model, they replaced the relay nodes with a set of user nodes. With the multi-user cooperative communication system, they proposed a diversity network coding scheme over finite fields. In [11], a down link model was considered, and it consists of a single transmitter base station, a single relay node, and multiple receiving user nodes. They proposed an instantaneously decodable binary network coding scheme and showed its improved transmission efficiency compared to the existing ARQ and network-coding-based schemes. Bletsas et al. [12] dealt with a cooperative communication system consisting of single source node, single sink node, and multiple relay nodes and introduced a distributed network path selection algorithm which performs opportunistic relaying by using an objective function based on the channel states at the relay nodes.
In this paper, we consider a system model that includes multiple relay nodes and multiple sink nodes. With this system model, we combine the opportunistic relaying with network coding and propose a relay selection measure which considers the channel state between the relays and the destination nodes. We compare the performance of proposed algorithms with conventional OPNC and opportunistic relaying in terms of throughput. The rest of this paper is organized as follows. The system model is described in the section of system model and scenario. In the section of proposed relay selection techniques for network-coded transmission, we propose several relay selection schemes for network-coded transmission. The performance of these schemes is compared with the conventional relay selection schemes. The results are verified by simulations in the section of simulation results, and we draw our conclusions in the last section.

System model and scenario
The system scenario and the system model are introduced in this section. A source node has packets that need to be delivered to different destinations. There are multiple relay nodes, some of which might have better channels to the destination than the channel between the source and the destination. After the source broadcasts the packets, some packets may not reach their destination nodes successfully, and it is needed to retransmit the missing packets. Since some destination nodes overhear the packets which are sent to other destination nodes, network coding can be effective in this scenario. With network coding and relay selection, the best intermediate node for retransmission is selected.

Transmission from source to neighbor nodes
We have a source node S, a set of relay nodes R, and a set of sink nodes D. Assume that S has n packets to transmit to corresponding sink nodes (i.e., S a = {a 1 ... a n }), R include l nodes (R = {r 1 ... r l }), and D includes m elements (D = {d 1 ... d m }). Each packet a i S a has its own destination address to be delivered. We assume all nodes in R and D are within communication range from S. At first, the source node S broadcasts n packets to all the nodes in its range. Every neighbor node is assumed to be able to overhear data traffic of other nodes as in OPNC and stores all the overheard packets in its buffer. A relay node r j receives a set of packets, a j , and a sink node d i gets a set of packets, b i . Both a j s and b i s are subsets of the original n packets. (n ≥ |b i |, |a j |, ∀d i D and ∀r j R).
After the source transmission is over, there may be packet loss at sink nodes due to a poor channel between source and those nodes. Hence, we need retransmissions for those missing packets. If the source retransmits data, the packet loss may occur again. If there exists a relay node (r j ) with better channel response than the source node S, it may be better for r j to retransmit the packet to the destination. It is assumed that the relay set R receives all the packets that the source sent. We then have This means that the union of packets of all relay nodes is identical to the set of all the packets from the source S. The number of packets from the source (n) should be less than the buffer size to prevent overflow. When n is larger than the buffer size, we can divide n packets into a number of groups as in a practical RLNC scheme [13].
Next, it will be shown that the probability to satisfy (1) is close to 1 in the high SNR regime. Let h 0 be an event that at least one node in R correctly receives a packet from the source and h 0 c be the complement of h 0 . The relationship between two events is where P(·) is the probability. Assume the channel response is independent, then P(h 0 c ) means that no node in R receives a correct packet. We have where ε p is the packet error rate given by ε p = 1 -(1ε s,M ) N , l is the size of set R, ε s,M is the symbol error rate of M-QAM, and N is the number of symbols in a packet. We can calculate the lower bound of P(h 0 ) using the upper bound of ε s,M from [14]. We then have where M is the modulation order of QAM, k = log 2 M, E bav is the bit energy, and N 0 is the noise variance. Note that the upper bound in (4) is for an AWGN channel. By plugging (3) and (4) into (2), we can calculate P(h 0 ). If n packets are transmitted, the probability that there is at least one relay node which receives each packet is simply the nth power of (2) since the channel is independent. Let us denote the event that satisfies (1) by h 1 .
where E s is the symbol energy, h is the channel response, P s is the symbol transmit power, B is the channel bandwidth, and R s is the symbol rate. From (2) to (5), P(h 1 ) is lower-bounded by (7) Figure 1 shows the plots of Equation 7. We use B = 5 MHz, 16-QAM, R s = 2 bps, and Rayleigh fading channel for h. The plots indicate that the probability of having at least one relay node with a correctly received packet approaches to 1 at high SNR. Therefore, it is enough for the relays instead of the source to retransmit data.

Retransmission procedures 2.2.1 Reception report from the destinations to the relays
Each of relay and destination nodes operates in opportunistic listening mode which stores every received packets for a given period regardless of the destination. The storing period is a system dependent variable (500 ms in [7]). After the source transmission, each destination d i D creates a report packet and sequentially broadcasts it to all the relays. Since there are multiple sink nodes in D, each sink node uses a random access method such as CSMA/CA to avoid collision. The report packet is sent to the source and the relay nodes. The information in the report packet consists of the source node ID, the current node ID, multiple original sink node IDs of received packets, and pilot signal as shown in Figure 2.
The portion of report packet is not significant compared to the information packet as indicated in [7]. Let us denote the number of packets at the source node by n, which is known to all the nodes. The report packet consists of a pilot, a source node ID, a current sink node ID, and the destination sink IDs of the |b i | received (stored) packets. We assume that the IDs are represented by 64 bits as in the IPv4 format. Let us denote the pilot size by l 1 , the packet size by l 2 , and the number of network-coded packets by l 3 . A packet then needs at least 64 + n log 2 n + l 1 bits. Before the retransmission from a relay node to its destinations, the relay receives m report packets, where m is the number of sink nodes. The ratio (r) of the overhead due to the report packets is For example, if n = 10, l 1 = 2, l 2 = 1 kB/packet, m = 10, and l 3 = 3, we have r = 4%. Note that the feedback is performed at a packet level instead of a symbol level, and the overhead of the report packets is not too significant compared to the overall data traffic as can be seen in the example. The report packet transmitted from each sink node is overheard by each node in R. Based on the information in these report packets, each relay r j R estimates the channel state to each destination and calculates the objective function which will be used for selecting the retransmitting node in a distributive manner.

Retransmission procedure from a relay node
After the packet report, each r j has the knowledge of the packet set b i of the destination d i and estimates the corresponding channel response h ji between r j and d i (1 ≤ i ≤ m, 1 ≤ j ≤ l). Using that knowledge, each relay r j checks its buffer for possible network coding. If there are more than 2 packets, it checks whether the packets can be network-coded or not. If affirmative, the relay node r j creates a network-coded packet using the OPNC algorithm. If it is not possible to do network coding, the relay node simply retransmits only one packet without using network coding. If no relays get certain packets from the source, these packets are to be delivered directly from the source during the next source broadcasting phase. In the OPNC algorithm, the optimal network coding can be constructed based on how many packets r j can mix to create a network-coded packet (i.e., how many destinations would receive packets). However, since the operation does not consider channel response between the relay node r j and its destination node in D, the decoding failure may occur with high probability when the channel quality is poor. This failure increases the retransmission number and degrades system performance such as throughput. To improve the throughput, we need to modify the selection rule by considering the channel state. We will define an objective function which depends on the number of packets that can be network-coded as well as the channel state, and the retransmission node will be chosen by this function.
Opportunistic relaying was introduced in [12], which proposed a distributed relay selection algorithm for a system which has multiple relays and single sink node. The basic idea is that each relay node sets up an internal timer which triggers transmission. This timer is a function of the channel responses of source-relay and relay-sink pairs, and it is given by where T i is the timer function of the relay R i , and c is a constant. There is possibility of hidden node problem, which can be mitigated by adjusting the constant c in (9). Another method to reduce the hidden node effect is that we use the minimum channel response instead of harmonic mean value [12]. Hence, h i is defined as a minimum of the channel responses of S -R i and R i -D, which is given by When the timer has expired, the relay node is expected to broadcast a channel reservation message to neighboring relays to prevent other relays from transmission. The relay whose timer expired first broadcasts a channel reservation message to the neighbors. Contrast to [12], in our model, we do not need to consider  Figure 1 The probability that there exists a relay node which receives a given packet correctly approaches to 1 at high SNR.
Srce ID PS ID OS ID 1 OS ID 2 OS ID | i | ... Figure 2 Report packet structure. Pilot is used to estimate channel state from a sender in D to a receiver in R. SrceID is the source identification, PSID is the current sink node identification, and OSID i is the destination node identification of the ith stored packet in the buffer of the current sink node. b i is the set of the packets overheard by the current node d i .

Pilot
the channel between the source and the relay node since only the relay performs retransmission. This will reduce processing delay in relay node selection. Based on this idea, we propose a new distributed relay node selection algorithm combined with OPNC for the topology of multiple relays and multiple sink nodes. If we use channel state as the only variable to choose a relay node as in the opportunistic relaying algorithm, the system performance may be poor. Figure 3 shows an example of overall system scenario. There are a single source S, 2 relays, and 3 sink nodes. Each relay node has different amount of packets to be delivered to its destinations. After the source S broadcasts, the relays (R 1 , R 2 ) and the sink nodes (D x , D y , D z ) overhear packets and stored them in their buffer. The packets a, b, and c are sent to D x , D y , and D z , respectively. R 1 sends 1 packet to D x , and R 2 sends 2 networkcoded packets to D y and D z in one time frame. Suppose h 1 = h 1x and h 2 = min(h 2y , h 2z ). We can then calculate the theoretical throughputs R 1 and R 2 for the two channels.
where r is the transmit signal-to-noise ratio. The multiplication factor of 2 in (12) is due to the network coding at R 2 , and the inequality is used because the larger of the two channels has larger capacity than the capacity of the minimum channel. Suppose ∥h 1 ∥ > ∥h 2 ∥, then opportunistic relaying algorithm will choose R 1 to transmit packet a to D x . However, if (1 + ∥h 2 ∥ 2 r) 2 > (1 + ∥h 1 ∥ 2 r), it may be better to choose R 2 for retransmission.

Proposed relay selection techniques for network-coded transmission
In this section, we propose relay selection techniques for network-coded transmission, which is based on a timer function. Let us denote the minimum channel response at the jth relay node by h j , and the set of packets that can be network-coded by K j . To improve throughput, we consider channel state information (h j ) as well as the number of packets (∥K j ∥) that each r j can deliver simultaneously by network coding. We assume that the objective function at the relay node r j is a function of h j and ∥K j ∥, which is denoted by f(h j , ∥K j ∥). The objective function f is an increasing function of each variable. The minimum channel response (h j ) from relay r j to a sink node is a modified version of (10) since only the relay nodes can retransmit. We then have The number of packets that the jth relay node r j uses to create a network-coded packet is denoted by ∥K j ∥. Both variables, h j and ∥K j ∥, may vary from one frame to Figure 3 An example for opportunistic network coding with relay selection. There is one source node, two relay nodes, and three sink nodes. The source has three packets a, b, c, each of which has its own sink node address. Packet a is destined to D x , and both b and c are to D y and, D z , respectively. Intermediate relay nodes are capable of opportunistic network coding.
another. Though h j can be defined differently from (13), Bletsas et. al. [12] empirically showed that it works well to use the minimum channel. Since the objective function is proportional to h j and ∥K j ∥, a relay r j which has either larger channel response or larger number of packets that can be network-coded will have high probability of using the channel. We can then define the internal timer value at the relay node r j as We will use the timer value in (14) in choosing a proper relay node for retransmission. This means that a node with smaller internal timer value will transmit earlier than other relays, which is a kind of decentralized selection scheme. We compare 5 relay selection algorithms using different internal timer functions. First, set the objective function f as a modified version of opportunistic relaying algorithm of [12]. In this case, the function f at a certain relay node r j depends only on the channel states between the relay and its corresponding sink nodes (13). Those sink nodes are the destinations of the packets that can be network-coded among all overheard packets in r j . As mentioned before, we use only the channel between a relay node and a destination node unlike the original opportunistic relaying scheme of (10). Thus, the 1st kind of timer function for the modified opportunistic relaying algorithm is given by As in the method of OPNC in choosing the best network coding option to increase system throughput, we use only ∥K j ∥ as a variable of the objective function. In this case, the 2nd kind of timer function is inversely proportional to ∥K j ∥, which is given by This means that the relay whose ∥K j ∥ is the largest would occupy the channel.
Let us now we introduce sum rate R S j which is given by Since R S j depends on both ∥K j ∥ and the channel response, we can use R S j as a variable in the objective function. In (18), we use both ∥K j ∥ and R S j in the 3rd kind of objective function. As R S j has a channel-related variable in it, the objective function considers the effect of channel and throughput simultaneously, and we have In (18), we can replace the sum rate by the minimum channel. The 4th kind of timer function is given by .
The 5th kind of timer function is based on the minimum channel h j and the sum rate R S j which is given by .
As we mentioned, c is an empirical constant to control the collision among the relay nodes. Typically, c has a value of a few microseconds [12]. Each relay node r j uses T j as its internal timer value. A relay node whose internal timer expires first broadcasts a signal to neighbor relays to stop their transmission to reserve the channel, which is a first-come-first-serve policy. The sink nodes that successfully overhear the network-coded packets decode the packets using theirs own stored data and update their decoding results. After that, the sink nodes transmit report packets again. Until there are no more packets to be delivered from the relay nodes to the sink nodes, the procedure is repeated.

Simulation results
The simulation environment is summarized in Table 1. The channel from one node to another is modeled as independent Rayleigh fading channel. This is equivalent to the case where the relay nodes and the sink nodes are randomly distributed around the source node with equal distance. It is also assumed that the relay nodes and the sink nodes are within the communication range from the source node, and the feedback channel is error free. We use a large number of relay and sink nodes in the simulations to order to increase the possibility of network coding at the relay nodes.

Average transmission number
Five different timer function algorithms are compared in terms of average number of transmissions in Figure 4, where the number of relay nodes and the number of sink nodes are set to 50 and 50, respectively. The first  Figure 4 is based on the opportunistic relaying algorithm of (15) which chooses a relay with the maximum channel amplitude h j from the relay nodes in R. Algorithms B through E (with the timer function T B j through T E j ) in Figure 4 are based on the timer functions of (16) through (20). From these results, it is observed that the relay node selection algorithm using the timer function of (15) needs the largest average number of transmissions, while the algorithm of (20) requires the least.
Algorithm B (T B j ) based on the modified OPNC algorithm chooses a relay node with the largest number of packets that can be network-coded. Note that this algorithm does not consider the channel response. If there occurs deep fading on the path from the chosen relay node to its sink nodes, it is highly likely that the selected node would fail to deliver the information. That may increase the total number of transmissions. Algorithm C (T C j ) is based on the sum rate and the amount of packets to be network-coded (∥K j ∥). By using these two variables in the objective function, the performance is improved over the two previous algorithms. Algorithm D (T D j is based on ∥K j ∥ and the h j simultaneously, so Algorithm D can be thought of as a combination of Algorithms A and B. In Figure 4, it is observed that the performance of Algorithm D is better than previous three algorithms. Algorithm E (T E j ) is based on the channel response and the sum rate with the timer function (20). Compared to the modified OPNC algorithm (Algorithm B), the sum rate measure lowers the possibility of choosing a node whose channel response h j is low. Compared to the modified opportunistic relaying algorithm (Algorithm A), Algorithm E considers the sum rate as a measurement of throughput (related to ∥K j ∥) so that this algorithm balances the measures of ∥K j ∥ and h j . Figure 4 shows that Algorithm E has the lowest average number of transmissions especially in the low SNR regime. At the high SNR regime, the transmission from the source to the sink nodes would succeed with high probability as mentioned before. In other words, there is not noticeable difference between different algorithms at the high SNR regime. Figure 5 compares the average system throughput of Algorithms A through E, and the plot is normalized by the total number of packets used in the simulation. The system throughput is defined by the total number of successfully delivered packets to the sink nodes per transmission. In the simulations, the number of relay nodes and the number of sink nodes are set to 50 and 50, respectively. It is observed in Figure 5 that Algorithm E performs the best in terms of throughput. It has throughput gain of 10-15% over the modified OPNC algorithm (Algorithm B) and 13-25% over Algorithm A in the SNR range between 20 and 25 dB. The average throughput difference is relatively large in the medium SNR range, but it gets negligible in the low and the high SNR regions. In the low SNR regime, the probability of error at a sink node increases. The error increases retransmission from the relay nodes. This phenomenon is believed to be almost independent of the type of the timer algorithm we use. This explains why there is little difference between the 5 algorithms in the low SNR regime. In the high SNR regime, most of the packets tend to be decoded successfully at the sink nodes in the source transmission (broadcast) phase. It means that the contribution of the retransmission phase decreases as the SNR increases, which also explains why there is little difference between the 5 algorithms in the high SNR regime.

Conclusion
In this paper, we proposed a new opportunistic wireless network coding combined with distributed relay selection. By taking advantage of opportunistic listening capability of wireless networks, several feedback-based retransmission schemes are proposed. From the simulation results, it was shown that the algorithm based on the minimum channel gain and the sum rate has the best performance in terms of average number of transmissions and system throughput. It was also observed that the proposed relay selection scheme performs better than the conventional schemes especially in the medium SNR regime. It appears that the proposed approach is promising in that it is a practical wireless network coding scheme with improved throughput.