Decoding techniques for alternate-relaying BICM cooperative systems

In this paper, we propose the use of bit-interleaved coded modulation in alternate-relaying decode-and-forward cooperative communication systems. At the destination, we exploit the interference signal, which results from the simultaneous transmission of data streams through both direct and one of the relay channels to develop an optimal detector. It is shown that the proposed detector can be implemented by parallel concatenating maximum a posteriori (MAP) algorithms and demappers to the decoders. The detector exchanges soft information between the decoders and the MAP algorithms in an iterative way for performance improvement. The proposed optimal detector requires a long delay as the destination has to receive and store the entire frame before performing data detection. To avoid this, a sub-optimal detector is also proposed. Unlike the optimal detector, the sub-optimal one exploits two consecutive received packets to decode one packet. It turns out that the sub-optimal detector has less reduced delay, complexity, memory size, and bandwidth loss with a slight increase of the bit-error-rate. Extensive simulation results are presented to demonstrate the effectiveness of the proposed detectors.


Introduction
Cooperative technology constitutes a breakthrough in the design of wireless communication systems. This is due to its relatively simple implementation and its significant performance gains in terms of link reliability, system capacity, and transmission range [1,2]. In cooperative communications, multiple terminals in a wireless network cooperate to form a virtual antenna array in a distributed fashion. In this manner, spatial diversity gain can be achieved even when a local antenna array is not available. It is not surprising that cooperative communications have become a strong candidate for many wireless applications, such as cellular networks, wireless local area network, mobile ad hoc networks, and wireless sensor networks [3].
Generally, there are two kinds of relaying modes: full duplex and half duplex. In a full-duplex mode, a relay transmits and receives simultaneously in the same band; however, the transmitted signal interferes at the relay with the received signal. In theory, it is possible for the relay *Correspondence: h.mostafa@mun.ca 1 Faculty of Engineering and Applied Science, Memorial University, St. John's, NL, A1B 3x5, Canada Full list of author information is available at the end of the article to cancel out the interference because it knows the transmitted signal. However, in practice, a small error in the interference cancellation can be fatal because the transmitted signal is typically 100 to 150 dB stronger than the received signal, as indicated in [2]. This error results from inaccurate knowledge of the device characteristics or from the effects of quantization and finite precision processing. Therefore, the full-duplex mode is not commonly used. In a half-duplex mode, a relay cannot simultaneously transmit and receive. In other words, the source and relay transmissions must be orthogonal in order to eliminate any potential interference. Orthogonality can be in time domain, in frequency domain, or using any set of signals that are orthogonal over the time-frequency plane. A major problem of the half-duplex relaying mode is the reduction in the spectral efficiency [2].
To combat this problem, different cooperative techniques are introduced, such as non-orthogonal, two way, and alternate relaying. In the non-orthogonal cooperative systems (e.g., [4,5]), the source is active all the time. In the first half of the transmission interval, the source sends data to a relay and destination. However, since the relay is assumed to be half duplex, the relay does not receive what the source transmits in the second half of the transmission http://jwcn.eurasipjournals.com/content/2013 /1/236 interval. This results in a reduction in the diversity order of the system. Furthermore, an additional processing is required at the destination in order to separate the signals received simultaneously in the second half of the transmission interval. In the two-way cooperative systems, two sources exchange data via the aid of a shared relay (e.g., [6,7]). The two sources send simultaneously during the first time slot, while in the second time slot, the rely broadcasts the mixture of these two signals. Since parts of the transmit signal sent by the relay are known at the destinations, each receiver can extract the data of the other partner. It is obvious that the full-rate transmission can be attained since two time slots are required to transmit the data of the two partners; however, this cooperative transmission requires two partners sending messages to each other simultaneously.
In alternate-relaying transmission protocols (e.g., [8,9]), the source communicates with the destination via two relays. The basic idea behind these protocols is to use two successively forwarding relays to mimic a full-duplex relay. More specifically, at any time slot, the source sends its information to the destination and one of the relays, while the other relay forwards the information received from the source in the previous time slot to the destination. In this way, the source can continuously transmit data without being halted, and hence, the spectral efficiency loss is recovered.
The vast majority of research in alternate-relaying transmission protocols focuses on information-theoretic analysis to assess achievable rates, capacity bounds, and diversity-multiplexing tradeoff (e.g., [8][9][10][11][12]). However, the major issue associated with these protocols is how to handle the interference, which is caused by the simultaneous transmission of the source and one of the relays, in a simple way. Basically, the way used to treat the interference at the relays and destination depends on whether the relays employ amplify-and-forward (AF) or decode-andforward (DF) relaying strategies.
For the AF alternate-relaying protocols, the authors in [8] propose successive decoding at the destination with partial cancellation of inter-relay interference. The authors in [9] introduce inter-relay self-interference cancellation, where the cancellation is performed at one of the relays; however, the detection process at the destination requires high computational complexity. The authors in [10] propose a full inter-relay interference algorithm, where the cancellation is performed at the destination in a simple manner; however, the noise accumulation associated with the algorithm limits the overall system performance.
Generally, the deployment of AF relaying strategy in alternate-relaying cooperative systems is challenging, as the interference and noise accumulation which results from the inter-relay link degrade the overall performance. Accordingly, the detection process at the destination has to be associated with sophisticated interference cancellation detectors. This increases the computational complexity significantly, as shown in [10]. Due to its symbol-by-symbol decision base, DF is an attractive relaying strategy to avoid interference accumulation at the destination for alternate-relaying cooperative systems.
For DF alternate-relaying protocols, it is usually assumed that the inter-relay link is either sufficiently weak and then its contribution can be treated as extra noise, or sufficiently strong that it can be canceled through successive interference cancellation at the relay [8,11]. However, these two extreme scenarios may not always occur in practical systems. In [13,14], dirty paper coding based on interference pre-subtraction at the source is proposed to cancel the inter-relay interference. However, this requires high computational complexity and full knowledge of the channel state information of all the links at the source, which is not easy to achieve in practice. Beam-forming/smart antennas [15] and code division multiple-access techniques [16] are also proposed to eliminate the interference at the relays and destination. The technique comes at the cost of complexity, where the latter comes at the cost of wasting resources. In [12], the authors propose employing multiple antenna at the relays to cancel out the inter-relay interference. However, implementing multiple antennas at the relays is not applicable for some wireless applications due to size, power, and cost constraints. A rotated signal constellation has been proposed to achieve full interference cancellation [17]. The idea is that there are no two symbols in the rotated constellation having the same real or imaginary value. Accordingly, an orthogonal transmission can be achieved between source relays and inter-relay links by assigning real parts of the rotated symbols to one link and imaginary parts to the other link. However, the direct link between the source and destination is not available, and there is significant degradation in the BER performance when compared with that of the original constellation.
All the previous works deal with uncoded transmission, which is not usually used in practice. To the best of our knowledge, this is the first work in the literature that applies error-correcting coding for alternate-relaying cooperative systems. In this paper, we exploit the interference signal at the destination to develop an optimal detector for alternate-relaying DF cooperative systems in conjunction with a bit-interleaved coded modulation (BICM) signal. Starting from the maximum a posteriori (MAP) principle, it is shown that the proposed detector can be implemented by combining the MAP algorithm with the decoder of BICM. Furthermore, to reduce the delay associated with the optimal detector, a sub-optimal one is also introduced. The proposed detectors can work http://jwcn.eurasipjournals.com/content/2013/1/236 with any receiver as long as this can compute the a posteriori probabilities of the data symbols.
The remainder of the paper is organized as follows. In Section 2, the system model and problem formulation are presented. In Sections 3 and 4, the optimal and sub-optimal detectors are proposed at the destination, respectively. Section 5 explains the used detector at the relays. The performance of the proposed detectors is evaluated through computer simulations in Section 6. Finally, the paper is concluded in Section 7.

Transmission protocol
We consider a simple cooperative communication network composed of four nodes: source node S, relay nodes R 1 and R 2 , and destination node D. All the terminals are equipped with a single antenna, and relays operate in half-duplex mode. The source transmission is divided into frames, each consisting of P packets. Each frame is generated as follows (see Figure 1) (2), · · · , c (p) (N c ) . For the sake of simplicity, we use a convolutional code. However, the proposed detectors in this paper are valid for other coding schemes. After interleaving, each m consecutive bits of the interleaved sequence are grouped to form the nth vector v chosen from an M-point signal constellation , where μ is the labeling map, and M = 2 m . Accordingly, one can describe the output of the mapper for the pth packet as The frame is transmitted continuously, one packet per time slot, over a wireless channel to the destination. The transmission schedule for the P time slots for each frame is described in Table 1 and Figure 2. The transmission steps shown in Table 1 are continuously repeated until the P packets are transmitted by the source. In practice, in order to achieve diversity for the last packet, one additional time slot is required at the end of the transmission; S remains silent, while R 1 (or R 2 ) sends d (P) . This slight loss in the rate, 1 P , is asymptotically zero for large values of P. As one can observe, the destination receives two copies of each packet, one from the source and one from the relays. This implies that diversity gain can still be achieved by this protocol, while the source transmits continuously. As a result, the bandwidth efficiency is not sacrificed, and full-rate transmission is retained.

Received signals at relays and destination
For each link, the channel is assumed to be frequency non-selective and modeled as a zero-mean independent complex Gaussian random variable. In addition, we consider that all the nodes have equal additive white Gaussian noise (AWGN) power spectral density of N 0 . Finally, we suppose that perfect channel state information is available at the destination and relays.
Without loss of generality, we presume that at time slot p, R 2 listens and R 1 sends. Referring to Figure 2, the nth received symbol of the pth packet at the destination (y (p) D (n)) and relay R 2 (y (p) R 2 (n)) can be expressed, respectively, as and y (p) AB is the channel coefficient between nodes A and B which corresponds the nth data symbol of the pth packet, A, B ∈ {S, R 1 , R 2 , D}, and w (p) D (n) and w (p) R 2 (n) are the AWGN contributions at destination and relay R 2 , respectively. The channel coefficients remain constant over the packet period and change to new independent values with each new packet; subsequently, the index n is dropped from the channel coefficient notation. Furthermore, at time slot p + 1, the received signal at the destination and relay R 1 can be written, respectively, as and

Optimal decoding technique at the destination
For single-input single-output systems, Zehavi suggested a detection method using two separate steps: bit metric generation and soft-input soft-output (SISO) decoding, as shown in Figure 3 [18]. The demapper generates 2m bit metrics for each received symbol, associated with the m positions, each having binary values 0 and 1. These bit metrics are then de-interleaved and fed to the SISO decoder. The soft information provided by the decoder is used to enhance the bit metrics in a recursive manner.
In this section, we show how to exploit the interference signal at the destination to develop an optimal detector for DF alternate-relaying BICM cooperative systems. From (1) and (3), one can observe that the source and relays send their messages to the destination in a sequential form. For illustration, let us consider that the source transmits the frame · · · , d (p−1) , d (p) , d (p+1) , · · · . Accordingly, the information sent by the source and the forwarding relay can be expressed as , and x = y. Consequently, in contrast to the relays, each symbol in each packet is received twice at the destination through the source-destination link and listening relay-destination link, only if the listening relay was able to correctly detect the packet that contains this symbol; otherwise, the transmitted symbol is received only one time through the direct link. The equivalent block diagram of the DF alternate-relaying cooperative system can be represented as shown in Figure 4. From this figure, it is clear that the equivalent model is analogous to a convolutional code with constraint length of two. Hence, one can describe the transmitter of the equivalent system through a trellis diagram. The trellis consists of M states, which is equal to the modulation order. There are M branches leaving from each state corresponding to the M different input patterns. For example, the trellis diagram of a transmitter using 8-PSK modulation is shown in Figure 5, where {α 1 , α 2 , α 3 , α 4 , α 5 , α 6 , α 7 , α 8 } denote the 8-PSK symbols. Each branch is labeled by α i /α i α x , where α i is an input symbol, and α i α x represent the two symbols transmitted through the direct and relay link, respectively.
The proposed optimal detector consists of three major steps, as depicted in Figure 6. In the first step, we apply the MAP algorithms to compute the a posteriori probability of each transmitted symbol. In the second step, we forward these probabilities to the demappers to compute the bit metrics. Finally, the SISO decoders receive these metrics and extract the a posteriori information of the transmitted bits. The output of the decoder is fed back to the MAP algorithms for performance improvement. As one observes, the proposed detector iterates between the MAP algorithms and the SISO decoders.

MAP algorithms
The key idea of the proposed optimal detector is to employ N d parallel MAP algorithms, as shown in Figure 6. The nth symbols from each received packet, Y(n) = y (1) and (3), respectively, are the input of the nth MAP algorithm. From (1) and (3), one can observe that the symbol d (p) (n) is included in y (n). Accordingly, in order to provide an optimal detector for the nth symbol in each packet, we have to rely on the received sequence y (1) (n), y (2) (n), · · · , y (P) (n) . That is why we mix the received packets, as shown in Figure 6. The nth MAP algorithm calculates the a posteriori probabilities P d (p) (n) = ϑ | Y(n), H for p = 1, · · · , P, and for all ϑ belonging to the constellation , and H = h is defined as the probability density function of event x 1 given the events x 2 and x 3 . The MAP algorithm can be implemented based on the Bahl, Cocke, Jelinek, and Raviv (BCJR) algorithm [19], with the transition probability, γ where represents the output associated with this transition. Note that γ (p) n (s , s) represents the probability to transit from the state s' to the state s for the nth symbol in the pth packet, given the received symbol y (p) D (n). We employ the BCJR algorithm to implement the optimal demapper of the DF alternate-relaying transmission, which is analogous to a convolutional code with constraint length of two, as the BCJR algorithm is the optimal decoding technique for convolutional codes [19]. The a priori probability P d (p) (n) in (5) is unavailable at the first iteration. Therefore, in the initialization phase, it is assumed that all d (p) (n) are equally probable. Equation (5) is used as the input to the demapper, which then generates the bit metrics.

Demapper
The a posteriori probabilities provided by the N d MAP algorithms can be exploited to compute the bit metrics. Referring to Figure 6, after the MAP algorithms, the decoding process can be divided into P parallel branches,

Memory
Receiver Transmitted frame where the branch pth decodes the pth packet. Each branch contains a demapper, de-interleaver, decoder, interleaver, and symbol a posteriori computation unit. In order to forward each branch its corresponding a posteriori probabilities, the outputs of the MAP algorithms are de-permuted; the pth output of each MAP algorithm is forwarded to the pth branch. Mathematically, the a posteriori probabilities {P (d p (n) | Y(n), H)} N d n=1 are passed to the pth branch. Accordingly, the bit metric, λ v (p) where the subset For more details on the bit metric concept, the reader is referred to [20][21][22].

Decoder
After de-interleaving, the bit metrics are passed to the decoder to provide the a posteriori probabilities of coded bits. These probabilities are then interleaved and forwarded to the symbol a posteriori computation unit. Assuming that the probabilities P v (p) n (m) are independent using a good interleaver, the symbol a posteriori probability P d (p) (n) can be computed as These a posteriori symbol probabilities are provided to the MAP algorithms as a priori information, as seen in (5). At the last iteration, the final decoded outputs are the hard decisions based on the a posteriori probabilities.

Implementation aspects
The implementation aspects are as follows: • An additional tail consisting of N d zero symbols is included to the end of the transmitted frame to force the trellis path to return back to the initial state, from which the decoding process starts. • As commonly assumed in the literature, the relays forward the received packets only when they have been correctly decoded; otherwise, they remain idle. In practice, the relays send acknowledgment signals to the destination, indicating the status of each packet. For illustration, if the destination receives negative acknowledgment for the pth packet, this implies that the relay R x is unable to decode and forward this packet. In this case, the channel coefficient that corresponds to the R x packet, h (p) R x , is set to zeros in (5) Figure 6 Optimal detector at destination.
• The optimal detector requires a large memory for storing the entire frame before starting data detection, which in turns increases the required processing time for decoding. This may prohibit the optimal detector from practical implementation, as such we propose a sub-optimal one.

Sub-optimal decoding technique at the destination
Instead of computing the exact values of symbol a posteriori probabilities P d (p) (n) = ϑ | Y(n), H provided by the MAP algorithms, in the proposed sub-optimal detector, these probabilities are calculated in an approximated way. We employ the P sub-optimal algorithms in the sub-optimal detector instead of the P MAP algorithms in the optimal detector, as shown in Figure 7. Note that these P sub-optimal algorithms are applied in a serial fashion, while in the optimal one, the P MAP algorithms are applied in a parallel fashion.
The key principle of the sub-optimal detector is to employ two consecutive received packets, y (p) , given in (1) and (3) Figure 7 Sub-optimal detector at the destination.
Using (1), we can write where e (p) D (n) includes the residual error that may result from the imperfect detection of the previous symbol. With (3) rewritten as and based on (9) and (10), one can write and Here, y (p) (n) = z and After computing the a posteriori probabilities of the data symbols as in (11), (6) can be applied to produce the bit metrics. As for the optimal detector, these bit metrics are then de-interleaved and passed to the SISO detector. The soft information provided by the decoder is fed back to (11) to refine the computation.
, and by averaging over d (p+2) , can be detected, and so on. Similar to the optimal detector, if the destination receives a negative acknowledgment from relays about the status of the pth packet, we set h (p) R y = 0 in (13).
A comparison of the proposed optimal and sub-optimal receivers in terms of complexity, delay, memory size, and bandwidth loss is shown in Table 2. We evaluate the computational complexity of the proposed detectors in terms of the number of required floating point operations (flops). Floating point operations include any operations that involve fractional numbers. For more details on this issue, the reader is referred to [23]. In the table, ϒ = P (MN d m + (m − 1) N d + ) refers to the term that is common in both detectors, where is the number of flops required for one decoder; depends on the encoder parameters such as constraint length, code rate, and generator polynomial. One can notice that the complexity of both detectors is directly proportional to the number of symbols per packet N d and square of the modulation order, M 2 . We also notice that the sub-optimal detector outperforms the optimal detector in terms of the required delay, memory size, and bandwidth loss, with a lower computational complexity.

Complexity (flops) Delay MS BW loss
Optimal Delay is expressed by the number of data symbols required to store. MS, required memory size; BW loss, bandwidth loss. http://jwcn.eurasipjournals.com/content/2013/1/236

Decoding technique at the relays
In DF alternate-relaying protocols, it is usually assumed that successive interference cancellation, where the strongest signal is detected first, and then its contribution is subtracted from the received signal before detecting the other signal, can be employed at the relays [8,11]. In order to provide a reliable BER performance, this requires that the inter-relay link is either sufficiently weak or sufficiently strong when compared with the source-relays links. However, these two extreme scenarios may not always occur in practical systems. In this section, at the relays, we show how the bit metric can be generated to have a better performance in the presence of the interference resulting from the forwarding relay transmission. Note that at the destination, each packet is received twice through the direct and relaying links, if the forwarding relay was able to correctly detect this packet. Otherwise, it is received only through the direct link. At the listening relay, each packet is received through the sourceto-listening relay link, interfered by the data sent from the forwarding relay. As such, the proposed detectors at the destination are inapplicable at the listening relay.
From (2) and (4), one can observe that the data received at a listening relay from the source is interfered by the data sent from the forwarding relay. This is because at any time slot, there is always a relay transmitting data simultaneously with the source. Without loss of generality, we can write the received signal at relay R x as (15) The demapper takes the received symbol y (p) R x (n) and the channel coefficients R x R y as its inputs to compute the bit metric, λ v (p) n (f ) = b , ∀f = 1, · · · , m and b = 0, 1, using the maximum a posteriori criterion as [22] λ v Since the symbold is not known at the relay R x , we remove its contribution by averaging out as We assume that the transmitted symbols are equally probable; thus, P d (p−1) where the subset The a priori probability P(a) is unavailable on the first iteration of the demapping. Therefore, in the initialization phase, it is assumed that all a are equally probable. Equation (18) is used as the input to the SISO decoder, which then generates the a posteriori probabilities for the coded bits. On the second iteration, these probabilities are interleaved and fed back as a priori probabilities to the demapper as shown in Figure 3.

Results
In this section, we validate the proposed detectors through Monte Carlo computer simulations. We consider an alternate-relaying DF cooperative communication system, using a convolutional code with constraint length 5, rate 1/2, and polynomial generators (23) 8 and (35) 8 . The BCJR algorithm [19] is used for decoding. A framebased transmission is assumed; each has 20 packets. A packet length of N b = 150 information bits is chosen, leading to N c = 300 coded bits. The coded bits are set, partition mapped on an 8-PSK constellation, resulting in N d = 100 symbols. For each link, the channel is assumed to be frequency non-selective and modeled as a zero-mean independent complex Gaussian random variable. To capture the effect of the path loss on the performance, we consider E |h AB | 2 = (d SD /d AB ) [2], where h AB , d SD , and d AB are the channel coefficient, the distance between source and destination, and the distance between the nodes A and B, respectively; is the path loss exponent, and E[.] is the statistical average operator. Unless mentioned otherwise, the distance between the source and the two relays equals 0.4, while the distance between the two relays is 0.2. All these distances are normalized to the source-to-destination distance, and is set to be 2. Figures 8 and 9 depict, respectively, the bit error rate (BER) performances of the proposed optimal and suboptimal detectors at the destination as a function of E b /N 0 , where E b and N 0 are the energy per bit contributed by the source and noise power spectral density, respectively. One can express E b at the destination as  Here we assume, without loss of generality, that = σ 2 SD = 1, and the transmit pulse shaping satisfies the Nyquist criterion. Since two relays are used in the alternate-relaying cooperative systems, the BER performance of the half-rate one relay and best relay from a set of two relays is also shown. For a fair comparison between the half-rate and full-rate (alternate-relaying) systems, we keep the same data rate and transmitted power for both systems. Hence, we use 64-PSK modulation for the half-rate systems and 8-PSK for the full-rate systems. As one can observe, for the fullrate cooperative systems, iterative processing achieves a significant performance improvement for the proposed optimal and sub-optimal detectors. Furthermore, it is seen that the performance of the proposed full-rate cooperative   systems outperforms that of the half-rate cooperative systems. Figure 10 compares the BER performance of the proposed optimal detector with that of the proposed sub-optimal detector. As one can observe, at the first iteration, the performance of the sub-optimal detector is much worse than that of the optimal one. However, by increasing the number of iterations, the difference between the performance of the sub-optimal and optimal detectors reduces. This occurs because at the first iteration, the a posteriori probabilities provided by the sub-optimal detector are not reliable to initialize the SISO decoder. With the aid of iterative processing, the reliability of the sub-optimal detector increases, and its performance approaches the performance of the optimal detector. Figure 11 shows the BER performance of the proposed optimal and sub-optimal detectors with different relay positions, while the distance between the two relays is kept constant. As one can observe, the BER at the destination degrades with d SR 1 and d SR 2 . It is because the BER at the relays increases with d SR 1 and d SR 2 ; this, in turn, increases the BER at the destination. Figure 12 shows the BER performance of the used detector at the relays, as shown in Section 5, as a function of E b /N 0 . Here, E b is the energy per bit contributed by the source at the listening relay R x , x = 1, 2, given as For the sake of comparison, we show the performance when the interference signal can be treated as additional noise, and this is referred to as detector 1. In addition, we show the BER performance of the successive interference cancellation detector proposed in [8,11], which is based on detecting the strongest signal first, then subtracting its contributions from the received interfered signal before detecting the other signal. In the sequel, this is referred to as detector 2. As one can observe, both detector 1 and detector 2 lead to an unacceptable performance. We also notice that the used detector achieves a strong improvement of the performance after only two iterations. In addition, there is no significant improvement in the performance after three iterations for any E b /N 0 values.

Conclusions
The receiver design was studied at destination for alternate-relaying decode-and-forward cooperative communication systems with BICM signals. The interference signal, which results from the simultaneous transmission of data streams through both direct and one of the relay channels, was exploited as a beneficial resource to develop an optimal detector. We showed that the optimal detector was implemented by parallel concatenating MAP algorithms and demappers to the decoders. In order to avoid the delay problem associated with the optimal detector, a sub-optimal detector was also developed. The sub-optimal detector achieved a close performance when compared with the optimal one. Both of the optimal and sub-optimal detectors exchanged soft information between decoders and MAP algorithms in an iterative fashion for performance improvement.