Low complexity frequency domain hybrid-ARQ chase combining for broadband MIMO CDMA systems

Chafnaji, Houda; Ait-Idir, Tarik; Saoudi, Samir; Vasilakos, Athanasios V

doi:10.1186/1687-1499-2012-134

Research
Open access
Published: 05 April 2012

Low complexity frequency domain hybrid-ARQ chase combining for broadband MIMO CDMA systems

Houda Chafnaji^1,2,
Tarik Ait-Idir^1,2,
Samir Saoudi² &
…
Athanasios V Vasilakos³

EURASIP Journal on Wireless Communications and Networking volume 2012, Article number: 134 (2012) Cite this article

3177 Accesses
1 Citations
Metrics details

Abstract

In this article, we investigate efficient minimum mean square error (MMSE) frequency domain equalization (FDE)-based iterative (turbo) packet combining for cyclic prefix (CP)-CDMA MIMO with Chase-type ARQ. We introduce two turbo packet combining schemes: (i) In the first scheme, namely "chip-level turbo packet combining", chip-level MMSE-FDE and packet combining are jointly performed at the chip-level. (ii) In the second scheme, namely "symbol-level turbo packet combining", chip-level MMSE-FDE and despreading are separately carried out for each transmission, then packet combining is performed at the level of the soft demapper. The key idea of the proposed schemes is to exploit the diversity among all transmissions with a very low cost by introducing new variables recursively computed. The complexity and performances are evaluated for some representative antenna configurations and load factors (i.e., number of orthogonal codes with respect to the spreading factor) to show the gains offered by the proposed techniques.

1. Introduction

Space-time (ST) multiplexing oriented multiple-input-multiple-output (MIMO) and hybrid-automatic repeat request (ARQ) protocols play a key role in the evolution of current wireless systems toward high data rate wireless broadband standards [1]. In ST multiplexing architectures, independent data streams are sent over multiple antennas to increase the transmission rate [2]. In hybrid-ARQ, erroneous data packets are kept in the receiver to help decode the retransmitted packet, using packet combining techniques (e.g., see [3] and references therein). Depending on the retransmitted information, hybrid-ARQ can be classified into Chase-type ARQ and incremental redundancy (IR). Chase-type ARQ is considered as the simplest hybrid-ARQ scheme where the data packet is entirely retransmitted. In the more sophisticated IR hybrid-ARQ scheme, retransmissions only carry portions of the data packet, this presents an efficient technique for increasing the system throughput while keeping the error performance acceptable. In this study, we propose advanced receiver schemes that can be only used for hybrid-ARQ with Chase combining. Combining schemes for IR hybrid-ARQ are out of the scope of the current article.

To support heterogeneous data rates in CDMA systems, multiple spreading codes can simultaneously be allocated to the same user if he requests a high data rate [4]. This method is often referred to as "multi-code transmission," and has been considered in the high speed packet access (HSPA) system [5]. In MIMO CDMA systems, multi-code transmission offers a spectrum efficiency that linearly increases in the order of the number of spreading codes and transmit antennas. This is achieved by assigning the same spreading code group to all transmit antennas. However, in severe frequency selective fading wireless channels, the performance of this scheme can dramatically deteriorate due to co-antenna interference (CAI) and inter-chip interference (ICI). This results in a large delay (due to multiple transmissions) when an ARQ protocol is used in the link layer. Motivated by this limitation, we investigate efficient hybrid- ARQ receiver schemes that allow to reduce the number of ARQ rounds required to correctly decode a data packet in MIMO CDMA ARQ systems with multi-code transmission.

Cyclic-prefix (CP) aided single carrier (SC) CDMA transmission with chip-level minimum mean square error (MMSE)-based frequency domain equalization (FDE) has been introduced in [6]. It is a transceiver scheme that allows to achieve attractive performance with affordable computational complexity cost. Turbo MMSE-FDE for CP-CDMA has then been proposed to cope with severe ICI [7]. In [8], MMSE FDE has been applied to perform packet combining for multi-code CP-CDMA systems with ARQ operating over severe frequency selective fading channels. It has recently been demonstrated that ARQ presents an important source of diversity in MIMO systems [9]. Interestingly, it has been shown in [9] that for both short and long-term static^a ARQ channel dynamics, multiple transmissions improve the diversity order of the corresponding MIMO ARQ channel. The case of block-fading MIMO ARQ, i.e., multiple fading blocks are observed within the same ARQ round, has been reported in [10]. Information rates and turbo MMSE packet combining strategies for frequency selective fading MIMO ARQ channel have been investigated in [11]. Turbo MMSE packet combining for broadband MIMO ARQ systems with co-channel interference (CCI) has been reported in [12, 13] using time and frequency domain combining methods, respectively.

In this article, we investigate an efficient turbo receiver schemes for single user multi-code CDMA systems with chase-type ARQ operating over a broadband MIMO channel. We introduce two packet combining where all ARQ rounds are used jointly to decode the data packet. The first packet combining scheme, referred to as chip-level packet combining scheme, is an extension of the combining approach introduced in [11, 13] to the case of multi-antenna multi-code CDMA systems. In this combining scheme, we exploit the fact that both the CP chip-word and data packet are retransmitted at each ARQ round. This allows us to view each transmission as a group of virtual receive antennas and perform combining of multiple transmissions jointly with chip-level soft MMSE FDE. In the second combining scheme, referred to as symbol-level packet combining scheme, frequency domain soft MMSE is performed separately for each transmission then the demapping is jointly performed with packet combining. In this article, our main contribution is to extend the two combining strategies to the case of multi-antenna multi-code CDMA systems and propose a low complexity combining approach based on recursive implementation strategy. Moreover, we present a comparative study of both combining schemes, in term of implementation cost and performance evaluation. Using complexity analysis and performance evaluation, we demonstrate that the choice of the best combining technique depends on the system configuration.

Throughout this article, (.)^⊤ and (.)^H denote the transpose and transpose conjugate of the argument, respectively. diag {x} ∈ ℂ^{n × n}and $diag \{X_{1}, \dots, X_{m}\} \in ℂ^{m n_{1} \times m n_{2}}$ denote the diagonal matrix and block diagonal matrix constructed from x ∈ ℂⁿand $X_{1}, \dots, X_{m} \in ℂ^{n_{1} \times n_{2}}$ , respectively. For x ∈ ℂ^TN, x_fdenotes the discrete Fourier transform (DFT) of x, i.e., x_f= U_{T, N}x, with U_{T, N}= U_T⊗ I_N, where I_Nis the N × N identity matrix, U_Tis a unitary T × T matrix whose (m, n)th element is ${(U_{T})}_{m, n} = \frac{1}{\sqrt{T}} e^{- j (2 π m n / T)}$ , $j = \sqrt{- 1}$ , and ⊗ denotes the Kronecker product. The rest of this article has the following structure. In Section 2, we present the CP-CDMA MIMO ARQ transmission scheme then provide its corresponding communication model. We also present the architecture of a space-time turbo receiver with no packet combiner. In Section 3, we derive the two iterative soft MMSE FDE-aided packet combining schemes we propose in this article. Section 4, analyzes the complexity and memory size required by both schemes, then focuses on the comparison of their block error rate (BLER) and throughput performances. The article is concluded in Section 5.

2. System description

2.1. CP-CDMA MIMO ARQ transmission scheme

We consider a single user multi-code CP-CDMA transmission scheme over a broadband MIMO channel with an ARQ protocol in the upper layer, where the ARQ delay is K (index k = 1, . . ., K). An information block is first encoded using a ρ-rate encoder, then interleaved with the aid of a semi-random interleaver Π, and spatially multiplexed over N_T transmit antennas (index t = 1, . . ., N_T) to produce the coded and interleaved frame b which is serial-to-parallel converted to N_T sub-streams $b_{1}, \dots, b_{N_{T}}$ , where

b_{t} ≜ [b_{t, 0, 1}, \dots, b_{t, j, m}, \dots, b_{t, T_{s} - 1, M}] \in {\{0, 1\}}^{M T_{S}} .

(1)

T_s denotes the length of the symbol block transmitted over each antenna (index j = 0, . . ., T_s -1). Each sub-stream is then symbol mapped onto the elements of constellation $S$ where $|S| = 2^{M}$ . For each antenna, the symbol block is passed through a serial-to-parallel converter and a spreading module which consists in C orthogonal codes. The same spreading matrix

W ≜ [w_{1}^{⊤}, \dots, w_{C}^{⊤}] \in {\{\pm 1 / \sqrt{N}\}}^{N \times C}

(2)

is used for each transmit antenna, where

w_{n} ≜ [w_{1, n}, \dots, w_{N, n}], n = 1, \dots, C,

(3)

is a Walsh code of length N (i.e., spreading factor), and C ≤ N is the number of multiplexed codes. The rate of this space-time code (STC) is therefore

R = ρ M N_{T} C .

(4)

The C parallel chip-streams on each antenna are then added together to construct a block of $T_{c} = T_{s} \frac{N}{C}$ chips (index i = 0, . . ., T_c - 1). The chips at the output of the N_T transmit antennas are arranged in the N_T × T_c matrix

X ≜ [\begin{matrix} x_{1, 0} & \dots & x_{1, T_{c} - 1} \\ ⋮ & ⋮ \\ \underset{x_{0}}{\underset{⏟}{x_{N_{T}, 0}}} & \dots & \underset{x_{T_{c} - 1}}{\underset{⏟}{x_{N_{T}, T_{c} - 1}}} \end{matrix}],

(5)

where

x_{t, i} ≜ \sum_{n = 1}^{C} s_{t, n, i} w_{p, n}, p = i mod N + 1,

(6)

and s_{t, n, i}denotes the symbol transmitted by antenna t at channel use (c.u) i using Walsh code w_n. Transmitted chips are independent (infinitely deep interleaving assumption), and the chip energy is normalized to one, i.e., $E [{|x_{t, i}|}^{2}] = 1$ . A CP chip-word of length T_CP is appended to X to construct the N_T × (T_c + T_CP) chip matrix X' to be transmitted. We consider Chase-type ARQ: When the decoding outcome is erroneous at ARQ round k, the receiver feeds back a negative acknowledgment (NACK) message, then the transmitter completely retransmits chip-matrix X' in the next round. A successful decoding incurs the feed back of a positive acknowledgment (ACK) message. The transmitter then stops the transmission of the current frame and moves on to the next frame. Figure 1 depicts the considered CP-CDMA MIMO transmission scheme with ACK/NACK.

2.2. Communication model

The broadband MIMO propagation channel connecting the N_T transmit and the N_R receive antennas is composed of L chip-spaced taps (index l = 0, . . ., L - 1). We assume a quasi-static block fading channel, i.e., the channel is constant over an information block and independently changes from block to block. The N_R × N_T channel matrix characterizing the l th discrete tap at ARQ round k is denoted $H_{l}^{(k)}$ , and is made of zero-mean circularly symmetric complex Gaussian random entries. The average channel energy per receive antenna is normalized as

\sum_{l = 0}^{L - 1} \sum_{t = 1}^{N_{T}} E [{|h_{r, t, l}^{(k)}|}^{2}] = N_{T}, r = 1, \dots, N_{R},

(7)

where $h_{r, t, l}^{(k)}$ is the (r, t)th element of $H_{l}^{(k)}$ .

At the receiver side, after removing the CP-word at ARQ round k, the N_R × 1 received signal at discrete time i is expressed as,

y_{i}^{(k)} = \sum_{l = 0}^{L - 1} H_{l}^{(k)} x_{(i - l) mod T} + n_{i}^{(k)},

(8)

where $n_{i}^{(k)} ~ N (0_{N_{R} \times 1}, σ^{2} I_{N_{R}})$ is the thermal noise at the receiver side. The block communication model, at transmission k, can be written as,

y^{(k)} = H_{c}^{(k)} x + n^{(k)},

(9)

where $y^{(k)} ≜ {[y_{0}^{{(k)}^{⊤}}, \dots, y_{T_{c} - 1}^{{(k)}^{⊤}}]}^{⊤}$ , $n^{(k)} = {[n_{0}^{{(k)}^{⊤}}, \dots, n_{T_{c} - 1}^{{(k)}^{⊤}}]}^{⊤}$ and $H_{c}^{(k)} \in ℂ^{T_{c} N_{R} \times T_{c} N_{T}}$ is a block circulant matrix whose first T_c N_R × N_T column matrix is ${[H_{0}^{{(k)}^{⊤}}, \dots, H_{L - 1}^{{(k)}^{⊤}}, 0_{N_{T} \times (T_{c} - L) N_{R}}]}^{⊤}$ . As $H_{c}^{(k)}$ is block circulant, it can be block diagonalized in a Fourier basis as $H_{c}^{(k)} = U_{T_{c}, N_{R}}^{H} Λ^{(k)} U_{T_{c}, N_{T}}$ , where Λ^(k)is the channel frequency response (CFR) matrix at ARQ round k is given by

\{\begin{matrix} Λ^{(k)} ≜ diag \{Λ_{0}^{(k)}, \dots, Λ_{T_{c} - 1}^{(k)}\}, \\ Λ_{i}^{(k)} = \sum_{l = 0}^{L - 1} H_{l}^{(k)} e^{- j (2 π i l / T_{c})} . \end{matrix}

(10)

A discrete Fourier transform (DFT) is then applied to the received vector y^(k). This yields T_c frequency domain components grouped in block

y_{f}^{(k)} ≜ {[y_{f_{0}}^{{(k)}^{⊤}}, \dots, y_{f_{T_{c} - 1}}^{{(k)}^{⊤}}]}^{⊤},

(11)

which can be expressed as,

y_{f}^{(k)} = Λ^{(k)} x_{f} + n_{f}^{(k)},

(12)

where vectors $x_{f} ≜ {[x_{f_{0}}^{⊤}, \dots, x_{f_{T_{c} - 1}}^{⊤}]}^{⊤} \in ℂ^{T_{c} N_{T} \times 1}$ and $n_{f}^{(k)} ≜ {[n_{f_{0}}^{{(k)}^{⊤}}, \dots, n_{f_{T_{c} - 1}}^{{(k)}^{⊤}}]}^{⊤}$ group the DFTs of transmitted chips and thermal noise at round k, respectively. The channel frequency response (CFR) matrix Λ^(k)

2.3. Turbo receiver with no packet combining for multi-antenna multi-code CP-CDMA

The conventional receiver for multi-antenna multi-code CP-CDMA, presented in this section, makes use of ARQ principle with no packet combining at the receiver side. At transmission k, the receiver performs soft equalization and computes the extrinsic log-likelihood ratio (LLR) about coded and interleaved bits with the aid of the communication model (12), and the a priori information generated by the soft-input-soft-output (SISO) decoder at the previous iteration. Interference cancelation is performed starting from the first iteration. In fact, this conventional receiver makes use of prior LLRs of coded and interleaved bits generated by the SISO decoder during the last iteration of previous transmission k - 1. This idea was initially introduced in [14] in the context of single antenna coded systems with ARQ.

First, soft inter-chip interference (ICI) is canceled from the received signal vector $y_{f}^{(k)}$ . Then, the resulting soft ICI-free signal enters an unconditional MMSE filter. As presented in [15], the soft interferences cancelation and MMSE filtering can be implemented in the frequency domain using a forward and a backward filters. The MMSE estimate $z_{f}^{(k)}$ on x_fat transmission k is expressed as,

z_{f}^{(k)} = Φ^{(k)} y_{f}^{(k)} - Ψ^{(k)} {\tilde{x}}_{f},

(13)

where ${\tilde{x}}_{f}$ denotes the DFT of the conditional expectation (i.e., computed based on a-priori LLRs) of x and $Φ^{(k)} = diag \{Φ_{0}^{(k)}, \dots, Φ_{T_{c} - 1}^{(k)}\}$ and $Ψ^{(k)} = diag \{Ψ_{0}^{(k)}, \dots, Ψ_{T_{c} - 1}^{(k)}\}$ denote the forward and backward filters at round k, respectively, and are given by,

\{\begin{matrix} Φ_{i}^{(k)} ≜ \frac{1}{σ^{2}} \{I_{N_{T}} - D_{i}^{(k)} C_{i}^{{(k)}^{- 1}}\} Λ_{i}^{{(k)}^{H}}, \\ C_{i}^{(k)} = σ^{2} {\tilde{Ξ}}^{- 1} + D_{i}^{(k)}, \end{matrix}

(14)

\{\begin{matrix} Ψ_{i}^{(k)} ≜ Φ_{i}^{(k)} Λ_{i}^{(k)} - ϒ^{(k)}, \\ ϒ^{(k)} = \frac{1}{T_{c}} \sum_{i = 0}^{T_{c} - 1} Φ_{i}^{(k)} Λ_{i}^{(k)} . \end{matrix}

(15)

where $D_{i}^{(k)} = Λ_{i}^{{(k)}^{H}} Λ_{i}^{(k)}$ and $\tilde{Ξ}$ is the N_T × N_T unconditional covariance of transmitted chips, and is computed as the time average of conditional covariance matrices $Ξ_{i} ≜ diag \{σ_{1, i}^{2}, \dots, σ_{N_{T}, i}^{2}\}$ , where $σ_{t, i}^{2}$ is the conditional variance of x_{t, i}.

After computing (13), the inverse DFT (IDFT) is then applied to $z_{f}^{(k)}$ to obtain the equalized time domain chip sequence,

z^{(k)} = U_{T_{c}, N_{T}}^{H} z_{f}^{(k)} .

(16)

The MMSE estimate $z_{t, i}^{(k)}$ corresponding to antenna t and channel use i after k transmission can be simply extracted from z^(k)as $z_{t, i}^{(k)} = e_{t, i}^{H} z^{(k)}$ , with e_{t, i}denotes the (N_Ti + t)th vector of the canonical basis. After despreading, extrinsic LLR value $ϕ_{t, j, m}^{(e)}$ [16] corresponding to coded and interleaved bit b_{t, j, m}∀ t, j, m is computed as,

ϕ_{t, j, m}^{(e)} = log \frac{\sum_{s \in S_{1}^{m}} exp \{ξ_{t, j}^{(k)} (s) + \sum_{m' \neq m} ϕ_{t, j, m'}^{(a)} λ_{m'} \{s\}\}}{\sum_{s \in S_{0}^{m}} exp \{ξ_{t, j}^{(k)} (s) + \sum_{m' \neq m} ϕ_{t, j, m'}^{(a)} λ_{m'} \{s\}\}},

(17)

where $ξ_{t, j}^{(k)} (s) = \frac{{|r_{t, j}^{(k)} - g_{t, j}^{(k)} s|}^{2}}{θ_{t, j}^{{(k)}^{2}}}$ , with $r_{t, j}^{(k)}$ , $g_{t, j}^{(k)}$ , and $θ_{t, j}^{{(k)}^{2}}$ are the despreading module output, gain, and residual interference variance, respectively. $ϕ_{t, j, m'}^{(a)}$ denotes a-priori LLR value corresponding to b_{t, j, m'}. λ_m'{s} is an operator that allows to extract the m'th bit labeling symbol $s \in S$ , and $S_{β}^{m}$ is the set of symbols where the m th bit is equal to β, i.e., $S_{β}^{m} = \{s : λ_{m} \{s\} = β\}$ . The obtained extrinsic LLR values are de-interleaved and fed to the SISO decoder. The block diagram of the conventional receiver at ARQ round k is depicted in Figure 2.

3. Iterative receivers for CP-CDMA MIMO ARQ

In this section, we present two efficient algorithms for performing turbo packet combining for CP-CDMA MIMO ARQ systems: (i) chip-level turbo packet combining, and (ii) symbol-level turbo packet combining. In both schemes, signals received in multiple ARQ rounds are processed using soft MMSE FDE.

3.1. Chip-level turbo packet combining

To exploit the diversity available in received signals $y_{f_{0}}^{(1)}, \dots, y_{f_{T_{c} - 1}}^{(k)}$ , we view each ARQ round k as an additional group of virtual N_R receive antennas. The MIMO ARQ system can therefore be considered as a point-to-point MIMO link with N_T transmit and kN_R receive antennas, where the T_c kN_R × 1 chip-level virtual received signal vector ${\underline{y}}_{f}^{(k)}$ is constructed as,

{\underline{y}}_{f}^{(k)} ≜ {[y_{f_{0}}^{{(1)}^{⊤}}, \dots, y_{f_{0}}^{{(k)}^{⊤}}, \dots, y_{f_{T_{c} - 1}}^{{(1)}^{⊤}}, \dots, y_{f_{T_{c} - 1}}^{{(k)}^{⊤}}]}^{⊤} .

(18)

The frequency domain communication model after k rounds is then given as,

{\underline{y}}_{f}^{(k)} = {\underline{Λ}}^{(k)} x_{f} + {\underline{n}}_{f}^{(k)},

(19)

where

{\underline{Λ}}^{(k)} ≜ diag \{[\begin{matrix} Λ_{0}^{(1)} \\ ⋮ \\ Λ_{0}^{(k)} \end{matrix}], \dots, [\begin{matrix} Λ_{T_{c} - 1}^{(1)} \\ ⋮ \\ Λ_{T_{c} - 1}^{(k)} \end{matrix}]\} \in ℂ^{T_{c} k N_{R} \times T_{c} N_{T}},

(20)

and

{\underline{n}}_{f}^{(k)} = {[n_{f_{0}}^{{(1)}^{⊤}}, \dots, n_{f_{0}}^{{(k)}^{⊤}}, \dots, n_{f_{T_{c} - 1}}^{{(1)}^{⊤}}, \dots, n_{f_{T_{c} - 1}}^{{(k)}^{⊤}}]}^{⊤} .

(21)

Soft ICI cancelation and frequency domain MMSE filtering are jointly performed over all ARQ rounds. We call this concept chip-level turbo packet combining. Therefore, the multi-round MMSE estimate $z_{f}^{(k)}$ on x_fat transmission k is expressed as,

z_{f}^{(k)} = {\underline{Φ}}^{(k)} {\underline{y}}_{f}^{(k)} - {\underline{Ψ}}^{(k)} {\tilde{x}}_{f},

(22)

where ${\underline{Φ}}^{(k)} = diag \{{\underline{Φ}}_{0}^{(k)}, \dots, {\underline{Φ}}_{T_{c} - 1}^{(k)}\}$ is the multi-round forward filter given by,

\{\begin{matrix} {\underline{Φ}}_{i}^{(k)} ≜ \frac{1}{σ^{2}} \{I_{N_{T}} - {\underline{D}}_{i}^{(k)} {\underline{C}}_{i}^{{(k)}^{- 1}}\} {\underline{Λ}}_{i}^{{(k)}^{H}}, \\ {\underline{C}}_{i}^{(k)} = σ^{2} {\tilde{Ξ}}^{- 1} + {\underline{D}}_{i}^{(k)}, \end{matrix}

(23)

and ${\underline{Ψ}}^{(k)} = diag \{{\underline{Ψ}}_{0}^{(k)}, \dots, {\underline{Ψ}}_{T - 1}^{(k)}\}$ is the multi-round backward filter given by,

\{\begin{matrix} {\underline{Ψ}}_{i}^{(k)} ≜ {\underline{Φ}}_{i}^{(k)} {\underline{Λ}}_{i}^{(k)} - {\underline{ϒ}}^{(k)}, \\ {\underline{ϒ}}^{(k)} = \frac{1}{T_{c}} \sum_{i = 0}^{T_{c} - 1} {\underline{Φ}}_{i}^{(k)} {\underline{Λ}}_{i}^{(k)} . \end{matrix}

(24)

Note that to perform this combining scheme all signals received at slots 1, . . ., k and their corresponding channel matrices $Λ_{0}^{(1)}, \dots, Λ_{T_{c} - 1}^{(k)}$ have to be stored in the receiver. This requires a memory size that linearly scales with the ARQ delay. To relax the constraint put by the memory space, we introduce the following frequency domain variables, ${\underline{\tilde{y}}}_{f}^{(k)}$ and ${\underline{D}}_{i}^{(k)}$ . The first variable ${\underline{\tilde{y}}}_{f}^{(k)}$ allows us to store received signals. It is calculated using the following recursion,

\{\begin{matrix} {\underline{\tilde{y}}}_{f}^{(k)} = {\underline{\tilde{y}}}_{f}^{(k - 1)} + Λ^{{(k)}^{H}} y_{f}^{(k)}, \\ {\underline{\tilde{y}}}_{f}^{(0)} = 0_{T N_{T} \times 1} . \end{matrix}

(25)

The second variable ${\underline{D}}_{i}^{(k)}$ is used to store CFRs. It is calculated as,

\{\begin{matrix} {\underline{D}}_{i}^{(k)} = {\underline{D}}_{i}^{(k - 1)} + Λ_{i}^{{(k)}^{H}} Λ_{i}^{(k)}, \\ {\underline{D}}_{i}^{(0)} = 0_{N_{T} \times N_{T}} . \end{matrix}

(26)

Using this recursive variables, the output of soft MMSE packet combiner can be expressed as,

z_{f}^{(k)} = Γ^{(k)} {\underline{\tilde{y}}}_{f}^{(k)} - Ω^{(k)} {\tilde{x}}_{f},

(27)

where $Γ^{(k)} = diag \{Γ_{0}^{(k)}, \dots, Γ_{T_{c} - 1}^{(k)}\} \in ℂ^{T_{c} N_{T} \times T_{c} N_{T}}$ denotes the low complexity forward filter at ARQ round k and is defined as,

\{\begin{matrix} Γ_{i}^{(k)} ≜ \frac{1}{σ^{2}} \{I_{N_{T}} - {\underline{D}}_{i}^{(k)} C_{i}^{{(k)}^{- 1}}\}, \\ C_{i}^{(k)} = σ^{2} {\tilde{Ξ}}^{{- 1}^{}} + {\underline{D}}_{i}^{(k)}, \end{matrix}

(28)

and $Ω^{(k)} = diag \{Ω_{0}^{(k)}, \dots, Ω_{T_{c} - 1}^{(k)}\} \in ℂ^{T_{c} N_{T} \times T_{c} N_{T}}$ denotes the low complexity backward filter at ARQ round k and is defined as,

\{\begin{matrix} Ω_{i}^{(k)} ≜ Γ_{i}^{(k)} {\underline{D}}_{i}^{(k)} - ϒ^{(k)}, \\ ϒ^{(k)} = \frac{1}{T} \sum_{i = 0}^{T - 1} Γ_{i}^{(k)} {\underline{D}}_{i}^{(k)} . \end{matrix}

(29)

The inverse DFT is then applied to $z_{f}^{(k)}$ to obtain the equalized time domain chip sequence. After despreading, extrinsic LLR values $ϕ_{t, j, m, n}^{(e)} (k)$ corresponding to coded and interleaved bits b_{t, j, m}∀ t, j, m at iteration n of round k are computed similarly to (17). The output of the demapper is then desinterleaved and fed to the SISO decoder. The proposed low complexity algorithm is summarized in Table 1 and the block diagram is presented in Figure 3.

Table 1 Summary of the chip-level turbo combining algorithm

Full size table

3.2. Symbol-level turbo packet combining

In this combining scheme, the receiver performs chip-level space-time frequency domain equalization separately for each ARQ round, then combines multiple transmissions at the level of the soft demapper. At each iteration of ARQ round k, soft ICI cancelation and MMSE filtering are performed similarly to (13) using communication model (12). The despreading module outputs at the current iteration of ARQ round k are then combined with those obtained at the last turbo iteration of previous rounds k - 1, . . ., 1. Let $r_{t, j}^{(k)} = {[r_{t, j}^{(1)}, \dots, r_{t, j}^{(k)}]}^{⊤}$ denotes the t th antenna despreading module outputs at discrete time j corresponding to transmissions 1, . . ., k. Assuming independence between the outputs of the despreading module of different transmissions $r_{t, j}^{(1)}, \dots, r_{t, j}^{(k)}$ , the extrinsic LLR values $ϕ_{t, j, m, n}^{(e)} (k)$ corresponding to coded and interleaved bits b_{t, j, m}at iteration n of round k are expressed as,

ϕ_{t, j, m, n}^{(e)} (k) = log \frac{\sum_{s \in S_{1}^{m}} exp \{ξ_{t, j}^{(k)} {(s)}^{H} ξ_{t, j}^{(k)} (s) + \sum_{m' \neq m} ϕ_{t, j, m', n}^{(a)} (k) λ_{m'} \{s\}\}}{\sum_{s \in S_{0}^{m}} exp \{ξ_{t, j}^{(k)} {(s)}^{H} ξ_{t, j}^{(k)} (s) + \sum_{m' \neq m} ϕ_{t, j, m', n}^{(a)} (k) λ_{m'} \{s\}\}},

(30)

where $ξ_{t, j}^{(k)} (s) = |r_{t, j}^{(k)} - g_{t, j}^{(k)} s| θ_{t, j}^{{(k)}^{- 1}}$ , with $g_{t, j}^{(k)} = {[g_{t, j}^{(1)}, \dots, g_{t, j}^{(k)}]}^{T}$ is the equivalent channel gain and $θ_{t, j}^{(k)} = diag \{θ_{t, j}^{(1)}, \dots, θ_{t, j}^{(k)}\}$ is the residual interference covariance matrix corresponding to transmissions 1, . . ., k.

Implementation Aspects

To relax the constraint put by the memory space required for storing the outputs of the despreading module of different transmissions, we introduce the new variable ${\bar{ξ}}_{t, j}^{(k)} (s)$ computed according to the following recursion,

\{\begin{matrix} {\bar{ξ}}_{t, j}^{(k)} (s) = {\bar{ξ}}_{t, j}^{(k - 1)} (s) + \frac{{|r_{t, j}^{(k)} - g_{t, j}^{(k)} s|}^{2}}{θ_{t, j}^{{(k)}^{2}}}, \\ {\bar{ξ}}_{t, j}^{(0)} (s) = 0 . \end{matrix}

(31)

The extrinsic LLR $ϕ_{t, j, m, n}^{(e)} (k)$ in (30) is then expressed as,

ϕ_{t, j, m, n}^{(e)} (k) = log \frac{\sum_{s \in S_{1}^{m}} exp \{{\bar{ξ}}_{t, j}^{(k)} (s) + \sum_{m' \neq m} ϕ_{t, j, m', n}^{(a)} (k) λ_{m'} \{s\}\}}{\sum_{s \in S_{0}^{m}} exp \{{\bar{ξ}}_{t, j}^{(k)} (s) + \sum_{m' \neq m} ϕ_{t, j, m', n}^{(a)} (k) λ_{m'} \{s\}\}},

(32)

The recursions (31) presents the major ingredient in the proposed symbol-level combining algorithm since both complexity and memory requirements become quite insensitive to the ARQ delay. The proposed recursive algorithm is summarized in Table 2 and the block diagram is presented in Figure 4.

Table 2 Summary of the symbol-level turbo combining algorithm

Full size table

4. Complexity and performance analysis

4.1. Complexity evaluation

In this section, we briefly analyze both the computational cost and memory requirements of the proposed packet combining schemes. First, note that both combining schemes have identical implementations. The only difference comes from variable updates in steps Table 1(1.1.), and Table 2(1.1.3). Therefore, both techniques approximately have the same implementation cost. In the following, we focus on the number of arithmetic additions and memory required to perform recursions (25), (26), and (31).

The main idea in the proposed algorithms is to exploit the diversity available in multiple transmissions without explicitly storing required soft channel outputs (i.e., signals and CFRs) or decisions (i.e., filter outputs), corresponding to all ARQ rounds. This is performed with the aid of recursions (25), (26), and (31), and translates into a memory requirement of 2T_cN_T (N_T + 1) and T_sN_T 2^Mreal values for chip-level and symbol-level turbo combining, respectively. Note that in both schemes, the required memory size is insensitive to the ARQ delay. The number of rounds only influences the number of arithmetic additions required in the update procedures corresponding to recursions (25), (26), and (31). At each ARQ round, the chip-level turbo combining algorithm involves 2T_c N_T (N_T + 1) arithmetic additions to update ${\underline{\tilde{y}}}_{f}^{(k)}$ and ${\underline{D}}_{i}^{(k)}$ . The symbol-level turbo combining scheme requires T_sN_T N_iter2^Marithmetic additions to update ${\bar{ξ}}_{t, j}^{(k)} (s)$ at each round, where N_iter denotes the number of turbo iterations. Table 3 summarizes the number of arithmetic additions and memory size required by both schemes.

Table 3 summary of the maximum number of arithmetic additions, and memory size

Full size table

4.2. Performance evaluation

In this section, we evaluate the performance of the proposed multi-antenna multi-code CP-CDMA receivers in term of BLER and Throughput η. Following [17], we define the throughput as $η ≜ \frac{E [R]}{E [K]}$ , where $R$ is a random variable (RV) that takes R when the packet is correctly received or zero when the packet is erroneous after K ARQ rounds. $K$ is a RV that denotes the number of rounds used for transmitting one data packet.

The system used for the evaluation has N_T = 2 transmit antennas, N_R = {1, 2} receive antennas, spreading factor N = 16, Quadrature Phase Shift Keying (QPSK) modulation and 16 states convolutional encoder with polynomial generators (35, 23)₈. The length of the coded frame is 1024 bits including tails. We assume short-term static ARQ MIMO channel that has L = 10 chip spaced paths with equally distributed power. The CP length is T_{C P} = 10. We employ the Max-Log-MAP Version of the MAP decoding algorithm [18] for SISO decoding. The maximum number of transmissions is set to K = 3 and the E_c/N₀ ratio appearing in all figures is the SNR per chip per receive antenna. We have noticed via simulations that no remarkable performance improvement is obtained when the number of iterations is greater than three. The turbo process is therefore stopped after three iterations for each transmission. The matched filter bound (MFB)^b is used to evaluate the diversity achievement of the proposed algorithms. We also use the conventional LLR-level packet combining^c as a reference to evaluate the performance gain provided by the proposed combining strategies. In term of complexity, the number of arithmetic additions is relatively insignificant compared with the whole computational cost of the receiver. Therefore, we consider the memory requirements as the major parameter to take into account to evaluate the studied combining schemes in term of implementation cost.

We first investigate performance for balanced configurations, i.e., N_T = N_R = 2, with all codes are assigned to one user (C = 16). Figure 5 compares the BLER performance for the chip-level and symbol-level combining with MFB and LLR-level combining. Due to the increase in the diversity order caused by virtual antennas, the proposed combining schemes clearly outperform the LLR-level combining. The performance gap is more than 2 dB at 10^-2 BLER for both second and third transmissions. Moreover, the chip-level combining outperforms symbol-level combining. However, the performance gap is less than 0.7 dB at 10^-2 BLER for both second and third transmissions. Figure 5 plots also the MFB to evaluate the diversity achievement of the proposed combining schemes. We observe that with chip-level combining a maximum of diversity is achieved and the gap between the proposed combining scheme and MFB is reduced from 4 dB in the first transmission to 1 dB in the third transmission at 10^-2 BLER. In Figure 6, we examine overloaded configuration where N_T = 2 and N_R = 1. Chip-level combining significantly outperforms symbol-level combining, the gap between these two techniques is more than 5 dB for the second transmission and 3 dB for the third transmission at 10^-2 BLER. Chip-level combining is therefore more beneficial for overloaded configurations, where the receiver has to deal with more interferences. Moreover, the ICI cancelation capability of the chip-level combiner and symbol-level combiner is better than that of LLR-level combining. In fact, LLR-level combining performance curves tend to saturate for high E_c /N₀ values, while the proposed combining schemes BLER curves have steeper slopes that are similar to that of the MFB curves. This is mainly due to the fact that, at the second ARQ round, the proposed combiners constructs a 2 × 2 virtual MIMO channel, while the MIMO configuration remains unbalanced in the case of LLR-level combining.

Now, we turn to the case where all codes are not necessarily assigned to one user. We start by evaluating the throughput of the considered system with N_T = N_R = 2. The simulation results are depicted in Figure 7 where three sets of curves are shown for C = 4, 8, and 16. In this configuration, both combining schemes yield quasi-identical performance, the gap between the proposed packet combining techniques is less than 0.7 dB. In term of implementation cost, since both schemes have quasi-identical performance, symbol-level combining scheme is the best candidate with the least memory requirements. We also evaluate multiple input single output transmission systems which are of special interest for downlink radio mobile applications. Figure 8 plots throughput for N_T = 2 and N_R = 1. Chip-level turbo combining scheme clearly outperforms symbol-level turbo combining scheme. The performance gap is more than 5 dB for systems with high ICI, i.e., C = 16. For this configuration, chip-level combining requires only 50% more memory than symbol-level combining scheme and can be chosen as the best candidate. However, when less multiplexed codes are used, i.e., C = 4, the performance gap between the proposed schemes is reduced to 1 dB as the complexity gap becomes huge (chip-level combining requires a memory size 12 times greater than the one required by symbol-level combining). In this case the symbol-level turbo combining scheme becomes be the best candidate.

5. Conclusions

In this article, efficient turbo receiver schemes for single user multi-code CP-CDMA transmission with ARQ operating over a broadband MIMO channel were introduced. The key idea of the proposed schemes is to exploit the diversity among all transmissions with a very low cost by introducing new variables recursively computed. Two packet combining algorithms were presented. The first algorithm consists in performing packet combining jointly with frequency domain chip level turbo equalization. The second proposed algorithm performs packet combining jointly with turbo demapping. Complexity evaluation showed that each combining scheme could be the most attractive in term of implementation cost depending on the number of transmit antennas, the factor $\frac{N}{C}$ , the constellation length, and the number of turbo iterations. Moreover, simulations demonstrated that both schemes approximately have similar performance for balanced (same number of transmit and receive antennas) MIMO configurations. Hence, for receiver devices that cannot afford large complexity and storage requirements, it may be preferable to use symbol-level combining instead of chip-level combining. In the case of unbalanced configurations (more transmit than receive antennas), we demonstrated that chip-level combining clearly outperforms symbol-level combining. In that case, system configuration should be considered before deciding on the best combining scheme.

Endnotes

^aThe short-term static ARQ channel dynamic corresponds to the case where two consecutive ARQ rounds observe independent channel realizations. In long-term static channels, all ARQ rounds corresponding to the same data packet observe the same channel realization.

^bThe MFB curves are obtained for each transmission assuming perfect ICI cancelation and maximum ratio combining (MRC) of all time, space, multipath, and delay diversity branches. ^cIn LLR-level combining, turbo equalization is separately performed for each transmission, and right before SISO decoding, extrinsic LLRs, at transmission k, are simply added together with those obtained at the last iteration of previous transmission k - 1.

References

Peisa J, Wager S, Sagfors M, Torsner J, Goransson B, Fulghum T, Cozzo C, Grant S: High speed packet access evolution-concept and technologies. In Proc 65th IEEE veh tech conf VTC'07 Spring. Dublin, Ireland; 2007:819-824.
Google Scholar
Wolniansky PW, Foschini GJ, Valenzuela GD: V-BLAST: An architecture for realizing very high data rates over the rich scattering wireless channel. In Proc Int Symp Signals, Systems, Electron. Pisa, Italy; 1998:295-300.
Google Scholar
Harvey BA, Wicker SB: Packet combining system based on the Viterbi decoder. IEEE Trans Commun 1994, 42: 1544-1557. 10.1109/TCOMM.1994.582838
Article Google Scholar
Chih-Lin I, Gitlin RD: Multi-code CDMA wireless personal communications networks. In Proc IEEE Int Conf Commun. Volume 2. Seattle, WA; 1995:1060-1064.
Chapter Google Scholar
3GPP TS 25.212 v7.8.0, Multiplexing and channel coding (FDD), Release 7 2008.
Adachi F, Sao T, Itagaki T: Performance of multicode DS-CDMA using frequency domain equalisation in frequency selective fading channel. Electron Lett 2003, 39(2):239-241. 10.1049/el:20030160
Article Google Scholar
Lee JK, Lee TJ, Chae HJ, Kim DK: Frequency domain turbo equalization for multicode DS-CDMA in frequency selective fading channel. In Proc, 19th Annual IEEE Symp Personal Indoor Mobile Radio Commun (PIMRC'07). Athens, Greece; 2007:1-5.
Google Scholar
Garg D: Adachi, Packet access using DS-CDMA with frequency-domain equalization. IEEE J Sel Areas Commun 2006, 24(1):161-170.
Article Google Scholar
El Gamal H, Caire G, Damen MO: The MIMO ARQ channel: diversity-multiplexing-delay tradeoff. IEEE Trans Inf Theory 2006, 52(8):3601-3621.
Article MathSciNet Google Scholar
Chuang A, Guillen i Fabregas A, Rasmussen LK, Collings IB: Optimal throughput-diversity-delay tradeoff in MIMO ARQ block-fading channels. IEEE Trans Inf Theory 2008, 54(9):3968-3986.
Article MathSciNet Google Scholar
Ait-Idir T, Saoudi S: Turbo packet combining strategies for the MIMO-ISI ARQ channel. IEEE Trans Commun 2009, 57(12):3782-3793.
Article Google Scholar
Ait-Idir T, Saoudi S: Turbo packet combining for MIMO-ISI channels with co-channel interference. In Proc, 19th Annual IEEE Symp Personal Indoor Mobile Radio Commun (PIMRC'08). Cannes, France; 2008:1-5.
Google Scholar
Ait-Idir T, Chafnaji H, Saoudi S: Turbo packet combining for broadband space-time BICM hybrid-ARQ systems with co-channel interference. IEEE Trans Wirel Commun 2010, 9(5):1686-1697.
Article Google Scholar
Narayanan K, Stuber G: A novel ARQ technique using the turbo coding principle. IEEE Commun Lett 1997, 1(3):49-51.
Article Google Scholar
Visoz R, Berthet AO, Chtourou S: Frequency-domain block turbo-equalization for single-carrier transmission over MIMO broadband wireless channel. IEEE Trans Commun 2006, 54(12):2144-2149.
Article Google Scholar
Tonello AM: Space-time bit-interleaved coded modulation with an iterative decoding strategy. In Proc 52th IEEE Veh tech conf VTS-Fall VTC 2000. Volume 1. Boston, USA; 2000:473-478.
Chapter Google Scholar
Caire G, Tuninetti D: ARQ protocols for the Gaussian collision channel. IEEE Trans Inf Theory 2001, 47(4):1971-1988.
Article MathSciNet Google Scholar
Bahl LR, Cocke J, Jelinek F, Raviv J: Optimal decoding of linear codes for minimizing symbol error rate. IEEE Trans Inf Theory 1974, IT-20: 284-287.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

INPT, Madinat Al Irfane, Rabat, 10100, Morocco
Houda Chafnaji & Tarik Ait-Idir
TELECOM Bretagne\Labsticc, Technopole Brest-Iroise, CS 83818, 29238, Brest cedex 3, France
Houda Chafnaji, Tarik Ait-Idir & Samir Saoudi
Department of Computer and Telecommunications Engineering, University of Western Former Yugoslav Republic of Macedonia, Kozani, GR, Greece
Athanasios V Vasilakos

Authors

Houda Chafnaji
View author publications
You can also search for this author in PubMed Google Scholar
Tarik Ait-Idir
View author publications
You can also search for this author in PubMed Google Scholar
Samir Saoudi
View author publications
You can also search for this author in PubMed Google Scholar
Athanasios V Vasilakos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Houda Chafnaji.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Chafnaji, H., Ait-Idir, T., Saoudi, S. et al. Low complexity frequency domain hybrid-ARQ chase combining for broadband MIMO CDMA systems. J Wireless Com Network 2012, 134 (2012). https://doi.org/10.1186/1687-1499-2012-134

Download citation

Received: 15 May 2011
Accepted: 05 April 2012
Published: 05 April 2012
DOI: https://doi.org/10.1186/1687-1499-2012-134

Low complexity frequency domain hybrid-ARQ chase combining for broadband MIMO CDMA systems

Abstract

1. Introduction

2. System description

2.1. CP-CDMA MIMO ARQ transmission scheme

2.2. Communication model

2.3. Turbo receiver with no packet combining for multi-antenna multi-code CP-CDMA

3. Iterative receivers for CP-CDMA MIMO ARQ

3.1. Chip-level turbo packet combining

3.2. Symbol-level turbo packet combining

Implementation Aspects

4. Complexity and performance analysis

4.1. Complexity evaluation

4.2. Performance evaluation

5. Conclusions

Endnotes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords