Iterative channel estimation and data detection for MIMO-OFDM systems operating in time-frequency dispersive channels under unknown background noise

Zhong, Ke; Lei, Xia; Li, Shaoqian

doi:10.1186/1687-1499-2013-182

Research
Open access
Published: 06 July 2013

Iterative channel estimation and data detection for MIMO-OFDM systems operating in time-frequency dispersive channels under unknown background noise

Ke Zhong¹,
Xia Lei¹ &
Shaoqian Li¹

EURASIP Journal on Wireless Communications and Networking volume 2013, Article number: 182 (2013) Cite this article

4227 Accesses
4 Citations
Metrics details

Abstract

In this paper, the challenging problem of joint channel estimation and data detection for multiple-input multiple-output orthogonal frequency division multiplexing systems operating in time-frequency dispersive channels under unknown background noise is investigated. Based on two different but equivalent signal models, two expectation-maximization algorithm-based iterative schemes for joint data detection and channel and noise variance estimation are proposed. The first scheme jointly detects data and estimates the channel and noise variance, but the computational complexity is high, owing to the simultaneous detection and estimation for all antennas. To reduce the computational complexity, a complexity-reduced scheme that is detecting data and estimating channel for only one antenna during each iteration and holding the unknown quantities of other antennas to their last values is proposed, whose performance only slightly degrades compared to the first scheme. Moreover, both schemes are derived as closed-form expressions, and therefore, our proposed schemes are free of exhaustive search. Simulation results demonstrate quick convergence of the proposed algorithm, and after convergence, the performance of the proposed algorithm is close to that of the optimal channel estimation and data detection case, which assumes full training and perfect channel state information.

1 Introduction

Multiple-input multiple-output (MIMO) communication [1] can significantly increase the throughput without increasing the transmit power and additional bandwidth. Orthogonal frequency division multiplexing (OFDM) [2] can provide high data rate transmission capability and is robust against multipath (time-dispersive) fading channels. MIMO combined with OFDM (MIMO-OFDM) [3] has been adopted in various international standards such as 3GPP-LTE, WiMAX, and IMT-Advanced.

Meanwhile, vehicles with increased speeds, such as high-speed cars, subways, and trains which exceed 350 km/h, play an increasingly important role in peoples’ lives.

Consequently, mobility support is widely regarded as one of the key features in current and future wireless communication systems. High mobility causes the transmission channel to change rapidly in time, which results in frequency dispersion of the channel. For coherent detection in MIMO-OFDM systems, channel state information (CSI) is indispensable [3].

CSI acquisition is particularly challenging in time-frequency (TF) dispersive channels because channel responses vary sample by sample, and therefore, the number of unknown channel parameters in an OFDM symbol period increases significantly (much greater than in frequency-nondispersive channels). Furthermore, in practical communication scenarios, the knowledge of the power of background noise is required to perform many signal processing algorithms, such as channel estimation [4] and decoding [5] in MIMO-OFDM systems.

In this paper, joint data detection and channel and noise variance estimation for MIMO-OFDM systems operating in TF dispersive channels under unknown background noise are investigated. We employ the expectation-maximization (EM) algorithm [6, 7], which is an iterative numerical method employed to compute the maximum likelihood (ML) estimates, to develop an iterative algorithm to solve this challenging problem.

For MIMO systems, the literature along these lines can be categorized as follows:

EM for channel estimation and data detection assuming the noise variance is known: EM-based joint channel estimation and data detection algorithms in time-nondispersive and frequency-nondispersive channels (TnDFnD channels) are proposed in [8–10], and in time-dispersive and frequency-nondispersive channels (TDFnD channels) are proposed in [11–13], respectively. However, the maximization step (M-step) for data detection proposed in these papers is not obtained as a closed-form solution, and therefore, a brute-force searching over all of the possibilities is required.

EM for channel and noise variance estimation: In TnDFnD channels, EM-based joint channel and noise variance estimation algorithms are proposed in [14–16]. However, data detection is obtained by an extra ML estimator and a maximizing a posteriori probabilities (APP) detector in [14, 15], respectively.

In [16], a full training sequence is adopted to perform the proposed EM algorithm, and therefore, no data detection is addressed. In TDFnD channels, EM-based joint channel and noise variance estimation algorithms are proposed in [17–19]. However, data detection is not addressed in these papers.

EM for data detection and noise variance estimation: In TnDFnD channels, an EM-based joint data detection and noise variance estimation algorithm is proposed in [20]. However, the channel estimate is only obtained by pilot symbols and is not included in the EM updating process.

EM only for data detection assuming the noise variance is known: In TnDFnD channels, EM-based data detection algorithms are proposed in [21, 22]. However, channel estimation is not addressed in [21], and the channel knowledge is assumed ideally known at the receiver in [22]. In TF channels, an EM-based data detection algorithm is proposed in [23] to solve a maximum a posteriori probability (MAP) detection problem. However, the data estimate is not given by a closed form, and therefore, the exhaustive search is required.

EM only for channel estimation assuming the noise variance is known: In TnDFnD channels, EM-based channel estimation algorithms are proposed in [24–28]. However, the data estimates are obtained by extra MAP estimators in [24–26] and APP estimators in [27, 28], respectively. In TDFnD channels, EM-based channel estimation algorithms are proposed in [29–32]. However, the data estimates are obtained by an extra BI-GDFE detector in [29], a minimum mean-squared error (MMSE) detector in [30], a trellises approach in [31], respectively, and data detection is not addressed in [32].

In this paper, based on two different but equivalent signal models, two EM algorithm-based iterative schemes which integrate data detection and channel and noise variance estimation are proposed in a consistent way so as to iteratively improve the system performance.

The first scheme jointly detects data and estimates the channel and noise variance, but the computational complexity is high, owing to the simultaneous detection and estimation for all antennas. To reduce the computational complexity of the first scheme, another scheme that performs data detection and channel estimation for only one antenna during each iteration and holding the unknown quantities of other antennas to their last values is proposed, whose performance only slightly degrades compared to the first scheme. Furthermore, the estimates of data, channel, and noise variance are all obtained as closed-form results, and therefore, the proposed schemes are free of exhaustive search. Simulation results demonstrate quick convergence of the proposed algorithm, and after convergence, the performance of the proposed iterative algorithm is close to that of the optimal channel estimation and data detection case, which assumes full training and perfect CSI.

The remainder of this paper is organized as follows. The system model for MIMO-OFDM systems operating in TF dispersive channels under unknown background noise is introduced in Section 2.

In Section 3, an EM-based scheme for joint data detection and channel and noise variance estimation is proposed. In Section 4, a reduced complexity EM-based scheme is proposed. Section 5 gives some simulation results that demonstrate the effectiveness of the proposed schemes. Finally, conclusions are drawn in Section 6.

Notation: Matrices and vectors are represented by boldface uppercase and lowercase letters, respectively.

A hat over a variable (e.g., $\hat{x}$ ) indicates an estimate of the variable. $E {\cdot}$ denotes the expectation. Superscripts [ ·]^T, [ ·]⁻¹, and [ ·]^H denote the transpose, the matrix inversion, and the Hermitian operations, respectively. I_N is an identity matrix with dimension N. diag{x} and Blkdiag{·} stand for the diagonal matrix with vector x on its diagonal and the block diagonal concatenation of input arguments, respectively. The symbol ⊛ denotes convolution, and ⊗ stands for the Kronecker product. Tr{X} and |X| are the trace and the determinant of a square matrix x, respectively. ℜ{·} is the real part of the element in the bracket. <·>_K denotes the mode K operation. The matrix F is the normalized fast Fourier transform (FFT) matrix with ${[F]}_{m, n} = \frac{1}{\sqrt{N}} e^{- j 2 πmn / N}$ .

2 System model

2.1 Transmitted MIMO-OFDM systems with scattered pilots

We consider a MIMO-OFDM system with N_T transmit and N_R receive antennas. For the i th transmit antenna, the time domain signal sⁱ= [ sⁱ(0),sⁱ(1),...,sⁱ(N−1)]^T is generated by taking the N-point inverse FFT of the source signal in the frequency domain xⁱ= [ xⁱ(0),xⁱ(1),...,xⁱ(N−1)]^T as sⁱ=F^Hxⁱ.

In general, the elements of xⁱ can be categorized into:

x^{i} (m) = \{\begin{array}{l} x_{d}^{i} (m) & \forall & m \in I_{d}^{i} \\ x_{p}^{i} (m) & \forall & m \in I_{p}^{i} \end{array}

(1)

where $I_{d}^{i}$ is the index set of subcarriers allocated for data symbols (with N_d elements), and $I_{p}^{i}$ is the index set of subcarriers allocated for pilot symbols (with N_p elements), respectively. Notice that N=N_d+N_p. From (1), we have $x^{i} = E_{d}^{i} x_{d}^{i} + E_{p}^{i} x_{p}^{i}$ , where $E_{d}^{i}$ and $E_{p}^{i}$ denote the matrices collecting columns of I_N corresponding to $I_{d}^{i}$ and $I_{p}^{i}$ , respectively, and $x_{d}^{i} = {[x_{d}^{i} (0), x_{d}^{i} (1), ..., x_{d}^{i} (N_{d} - 1)]}^{T}$ and $x_{p}^{i} = {[x_{p}^{i} (0), x_{p}^{i} (1), ..., x_{p}^{i} (N_{p} - 1)]}^{T}$ denote the data and pilot vectors, respectively.

A cyclic prefix (CP) with length N_cp larger than that of the longest channel response is inserted at the beginning of each OFDM symbol to prevent intersymbol interference.

2.2 TF dispersive channels under unknown background noise model

At the receive antenna j, assuming perfect timing and frequency synchronization are achieved, the n th sample of the received signal is given by:

y^{j} (n) = \sum_{i = 0}^{N_{T} - 1} h^{ji} (n, l) ⊛ s^{i} (n) + w^{j} (n),

(2)

where h^ji(n,l) is the TF dispersive channel of the l th path with length L at time n, associated with the i th transmit antenna and the j th receive antenna, and w^j(n) denotes the unknown background noise and is assumed to obey complex Gaussian distribution with zero mean and unknown variance σ², which is assumed to be the same across all receive antennas.

After discarding the CP and stacking all N samples, the received signal for a whole OFDM symbol at the receive antenna j can be expressed in a vector form as:

y^{j} = \sum_{i = 0}^{N_{T} - 1} H^{ji} s^{i} + w^{j},

(3)

where y^j= [ y^j(0),y^j(1),...,y^j(N−1)]^T and w^j= [ w^j(0),w^j(1),...,w^j(N−1)]^T denote the received signal at the receive antenna j and the corresponding noise, respectively.

H^ji represents the corresponding TF dispersive channel matrix and is expressed as:

H^{ji} = [\begin{array}{l} h^{ji} (0, 0) & 0 \dots & h^{ji} (0, L - 1) \dots & h^{ji} (0, 1) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ h^{ji} (L - 1, L - 1) & h^{ji} (L - 1, L - 2) \dots & h^{ji} (L - 1, 0) & 0 \dots \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 \dots & h^{ji} (N - 1, L - 1) & h^{ji} (N - 1, L - 2) \dots & h^{ji} (N - 1, 0) \end{array}]

(4)

It is observed from (4) that the number of unknowns in H^ji is NL, which is much larger than the number of received samples. Therefore, direct estimation of H^ji is almost impossible (i.e., this will give rise to the identifiability problem).

To overcome this problem, in this paper, a parsimonious (low-dimensional) representation of h^ji(n,l) using the basis expansion model (BEM) [33, 34] is adopted, i.e., using an expansion with respect to time n of each path l of h^ji(n,l) into a basis ${b_{n, q}}_{q = 0}^{Q}$ as:

h^{ji} (n, l) = \sum_{q = 0}^{Q} β_{q, l}^{ji} b_{n, q},

(5)

where $β_{q, l}^{ji}$ is the q th BEM coefficient of the l th path associated with the channel between the i th transmit antenna and the j th receive antenna; b_n,q is the basis that captures channel time variations, and Q+1 is the number of the basis. BEM is motivated by the observation that the temporal (n) variation of h(n,l) is usually rather smooth due to the channel’s limited Doppler spread and therefore ${b_{n, q}}_{q = 0}^{Q}$ can be chosen as a small set (i.e., Q≪N) of smooth functions.

Below, two equivalent expressions for the received signal will be derived, from which closed-form solution for data detection and channel estimation can be obtained, as will be shown in the following sections.

Notice that (3) can be rewritten as:

y^{j} = \sum_{i = 0}^{N_{T} - 1} G [s^{i}] h^{ji} + w^{j},

(6)

where $G [s^{i}] = [diag {s_{cs, 0}^{i}}, diag {s_{cs, 1}^{i}}, ..., diag {s_{cs, L - 1}^{i}}]$ with $s_{cs, l}^{i}$ representing cyclically shifts (cs) sⁱ by l positions and $h^{ji} = {[{(h_{0}^{ji})}^{T}, {(h_{1}^{ji})}^{T}, ..., {(h_{L - 1}^{ji})}^{T}]}^{T}$ with $h_{l}^{ji} = {[h^{ji} (0, l), h^{ji} (1, l), ..., h^{ji} (N - 1, l)]}^{T}$ . (6) can be put into a more compact form as:

y^{j} = G [s] h^{j} + w^{j},

(7)

where $G [s] = [G [s^{0}], G [s^{1}], ..., G [s^{N_{T} - 1}]]$ and $h^{j} = {[{(h^{j 0})}^{T}, {(h^{j 1})}^{T}, ..., {(h^{j (N_{T} - 1)})}^{T}]}^{T}$ . Using (5), $h_{l}^{ji}$ can be expressed in a vector form as

h_{l}^{ji} = B β_{l}^{ji},

(8)

where B= [ b₀,b₁,...,b_Q] with b_q= [ b_0,q,b_1,q,...,b_N−1,q]^T and $β_{l}^{ji} = {[β_{0, l}^{ji}, β_{1, l}^{ji} ..., β_{Q, l}^{ji}]}^{T}$ . Substituting (8) into (7), we obtain:

y^{j} = G [s] (I_{N_{T} L} \otimes B) β^{j} + w^{j},

(9)

where

β^{j} = {[{(β^{j 0})}^{T}, {(β^{j 1})}^{T}, ..., {(β^{j (N_{T} - 1)})}^{T}]}^{T}

with $β^{ji} = [{(β_{0}^{ji})}^{T}$ , ${(β_{1}^{ji})}^{T}, ..., {(β_{L - 1}^{ji})}^{T}]^{T}$ . By stacking the received signals from all N_R receive antennas into a single vector using (9) and (3), two equivalent expressions of the received signal which explicitly show the dependence of the unknown BEM coefficient and unknown signal can be obtained, respectively, as:

\begin{align} y & = Ξ [s] β + w \end{align}

(10a)

\begin{align} = Θ [β] s + w, \end{align}

(10b)

where $Ξ [s] = I_{N_{R}} \otimes (G [s] (I_{N_{T} L} \otimes B))$ , $β = {[{(β^{0})}^{T}, {(β^{1})}^{T}, \dots, {(β^{N_{R} - 1})}^{T}]}^{T}$ , $Θ [β] = [H^{0}, H^{1}, \dots, H^{N_{T} - 1}]$ with $H^{i} = {[{(H^{0 i})}^{H}, {(H^{1 i})}^{H}, \dots, {(H^{(N_{R} - 1) i})}^{H}]}^{H}$ , y= [ (y⁰)^T,(y¹)^T,…, ${(y^{N_{R} - 1})}^{T}]^{T}$ , $s = {[{(s^{0})}^{T}, {(s^{1})}^{T}, \dots, {(s^{N_{T} - 1})}^{T}]}^{T}$ and $w = {[{(w^{0})}^{T}, {(w^{1})}^{T}, \dots, {(w^{N_{R} - 1})}^{T}]}^{T}$ . Notice that Ξ[ s] represents a function of s and can be reconstructed by s through (6), (7), (8) and (9). Similarly, Θ[β] represents a function of β and can be reconstructed by β through (3), (4) and (5).

3 Iterative data detection and channel and noise variance estimation

The ML solution of all unknown quantities in (??), i.e., s, β, and σ² of w, involves multidimensional searches that pose prohibitively high computational complexity. In this and the next sections, the EM algorithm is employed to iteratively compute the ML estimates, with the different accuracy versus complexity trade-offs, respectively. As will be seen, our proposed schemes provide not only computationally affordable but also closed-form solutions that are free of exhaustive search.

Using the EM terminology, we take y as the incomplete data, β as the unobservable or missing data, and (σ², s) as parameters of interest. The iterative algorithm includes two steps (the E-step and the M-step) at each iteration. In the E-step, an expectation is taken with respect to β conditional on the observed data y and the previous estimates of (σ², s), and an objective function depending only on (σ², s) is obtained. In the M-step, through maximizing the function obtained in the E-step, the effect of channel can be compensated, and the current updated estimates of (σ², s) can be obtained.

The two steps at the k th iteration are detailed as follows:

E-step: compute $ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}) = E {log f (y, β | σ^{2}, s) | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}}$ .

M-step: solve $({\hat{σ}}_{k}^{2}, {\hat{s}}_{k}) = arg max_{σ^{2}, s} ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1})$ .

Note that conditioned upon y, the only unknown or random component in the complete data (y,β) is β; the expectation is taken with respect to the conditional probability density function $f (β | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}$ ), while $({\hat{σ}}_{k}^{2}, {\hat{s}}_{k})$ are the estimates of σ² and s at the k th iteration.

More specifically, for the E-step: using Bayes’s rule, we have:

f (y, β | σ^{2}, s) = f (y | β, σ^{2}, s) f (β),

(11)

where the fact that β is independent of s and σ² has been used. From (11), the function $ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1})$ in the E-step can be expressed as:

\begin{array}{l} ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}) = E {log f (y | β, σ^{2}, s) | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}} \\ + E {log f (β) | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}}, \end{array}

(12)

where the second term can be ignored in the following derivations, since it is not a function of parameters of interest, i.e., not a function of (σ², s) and therefore will not affect the following M-step. Using (10a), the likelihood function f(y|β,σ²,s) is obtained as:

\begin{array}{l} f (y | β, σ^{2}, s) = \frac{1}{{(π σ^{2})}^{N_{R} N}} \\ \times exp (- \frac{1}{σ^{2}} {(y - Ξ [s] β)}^{H} (y - Ξ [s] β)) . \end{array}

(13)

Substituting (13) into (12), we have:

\begin{array}{l} ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}) \propto - N N_{R} log (π σ^{2}) \\ - \frac{1}{σ^{2}} (y^{H} y - 2 ℜ {y^{H} Ξ [s] E {β | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}}} \\ + E {β^{H} Ξ^{H} [s] Ξ [s] β | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}}) . \end{array}

(14)

Notice that the following equation holds true for any matrix A and vector A with compatible dimension:

a^{H} A^{H} A a = Tr {A^{H} A a a^{H}} .

(15)

Define the conditional mean of β in (14) as ${\hat{β}}_{k} = E {β | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}}$ , and using (15), we obtain:

\begin{array}{l} ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}) \propto - N N_{R} log (π σ^{2}) \\ - \frac{1}{σ^{2}} (y^{H} y - 2 ℜ {y^{H} Ξ [s] {\hat{β}}_{k}} \\ + Tr {Ξ^{H} [s] Ξ [s] ({\hat{Υ}}_{k} + {\hat{β}}_{k} {\hat{β}}_{k}^{H})}), \end{array}

(16)

where ${\hat{Υ}}_{k} = E {(β - {\hat{β}}_{k}) {(β - {\hat{β}}_{k})}^{H} | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}}$ represents the corresponding conditional covariance matrix of β. It is shown in Appendix 1 that the conditional mean and covariance matrix are approximately given by:

{\hat{β}}_{k} = {(Ξ^{H} [{\hat{s}}_{k - 1}] Ξ [{\hat{s}}_{k - 1}])}^{- 1} Ξ^{H} [{\hat{s}}_{k - 1}] y,

(17)

{\hat{Υ}}_{k} = {\hat{σ}}_{k - 1}^{2} {(Ξ^{H} [{\hat{s}}_{k - 1}] Ξ [{\hat{s}}_{k - 1}])}^{- 1} .

(18)

M-step: in this step, we aim to maximize $ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1})$ with respect to σ² and s. Differentiating (16) with respect to s and setting the result to zero, neglecting those irrelevant terms we have:

\begin{array}{l} \frac{∂ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1})}{\partial s} \\ \times \propto \frac{\partial}{\partial s} \{2 ℜ {y^{H} Ξ [s] {\hat{β}}_{k}} - Tr {Ξ^{H} [s] Ξ [s] ({\hat{Υ}}_{k} + {\hat{β}}_{k} {\hat{β}}_{k}^{H})}\} . \end{array}

(19)

It is noted that since (19) depends on s in an implicit way, direct maximization of (19) with respect to s is difficult since multidimensional search is required. In what follows, an alternative expression for $ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1})$ will be derived from which a closed-form solution for the maximizing value of s can be obtained. Since ${\hat{Υ}}_{k}$ is a N_TN_R(Q+1)L×N_TN_R(Q+1)L Hermitian matrix, based on eigen-decomposition, we have ${\hat{Υ}}_{k} = \sum_{m = 0}^{N_{T} N_{R} (Q + 1) L - 1} λ_{m, k} μ_{m, k} μ_{m, k}^{H}$ , where λ_m,k is the m th eigenvalue of ${\hat{Υ}}_{k}$ , and μ_m,k is the m th eigenvector, associated with λ_m,k. Substituting the eigendecomposition on ${\hat{Υ}}_{k}$ into (19) and using the two equivalent equations derived in (10a) and (10b), we have:

\begin{array}{l} 2 ℜ {y^{H} Ξ [s] {\hat{β}}_{k}} - Tr {Ξ^{H} [s] Ξ [s] ({\hat{Υ}}_{k} + {\hat{β}}_{k} {\hat{β}}_{k}^{H})} \\ = y^{H} Θ [{\hat{β}}_{k}] s + s^{H} Θ^{H} [{\hat{β}}_{k}] y - \sum_{m = 0}^{N_{T} N_{R} (Q + 1) L - 1} λ_{m, k} s^{H} Θ^{H} [μ_{m, k}] \\ \times Θ [μ_{m, k}] s - s^{H} Θ^{H} [{\hat{β}}_{k}] Θ [{\hat{β}}_{k}] s . \end{array}

(20)

Notice that Ξ[ ·] and Θ[ ·] defined in (10a) and (10b) are not only applicable to s and β but also applicable to any vectors with compatible dimension.

Since (20) is a quadratic form of s, by setting the first derivative of (20) with respect to s to zero, the k th signal estimate is then given by:

\begin{align} {\tilde{s}}_{k} = (\sum_{m = 0}^{N_{T} N_{R} (Q + 1) L - 1} λ_{m, k} Θ^{H} [μ_{m, k}] Θ [μ_{m, k}] \\ {+ Θ^{H} [{\hat{β}}_{k}] Θ [{\hat{β}}_{k}])}^{- 1} Θ^{H} [{\hat{β}}_{k}] y . \end{align}

(21)

Note that ${\tilde{s}}_{k} = {[{({\tilde{s}}_{k}^{0})}^{T}, {({\tilde{s}}_{k}^{1})}^{T}, ..., {({\tilde{s}}_{k}^{N_{T} - 1})}^{T}]}^{T}$ . After OFDM demodulation, the symbol from the i th transmit antenna can be obtained as:

{\tilde{x}}_{k}^{i} = F {\tilde{s}}_{k}^{i} .

(22)

Since ${\tilde{x}}_{k}^{i}$ is discrete, belonging to a symbol constellation point, it must be quantized to its nearest constellation point in each iteration. Consequently, constellation mapping is carried out to obtain the discrete symbol estimate as: ${\hat{x}}_{k}^{i} = Qant {{\tilde{x}}_{k}^{i}}$ , where Qant{·} operation denotes quantization on the element in the bracket. The data symbol estimate is thus obtained by collecting the elements of ${\hat{x}}_{k}^{i}$ corresponding to $I_{d}^{i}$ .

Finally, putting ${\hat{s}}_{k}^{i} = F^{H} {\hat{x}}_{k}^{i}, i = 0, 1, ..., N_{T} - 1$ into (16) and setting the first derivative of $ℚ (σ^{2}, s | {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1})$ with respect to σ² to zero, the k th estimate of the unknown noise variance can be obtained as

\begin{array}{l} {\hat{σ}}_{k}^{2} = \frac{1}{N N_{R}} (y^{H} y - 2 ℜ {y^{H} Ξ [{\hat{s}}_{k}] {\hat{β}}_{k}} \\ + Tr {Ξ^{H} [{\hat{s}}_{k}] Ξ [{\hat{s}}_{k}] ({\hat{Υ}}_{k} + {\hat{β}}_{k} {\hat{β}}_{k}^{H})}) . \end{array}

(23)

In summary, starting from a suitable initial value, the proposed iterative EM-based scheme alternates among the explicitly closed-form results (17), (18), (21), (22), and (23) until convergence, i.e., until no significant changes are observed in the updates.

4 A reduced computational complexity scheme

The computational complexity of the EM-based iterative scheme proposed in Section 3 is summarized in Table 1. Notice that the computational burden mainly comes from the joint detection and estimation simultaneous for all transmit antennas. If in each iteration, detection, and estimation can be completed one antenna by one antenna, the computational burden will be significantly reduced.

Table 1 Computational complexity of the proposed scheme in Section 3

Full size table

Recalling (6) and (3), two alternating but equivalent expressions for y can be derived as:

\begin{align} y & = Ξ [s] β + w = \sum_{i = 0}^{N_{T} - 1} Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i} + w \end{align}

(24a)

\begin{align} = Θ [β] s + w = \sum_{i = 0}^{N_{T} - 1} H [β_{rc}^{i}] s^{i} + w, \end{align}

(24b)

where $Φ [s^{i}] = I_{N_{R}} \otimes G [s^{i}]$ , $H [β_{rc}^{i}] = {[{(H^{0 i})}^{H}, {(H^{1 i})}^{H}, ..., {(H^{(N_{R} - 1) i})}^{H}]}^{H}$ , the BEM coefficients associated with the channel from the transmit antenna i to N_R receive antennas is represented by $β_{rc}^{i} = {[{(β^{0 i})}^{T}, {(β^{1 i})}^{T}, ..., {(β^{(N_{R} - 1) i})}^{T}]}^{T}$ with $β^{ji} = {[{(β_{0}^{ji})}^{T}, {(β_{1}^{ji})}^{T}, ..., {(β_{L - 1}^{ji})}^{T}]}^{T}$ . The subscript ‘rc’ is short for ‘reduced complexity’ to distinguish it from the β^j defined in Section 3, which represents the BEM coefficients associated with the channel from N_T transmit antennas to the receive antenna j.

From (24a) and (24b), it is observed that by applying the mathematical framework of EM, an alternative way to choose the complete data, defined as ψ in this scheme, is by decomposing the observed data y into its signal components. The complete data ψ is obtained as:

\begin{align} ψ & = [\begin{array}{c} ψ^{0} \\ ψ^{1} \\ ⋮ \\ ψ^{N_{T} - 1} \end{array}] \\ = [\begin{array}{c} Φ [s^{0}] (I_{N_{R} L} \otimes B) β_{rc}^{0} \\ Φ [s^{1}] (I_{N_{R} L} \otimes B) β_{rc}^{1} \\ ⋮ \\ Φ [s^{N_{T} - 1}] (I_{N_{R} L} \otimes B) β_{rc}^{N_{T} - 1} \end{array}] + [\begin{array}{c} w^{0} \\ w^{1} \\ ⋮ \\ w^{N_{T} - 1} \end{array}] \end{align}

(25a)

\begin{align} = [\begin{array}{c} H [β_{rc}^{0}] s^{0} \\ H [β_{rc}^{1}] s^{1} \\ ⋮ \\ H [β_{rc}^{N_{T} - 1}] s^{N_{T} - 1} \end{array}] + [\begin{array}{c} w^{0} \\ w^{1} \\ ⋮ \\ w^{N_{T} - 1} \end{array}], \end{align}

(25b)

where $ψ^{i} = Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i} + w^{i} = H [β_{rc}^{i}] s^{i} + w^{i}, i = 0, 1, ..., N_{T} - 1,$ and wⁱ,i=0,1,...,N_T−1 are circularly symmetric and statistically independent Gaussian vectors satisfying $w = \sum_{i = 0}^{N_{T} - 1} w^{i}$ .

Similar to the E-step in Section 3, for the k th iteration, we need to compute the conditional expectation of the log-likelihood function for the complete data ψ. More specifically, for the

E-step: using (25a), the likelihood function can be expressed as:

\begin{array}{l} f (ψ | β_{rc}, s) = \frac{1}{{(π)}^{N N_{R} N_{T}} | Υ_{ψ} |} \\ \times exp (- {(ψ - ϱ)}^{H} Υ_{ψ}^{- 1} (ψ - ϱ)), \end{array}

(26)

where $β_{rc} = {[{(β_{rc}^{0})}^{T}, {(β_{rc}^{1})}^{T}, ..., {(β_{rc}^{N_{T} - 1})}^{T}]}^{T}$ , $Υ_{ψ} = Blkdiag {ς^{0} σ^{2} I_{N N_{R}}, ς^{1} σ^{2} I_{N N_{R}}, ..., ς^{N_{T} - 1} σ^{2} I_{N N_{R}}}$ with $E {w^{i} {(w^{i})}^{H}} = ς^{i} σ^{2} I_{N N_{R}}$ and the ςⁱ’s being arbitrary non-negative and real-valued scalars satisfying $\sum_{i = 0}^{N_{T} - 1} ς^{i} = 1$ , $ϱ = {[{(Φ [s^{0}] (I_{N_{R} L} \otimes B) β_{rc}^{0})}^{T}, {(Φ [s^{1}] (I_{N_{R} L} \otimes B) β_{rc}^{1})}^{T}, ..., {(Φ [s^{N_{T} - 1}] (I_{N_{R} L} \otimes B) β_{rc}^{N_{T} - 1})}^{T}]}^{T}$ . Notice that in this scheme, we take (β_rc, s) as parameters of interest. Using (26) and neglecting those irrelevant terms, $E {log f (ψ | β_{rc}, s) | y, {\hat{β}}_{rc, k - 1}, {\hat{s}}_{k - 1}}$ can be expressed as:

\begin{array}{l} E {log f (ψ | β_{rc}, s) | y, {\hat{β}}_{rc, k - 1}, {\hat{s}}_{k - 1}} \\ \times \propto ϱ^{H} Υ_{ψ}^{- 1} {\hat{ψ}}_{k} + {\hat{ψ}}_{k}^{H} Υ_{ψ}^{- 1} ϱ - ϱ^{H} Υ_{ψ}^{- 1} ϱ, \end{array}

(27)

where ${\hat{ψ}}_{k}$ , the conditional mean of ψ, can be derived as:

\begin{array}{l} {\hat{ψ}}_{k} = E {ψ | y, {\hat{β}}_{rc, k - 1}, {\hat{s}}_{k - 1}} \\ = {\hat{ϱ}}_{k - 1} + Υ_{ψ} ℧^{H} {(℧ Υ_{ψ} ℧^{H})}^{- 1} (y - ℧ {\hat{ϱ}}_{k - 1}), \end{array}

(28)

where $℧ = [I_{N N_{R}}, I_{N N_{R}}, ..., I_{N N_{R}}]$ is a matrix with dimension N N_R×N N_RN_T that connects y and ψ as $y = \sum_{i = 0}^{N_{T} - 1} ψ^{i} = ℧ ψ$ . Substituting the corresponding components into the right-hand side of (28), after some manipulations we obtain:

{\hat{ψ}}_{k} = [\begin{array}{c} Φ [{\hat{s}}_{k - 1}^{0}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{0} + ς^{0} (y - \sum_{i = 0}^{N_{T} - 1} Φ [{\hat{s}}_{k - 1}^{i}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{i}) \\ Φ [{\hat{s}}_{k - 1}^{1}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{1} + ς^{1} (y - \sum_{i = 0}^{N_{T} - 1} Φ [{\hat{s}}_{k - 1}^{i}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{i}) \\ ⋮ \\ Φ [{\hat{s}}_{k - 1}^{N_{T} - 1}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{N_{T} - 1} + ς^{N_{T} - 1} (y - \sum_{i = 0}^{N_{T} - 1} Φ [{\hat{s}}_{k - 1}^{i}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{i}) \end{array}] .

(29)

Substituting (29) into (28), finally we obtain:

\begin{array}{l} E {log f (ψ | β_{rc}, s) | y, {\hat{β}}_{rc, k - 1}, {\hat{s}}_{k - 1}} \\ \propto - \sum_{i = 0}^{N_{T} - 1} {({\hat{ψ}}_{k}^{i} - Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i})}^{H} \\ \times ({\hat{ψ}}_{k}^{i} - Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i}), \end{array}

(30)

where

\begin{array}{l} {\hat{ψ}}_{k}^{i} = Φ [{\hat{s}}_{k - 1}^{i}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{i} \\ + ς^{i} (y - \sum_{i = 0}^{N_{T} - 1} Φ [{\hat{s}}_{k - 1}^{i}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{i}) . \end{array}

(31)

It is noted from (30) that in the following M-step, the maximization of $E {log f (ψ | β_{rc}, s) | y, {\hat{β}}_{rc, k - 1}, {\hat{s}}_{k - 1}}$ with respect to β_rc and s is equivalent to the minimization of each of the single terms in (30), i.e., minimization of ${({\hat{ψ}}_{k}^{i} - Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i})}^{H} ({\hat{ψ}}_{k}^{i} - Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i})$ with respect to $β_{rc}^{i}$ and sⁱ for each i, separately.

Notice that the multidimensional minimization for each of the terms in (30) still remains a formidable task. To solve this problem, substituting (24a) and (24a) into (31), we obtain:

\begin{align} {\hat{ψ}}_{k}^{i} & = ς^{i} Φ [s^{i}]_{N_{R} L} \otimes B) β_{rc}^{i} + χ_{k}^{i} \end{align}

(32a)

\begin{align} (I & = ς^{i} H [β_{rc}^{i}] s^{i} + χ_{k}^{i}, \end{align}

(32b)

where

\begin{array}{l} χ_{k}^{i} = Φ [{\hat{s}}_{k - 1}^{i}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{i} \\ + ς^{i} (\sum_{g = 0, g \neq i}^{N_{T} - 1} Φ [s^{g}] (I_{N_{R} L} \otimes B) β_{rc}^{g} + w \\ - \sum_{i = 0}^{N_{T} - 1} Φ [{\hat{s}}_{k - 1}^{i}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{i}) . \end{array}

(33)

Set ςⁱ=1, for the i th transmit antenna we have:

\begin{align} {\hat{ψ}}_{k}^{i} & = Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i} + χ_{k}^{i} \end{align}

(34a)

\begin{align} = H [β_{rc}^{i}] s^{i} + χ_{k}^{i} . \end{align}

(34b)

Recalling that $\sum_{i = 0}^{N_{T} - 1} ς^{i} = 1$ , and by using (31), for the transmit antennas ${g}_{g = 0, g \neq i}^{N_{T} - 1}$ we have:

{\hat{ψ}}_{k}^{g} = Φ [{\hat{s}}_{k - 1}^{g}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k - 1}^{g} .

(35)

Using (??) and (35), (30) can be decomposed into N_T terms, each of which can be solved as follows:

M-step: for the i th transmit antenna, we have:

\begin{array}{l} [{\hat{β}}_{rc, k}^{i}, {\hat{s}}_{k}^{i}] = arg \min_{β_{rc}^{i}, s^{i}} \{{({\hat{ψ}}_{k}^{i} - Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i})}^{H} \\ \times ({\hat{ψ}}_{k}^{i} - Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i})\}, \end{array}

(36)

and for the transmit antennas ${g}_{g = 0, g \neq i}^{N_{T} - 1}$ , we have:

[{\hat{β}}_{rc, k}^{g}, {\hat{s}}_{k}^{g}] = [{\hat{β}}_{rc, k - 1}^{g}, {\hat{s}}_{k - 1}^{g}] .

(37)

Therefore, the proposed iterative scheme starts from k=0,1,2,... and during the k th iteration, i is set as $i = < k >_{N_{T}}$ . It can be seen that we have split the estimation and detection problem for the MIMO case of Section 3 into estimation and detection problem for N_T single-input and multiple-output (SIMO) cases, where, during each iteration, parameters and data from only one transmit antenna are estimated and detected. Note that $χ_{k}^{i}$ given in (33) is a disturbance term that accounts for the background noise and residual interference after the k th iteration, where the interference is linearly related to the signals of all transmit antennas. Then, assuming the interference is i.i.d with zero mean, from the central limit theorem [35], it can be seen that the entries of $χ_{k}^{i}$ are nearly Gaussian distributed with zero mean and some variance σ χⁱ,k 2. Under the above assumption, it turns out that the minimization problem in (36) is equivalent to the ML estimation of $β_{rc}^{i}$ , sⁱ and the unknown variance $σ_{χ^{i}}^{2}$ starting from the observation ${\hat{ψ}}^{i}$ . Comparing (34a) and (34b) with (10a) and (10b), it is easy to see that the same EM procedure proposed in Section 3 can be directly adopted to solve the optimization problem of (36), with details shown in Appendix 2.

The computationally feasible EM scheme is summarized as follows:

The computational complexity of the proposed iterative EM-based scheme with reduced complexity is summarized in Table 2.

Table 2 Computational complexity of the proposed scheme in Section 4

Full size table

Note that compared to Table 1, the computational complexity of the proposed iterative EM-based scheme with reduced complexity is significantly lower than that of the EM-based scheme proposed in Section 3. However, this significant computational complexity reduction is not obtained without price. As will be shown in Section 5, there is a minor performance degradation compared to the EM-based scheme proposed in Section 3. This performance degradation is due mainly to two reasons. First, the disturbance term in (34a) and (34b) contains the background noise as well as the residual interference from other transmit antennas, whereas in (10a) and (10b), only the background noise is contained. Second, the separate estimation and detection for each antenna is seen as a suboptimal estimation and detection method compared to the joint estimation and detection for all antennas, which is optimal in the sense of estimation and detection theory [36].

4.1 Initialization

The EM algorithm is guaranteed to obtain at least a local maximum after convergence [6, 7].

To provide an initial value, a least square (LS) algorithm based on pilot symbols is utilized to provide a good initial estimate which will be demonstrated in the simulations. Recalling (1), (10a), and (10b), we have:

y = Ξ [Ω_{p} x_{p}] β + Θ [β] Ω_{d} x_{d} + w,

(38)

where $Ω_{p} = Blkdiag {F^{H} E_{p}^{0}, F^{H} E_{p}^{1}, ..., F^{H} E_{p}^{N_{T} - 1}}$ , $Ω_{d} = Blkdiag {F^{H} E_{d}^{0}, F^{H} E_{d}^{1}, ..., F^{H} E_{d}^{N_{T} - 1}}$ , $x_{p} = {[{(x_{p}^{0})}^{T}, {(x_{p}^{1})}^{T}, ..., {(x_{p}^{N_{T} - 1})}^{T}]}^{T}$ , and $x_{d} = {[{(x_{d}^{0})}^{T}, {(x_{d}^{1})}^{T}, ..., {(x_{d}^{N_{T} - 1})}^{T}]}^{T}$ . By treating the term containing x_d as interference, the LS estimate of β is obtained as:

{\hat{β}}_{0} = {(Ξ^{H} [Ω_{p} x_{p}] Ξ [Ω_{p} x_{p}])}^{- 1} Ξ^{H} [Ω_{p} x_{p}] y .

(39)

Substituting (39) into (10b), the initial signal detection is obtained as:

\begin{array}{l} {\hat{s}}_{0} = (I_{N_{T}} \otimes F^{H}) Qant \{(I_{N_{T}} \otimes F) {(Θ^{H} [{\hat{β}}_{0}] Θ [{\hat{β}}_{0}])}^{- 1} \\ \times Θ^{H} [{\hat{β}}_{0}] y\} . \end{array}

(40)

Finally, for the initial variances ${\hat{σ}}_{0}^{2}$ and ${{\hat{σ}}_{χ^{i}, 0}^{2}}_{i = 0}^{N_{T} - 1}$ , they are all set to 0.

5 Simulation results and discussions

In this section, the performance of the proposed algorithm is demonstrated by Monte Carlo simulations. In the simulations, transmit and receive antennas are set as N_T=N_R=2, each OFDM symbol has 64 subcarriers (N=64) and communicates over a bandwidth of 20 MHz. The sampling interval T_s is thus 50 ns. The length of the CP is N_cp=8.

The normalized maximal Doppler shift is set as N f_dT_s=0.075 and 0.15, respectively, where f_d represents the maximum Doppler frequency.

The channel has three taps (L=3) with an exponential power delay profile, namely $σ_{l}^{2} = exp (- κl) ((1 - exp (- κ)) / (1 - exp (- κL))), l = 0, 1, ..., L - 1$ with κ=1/3. In typical communication scenarios, only a few significant paths dominate the effect of the wireless channel [4]. Therefore, L=3 is a reasonable setting. Each tap coefficient follows a complex Gaussian distribution. The data are modulated by quadrature phase shift keying (QPSK) and 16 quadrature amplitude modulation (16 QAM), respectively, with unit power. The pilot cluster follows the structure in [37], and more specifically, seven pilot clusters are used for each transmit antenna. The clusters are equal-spaced among subcarriers, and in each cluster, one nonzero pilot is guarded by one zero pilot on each side. The nonzero pilots are generated as zero-mean complex Gaussian random variables with power three times that of data symbols. Furthermore, the generalized complex exponential BEM (GCE-BEM) [34] is adopted.

5.1 Convergence of the proposed schemes

Figure 1, 2, and 3 present the convergence performance of the proposed EM-based scheme in Section 3 (marked as scheme 1) and the proposed EM-based scheme in Section 4 (marked as scheme 2) with signal-to-noise ratio (SNR) equal to 10, 20, and 30 dB. It can be seen that both the mean-square error (MSE) and bit error rate (BER) improve significantly in the first few iterations and converge to stable values within eight iterations. Channel estimation with full training and data detection with perfect CSI are shown for comparison. Furthermore, according to [38], the Cramer-Rao bound is also shown for comparison. It can be seen from Figure 1 that after convergence, the channel estimation performance of both schemes greatly improve that of the initial estimation (marked as iteration = 0), which indicates the ability of the proposed algorithm to cancel the interference from unknown data to channel estimation through iterations. The channel estimation performance of scheme 1 is very close to that of the Cramer-Rao bound and the full training case. The channel estimation performance of scheme 2 suffers a minor performance degradation compared to that of the scheme 1, which is the price we have to pay for the reduced computational complexity. Similar results can be observed for the performance of data detection in Figure 2 and 3, which indicates that the updated channel estimate can in turn greatly improve the data detection through iterations. Similar convergence results are also observed for the 16 QAM case, and figures are not presented here due to space limitations.

5.2 Performance of the proposed schemes

Figure 4, 5, and 6 show the MSE and BER performance achieved by the proposed iterative algorithm versus SNRs. It can be seen from Figure 4 that the performances of the proposed schemes 1 and 2 both perform much better than that of the initial value and close to that of the Cramer-Rao bound and the full training case after convergence.

Similarly, it can be seen from Figure 5 that for the case where N f_dT_s=0.075, the BER performance of the proposed iterative algorithm is very close to that of the ideal case which assumes perfect CSI after convergence. For the severe case where N f_dT_s=0.15, it can be seen from Figure 6 that the proposed iterative algorithm can still deal with such a highly TF dispersive channel and performs well. Moreover, from Figure 5 and 6, it can be seen that for signals with both amplitude and phase variations such as 16 QAM, the proposed algorithm also performs well.

Finally, we investigate how the proposed schemes are affected by different channel lengths. A severe case where the channel length is equivalent to the number of embedded pilots (marked as case 2) is shown in Figure 7. As can be seen from the figure, compared to the originally-presented case where the channel length is 3 (marked as case 1), there is an obvious performance degradation of the proposed schemes for the severe case 2. The reason can be explained according to the estimation theory [36] that when the channel length increases, more parameters need to be estimated, which leads to a decreased performance. On the contrary, if the channel length decreases, less parameters need to be estimated and that leads to an increased performance.

6 Conclusions

In this paper, two EM-based iterative data detection and channel and noise variance estimation schemes for MIMO-OFDM systems operating over TF dispersive channels under unknown background noise have been proposed. The resulting schemes achieve convergence in a few iterations and can effectively estimate TF dispersive channels and obtain reliable data detection under unknown background noise environments. The first scheme iteratively detects data and estimates the channel and noise variance simultaneously for all antennas, and moreover, the updating expressions of these estimates are all derived as closed-form results. Simulation results showed that after convergence, the performance of the first scheme is very close to that of the optimal case which assumes full training and perfect CSI. To reduce the computational complexity of the first scheme, another EM-based scheme that detecting data and estimating channel for only one antenna during each iteration and holding the unknown quantities of other antennas to their last estimates has been proposed, which is also derived as closed-form results. Simulation results showed that its performance only slightly degrades compared to the first scheme, but the computational complexity is significantly reduced.

Appendices

Appendix 1

Derivation of (17) and (18)

Using Bayes’s formula, the conditional pdf of β is given by:

f (β | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}) = {\frac{f (y | β, σ^{2}, s) f (β)}{f (y | σ^{2}, s)}|}_{σ^{2} = {\hat{σ}}_{k - 1}^{2}, s = {\hat{s}}_{k - 1}},

(41)

where the fact that β is independent of s, and σ² has been used. The BEM coefficient β can be shown to be complex Gaussian variable [33] with zero mean and covariance matrix R_β, that is:

f (β) = \frac{1}{π^{(Q + 1) N_{T} N_{R} L} | R_{β} |} exp (- β^{H} R_{β}^{- 1} β) .

(42)

Note that:

f (y | σ^{2}, s) = \int f (y | β, σ^{2}, s) f (β) d β .

(43)

With f(y|β,σ²,s) given by (13), putting (13) and (42) into (43), we have:

\begin{align} f (y | σ^{2}, s) = & \frac{| {\hat{Υ}}_{k} |}{{(π σ^{2})}^{N N_{R}} | R_{β} |} \\ \times exp (- \frac{1}{σ^{2}} (y^{H} y - σ^{2} {\hat{β}}_{k}^{H} {\hat{Υ}}_{k}^{- 1} {\hat{β}}_{k})) . \end{align}

(44)

Substituting (13), (42), and (44) into (41), after some manipulations we have:

\begin{align} f (β | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1}) = & \frac{1}{π^{(Q + 1) N_{T} N_{R} L} | {\hat{Υ}}_{k} |} \\ \times exp (- {(β - {\hat{β}}_{k})}^{H} {\hat{Υ}}_{k}^{- 1} (β - {\hat{β}}_{k})), \end{align}

(45)

where

\begin{align} {\hat{β}}_{k} & = {(Ξ^{H} [{\hat{s}}_{k - 1}] Ξ [{\hat{s}}_{k - 1}] + {\hat{σ}}_{k - 1}^{2} R_{β}^{- 1})}^{- 1} Ξ^{H} [{\hat{s}}_{k - 1}] y, \end{align}

(46)

\begin{align} {\hat{Υ}}_{k} & = {\hat{σ}}_{k - 1}^{2} {(Ξ^{H} [{\hat{s}}_{k - 1}] Ξ [{\hat{s}}_{k - 1}] + {\hat{σ}}_{k - 1}^{2} R_{β}^{- 1})}^{- 1} . \end{align}

(47)

Thus, the pdf $f (β | y, {\hat{σ}}_{k - 1}^{2}, {\hat{s}}_{k - 1})$ is a Gaussian distribution. In addition, ${\hat{β}}_{k}$ and ${\hat{Υ}}_{k}$ given in (46) and (47), respectively, are in fact its conditional mean and covariance. To show that we have no prior information on β, we take the limit ||R_β||→+∞, which leads to (17) and (18). In this paper, we set $R_{β}^{- 1}$ to zero to show we have no prior information for β. Indeed, there will be a performance degradation by assuming $R_{β}^{- 1}$ to zero. However, this is a typical complexity versus performance trade-off. Moreover, as can be seen from simulation results in Section 5, even we set $R_{β}^{- 1}$ to zero, the proposed algorithm also performs well, and its performance is acceptable.

Appendix 2

Solving (36)

Comparing (34a) and (34b) with (10a) and (10b), referring to Section 3, we take ${\hat{ψ}}_{k}^{i}$ as the incomplete data, $β_{rc}^{i}$ as the unobservable or missing data, and ( $σ_{χ^{i}}^{2}$ , sⁱ) as parameters of interest. The two steps at the k th iteration are detailed as follows:

E-step: compute $ℚ (σ_{χ^{i}}^{2}, s^{i} | {\hat{σ}}_{χ^{i}, k - 1}^{2}, {\hat{s}}_{k - 1}^{i}) = E {log f ({\hat{ψ}}_{k}^{i}$ , $β_{rc}^{i} | σ_{χ^{i}}^{2}, s^{i}) | {\hat{ψ}}_{k}^{i}, {\hat{σ}}_{χ^{i}, k - 1}^{2}, {\hat{s}}_{k - 1}^{i}}$ .

M-step: solve $({\hat{σ}}_{χ^{i}, k}^{2}, {\hat{s}}_{k}^{i}) = {arg max}_{σ_{χ^{i}}^{2}, s^{i}} ℚ (σ_{χ^{i}}^{2}, s^{i} | {\hat{σ}}_{χ^{i}, k - 1}^{2}$ , ${\hat{s}}_{k - 1}^{i})$ .

Note that conditioned upon ${\hat{ψ}}_{k}^{i}$ , the only unknown or random component in the complete data $({\hat{ψ}}_{k}^{i}, β_{rc}^{i})$ is $β_{rc}^{i}$ , the expectation is taken with respect to the conditional probability density function $f (β_{rc}^{i} | {\hat{ψ}}_{k}^{i}, {\hat{σ}}_{χ^{i}, k - 1}^{2}, {\hat{s}}_{k - 1}^{i}$ ), while $({\hat{σ}}_{χ^{i}, k}^{2}, {\hat{s}}_{k}^{i})$ are the estimates of σ χⁱ2, and sⁱ at the k th iteration. More specifically, for the E-step: Using Bayes’s rule, we obtain:

f ({\hat{ψ}}_{k}^{i}, β_{rc}^{i} | σ_{χ^{i}}^{2}, s^{i}) = f ({\hat{ψ}}_{k}^{i} | β_{rc}^{i}, σ_{χ^{i}}^{2}, s^{i}) f (β_{rc}^{i}) .

(48)

Using (48), the function $ℚ (σ_{χ^{i}}^{2}, s^{i} | {\hat{σ}}_{χ^{i}, k - 1}^{2}, {\hat{s}}_{k - 1}^{i})$ can be expressed as:

\begin{array}{l} ℚ (σ_{χ^{i}}^{2}, s^{i} | {\hat{σ}}_{χ^{i}, k - 1}^{2}, {\hat{s}}_{k - 1}^{i}) \\ = E {log f ({\hat{ψ}}_{k}^{i} | β_{rc}^{i}, σ_{χ^{i}}^{2}, s^{i}) | {\hat{ψ}}_{k}^{i}, {\hat{σ}}_{χ^{i}, k - 1}^{2}, {\hat{s}}_{k - 1}^{i}} \\ + E {log f (β_{rc}^{i}) | {\hat{ψ}}_{k}^{i}, {\hat{σ}}_{χ^{i}, k - 1}^{2}, {\hat{s}}_{k - 1}^{i}}, \end{array}

(49)

where the second term can be ignored in the following derivations, since it is not a function of parameters of interest, i.e., not a function of ( $σ_{χ^{i}}^{2}$ , sⁱ). Using (34a), the likelihood function $f ({\hat{ψ}}_{k}^{i} | β_{rc}^{i}, σ_{χ^{i}}^{2}, s^{i})$ is obtained as:

\begin{array}{l} f ({\hat{ψ}}_{k}^{i} | β_{rc}^{i}, σ_{χ^{i}}^{2}, s^{i}) = \frac{1}{{(π σ_{χ^{i}}^{2})}^{N N_{R}}} \\ \times exp (- \frac{1}{σ_{χ^{i}}^{2}} {({\hat{ψ}}_{k}^{i} - Φ [s^{i}] (I_{N_{RL}} \otimes B) β_{rc}^{i})}^{H} \\ \times ({\hat{ψ}}_{k}^{i} - Φ [s^{i}] (I_{N_{R} L} \otimes B) β_{rc}^{i})) . \end{array}

(50)

Substituting (50) into (49) and referring to (14), (15) and (16) and Appendix 1, the conditional mean and covariance matrix are obtained as:

\begin{align} {\hat{β}}_{rc, k}^{i} & = {({(I_{N_{R} L} \otimes B)}^{H} Φ^{H} [{\hat{s}}_{k - 1}^{i}] Φ [{\hat{s}}_{k - 1}^{i}] (I_{N_{R} L} \otimes B))}^{- 1} \\ \times {(I_{N_{R} L} \otimes B)}^{H} Φ^{H} [{\hat{s}}_{k - 1}^{i}] {\hat{ψ}}_{k}^{i}, \end{align}

(51)

\begin{align} {\hat{Υ}}_{rc, k}^{i} & = {\hat{σ}}_{χ^{i}, k - 1}^{2} ({(I_{N_{R} L} \otimes B)}^{H} Φ^{H} [{\hat{s}}_{k - 1}^{i}] Φ [{\hat{s}}_{k - 1}^{i}] \\ {\times (I_{N_{R} L} \otimes B))}^{- 1} . \end{align}

(52)

It is noted that the matrix Φ[ sⁱ] is of dimension N_RN×N_RN L, and the matrix $(I_{N_{R} L} \otimes B)$ is of dimension N_RN L×N_R(Q+1)L; the N_R(Q+1)L×N_R(Q+1)L matrix inversion required in (51) and (52) is only $\frac{1}{N_{T}}$ of that needed in (17) and (18).

M-step: using the two equivalent expressions derived in (34a) and (34b) and similar to (19), (20), (21) and (22), the signal updating equation is obtained as:

\begin{align} {\tilde{s}}_{k}^{i} & = (\sum_{m = 0}^{N_{R} (Q + 1) L - 1} λ_{rc, m, k}^{i} H^{H} [μ_{rc, m, k}^{i}] H [μ_{rc, m, k}^{i}] \\ {+ H^{H} [{\hat{β}}_{rc, k}^{i}] H [{\hat{β}}_{rc, k}^{i}])}^{- 1} H^{H} [{\hat{β}}_{rc, k}^{i}] {\hat{ψ}}_{k}^{i}, \end{align}

(53)

where ${\hat{Υ}}_{rc, k}^{i} = \sum_{m = 0}^{N_{R} (Q + 1) L - 1} λ_{rc, m, k}^{i} μ_{rc, m, k}^{i} {(μ_{rc, m, k}^{i})}^{H}$ represents the eigendecomposition of ${\hat{Υ}}_{rc, k}^{i}$ . It is noted that compared to (21) where N_TN×N_TN matrix inversion is required, only N×N matrix inversion is needed in (53). The symbol detection can thus be obtained after OFDM demodulation as

{\hat{x}}_{k}^{i} = Qant {F {\tilde{s}}_{k}^{i}} .

(54)

Substituting ${\hat{s}}_{k}^{i} = F^{H} {\hat{x}}_{k}^{i}$ and (50), (51) and (52) into (49) and referring to (23), the unknown noise variance for the disturbance term $χ_{k}^{i}$ can be obtained as:

\begin{array}{l} {\hat{σ}}_{χ^{i}, k}^{2} = \frac{1}{N N_{R}} ({({\hat{ψ}}_{k}^{i})}^{H} {\hat{ψ}}_{k}^{i} - 2 ℜ {{({\hat{ψ}}_{k}^{i})}^{H} Φ [{\hat{s}}_{k}^{i}] (I_{N_{R} L} \otimes B) {\hat{β}}_{rc, k}^{i}} \\ + Tr \{{(I_{N_{R} L} \otimes B)}^{H} Φ^{H} [{\hat{s}}_{k}^{i}] Φ [{\hat{s}}_{k}^{i}] (I_{N_{R} L} \otimes B) \\ \times ({\hat{Υ}}_{rc, k}^{i} + {\hat{β}}_{rc, k}^{i} {({\hat{β}}_{rc, k}^{i})}^{H})\}) . \end{array}

(55)

In summary, (51), (52), (53), (54) and (55) solve the minimization problem in (36).

Notice that the computational complexity can be further reduced by observing the diagonal structure of both Φ[ sⁱ] and $(I_{N_{R} L} \otimes B)$ in (24a). Therefore, (51) and (52) can be further split into N_R sub-matrices, each of which is expressed as:

{\hat{β}}_{rc, k}^{z, i} = {({(B^{z})}^{H} G^{H} [{\hat{s}}_{k - 1}^{i}] G [{\hat{s}}_{k - 1}^{i}] B^{z})}^{- 1} {(B^{z})}^{H} G^{H} [{\hat{s}}_{k - 1}^{i}] {\hat{ψ}}_{k}^{z, i},

(56)

Υ_{rc, k}^{z, i} = {\hat{σ}}_{χ^{i}, k - 1}^{2} {({(B^{z})}^{H} G^{H} [{\hat{s}}_{k - 1}^{i}] G [{\hat{s}}_{k - 1}^{i}] B^{z})}^{- 1},

(57)

where ${\hat{ψ}}_{k}^{i} = {[{({\hat{ψ}}_{k}^{0, i})}^{T}, {({\hat{ψ}}_{k}^{1, i})}^{T}, ..., {({\hat{ψ}}_{k}^{N_{R} - 1, i})}^{T}]}^{T}$ . Then, ${\hat{β}}_{rc, k}^{i}$ is obtained as ${\hat{β}}_{rc, k}^{i} = {[{({\hat{β}}_{rc, k}^{0, i})}^{T}, {({\hat{β}}_{rc, k}^{1, i})}^{T}, ..., {({\hat{β}}_{rc, k}^{N_{R} - 1, i})}^{T}]}^{T}$ , and ${\hat{Υ}}_{rc, k}^{i}$ can be obtained as ${\hat{Υ}}_{rc, k}^{i} = Blkdiag {Υ_{rc, k}^{0, i}, Υ_{rc, k}^{1, i}, ..., Υ_{rc, k}^{N_{R} - 1, i}}$ . It is noted that the matrix G[ sⁱ] is of dimension N×N L, and the matrix $B^{z} ≜ I_{L} \otimes B$ is of dimension N L×(Q+1)L; the (Q+1)L×(Q+1)L matrix inversion required in (56) and (57) is $\frac{1}{N_{R}}$ of that needed in (51) and (52) and therefore only $\frac{1}{N_{R} N_{T}}$ of that needed in (17) and (18).

References

Bocskei H, Paulraj AJ: Multiple-Input Multiple-Output (MIMO) Wireless Systems. Cambridge: Cambridge Univ. Press; 2003.
Google Scholar
Nee RV, Prasad R: OFDM for Wireless Multimedia Communications. Norwood: Artech House Publishers; 2000.
Google Scholar
Hanzo L, Akhtman J, Jiang M, Wang L: MIMO-OFDM for LTE, WiFi and WiMAX: Coherent versus Non-coherent and Cooperative Turbo Transceivers. Hoboken: Wiley; 2010.
Book Google Scholar
Goldsmith A: Wireless Communications. Cambridge: Cambridge Univ. Press; 2005.
Book Google Scholar
Viterbi AJ, Omura JK: Principles of Digital Communication and Coding. New York: Dover Press; 2009.
Google Scholar
Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statiscal Soc., Ser. B (Methodological) 1977, 39(1):1-38.
MathSciNet Google Scholar
Moon T: The expectation-maximization algorithm. IEEE Signal Process. Mag 1996, 13(6):47-60. 10.1109/79.543975
Article Google Scholar
Assra A, Hamouda W, Youssef A: EM-based joint channel estimation and data detection for MIMO-CDMA systems. IEEE Trans. Veh. Technol 2010, 59(3):1205-1216.
Article Google Scholar
Choi J: An EM based joint data detection and channel estimation incorporating with initial channel estimate. IEEE Commun. Lett 2008, 12(9):654-656.
Article Google Scholar
Cozzo C, Hughes BL: Joint channel estimation and data detection in space-time communications. IEEE Trans. Commun 2003, 51(8):1266-1270. 10.1109/TCOMM.2003.815062
Article Google Scholar
Zhang X Y, Wang DG, Wei JB: Joint symbol detection and channel estimation for MIMO-OFDM systems via the variational bayesian EM algorithm. In IEEE Wireless Communications and Networking Conference. Las Vegas: ; 31 Mar–3 Apr 2008:13-17.
Google Scholar
So DKC, Chen RS: Iterative EM receiver for space-time coded systems in MIMO frequency-selective fading channels with channel gain and order estimation. IEEE Trans. Wirel. Commun 2004, 3(6):1928-1935. 10.1109/TWC.2004.837293
Article Google Scholar
Lu B, Wang X, Li Y: Iterative receivers for space-time block coded OFDM systems in dispersive fading channels. IEEE Trans. Wirel. Commun 2002, 1(2):213-225. 10.1109/7693.994815
Article Google Scholar
Zia A, Reilly JPR, Manton J, Shiran S: An information geometry approach to ML estimation with incomplete data: application to semiblind MIMO channel identification. IEEE Trans. Signal Process 2007, 55(8):3975-3985.
Article MathSciNet Google Scholar
Khalighi MA, Boutros JJ: Semi-blind channel estimation using EM algorithm in iterative MIMO APP detectors. IEEE Trans. Wirel. Commun 2006, 5(11):3165-3173.
Article Google Scholar
Aldana CH, de Cardevalho E, Ciof J: Channel estimation for multicarrier multiple input single output systems using the EM algorithm. IEEE Trans. Signal Process 2003, 51(12):3280-3292. 10.1109/TSP.2003.819082
Article MathSciNet Google Scholar
Zhang J, Hanzo L, Mu X: Joint decision-directed channel and noise-variance estimation for MIMO OFDM/SDMA systems based on expectation-conditional maximization. IEEE Trans. Veh. Technol 2011, 60(5):2139-2151.
Article Google Scholar
Choi J: An EM-based iterative receiver for MIMO-OFDM under interference-limited environments. IEEE Trans. Wirel. Commun 2007, 6(11):3994-4003.
Article Google Scholar
Wautelet X, Herzet C, Dejonghe A, Louveaux J, Vandendorpe L: Comparison of EM-based algorithms for MIMO channel estimation. IEEE Trans. Commun 2007, 55(1):216-226.
Article Google Scholar
Nevat I, Peters GW, Yuan J: Detection of gaussian constellations in MIMO systems under imperfect CSI. IEEE Trans. Commun 2010, 58(4):1151-1160.
Article MathSciNet Google Scholar
Georghiades C, Han J: Sequence estimation in the presence of random parameters via the EM algorithm. IEEE Trans. Commun 1997, 45(3):300-308. 10.1109/26.558691
Article Google Scholar
Chan F, Choi J: Neighborhood exploring detector: an EM-based signal detector for multiple antenna systems. IEEE Trans. Signal Process 2007, 55(5):1875-1885.
Article MathSciNet Google Scholar
Kashima T, Fukawa K, Suzuki H: Adaptive MAP receiver via the EM algorithm and message passings for MIMO-OFDM mobile communications. IEEE J. Sel. Areas Commun 2006, 24(3):437-447.
Article Google Scholar
Ueng Y-L, Chen Y-M, Lin J-Y: A MIMO-BICM scheme using a convolutional interleaver for delay-sensitive applications. IEEE Trans. Veh. Technol 2010, 59(5):2380-2393.
Article Google Scholar
Khalighi M, Bourennane S: Semiblind single-carrier MIMO channel estimation using overlay pilots. IEEE Trans. Veh. Technol 2008, 57(3):951-1956.
Article Google Scholar
Choi J: MIMO-BICM iterative receiver with the EM based channel estimation and simplified MMSE combining with soft cancellation. IEEE Trans. Signal Process 2006, 54(8):3247-3251.
Article Google Scholar
Zheng J, Rao B: LDPC-coded MIMO systems with unknown block fading channels: soft MIMO detector design, channel estimation, and code optimization. IEEE Trans. Signal Process 2006, 54(4):1504-1518.
Article MathSciNet Google Scholar
Khalighi MA, Boutros J, Hélard J-F: Data-aided channel estimation for turbo-PIC MIMO detectors. IEEE Commun. Lett 2006, 10(5):350-352. 10.1109/LCOMM.2006.1633319
Article Google Scholar
Pham T-H, Liang Y-C: A Nallanathan, A joint channel estimation and data detection receiver for multiuser MIMO IFDMA systems. IEEE Trans. Commun 2009, 57(6):1857-1865.
Article Google Scholar
Gao J, Li H: Low-complexity MAP channel estimation for mobile MIMO-OFDM systems. IEEE Trans. Wirel. Commun 2008, 7(3):774-780.
Article Google Scholar
Souza RD, Garcia-Frias J, Haimovich A M: Semiblind EM based iterative receivers for space-time-coded modulation and quasi-static frequency-selective fading channels. IEEE Trans. Veh. Technol 2006, 55(4):1259-1268. 10.1109/TVT.2006.877461
Article Google Scholar
Xie YZ, Georghiades CN: Two EM-type channel estimation algorithms for OFDM with transmitter diversity. IEEE Trans. Commun 2003, 51(1):106-115. 10.1109/TCOMM.2002.807617
Article Google Scholar
Ma X, Giannakis G B, Ohno S: Optimal training for block transmissions over doubly selective wireless fading channels. IEEE Trans. Signal Process 2003, 51(5):1351-1366. 10.1109/TSP.2003.810304
Article MathSciNet Google Scholar
Tang Z, Leus G, Cannizzaro RC, Banelli P: Pilot-assisted timevarying channel estimation for OFDM systems. IEEE Trans. Signal Process 2007, 55(5):2226-2238.
Article MathSciNet Google Scholar
Stark H, Woods JW: Probability and Random Processes with Applications to Signal Processing Prentice Hall. Upper Saddle River: Prentice-Hall; 2002.
Google Scholar
Kay SM: Fundamental of Statistical Signal Processing: Estimation Theory. Upper Saddle River: Prentice-Hall; 1993.
Google Scholar
Kannu A, Schniter P: Design and analysis of MMSE pilot-aided cyclic-prefixed block transmissions for doubly selective channels. IEEE Trans. Signal Process 2008, 56(3):1148-1160.
Article MathSciNet Google Scholar
Tree H, Bell K: Bayesian Bounds for Parameter Estimation and Nonlinear Filtering/Tracking. New York: Wiley-IEEE Press; 2007.
Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Science Foundation of China under grant number 61032002, 60902026, and 60972029, the Chinese Important National Science & Technology Specific Projects under grant 2011ZX03001-007-01, and the Program for New Century Excellent Talents in University, NCET-11-0058.

Author information

Authors and Affiliations

National Key Laboratory of Science and Technology on Communications, University of Electronic Science and Technology of China, Chengdu, 611731, People’s Republic of China
Ke Zhong, Xia Lei & Shaoqian Li

Authors

Ke Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Xia Lei
View author publications
You can also search for this author in PubMed Google Scholar
Shaoqian Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ke Zhong.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Zhong, K., Lei, X. & Li, S. Iterative channel estimation and data detection for MIMO-OFDM systems operating in time-frequency dispersive channels under unknown background noise. J Wireless Com Network 2013, 182 (2013). https://doi.org/10.1186/1687-1499-2013-182

Download citation

Received: 10 December 2012
Accepted: 15 June 2013
Published: 06 July 2013
DOI: https://doi.org/10.1186/1687-1499-2013-182

Keywords

Multiple-input multiple-output (MIMO); Orthogonal frequency division multiplexing (OFDM); Time-frequency (TF) dispersive channels; Unknown noise variance; Expectation-maximization (EM)

Iterative channel estimation and data detection for MIMO-OFDM systems operating in time-frequency dispersive channels under unknown background noise

Abstract

1 Introduction

2 System model

2.1 Transmitted MIMO-OFDM systems with scattered pilots

2.2 TF dispersive channels under unknown background noise model

3 Iterative data detection and channel and noise variance estimation

4 A reduced computational complexity scheme

4.1 Initialization

5 Simulation results and discussions

5.1 Convergence of the proposed schemes

5.2 Performance of the proposed schemes

6 Conclusions

Appendices

Appendix 1

Derivation of (17) and (18)

Appendix 2

Solving (36)

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Rights and permissions

About this article

Cite this article

Share this article

Keywords