- Research
- Open Access
- Published:

# Iterative channel estimation and data detection for MIMO-OFDM systems operating in time-frequency dispersive channels under unknown background noise

*EURASIP Journal on Wireless Communications and Networking*
**volume 2013**, Article number: 182 (2013)

## Abstract

In this paper, the challenging problem of joint channel estimation and data detection for multiple-input multiple-output orthogonal frequency division multiplexing systems operating in time-frequency dispersive channels under unknown background noise is investigated. Based on two different but equivalent signal models, two expectation-maximization algorithm-based iterative schemes for joint data detection and channel and noise variance estimation are proposed. The first scheme jointly detects data and estimates the channel and noise variance, but the computational complexity is high, owing to the simultaneous detection and estimation for all antennas. To reduce the computational complexity, a complexity-reduced scheme that is detecting data and estimating channel for only one antenna during each iteration and holding the unknown quantities of other antennas to their last values is proposed, whose performance only slightly degrades compared to the first scheme. Moreover, both schemes are derived as closed-form expressions, and therefore, our proposed schemes are free of exhaustive search. Simulation results demonstrate quick convergence of the proposed algorithm, and after convergence, the performance of the proposed algorithm is close to that of the optimal channel estimation and data detection case, which assumes full training and perfect channel state information.

## 1 Introduction

Multiple-input multiple-output (MIMO) communication [1] can significantly increase the throughput without increasing the transmit power and additional bandwidth. Orthogonal frequency division multiplexing (OFDM) [2] can provide high data rate transmission capability and is robust against multipath (time-dispersive) fading channels. MIMO combined with OFDM (MIMO-OFDM) [3] has been adopted in various international standards such as 3GPP-LTE, WiMAX, and IMT-Advanced.

Meanwhile, vehicles with increased speeds, such as high-speed cars, subways, and trains which exceed 350 km/h, play an increasingly important role in peoples’ lives.

Consequently, mobility support is widely regarded as one of the key features in current and future wireless communication systems. High mobility causes the transmission channel to change rapidly in time, which results in frequency dispersion of the channel. For coherent detection in MIMO-OFDM systems, channel state information (CSI) is indispensable [3].

CSI acquisition is particularly challenging in time-frequency (TF) dispersive channels because channel responses vary sample by sample, and therefore, the number of unknown channel parameters in an OFDM symbol period increases significantly (much greater than in frequency-nondispersive channels). Furthermore, in practical communication scenarios, the knowledge of the power of background noise is required to perform many signal processing algorithms, such as channel estimation [4] and decoding [5] in MIMO-OFDM systems.

In this paper, joint data detection and channel and noise variance estimation for MIMO-OFDM systems operating in TF dispersive channels under unknown background noise are investigated. We employ the expectation-maximization (EM) algorithm [6, 7], which is an iterative numerical method employed to compute the maximum likelihood (ML) estimates, to develop an iterative algorithm to solve this challenging problem.

For MIMO systems, the literature along these lines can be categorized as follows:

*EM for channel estimation and data detection assuming the noise variance is known*: EM-based joint channel estimation and data detection algorithms in time-nondispersive and frequency-nondispersive channels (TnDFnD channels) are proposed in [8–10], and in time-dispersive and frequency-nondispersive channels (TDFnD channels) are proposed in [11–13], respectively. However, the maximization step (M-step) for data detection proposed in these papers is not obtained as a closed-form solution, and therefore, a brute-force searching over all of the possibilities is required.

*EM for channel and noise variance estimation*: In TnDFnD channels, EM-based joint channel and noise variance estimation algorithms are proposed in [14–16]. However, data detection is obtained by an extra ML estimator and a maximizing *a posteriori* probabilities (APP) detector in [14, 15], respectively.

In [16], a full training sequence is adopted to perform the proposed EM algorithm, and therefore, no data detection is addressed. In TDFnD channels, EM-based joint channel and noise variance estimation algorithms are proposed in [17–19]. However, data detection is not addressed in these papers.

*EM for data detection and noise variance estimation*: In TnDFnD channels, an EM-based joint data detection and noise variance estimation algorithm is proposed in [20]. However, the channel estimate is only obtained by pilot symbols and is not included in the EM updating process.

*EM only for data detection assuming the noise variance is known*: In TnDFnD channels, EM-based data detection algorithms are proposed in [21, 22]. However, channel estimation is not addressed in [21], and the channel knowledge is assumed ideally known at the receiver in [22]. In TF channels, an EM-based data detection algorithm is proposed in [23] to solve a maximum a posteriori probability (MAP) detection problem. However, the data estimate is not given by a closed form, and therefore, the exhaustive search is required.

*EM only for channel estimation assuming the noise variance is known*: In TnDFnD channels, EM-based channel estimation algorithms are proposed in [24–28]. However, the data estimates are obtained by extra MAP estimators in [24–26] and APP estimators in [27, 28], respectively. In TDFnD channels, EM-based channel estimation algorithms are proposed in [29–32]. However, the data estimates are obtained by an extra BI-GDFE detector in [29], a minimum mean-squared error (MMSE) detector in [30], a trellises approach in [31], respectively, and data detection is not addressed in [32].

In this paper, based on two different but equivalent signal models, two EM algorithm-based iterative schemes which integrate data detection and channel and noise variance estimation are proposed in a consistent way so as to iteratively improve the system performance.

The first scheme jointly detects data and estimates the channel and noise variance, but the computational complexity is high, owing to the simultaneous detection and estimation for all antennas. To reduce the computational complexity of the first scheme, another scheme that performs data detection and channel estimation for only one antenna during each iteration and holding the unknown quantities of other antennas to their last values is proposed, whose performance only slightly degrades compared to the first scheme. Furthermore, the estimates of data, channel, and noise variance are all obtained as closed-form results, and therefore, the proposed schemes are free of exhaustive search. Simulation results demonstrate quick convergence of the proposed algorithm, and after convergence, the performance of the proposed iterative algorithm is close to that of the optimal channel estimation and data detection case, which assumes full training and perfect CSI.

The remainder of this paper is organized as follows. The system model for MIMO-OFDM systems operating in TF dispersive channels under unknown background noise is introduced in Section 2.

In Section 3, an EM-based scheme for joint data detection and channel and noise variance estimation is proposed. In Section 4, a reduced complexity EM-based scheme is proposed. Section 5 gives some simulation results that demonstrate the effectiveness of the proposed schemes. Finally, conclusions are drawn in Section 6.

*Notation*: Matrices and vectors are represented by boldface uppercase and lowercase letters, respectively.

A hat over a variable (e.g., \widehat{\mathbf{x}}) indicates an estimate of the variable. \mathbb{E}\{\xb7\} denotes the expectation. Superscripts [ ·]^{T}, [ ·]^{−1}, and [ ·]^{H} denote the transpose, the matrix inversion, and the Hermitian operations, respectively. **I**_{
N
} is an identity matrix with dimension *N*. diag{**x**} and Blkdiag{·} stand for the diagonal matrix with vector **x** on its diagonal and the block diagonal concatenation of input arguments, respectively. The symbol ⊛ denotes convolution, and ⊗ stands for the Kronecker product. Tr{**X**} and |**X**| are the trace and the determinant of a square matrix **x**, respectively. ℜ{·} is the real part of the element in the bracket. <·>_{
K
} denotes the mode *K* operation. The matrix **F** is the normalized fast Fourier transform (FFT) matrix with {[\phantom{\rule{0.3em}{0ex}}\mathbf{F}]}_{m,n}=\frac{1}{\sqrt{N}}{e}^{-j2\mathrm{\pi mn}/N}.

## 2 System model

### 2.1 Transmitted MIMO-OFDM systems with scattered pilots

We consider a MIMO-OFDM system with *N*_{
T
} transmit and *N*_{
R
} receive antennas. For the *i* th transmit antenna, the time domain signal **s**^{i}= [ *s*^{i}(0),*s*^{i}(1),...,*s*^{i}(*N*−1)]^{T} is generated by taking the *N*-point inverse FFT of the source signal in the frequency domain **x**^{i}= [ *x*^{i}(0),*x*^{i}(1),...,*x*^{i}(*N*−1)]^{T} as **s**^{i}=**F**^{H}**x**^{i}.

In general, the elements of **x**^{i} can be categorized into:

where {I}_{d}^{i} is the index set of subcarriers allocated for data symbols (with *N*_{
d
} elements), and {I}_{p}^{i} is the index set of subcarriers allocated for pilot symbols (with *N*_{
p
} elements), respectively. Notice that *N*=*N*_{
d
}+*N*_{
p
}. From (1), we have {\mathbf{x}}^{i}={\mathbf{E}}_{d}^{i}{\mathbf{x}}_{d}^{i}+{\mathbf{E}}_{p}^{i}{\mathbf{x}}_{p}^{i}, where {\mathbf{E}}_{d}^{i} and {\mathbf{E}}_{p}^{i} denote the matrices collecting columns of **I**_{
N
} corresponding to {I}_{d}^{i} and {I}_{p}^{i}, respectively, and {\mathbf{x}}_{d}^{i}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{x}_{d}^{i}(0),{x}_{d}^{i}(1),\mathrm{...},{x}_{d}^{i}({N}_{d}-1)]}^{T} and {\mathbf{x}}_{p}^{i}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{x}_{p}^{i}(0),{x}_{p}^{i}(1),\mathrm{...},{x}_{p}^{i}({N}_{p}-1)]}^{T} denote the data and pilot vectors, respectively.

A cyclic prefix (CP) with length *N*_{cp} larger than that of the longest channel response is inserted at the beginning of each OFDM symbol to prevent intersymbol interference.

### 2.2 TF dispersive channels under unknown background noise model

At the receive antenna *j*, assuming perfect timing and frequency synchronization are achieved, the *n* th sample of the received signal is given by:

where *h*^{ji}(*n*,*l*) is the TF dispersive channel of the *l* th path with length *L* at time *n*, associated with the *i* th transmit antenna and the *j* th receive antenna, and *w*^{j}(*n*) denotes the unknown background noise and is assumed to obey complex Gaussian distribution with zero mean and unknown variance *σ*^{2}, which is assumed to be the same across all receive antennas.

After discarding the CP and stacking all *N* samples, the received signal for a whole OFDM symbol at the receive antenna *j* can be expressed in a vector form as:

where **y**^{j}= [ *y*^{j}(0),*y*^{j}(1),...,*y*^{j}(*N*−1)]^{T} and **w**^{j}= [ *w*^{j}(0),*w*^{j}(1),...,*w*^{j}(*N*−1)]^{T} denote the received signal at the receive antenna *j* and the corresponding noise, respectively.

**H**^{ji} represents the corresponding TF dispersive channel matrix and is expressed as:

It is observed from (4) that the number of unknowns in **H**^{ji} is *NL*, which is much larger than the number of received samples. Therefore, direct estimation of **H**^{ji} is almost impossible (i.e., this will give rise to the identifiability problem).

To overcome this problem, in this paper, a parsimonious (low-dimensional) representation of *h*^{ji}(*n*,*l*) using the basis expansion model (BEM) [33, 34] is adopted, i.e., using an expansion with respect to time *n* of each path *l* of *h*^{ji}(*n*,*l*) into a basis {\left\{{b}_{n,q}\right\}}_{q=0}^{Q} as:

where {\beta}_{q,l}^{\mathit{\text{ji}}} is the *q* th BEM coefficient of the *l* th path associated with the channel between the *i* th transmit antenna and the *j* th receive antenna; *b*_{n,q} is the basis that captures channel time variations, and *Q*+1 is the number of the basis. BEM is motivated by the observation that the temporal (*n*) variation of *h*(*n*,*l*) is usually rather smooth due to the channel’s limited Doppler spread and therefore {\left\{{b}_{n,q}\right\}}_{q=0}^{Q} can be chosen as a small set (i.e., *Q*≪*N*) of smooth functions.

Below, two equivalent expressions for the received signal will be derived, from which closed-form solution for data detection and channel estimation can be obtained, as will be shown in the following sections.

Notice that (3) can be rewritten as:

where \mathbf{G}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{i}]=\phantom{\rule{2.77626pt}{0ex}}[\phantom{\rule{0.3em}{0ex}}\text{diag}\{{\mathbf{s}}_{\mathit{\text{cs}},0}^{i}\},\text{diag}\{{\mathbf{s}}_{\mathit{\text{cs}},1}^{i}\},\mathrm{...},\text{diag}\{{\mathbf{s}}_{\mathit{\text{cs}},L-1}^{i}\}] with {\mathbf{s}}_{\mathit{\text{cs}},l}^{i} representing cyclically shifts (cs) **s**^{i} by *l* positions and {\mathbf{h}}^{\mathit{\text{ji}}}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathbf{h}}_{0}^{\mathit{\text{ji}}})}^{T},{({\mathbf{h}}_{1}^{\mathit{\text{ji}}})}^{T},\mathrm{...},{({\mathbf{h}}_{L-1}^{\mathit{\text{ji}}})}^{T}]}^{T} with {\mathbf{h}}_{l}^{\mathit{\text{ji}}}\phantom{\rule{0.3em}{0ex}}={[\phantom{\rule{0.3em}{0ex}}{h}^{\mathit{\text{ji}}}(0,l),{h}^{\mathit{\text{ji}}}(1,l),\mathrm{...},{h}^{\mathit{\text{ji}}}(N-1,l)]}^{T}. (6) can be put into a more compact form as:

where \mathbf{G}[\phantom{\rule{0.3em}{0ex}}\mathbf{s}]=\phantom{\rule{0.3em}{0ex}}[\phantom{\rule{0.3em}{0ex}}\mathbf{G}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{0}],\mathbf{G}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{1}],\mathrm{...},\mathbf{G}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{{N}_{T}-1}]\phantom{\rule{0.3em}{0ex}}] and {\mathbf{h}}^{j}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathbf{h}}^{j0})}^{T},{({\mathbf{h}}^{j1})}^{T},\mathrm{...},{({\mathbf{h}}^{j({N}_{T}-1)})}^{T}]}^{T}. Using (5), {\mathbf{h}}_{l}^{\mathit{\text{ji}}} can be expressed in a vector form as

where **B**= [ **b**_{0},**b**_{1},...,**b**_{
Q
}] with **b**_{
q
}= [ *b*_{0,q},*b*_{1,q},...,*b*_{N−1,q}]^{T} and {\mathit{\beta}}_{l}^{\mathit{\text{ji}}}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{\beta}_{0,l}^{\mathit{\text{ji}}},{\beta}_{1,l}^{\mathit{\text{ji}}}\mathrm{...},{\beta}_{Q,l}^{\mathit{\text{ji}}}]}^{T}. Substituting (8) into (7), we obtain:

where

with {\mathit{\beta}}^{\mathit{\text{ji}}}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{2.77626pt}{0ex}}[\phantom{\rule{0.3em}{0ex}}{({\mathit{\beta}}_{0}^{\mathit{\text{ji}}})}^{T}, {({\mathit{\beta}}_{1}^{\mathit{\text{ji}}})}^{T},\mathrm{...},{({\mathit{\beta}}_{L-1}^{\mathit{\text{ji}}})}^{T}{]}^{T}. By stacking the received signals from all *N*_{
R
} receive antennas into a single vector using (9) and (3), two equivalent expressions of the received signal which explicitly show the dependence of the unknown BEM coefficient and unknown signal can be obtained, respectively, as:

where \mathit{\Xi}[\phantom{\rule{0.3em}{0ex}}\mathbf{s}]={\mathbf{I}}_{{N}_{R}}\otimes (\mathbf{G}[\phantom{\rule{0.3em}{0ex}}\mathbf{s}]({\mathbf{I}}_{{N}_{T}L}\otimes \mathbf{B})), \mathit{\beta}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathit{\beta}}^{0})}^{T},{({\mathit{\beta}}^{1})}^{T},\dots ,{({\mathit{\beta}}^{{N}_{R}-1})}^{T}]}^{T}, \mathit{\Theta}[\phantom{\rule{0.3em}{0ex}}\mathit{\beta}]=\phantom{\rule{0.3em}{0ex}}[\phantom{\rule{0.3em}{0ex}}{\mathbf{H}}^{0},{\mathbf{H}}^{1},\dots ,{\mathbf{H}}^{{N}_{T}-1}] with {\mathbf{H}}^{i}\phantom{\rule{0.3em}{0ex}}={[\phantom{\rule{0.3em}{0ex}}{({\mathbf{H}}^{0i})}^{H},{({\mathbf{H}}^{1i})}^{H},\dots ,{({\mathbf{H}}^{({N}_{R}-1)i})}^{H}]}^{H}, **y**= [ (**y**^{0})^{T},(**y**^{1})^{T},…, {({\mathbf{y}}^{{N}_{R}-1})}^{T}{]}^{T}, \mathbf{s}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathbf{s}}^{0})}^{T},{({\mathbf{s}}^{1})}^{T},\dots ,{({\mathbf{s}}^{{N}_{T}-1})}^{T}]}^{T} and \mathbf{w}\phantom{\rule{0.3em}{0ex}}={[\phantom{\rule{0.3em}{0ex}}{({\mathbf{w}}^{0})}^{T},{({\mathbf{w}}^{1})}^{T},\dots ,{({\mathbf{w}}^{{N}_{R}-1})}^{T}]}^{T}. Notice that ** Ξ**[

**s**] represents a function of

**s**and can be reconstructed by

**s**through (6), (7), (8) and (9). Similarly,

**[**

*Θ***] represents a function of**

*β***and can be reconstructed by**

*β***through (3), (4) and (5).**

*β*## 3 Iterative data detection and channel and noise variance estimation

The ML solution of all unknown quantities in (??), i.e., **s**, ** β**, and

*σ*

^{2}of

**w**, involves multidimensional searches that pose prohibitively high computational complexity. In this and the next sections, the EM algorithm is employed to iteratively compute the ML estimates, with the different accuracy versus complexity trade-offs, respectively. As will be seen, our proposed schemes provide not only computationally affordable but also closed-form solutions that are free of exhaustive search.

Using the EM terminology, we take **y** as the incomplete data, ** β** as the unobservable or missing data, and (

*σ*

^{2},

**s**) as parameters of interest. The iterative algorithm includes two steps (the E-step and the M-step) at each iteration. In the E-step, an expectation is taken with respect to

**conditional on the observed data**

*β***y**and the previous estimates of (

*σ*

^{2},

**s**), and an objective function depending only on (

*σ*

^{2},

**s**) is obtained. In the M-step, through maximizing the function obtained in the E-step, the effect of channel can be compensated, and the current updated estimates of (

*σ*

^{2},

**s**) can be obtained.

The two steps at the *k* th iteration are detailed as follows:

*E-step*: compute \mathbb{Q}(\phantom{\rule{0.3em}{0ex}}{\sigma}^{2},\mathbf{s}|{\widehat{\sigma}}_{k-1}^{2},\phantom{\rule{0.3em}{0ex}}{\widehat{\mathbf{s}}}_{k-1}\phantom{\rule{0.3em}{0ex}})\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\mathbb{E}\{\text{log}f(\mathbf{y}\phantom{\rule{0.3em}{0ex}},\mathit{\beta}|{\sigma}^{2},\phantom{\rule{0.3em}{0ex}}\mathbf{s})|\mathbf{y},{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}\}.

*M-step*: solve ({\widehat{\sigma}}_{k}^{2},{\widehat{\mathbf{s}}}_{k})=\text{arg}\underset{{\sigma}^{2},\mathbf{s}}{max}\mathbb{Q}({\sigma}^{2},\mathbf{s}|{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}).

Note that conditioned upon **y**, the only unknown or random component in the complete data (**y**,** β**) is

**; the expectation is taken with respect to the conditional probability density function f(\mathit{\beta}|\mathbf{y},{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}), while ({\widehat{\sigma}}_{k}^{2},{\widehat{\mathbf{s}}}_{k}) are the estimates of**

*β**σ*

^{2}and

**s**at the

*k*th iteration.

More specifically, for the *E-step*: using Bayes’s rule, we have:

where the fact that ** β** is independent of

**s**and

*σ*

^{2}has been used. From (11), the function \mathbb{Q}({\sigma}^{2},\mathbf{s}|{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}) in the E-step can be expressed as:

where the second term can be ignored in the following derivations, since it is not a function of parameters of interest, i.e., not a function of (*σ*^{2}, **s**) and therefore will not affect the following M-step. Using (10a), the likelihood function *f*(**y**|** β**,

*σ*

^{2},

**s**) is obtained as:

Substituting (13) into (12), we have:

Notice that the following equation holds true for any matrix **A** and vector **A** with compatible dimension:

Define the conditional mean of ** β** in (14) as {\widehat{\mathit{\beta}}}_{k}=\mathbb{E}\left\{\mathit{\beta}\right|\mathbf{y},{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}\}, and using (15), we obtain:

where {\widehat{\mathit{{\rm Y}}}}_{k}=\mathbb{E}\{(\mathit{\beta}-{\widehat{\mathit{\beta}}}_{k}){(\mathit{\beta}-{\widehat{\mathit{\beta}}}_{k})}^{H}|\mathbf{y},{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}\} represents the corresponding conditional covariance matrix of ** β**. It is shown in Appendix 1 that the conditional mean and covariance matrix are approximately given by:

*M-step*: in this step, we aim to maximize \mathbb{Q}({\sigma}^{2},\mathbf{s}|{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}) with respect to *σ*^{2} and **s**. Differentiating (16) with respect to **s** and setting the result to zero, neglecting those irrelevant terms we have:

It is noted that since (19) depends on **s** in an implicit way, direct maximization of (19) with respect to **s** is difficult since multidimensional search is required. In what follows, an alternative expression for \mathbb{Q}({\sigma}^{2},\mathbf{s}|{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}) will be derived from which a closed-form solution for the maximizing value of **s** can be obtained. Since {\widehat{\mathit{{\rm Y}}}}_{k} is a *N*_{
T
}*N*_{
R
}(*Q*+1)*L*×*N*_{
T
}*N*_{
R
}(*Q*+1)*L* Hermitian matrix, based on eigen-decomposition, we have {\widehat{\mathit{{\rm Y}}}}_{k}={\sum}_{m=0}^{{N}_{T}{N}_{R}(Q+1)L-1}{\lambda}_{m,k}{\mathit{\mu}}_{m,k}{\mathit{\mu}}_{m,k}^{H}, where *λ*_{m,k} is the *m* th eigenvalue of {\widehat{\mathit{{\rm Y}}}}_{k}, and *μ*_{m,k} is the *m* th eigenvector, associated with *λ*_{m,k}. Substituting the eigendecomposition on {\widehat{\mathit{{\rm Y}}}}_{k} into (19) and using the two equivalent equations derived in (10a) and (10b), we have:

Notice that ** Ξ**[ ·] and

**[ ·] defined in (10a) and (10b) are not only applicable to**

*Θ***s**and

**but also applicable to any vectors with compatible dimension.**

*β*Since (20) is a quadratic form of **s**, by setting the first derivative of (20) with respect to **s** to zero, the *k* th signal estimate is then given by:

Note that {\stackrel{~}{\mathbf{s}}}_{k}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\stackrel{~}{\mathbf{s}}}_{k}^{0})}^{T},{({\stackrel{~}{\mathbf{s}}}_{k}^{1})}^{T},\mathrm{...},{({\stackrel{~}{\mathbf{s}}}_{k}^{{N}_{T}-1})}^{T}]}^{T}. After OFDM demodulation, the symbol from the *i* th transmit antenna can be obtained as:

Since {\stackrel{~}{\mathbf{x}}}_{k}^{i} is discrete, belonging to a symbol constellation point, it must be quantized to its nearest constellation point in each iteration. Consequently, constellation mapping is carried out to obtain the discrete symbol estimate as: {\widehat{\mathbf{x}}}_{k}^{i}=\text{Qant}\left\{{\stackrel{~}{\mathbf{x}}}_{k}^{i}\right\}, where Qant{·} operation denotes quantization on the element in the bracket. The data symbol estimate is thus obtained by collecting the elements of {\widehat{\mathbf{x}}}_{k}^{i} corresponding to {I}_{d}^{i}.

Finally, putting {\widehat{\mathbf{s}}}_{k}^{i}={\mathbf{F}}^{H}{\widehat{\mathbf{x}}}_{k}^{i},i=0,1,\mathrm{...},{N}_{T}-1 into (16) and setting the first derivative of \mathbb{Q}({\sigma}^{2},\mathbf{s}|{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}) with respect to *σ*^{2} to zero, the *k* th estimate of the unknown noise variance can be obtained as

In summary, starting from a suitable initial value, the proposed iterative EM-based scheme alternates among the explicitly closed-form results (17), (18), (21), (22), and (23) until convergence, i.e., until no significant changes are observed in the updates.

## 4 A reduced computational complexity scheme

The computational complexity of the EM-based iterative scheme proposed in Section 3 is summarized in Table 1. Notice that the computational burden mainly comes from the joint detection and estimation simultaneous for all transmit antennas. If in each iteration, detection, and estimation can be completed one antenna by one antenna, the computational burden will be significantly reduced.

Recalling (6) and (3), two alternating but equivalent expressions for **y** can be derived as:

where \mathbf{\Phi}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{i}]\phantom{\rule{0.3em}{0ex}}={\mathbf{I}}_{{N}_{R}}\otimes \mathbf{G}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{i}], \mathbf{H}[\phantom{\rule{0.3em}{0ex}}{\mathit{\beta}}_{\text{rc}}^{i}]=\phantom{\rule{0.3em}{0ex}}{[{({\mathbf{H}}^{0i})}^{H},{({\mathbf{H}}^{1i})}^{H},\mathrm{...},{({\mathbf{H}}^{({N}_{R}-1)i})}^{H}]}^{H}, the BEM coefficients associated with the channel from the transmit antenna *i* to *N*_{
R
} receive antennas is represented by {\mathit{\beta}}_{\text{rc}}^{i}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathit{\beta}}^{0i})}^{T},{({\mathit{\beta}}^{1i})}^{T},\mathrm{...},{({\mathit{\beta}}^{({N}_{R}-1)i})}^{T}]}^{T} with {\mathit{\beta}}^{\mathit{\text{ji}}}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathit{\beta}}_{0}^{\mathit{\text{ji}}})}^{T},{({\mathit{\beta}}_{1}^{\mathit{\text{ji}}})}^{T},\mathrm{...},{({\mathit{\beta}}_{L-1}^{\mathit{\text{ji}}})}^{T}]}^{T}. The subscript ‘rc’ is short for ‘reduced complexity’ to distinguish it from the *β*^{j} defined in Section 3, which represents the BEM coefficients associated with the channel from *N*_{
T
} transmit antennas to the receive antenna *j*.

From (24a) and (24b), it is observed that by applying the mathematical framework of EM, an alternative way to choose the complete data, defined as ** ψ** in this scheme, is by decomposing the observed data

**y**into its signal components. The complete data

**is obtained as:**

*ψ*where {\mathit{\psi}}^{i}=\mathbf{\Phi}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{i}]({\mathbf{I}}_{{N}_{R}L}\otimes \mathbf{B}){\mathit{\beta}}_{\text{rc}}^{i}+{\mathbf{w}}^{i}=\mathbf{H}[\phantom{\rule{0.3em}{0ex}}{\mathit{\beta}}_{\text{rc}}^{i}]{\mathbf{s}}^{i}+{\mathbf{w}}^{i},i=0,1,\mathrm{...},{N}_{T}-1, and **w**^{i},*i*=0,1,...,*N*_{
T
}−1 are circularly symmetric and statistically independent Gaussian vectors satisfying \mathbf{w}={\sum}_{i=0}^{{N}_{T}-1}{\mathbf{w}}^{i}.

Similar to the E-step in Section 3, for the *k* th iteration, we need to compute the conditional expectation of the log-likelihood function for the complete data ** ψ**. More specifically, for the

*E-step*: using (25a), the likelihood function can be expressed as:

where {\mathit{\beta}}_{\text{rc}}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathit{\beta}}_{\text{rc}}^{0})}^{T},{({\mathit{\beta}}_{\text{rc}}^{1})}^{T},\mathrm{...},{({\mathit{\beta}}_{\text{rc}}^{{N}_{T}-1})}^{T}]}^{T}, {\mathit{{\rm Y}}}_{\mathit{\psi}}\phantom{\rule{0.3em}{0ex}}=\text{Blkdiag}\{{\varsigma}^{0}{\sigma}^{2}{\mathbf{I}}_{N{N}_{R}},{\varsigma}^{1}{\sigma}^{2}{\mathbf{I}}_{N{N}_{R}},\mathrm{...},{\varsigma}^{{N}_{T}-1}{\sigma}^{2}{\mathbf{I}}_{N{N}_{R}}\} with \mathbb{E}\left\{{\mathbf{w}}^{i}{({\mathbf{w}}^{i})}^{H}\right\}\phantom{\rule{1em}{0ex}}=\phantom{\rule{1em}{0ex}}{\varsigma}^{i}{\sigma}^{2}{\mathbf{I}}_{N{N}_{R}} and the *ς*^{i}’s being arbitrary non-negative and real-valued scalars satisfying {\sum}_{i=0}^{{N}_{T}-1}{\varsigma}^{i}=1, \mathit{\varrho}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{(\mathbf{\Phi}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{0}]({\mathbf{I}}_{{N}_{R}L}\otimes \mathbf{B}){\mathit{\beta}}_{\text{rc}}^{0})}^{T},\phantom{\rule{0.3em}{0ex}}{(\mathbf{\Phi}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{1}]({\mathbf{I}}_{{N}_{R}L}\otimes \mathbf{B}){\mathit{\beta}}_{\text{rc}}^{1})}^{T},\mathrm{...},{(\mathbf{\Phi}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{{N}_{T}-1}]({\mathbf{I}}_{{N}_{R}L}\otimes \mathbf{B}){\mathit{\beta}}_{\text{rc}}^{{N}_{T}-1})}^{T}]}^{T}. Notice that in this scheme, we take (*β*_{rc}, **s**) as parameters of interest. Using (26) and neglecting those irrelevant terms, \mathbb{E}\{\text{log}f(\mathit{\psi}|{\mathit{\beta}}_{\text{rc}},\mathbf{s})|\mathbf{y},{\widehat{\mathit{\beta}}}_{\text{rc},k-1},{\widehat{\mathbf{s}}}_{k-1}\} can be expressed as:

where {\widehat{\mathit{\psi}}}_{k}, the conditional mean of ** ψ**, can be derived as:

where \mathit{\mho}=\phantom{\rule{2.77626pt}{0ex}}[\phantom{\rule{0.3em}{0ex}}{\mathbf{I}}_{N{N}_{R}},{\mathbf{I}}_{N{N}_{R}},\mathrm{...},{\mathbf{I}}_{N{N}_{R}}] is a matrix with dimension *N* *N*_{
R
}×*N* *N*_{
R
}*N*_{
T
} that connects **y** and ** ψ** as \mathbf{y}={\sum}_{i=0}^{{N}_{T}-1}{\mathit{\psi}}^{i}=\mathit{\mho}\mathit{\psi}. Substituting the corresponding components into the right-hand side of (28), after some manipulations we obtain:

Substituting (29) into (28), finally we obtain:

where

It is noted from (30) that in the following M-step, the maximization of \mathbb{E}\{\text{log}f(\mathit{\psi}|{\mathit{\beta}}_{\text{rc}},\mathbf{s})|\mathbf{y},{\widehat{\mathit{\beta}}}_{\text{rc},k-1},{\widehat{\mathbf{s}}}_{k-1}\} with respect to *β*_{rc} and **s** is equivalent to the minimization of each of the single terms in (30), i.e., minimization of {({\widehat{\mathit{\psi}}}_{k}^{i}-\mathbf{\Phi}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{i}]({\mathbf{I}}_{{N}_{R}L}\otimes \mathbf{B}){\mathit{\beta}}_{\text{rc}}^{i})}^{H}({\widehat{\mathit{\psi}}}_{k}^{i}-\mathbf{\Phi}[\phantom{\rule{0.3em}{0ex}}{\mathbf{s}}^{i}]({\mathbf{I}}_{{N}_{R}L}\otimes \mathbf{B}){\mathit{\beta}}_{\text{rc}}^{i}) with respect to {\mathit{\beta}}_{\text{rc}}^{i} and **s**^{i} for each *i*, separately.

Notice that the multidimensional minimization for each of the terms in (30) still remains a formidable task. To solve this problem, substituting (24a) and (24a) into (31), we obtain:

where

Set *ς*^{i}=1, for the *i* th transmit antenna we have:

Recalling that {\sum}_{i=0}^{{N}_{T}-1}{\varsigma}^{i}=1, and by using (31), for the transmit antennas {\left\{g\right\}}_{g=0,g\ne i}^{{N}_{T}-1} we have:

Using (??) and (35), (30) can be decomposed into *N*_{
T
} terms, each of which can be solved as follows:

*M-step*: for the *i* th transmit antenna, we have:

and for the transmit antennas {\left\{g\right\}}_{g=0,g\ne i}^{{N}_{T}-1}, we have:

Therefore, the proposed iterative scheme starts from *k*=0,1,2,... and during the *k* th iteration, *i* is set as i=<k{>}_{{N}_{T}}. It can be seen that we have split the estimation and detection problem for the MIMO case of Section 3 into estimation and detection problem for *N*_{
T
} single-input and multiple-output (SIMO) cases, where, during each iteration, parameters and data from only one transmit antenna are estimated and detected. Note that {\mathit{\chi}}_{k}^{i} given in (33) is a disturbance term that accounts for the background noise and residual interference after the *k* th iteration, where the interference is linearly related to the signals of all transmit antennas. Then, assuming the interference is i.i.d with zero mean, from the central limit theorem [35], it can be seen that the entries of {\mathit{\chi}}_{k}^{i} are nearly Gaussian distributed with zero mean and some variance *σ* *χ*^{i},*k* 2. Under the above assumption, it turns out that the minimization problem in (36) is equivalent to the ML estimation of {\mathit{\beta}}_{\text{rc}}^{i}, **s**^{i} and the unknown variance {\sigma}_{{\chi}^{i}}^{2} starting from the observation {\widehat{\mathit{\psi}}}^{i}. Comparing (34a) and (34b) with (10a) and (10b), it is easy to see that the same EM procedure proposed in Section 3 can be directly adopted to solve the optimization problem of (36), with details shown in Appendix 2.

The computationally feasible EM scheme is summarized as follows:

The computational complexity of the proposed iterative EM-based scheme with reduced complexity is summarized in Table 2.

Note that compared to Table 1, the computational complexity of the proposed iterative EM-based scheme with reduced complexity is significantly lower than that of the EM-based scheme proposed in Section 3. However, this significant computational complexity reduction is not obtained without price. As will be shown in Section 5, there is a minor performance degradation compared to the EM-based scheme proposed in Section 3. This performance degradation is due mainly to two reasons. First, the disturbance term in (34a) and (34b) contains the background noise as well as the residual interference from other transmit antennas, whereas in (10a) and (10b), only the background noise is contained. Second, the separate estimation and detection for each antenna is seen as a suboptimal estimation and detection method compared to the joint estimation and detection for all antennas, which is optimal in the sense of estimation and detection theory [36].

### 4.1 Initialization

The EM algorithm is guaranteed to obtain at least a local maximum after convergence [6, 7].

To provide an initial value, a least square (LS) algorithm based on pilot symbols is utilized to provide a good initial estimate which will be demonstrated in the simulations. Recalling (1), (10a), and (10b), we have:

where {\mathit{\Omega}}_{p}=\text{Blkdiag}\{{\mathbf{F}}^{H}{\mathbf{E}}_{p}^{0},{\mathbf{F}}^{H}{\mathbf{E}}_{p}^{1},\mathrm{...},{\mathbf{F}}^{H}{\mathbf{E}}_{p}^{{N}_{T}-1}\}, {\mathit{\Omega}}_{d}\phantom{\rule{0.3em}{0ex}}=\text{Blkdiag}\{{\mathbf{F}}^{H}{\mathbf{E}}_{d}^{0},{\mathbf{F}}^{H}{\mathbf{E}}_{d}^{1},\mathrm{...},{\mathbf{F}}^{H}{\mathbf{E}}_{d}^{{N}_{T}-1}\}, {\mathbf{x}}_{p}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathbf{x}}_{p}^{0})}^{T},{({\mathbf{x}}_{p}^{1})}^{T},\mathrm{...},{({\mathbf{x}}_{p}^{{N}_{T}-1})}^{T}]}^{T}, and {\mathbf{x}}_{d}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\mathbf{x}}_{d}^{0})}^{T},{({\mathbf{x}}_{d}^{1})}^{T},\mathrm{...},{({\mathbf{x}}_{d}^{{N}_{T}-1})}^{T}]}^{T}. By treating the term containing **x**_{
d
} as interference, the LS estimate of ** β** is obtained as:

Substituting (39) into (10b), the initial signal detection is obtained as:

Finally, for the initial variances {\widehat{\sigma}}_{0}^{2} and {\left\{{\widehat{\sigma}}_{{\chi}^{i},0}^{2}\right\}}_{i=0}^{{N}_{T}-1}, they are all set to 0.

## 5 Simulation results and discussions

In this section, the performance of the proposed algorithm is demonstrated by Monte Carlo simulations. In the simulations, transmit and receive antennas are set as *N*_{
T
}=*N*_{
R
}=2, each OFDM symbol has 64 subcarriers (*N*=64) and communicates over a bandwidth of 20 MHz. The sampling interval *T*_{
s
} is thus 50 ns. The length of the CP is *N*_{cp}=8.

The normalized maximal Doppler shift is set as *N* *f*_{
d
}*T*_{
s
}=0.075 and 0.15, respectively, where *f*_{
d
} represents the maximum Doppler frequency.

The channel has three taps (*L*=3) with an exponential power delay profile, namely {\sigma}_{l}^{2}=exp\phantom{\rule{1pt}{0ex}}(-\mathrm{\kappa l})((1-exp\phantom{\rule{1pt}{0ex}}(-\kappa ))/(1-exp\phantom{\rule{1pt}{0ex}}(-\mathrm{\kappa L}))),l=0,1,\mathrm{...},L-1 with *κ*=1/3. In typical communication scenarios, only a few significant paths dominate the effect of the wireless channel [4]. Therefore, *L*=3 is a reasonable setting. Each tap coefficient follows a complex Gaussian distribution. The data are modulated by quadrature phase shift keying (QPSK) and 16 quadrature amplitude modulation (16 QAM), respectively, with unit power. The pilot cluster follows the structure in [37], and more specifically, seven pilot clusters are used for each transmit antenna. The clusters are equal-spaced among subcarriers, and in each cluster, one nonzero pilot is guarded by one zero pilot on each side. The nonzero pilots are generated as zero-mean complex Gaussian random variables with power three times that of data symbols. Furthermore, the generalized complex exponential BEM (GCE-BEM) [34] is adopted.

### 5.1 Convergence of the proposed schemes

Figure 1, 2, and 3 present the convergence performance of the proposed EM-based scheme in Section 3 (marked as scheme 1) and the proposed EM-based scheme in Section 4 (marked as scheme 2) with signal-to-noise ratio (SNR) equal to 10, 20, and 30 dB. It can be seen that both the mean-square error (MSE) and bit error rate (BER) improve significantly in the first few iterations and converge to stable values within eight iterations. Channel estimation with full training and data detection with perfect CSI are shown for comparison. Furthermore, according to [38], the Cramer-Rao bound is also shown for comparison. It can be seen from Figure 1 that after convergence, the channel estimation performance of both schemes greatly improve that of the initial estimation (marked as iteration = 0), which indicates the ability of the proposed algorithm to cancel the interference from unknown data to channel estimation through iterations. The channel estimation performance of scheme 1 is very close to that of the Cramer-Rao bound and the full training case. The channel estimation performance of scheme 2 suffers a minor performance degradation compared to that of the scheme 1, which is the price we have to pay for the reduced computational complexity. Similar results can be observed for the performance of data detection in Figure 2 and 3, which indicates that the updated channel estimate can in turn greatly improve the data detection through iterations. Similar convergence results are also observed for the 16 QAM case, and figures are not presented here due to space limitations.

### 5.2 Performance of the proposed schemes

Figure 4, 5, and 6 show the MSE and BER performance achieved by the proposed iterative algorithm versus SNRs. It can be seen from Figure 4 that the performances of the proposed schemes 1 and 2 both perform much better than that of the initial value and close to that of the Cramer-Rao bound and the full training case after convergence.

Similarly, it can be seen from Figure 5 that for the case where *N* *f*_{
d
}*T*_{
s
}=0.075, the BER performance of the proposed iterative algorithm is very close to that of the ideal case which assumes perfect CSI after convergence. For the severe case where *N* *f*_{
d
}*T*_{
s
}=0.15, it can be seen from Figure 6 that the proposed iterative algorithm can still deal with such a highly TF dispersive channel and performs well. Moreover, from Figure 5 and 6, it can be seen that for signals with both amplitude and phase variations such as 16 QAM, the proposed algorithm also performs well.

Finally, we investigate how the proposed schemes are affected by different channel lengths. A severe case where the channel length is equivalent to the number of embedded pilots (marked as case 2) is shown in Figure 7. As can be seen from the figure, compared to the originally-presented case where the channel length is 3 (marked as case 1), there is an obvious performance degradation of the proposed schemes for the severe case 2. The reason can be explained according to the estimation theory [36] that when the channel length increases, more parameters need to be estimated, which leads to a decreased performance. On the contrary, if the channel length decreases, less parameters need to be estimated and that leads to an increased performance.

## 6 Conclusions

In this paper, two EM-based iterative data detection and channel and noise variance estimation schemes for MIMO-OFDM systems operating over TF dispersive channels under unknown background noise have been proposed. The resulting schemes achieve convergence in a few iterations and can effectively estimate TF dispersive channels and obtain reliable data detection under unknown background noise environments. The first scheme iteratively detects data and estimates the channel and noise variance simultaneously for all antennas, and moreover, the updating expressions of these estimates are all derived as closed-form results. Simulation results showed that after convergence, the performance of the first scheme is very close to that of the optimal case which assumes full training and perfect CSI. To reduce the computational complexity of the first scheme, another EM-based scheme that detecting data and estimating channel for only one antenna during each iteration and holding the unknown quantities of other antennas to their last estimates has been proposed, which is also derived as closed-form results. Simulation results showed that its performance only slightly degrades compared to the first scheme, but the computational complexity is significantly reduced.

## Appendices

### Appendix 1

#### Derivation of (17) and (18)

Using Bayes’s formula, the conditional pdf of ** β** is given by:

where the fact that ** β** is independent of

**s**, and

*σ*

^{2}has been used. The BEM coefficient

**can be shown to be complex Gaussian variable [33] with zero mean and covariance matrix**

*β***R**

_{ β }, that is:

Note that:

With *f*(**y**|** β**,

*σ*

^{2},

**s**) given by (13), putting (13) and (42) into (43), we have:

Substituting (13), (42), and (44) into (41), after some manipulations we have:

where

Thus, the pdf f(\mathit{\beta}|\mathbf{y},{\widehat{\sigma}}_{k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}) is a Gaussian distribution. In addition, {\widehat{\mathit{\beta}}}_{k} and {\widehat{\mathit{{\rm Y}}}}_{k} given in (46) and (47), respectively, are in fact its conditional mean and covariance. To show that we have no prior information on ** β**, we take the limit ||

**R**

_{ β }||→+

*∞*, which leads to (17) and (18). In this paper, we set {\mathbf{R}}_{\beta}^{-1} to zero to show we have no prior information for

**. Indeed, there will be a performance degradation by assuming {\mathbf{R}}_{\beta}^{-1} to zero. However, this is a typical complexity versus performance trade-off. Moreover, as can be seen from simulation results in Section 5, even we set {\mathbf{R}}_{\beta}^{-1} to zero, the proposed algorithm also performs well, and its performance is acceptable.**

*β*### Appendix 2

#### Solving (36)

Comparing (34a) and (34b) with (10a) and (10b), referring to Section 3, we take {\widehat{\mathit{\psi}}}_{k}^{i} as the incomplete data, {\mathit{\beta}}_{\text{rc}}^{i} as the unobservable or missing data, and ({\sigma}_{{\chi}^{i}}^{2}, **s**^{i}) as parameters of interest. The two steps at the *k* th iteration are detailed as follows:

*E-step*: compute \mathbb{Q}({\sigma}_{{\chi}^{i}}^{2},{\mathbf{s}}^{i}|{\widehat{\sigma}}_{{\chi}^{i},k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}^{i})=\mathbb{E}\{\text{log}f({\widehat{\mathit{\psi}}}_{k}^{i}, {\mathit{\beta}}_{\text{rc}}^{i}|{\sigma}_{{\chi}^{i}}^{2},{\mathbf{s}}^{i})|{\widehat{\mathit{\psi}}}_{k}^{i},{\widehat{\sigma}}_{{\chi}^{i},k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}^{i}\}.

*M-step*: solve ({\widehat{\sigma}}_{{\chi}^{i},k}^{2},{\widehat{\mathbf{s}}}_{k}^{i})={\text{arg max}}_{{\sigma}_{{\chi}^{i}}^{2},{\mathbf{s}}^{i}}\mathbb{Q}({\sigma}_{{\chi}^{i}}^{2},{\mathbf{s}}^{i}|{\widehat{\sigma}}_{{\chi}^{i},k-1}^{2}, {\widehat{\mathbf{s}}}_{k-1}^{i}).

Note that conditioned upon {\widehat{\mathit{\psi}}}_{k}^{i}, the only unknown or random component in the complete data ({\widehat{\mathit{\psi}}}_{k}^{i},{\mathit{\beta}}_{\text{rc}}^{i}) is {\mathit{\beta}}_{\text{rc}}^{i}, the expectation is taken with respect to the conditional probability density function f({\mathit{\beta}}_{\text{rc}}^{i}|{\widehat{\mathit{\psi}}}_{k}^{i},{\widehat{\sigma}}_{{\chi}^{i},k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}^{i}), while ({\widehat{\sigma}}_{{\chi}^{i},k}^{2},{\widehat{\mathbf{s}}}_{k}^{i}) are the estimates of *σ* *χ*^{i}2, and **s**^{i} at the *k* th iteration. More specifically, for the *E-step*: Using Bayes’s rule, we obtain:

Using (48), the function \mathbb{Q}({\sigma}_{{\chi}^{i}}^{2},{\mathbf{s}}^{i}|{\widehat{\sigma}}_{{\chi}^{i},k-1}^{2},{\widehat{\mathbf{s}}}_{k-1}^{i}) can be expressed as:

where the second term can be ignored in the following derivations, since it is not a function of parameters of interest, i.e., not a function of ({\sigma}_{{\chi}^{i}}^{2}, **s**^{i}). Using (34a), the likelihood function f({\widehat{\mathit{\psi}}}_{k}^{i}|{\mathit{\beta}}_{\text{rc}}^{i},{\sigma}_{{\chi}^{i}}^{2},{\mathbf{s}}^{i}) is obtained as:

Substituting (50) into (49) and referring to (14), (15) and (16) and Appendix 1, the conditional mean and covariance matrix are obtained as:

It is noted that the matrix **Φ**[ **s**^{i}] is of dimension *N*_{
R
}*N*×*N*_{
R
}*N* *L*, and the matrix ({\mathbf{I}}_{{N}_{R}L}\otimes \mathbf{B}) is of dimension *N*_{
R
}*N* *L*×*N*_{
R
}(*Q*+1)*L*; the *N*_{
R
}(*Q*+1)*L*×*N*_{
R
}(*Q*+1)*L* matrix inversion required in (51) and (52) is only \frac{1}{{N}_{T}} of that needed in (17) and (18).

*M-step*: using the two equivalent expressions derived in (34a) and (34b) and similar to (19), (20), (21) and (22), the signal updating equation is obtained as:

where {\widehat{\mathit{{\rm Y}}}}_{\text{rc},k}^{i}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\sum _{m=0}^{{N}_{R}(Q+1)L-1}{\lambda}_{\text{rc},m,k}^{i}{\mathit{\mu}}_{\text{rc},m,k}^{i}{({\mathit{\mu}}_{\text{rc},m,k}^{i})}^{H} represents the eigendecomposition of {\widehat{\mathit{{\rm Y}}}}_{\text{rc},k}^{i}. It is noted that compared to (21) where *N*_{
T
}*N*×*N*_{
T
}*N* matrix inversion is required, only *N*×*N* matrix inversion is needed in (53). The symbol detection can thus be obtained after OFDM demodulation as

Substituting {\widehat{\mathbf{s}}}_{k}^{i}={\mathbf{F}}^{H}{\widehat{\mathbf{x}}}_{k}^{i} and (50), (51) and (52) into (49) and referring to (23), the unknown noise variance for the disturbance term {\mathit{\chi}}_{k}^{i} can be obtained as:

In summary, (51), (52), (53), (54) and (55) solve the minimization problem in (36).

Notice that the computational complexity can be further reduced by observing the diagonal structure of both **Φ**[ **s**^{i}] and ({\mathbf{I}}_{{N}_{R}L}\otimes \mathbf{B}) in (24a). Therefore, (51) and (52) can be further split into *N*_{
R
} sub-matrices, each of which is expressed as:

where {\widehat{\mathit{\psi}}}_{k}^{i}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\widehat{\mathit{\psi}}}_{k}^{0,i})}^{T},{({\widehat{\mathit{\psi}}}_{k}^{1,i})}^{T},\mathrm{...},{({\widehat{\mathit{\psi}}}_{k}^{{N}_{R}-1,i})}^{T}]}^{T}. Then, {\widehat{\mathit{\beta}}}_{\text{rc},k}^{i} is obtained as {\widehat{\mathit{\beta}}}_{\text{rc},k}^{i}=\phantom{\rule{2.77626pt}{0ex}}{[\phantom{\rule{0.3em}{0ex}}{({\widehat{\mathit{\beta}}}_{\text{rc},k}^{0,i})}^{T},{({\widehat{\mathit{\beta}}}_{\text{rc},k}^{1,i})}^{T},\mathrm{...},{({\widehat{\mathit{\beta}}}_{\text{rc},k}^{{N}_{R}-1,i})}^{T}]}^{T}, and {\widehat{\mathit{{\rm Y}}}}_{\text{rc},k}^{i} can be obtained as {\widehat{\mathit{{\rm Y}}}}_{\text{rc},k}^{i}=\text{Blkdiag}\{{\mathit{{\rm Y}}}_{\text{rc},k}^{0,i},{\mathit{{\rm Y}}}_{\text{rc},k}^{1,i},\mathrm{...},{\mathit{{\rm Y}}}_{\text{rc},k}^{{N}_{R}-1,i}\}. It is noted that the matrix **G**[ **s**^{i}] is of dimension *N*×*N* *L*, and the matrix {\mathbf{B}}^{z}\triangleq {\mathbf{I}}_{L}\otimes \mathbf{B} is of dimension *N* *L*×(*Q*+1)*L*; the (*Q*+1)*L*×(*Q*+1)*L* matrix inversion required in (56) and (57) is \frac{1}{{N}_{R}} of that needed in (51) and (52) and therefore only \frac{1}{{N}_{R}{N}_{T}} of that needed in (17) and (18).

## References

Bocskei H, Paulraj AJ:

*Multiple-Input Multiple-Output (MIMO) Wireless Systems*. Cambridge: Cambridge Univ. Press; 2003.Nee RV, Prasad R:

*OFDM for Wireless Multimedia Communications*. Norwood: Artech House Publishers; 2000.Hanzo L, Akhtman J, Jiang M, Wang L:

*MIMO-OFDM for LTE, WiFi and WiMAX: Coherent versus Non-coherent and Cooperative Turbo Transceivers*. Hoboken: Wiley; 2010.Goldsmith A:

*Wireless Communications*. Cambridge: Cambridge Univ. Press; 2005.Viterbi AJ, Omura JK:

*Principles of Digital Communication and Coding*. New York: Dover Press; 2009.Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm.

*J. Royal Statiscal Soc., Ser. B (Methodological)*1977, 39(1):1-38.Moon T: The expectation-maximization algorithm.

*IEEE Signal Process. Mag*1996, 13(6):47-60. 10.1109/79.543975Assra A, Hamouda W, Youssef A: EM-based joint channel estimation and data detection for MIMO-CDMA systems.

*IEEE Trans. Veh. Technol*2010, 59(3):1205-1216.Choi J: An EM based joint data detection and channel estimation incorporating with initial channel estimate.

*IEEE Commun. Lett*2008, 12(9):654-656.Cozzo C, Hughes BL: Joint channel estimation and data detection in space-time communications.

*IEEE Trans. Commun*2003, 51(8):1266-1270. 10.1109/TCOMM.2003.815062Zhang X Y, Wang DG, Wei JB: Joint symbol detection and channel estimation for MIMO-OFDM systems via the variational bayesian EM algorithm. In

*IEEE Wireless Communications and Networking Conference*. Las Vegas: ; 31 Mar–3 Apr 2008:13-17.So DKC, Chen RS: Iterative EM receiver for space-time coded systems in MIMO frequency-selective fading channels with channel gain and order estimation.

*IEEE Trans. Wirel. Commun*2004, 3(6):1928-1935. 10.1109/TWC.2004.837293Lu B, Wang X, Li Y: Iterative receivers for space-time block coded OFDM systems in dispersive fading channels.

*IEEE Trans. Wirel. Commun*2002, 1(2):213-225. 10.1109/7693.994815Zia A, Reilly JPR, Manton J, Shiran S: An information geometry approach to ML estimation with incomplete data: application to semiblind MIMO channel identification.

*IEEE Trans. Signal Process*2007, 55(8):3975-3985.Khalighi MA, Boutros JJ: Semi-blind channel estimation using EM algorithm in iterative MIMO APP detectors.

*IEEE Trans. Wirel. Commun*2006, 5(11):3165-3173.Aldana CH, de Cardevalho E, Ciof J: Channel estimation for multicarrier multiple input single output systems using the EM algorithm.

*IEEE Trans. Signal Process*2003, 51(12):3280-3292. 10.1109/TSP.2003.819082Zhang J, Hanzo L, Mu X: Joint decision-directed channel and noise-variance estimation for MIMO OFDM/SDMA systems based on expectation-conditional maximization.

*IEEE Trans. Veh. Technol*2011, 60(5):2139-2151.Choi J: An EM-based iterative receiver for MIMO-OFDM under interference-limited environments.

*IEEE Trans. Wirel. Commun*2007, 6(11):3994-4003.Wautelet X, Herzet C, Dejonghe A, Louveaux J, Vandendorpe L: Comparison of EM-based algorithms for MIMO channel estimation.

*IEEE Trans. Commun*2007, 55(1):216-226.Nevat I, Peters GW, Yuan J: Detection of gaussian constellations in MIMO systems under imperfect CSI.

*IEEE Trans. Commun*2010, 58(4):1151-1160.Georghiades C, Han J: Sequence estimation in the presence of random parameters via the EM algorithm.

*IEEE Trans. Commun*1997, 45(3):300-308. 10.1109/26.558691Chan F, Choi J: Neighborhood exploring detector: an EM-based signal detector for multiple antenna systems.

*IEEE Trans. Signal Process*2007, 55(5):1875-1885.Kashima T, Fukawa K, Suzuki H: Adaptive MAP receiver via the EM algorithm and message passings for MIMO-OFDM mobile communications.

*IEEE J. Sel. Areas Commun*2006, 24(3):437-447.Ueng Y-L, Chen Y-M, Lin J-Y: A MIMO-BICM scheme using a convolutional interleaver for delay-sensitive applications.

*IEEE Trans. Veh. Technol*2010, 59(5):2380-2393.Khalighi M, Bourennane S: Semiblind single-carrier MIMO channel estimation using overlay pilots.

*IEEE Trans. Veh. Technol*2008, 57(3):951-1956.Choi J: MIMO-BICM iterative receiver with the EM based channel estimation and simplified MMSE combining with soft cancellation.

*IEEE Trans. Signal Process*2006, 54(8):3247-3251.Zheng J, Rao B: LDPC-coded MIMO systems with unknown block fading channels: soft MIMO detector design, channel estimation, and code optimization.

*IEEE Trans. Signal Process*2006, 54(4):1504-1518.Khalighi MA, Boutros J, Hélard J-F: Data-aided channel estimation for turbo-PIC MIMO detectors.

*IEEE Commun. Lett*2006, 10(5):350-352. 10.1109/LCOMM.2006.1633319Pham T-H, Liang Y-C: A Nallanathan, A joint channel estimation and data detection receiver for multiuser MIMO IFDMA systems.

*IEEE Trans. Commun*2009, 57(6):1857-1865.Gao J, Li H: Low-complexity MAP channel estimation for mobile MIMO-OFDM systems.

*IEEE Trans. Wirel. Commun*2008, 7(3):774-780.Souza RD, Garcia-Frias J, Haimovich A M: Semiblind EM based iterative receivers for space-time-coded modulation and quasi-static frequency-selective fading channels.

*IEEE Trans. Veh. Technol*2006, 55(4):1259-1268. 10.1109/TVT.2006.877461Xie YZ, Georghiades CN: Two EM-type channel estimation algorithms for OFDM with transmitter diversity.

*IEEE Trans. Commun*2003, 51(1):106-115. 10.1109/TCOMM.2002.807617Ma X, Giannakis G B, Ohno S: Optimal training for block transmissions over doubly selective wireless fading channels.

*IEEE Trans. Signal Process*2003, 51(5):1351-1366. 10.1109/TSP.2003.810304Tang Z, Leus G, Cannizzaro RC, Banelli P: Pilot-assisted timevarying channel estimation for OFDM systems.

*IEEE Trans. Signal Process*2007, 55(5):2226-2238.Stark H, Woods JW:

*Probability and Random Processes with Applications to Signal Processing Prentice Hall*. Upper Saddle River: Prentice-Hall; 2002.Kay SM:

*Fundamental of Statistical Signal Processing: Estimation Theory*. Upper Saddle River: Prentice-Hall; 1993.Kannu A, Schniter P: Design and analysis of MMSE pilot-aided cyclic-prefixed block transmissions for doubly selective channels.

*IEEE Trans. Signal Process*2008, 56(3):1148-1160.Tree H, Bell K:

*Bayesian Bounds for Parameter Estimation and Nonlinear Filtering/Tracking*. New York: Wiley-IEEE Press; 2007.

## Acknowledgements

This work was supported in part by the National Science Foundation of China under grant number 61032002, 60902026, and 60972029, the Chinese Important National Science & Technology Specific Projects under grant 2011ZX03001-007-01, and the Program for New Century Excellent Talents in University, NCET-11-0058.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Competing interests

The authors declare that they have no competing interests.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

### Cite this article

Zhong, K., Lei, X. & Li, S. Iterative channel estimation and data detection for MIMO-OFDM systems operating in time-frequency dispersive channels under unknown background noise.
*J Wireless Com Network* **2013, **182 (2013). https://doi.org/10.1186/1687-1499-2013-182

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/1687-1499-2013-182

### Keywords

- Multiple-input multiple-output (MIMO); Orthogonal frequency division multiplexing (OFDM); Time-frequency (TF) dispersive channels; Unknown noise variance; Expectation-maximization (EM)