Skip to main content

On the optimality of training signals for MMSE channel estimation in MIMO-OFDM systems

Abstract

In this paper, we investigate the optimality of training signals for linear minimum mean square error (LMMSE) channel estimation in multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) with frequency-selective fading channels. This is a very challenging problem due to its mathematical intractability and has not been analytically solved in the literature. Using the Lagrange multiplier method, we derive the optimality conditions for training signal design. Important findings revealed on optimal training signals are twofold: (i) the energies of the training signals on each subcarrier are equal, and (ii) on each subcarrier, the training signals transmitted from the different antennas are orthogonal and of equal energy. We verify that our results are in line with the design principles that have been derived in single-carrier MIMO systems. Two types of optimal training signal examples that satisfy the optimality conditions are presented for practical implementations in MIMO-OFDM systems. Simulation results show that the training signals based on the optimality conditions outperform other non-optimal training signals in terms of channel estimation performance.

1 Introduction

Recently, increasing interest has been concentrated on multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) for broadband wireless communication. The combination of OFDM with MIMO exploits the benefits from both techniques, i.e., the robustness to combat multipath delay spread and an increase in system capacity [1-5]. For a practical implementation of MIMO-OFDM systems, channel estimation becomes very important for the system performance. Imperfect channel estimation typically leads to the increase of error rates and reduces transmission efficiency. In this study, we focus on the problem of designing MIMO-OFDM training signals for channel estimation, a critical component in many modern wireless communication systems.

Several approaches on optimal training signal design have been proposed in the literature. In the absence of prior statistical information about the channel, simple least-squares channel estimation is used. In [6], the optimal placement of training signals is studied for a single-input single-output (SISO) OFDM case. The optimal training design is extended to the MIMO-OFDM case in [7]. In [8], more general training structures for MIMO-OFDM are presented that utilize frequency division, time division, and code division multiplexing.

In the presence of prior statistical information about the channel, a more efficient channel estimation technique can be used. Linear minimum mean square error (LMMSE) estimators that incorporate prior knowledge to improve channel estimation are known to be optimal for this case. In [9,10], optimal conditions for training signals are studied for SISO frequency-selective fading channels. In [11], the results are extended to MIMO frequency-selective fading channels, and it is revealed that training signals across transmit antennas should be orthogonal and training signals should be equi-powered. The results are very simple but effective and thus have been widely used as design guidelines for recent wireless systems. One can intuitively generalize the main principles to multi-carrier systems while the optimality of the principles has remained unproved for the multi-carrier systems, i.e., OFDM. For example, downlink reference signals in commercial LTE systems are designed with the same principles. In [12], an optimal training design for both least-squares estimators and LMMSE estimators is studied assuming cyclic delay diversity OFDM systems.

Meanwhile, research interests have been shifted to more practical issues on MIMO channel estimations. In [13,14], a training signal design in the existence of inphase and quadrature imbalances is considered for SISO-OFDM and MIMO-OFDM systems. A robust training signal design for LMMSE channel estimator in case of imperfect knowledge of second-order characteristics of channels is studied in [15-17]. Efficient algorithms that exploit the spatial correlation of MIMO channels are proposed for training signal design in [18,19].

In this paper, we aim to complete the puzzle with the missing piece. We directly tackle a multi-carrier system model and solve the optimality conditions of training signals for LMMSE estimator in MIMO-OFDM systems. This is quite challenging because simultaneous considerations on all MIMO dimensions, channel statistics, multicarriers, and multisymbols lead to an extremely complex modeling and mathematically intractable problem. This has never been solved in the literature to the best of our knowledge. We analytically derive the optimality conditions using the Lagrange multiplier method, which is the principle contribution of this study.

The remainder of the paper is organized as follows. Section 2 describes the system model of our work. After Section 3 analytically derives the optimality conditions, the optimal training signal design is discussed in Section 4. Section 5 presents two types of optimal training signal examples for practical MIMO-OFDM systems. Simulation results are provided in Section 6. Finally, Section 7 draws conclusions.

Notations: Uppercase and lowercase boldface letters are used for matrices and vectors, respectively. The superscript ‘ ∗’ denotes the conjugate transpose, superscript ‘T’ denotes the transpose, and superscript ‘ −1’ denotes the matrix/vector inverse. We will use \(\mathbb {E} \left [ \cdot \right ] \) for expectation, v e c(·) for matrix vectorization, t r[·] for the matrix trace, ⊗ for the Kronecker product, and I N for the N×N identity matrix.

2 System model

We consider a MIMO-OFDM system with Q transmit and P receive antennas and perform an analysis in the frequency domain to search for the properties of an optimal training signal. For the best analytical tractability, we will work directly in the frequency domain. The number of OFDM subcarriers is N, and we consider a block of M OFDM symbols transmitted across the channel. We use the notation x n,q,m to denote the training signal transmitted on the qth transmit antenna in the mth symbol and on the nth subcarrier. For the nth subcarrier, we may, therefore, consider that we transmit a matrix of training signals, X n , where x n,q,m is the element in the qth row and mth column. The following notations are used in this paper:

  • X n : Training signal matrix for the nth subcarrier (Q×M),

  • Y n : Received signal matrix for the nth subcarrier (P×M),

  • H n : MIMO channel matrix for the nth subcarrier (P×Q),

  • W n : Received noise matrix for the nth subcarrier (P×M).

Considering frequency-selective fading, a signal model is given as

$$ \mathbf{Y}_{n} = \mathbf{H}_{n} \mathbf{X}_{n} + \mathbf{W}_{n}, $$
((1))

where all elements of H n are uncorrelated, and all noise variables are independent, i.e., \(\mathbb {E} \left [ \mathbf {W}_{n} \mathbf {W}_{n}^{*} \right ] = \sigma ^{2} \mathbf {I}\). Aggregating the matrices for all subcarriers gives

$$\begin{array}{*{20}l} {} \mathbf{x} &= \left[ \begin{array}{c} \mathbf{vec} \left(\mathbf{X}_{1}\right) \\ \mathbf{vec} \left(\mathbf{X}_{2}\right) \\ \vdots \\ \mathbf{vec} \left(\mathbf{X}_{N}\right) \end{array} \right] = \left[ \begin{array}{c} \mathbf{x}_{1} \\ \mathbf{x}_{2} \\ \vdots \\ \mathbf{x}_{N} \end{array} \right], \ \mathbf{y} = \left[ \begin{array}{c} \mathbf{vec} \left(\mathbf{Y}_{1}\right) \\ \mathbf{vec} \left(\mathbf{Y}_{2}\right) \\ \vdots \\ \mathbf{vec} \left(\mathbf{Y}_{N}\right) \end{array} \right] = \left[ \begin{array}{c} \mathbf{y}_{1} \\ \mathbf{y}_{2} \\ \vdots \\ \mathbf{y}_{N} \end{array} \right], \\ {}\mathbf{h} &= \left[ \begin{array}{c} \mathbf{vec} \left(\mathbf{H}_{1}\right) \\ \mathbf{vec} \left(\mathbf{H}_{2}\right) \\ \vdots \\ \mathbf{vec} \left(\mathbf{H}_{N}\right) \end{array} \right] = \left[ \begin{array}{c} \mathbf{h}_{1} \\ \mathbf{h}_{2} \\ \vdots \\ \mathbf{h}_{N} \end{array} \right], \ \mathbf{w} = \left[ \begin{array}{c} \mathbf{vec} \left(\mathbf{W}_{1}\right) \\ \mathbf{vec} \left(\mathbf{W}_{2}\right) \\ \vdots \\ \mathbf{vec} \left(\mathbf{W}_{N}\right) \end{array} \right] = \left[ \begin{array}{c} \mathbf{w}_{1} \\ \mathbf{w}_{2} \\ \vdots \\ \mathbf{w}_{N} \end{array} \right], \end{array} $$
((2))

where x,y,h,w are respectively M N Q×1,M N P×1,N P Q×1, and M N P×1 matrices. By using a well-known result regarding the v e c operation, i.e., v e c(A X B)=(B T⊗A)v e c(X), (1) becomes

$$\begin{array}{*{20}l} \mathbf{vec} \left(\mathbf{Y}_{n} \right) & = \mathbf{vec} \left(\mathbf{H}_{n} \mathbf{X}_{n} \right) + \mathbf{vec} \left(\mathbf{W}_{n} \right) \\ & = \mathbf{vec} \left(\mathbf{I}_{P} \mathbf{H}_{n} \mathbf{X}_{n} \right) + \mathbf{vec} \left(\mathbf{W}_{n} \right) \\ & = \left(\mathbf{X}_{n}^{T} \otimes \mathbf{I}_{P} \right) \mathbf{vec} \left(\mathbf{H}_{n} \right) + \mathbf{vec} \left(\mathbf{W}_{n} \right), \end{array} $$

or can be rewritten as

$$\begin{array}{*{20}l} \mathbf{y}_{n} = \left(\mathbf{X}_{n}^{T} \otimes \mathbf{I}_{P} \right) \mathbf{h}_{n} + \mathbf{w}_{n}. \end{array} $$
((3))

Combining all the N equations in (3) yields

$$\begin{array}{*{20}l} {}\mathbf{y} = \left[ \begin{array}{cccc} \left(\mathbf{X}_{1}^{T} \otimes \mathbf{I}_{P} \right) & & & \\ & \left(\mathbf{X}_{2}^{T} \otimes \mathbf{I}_{P} \right) & & \\ & & \ddots & \\ & & & \left(\mathbf{X}_{N}^{T} \otimes \mathbf{I}_{P} \right) \\\ \end{array} \right] \left[ \begin{array}{c} \mathbf{h}_{1} \\ \mathbf{h}_{2} \\ \vdots \\ \mathbf{h}_{N} \end{array} \right] \!+ \mathbf{w}, \end{array} $$

or shortly denoted as

$$\begin{array}{*{20}l} \mathbf{y} = \mathbf{X} \mathbf{h}+ \mathbf{w}, \end{array} $$
((4))

where X (M N P×N P Q) is defined as

$$\begin{array}{*{20}l} \mathbf{X} = \left[ \begin{array}{cccc} \left(\mathbf{X}_{1}^{T} \otimes \mathbf{I}_{P} \right) & & & \\ & \left(\mathbf{X}_{2}^{T} \otimes \mathbf{I}_{P} \right) & & \\ & & \ddots & \\ & & & \left(\mathbf{X}_{N}^{T} \otimes \mathbf{I}_{P} \right) \\ \end{array} \right]. \end{array} $$
((5))

3 Analysis on optimality conditions

In this section, the problem formulation for minimizing the LMMSE channel estimation errors in MIMO-OFDM systems is presented first. Then, optimal conditions for the training signals are derived for an optimal training signal design.

3.1 Problem formulation

The standard LMMSE estimate of h, based upon the observation of y, becomes

$$\begin{array}{*{20}l} \hat{\mathbf{h}} = \mathbb{E} \left[ \mathbf{h} \mathbf{y}^{*} \right] \left(\mathbb{E} \left[ \mathbf{y} \mathbf{y}^{*} \right] \right)^{-1} \mathbf{y} \end{array} $$
((6))

By using (4), this becomes

$$\begin{array}{*{20}l} \hat{\mathbf{h}} & = \mathbb{E} \left[ \mathbf{h} \mathbf{h}^{*} \right] \mathbf{X}^{*} \left(\mathbf{X} \mathbb{E} \left[ \mathbf{y} \mathbf{y}^{*} \right] \mathbf{X}^{*} + \sigma_{\mathbf{w}}^{2} \mathbf{I} \right)^{-1} \mathbf{y} \\ & = \mathbf{R}_{\mathbf{hh}} \mathbf{X}^{*} \left(\mathbf{X} \mathbf{R}_{\mathbf{hh}} \mathbf{X}^{*} + \sigma_{\mathbf{w}}^{2} \mathbf{I} \right)^{-1} \mathbf{y}. \end{array} $$
((7))

The covariance matrix of the error can be expressed as

$$\begin{array}{*{20}l} {}\mathbb{E} \left[ \left(\mathbf{h} - \hat{\mathbf{h}} \right) \left(\mathbf{h} - \hat{\mathbf{h}} \right)^{*} \right] & = \mathbb{E} \left[ \left(\mathbf{h} - \hat{\mathbf{h}} \right) \mathbf{h}^{*} \right] \\ & = \mathbf{R}_{\mathbf{hh}} \!- \mathbf{R}_{\mathbf{hh}} \mathbf{X}^{*} \left(\mathbf{X} \mathbf{R}_{\mathbf{hh}} \mathbf{X}^{*} + \sigma_{\mathbf{w}}^{2} \mathbf{I} \right)^{-1} \mathbf{X} \mathbf{R}_{\mathbf{hh}}. \end{array} $$
((8))

Using the well-known matrix identity,

$$\begin{array}{*{20}l}{} \left(\mathbf{A} + \mathbf{B} \mathbf{C} \mathbf{D} \right)^{-1} = \mathbf{A}^{-1} - \mathbf{A}^{-1} \mathbf{B} \left(\mathbf{C}^{-1} + \mathbf{D} \mathbf{A}^{-1} \mathbf{B} \right)^{-1} \mathbf{D} \mathbf{A}^{-1}, \end{array} $$

(8) can be rewritten as

$$\begin{array}{*{20}l} \mathbb{E} \left[ \left(\mathbf{h} - \hat{\mathbf{h}} \right) \left(\mathbf{h} - \hat{\mathbf{h}} \right)^{*} \right] & = \left(\mathbf{R}_{\mathbf{hh}}^{-1} + \frac{\mathbf{X}^{*} \mathbf{X}}{\sigma_{\mathbf{w}}^{2}} \right)^{-1} \\ & = \sigma_{\mathbf{w}}^{2} \left(\sigma_{\mathbf{w}}^{2} \mathbf{R}_{\mathbf{hh}}^{-1} + \mathbf{X}^{*} \mathbf{X} \right)^{-1}. \end{array} $$
((9))

The MMSE becomes [20,21]

$$\begin{array}{*{20}l} \text{MMSE} = \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \mathbf{R}_{\mathbf{hh}}^{-1} + \mathbf{X}^{*} \mathbf{X} \right)^{-1} \right]. \end{array} $$
((10))

Our goal is to find the matrix X in the form of (5) that minimizes (10), subject to the total transmit energy constraint.

Based on the assumptions of the channel, it can be observed that

$$\begin{array}{*{20}l} \mathbf{R}_{\mathbf{hh}} = \mathbf{R} \otimes \mathbf{I}_{Q} \otimes \mathbf{I}_{P}, \end{array} $$
((11))

where R (N×N) is a Toeplitz Hermitian matrix of the form

$$\begin{array}{*{20}l} \mathbf{R} = \left[ \begin{array}{cccc} r_{0} & r_{1} & \cdots & r_{N-1} \\ r_{N-1} & r_{0} & \cdots & r_{N-2} \\ \vdots & & \ddots & \vdots \\ r_{1} & r_{2} & \cdots & r_{0} \end{array} \right]. \end{array} $$
((12))

We also have

$$\begin{array}{*{20}l} \mathbf{X} & = \left[ \begin{array}{cccc} \left(\mathbf{X}_{1}^{T} \otimes \mathbf{I}_{P} \right) & & & \\ & \left(\mathbf{X}_{2}^{T} \otimes \mathbf{I}_{P} \right) & & \\ & & \ddots & \\ & & & \left(\mathbf{X}_{N}^{T} \otimes \mathbf{I}_{P} \right) \\ \end{array} \right] \\ & = \left[ \begin{array}{cccc} \mathbf{X}_{1}^{T} & & & \\ & \mathbf{X}_{2}^{T} & & \\ & & \ddots & \\ & & & \mathbf{X}_{N}^{T} \\ \end{array} \right] \otimes \mathbf{I}_{P} \\ & = \mathbf{Z} \otimes \mathbf{I}_{P}, \end{array} $$
((13))

where Z (M N×N Q) is defined as

$$\begin{array}{*{20}l} \mathbf{Z} = \left[ \begin{array}{cccc} \mathbf{X}_{1}^{T} & & & \\ & \mathbf{X}_{2}^{T} & & \\ & & \ddots & \\ & & & \mathbf{X}_{N}^{T} \\ \end{array} \right]. \end{array} $$
((14))

Furthermore,

$$\begin{array}{*{20}l} \mathbf{X}^{*} \mathbf{X} & = \left(\mathbf{Z} \otimes \mathbf{I}_{P} \right)^{*} \left(\mathbf{Z} \otimes \mathbf{I}_{P} \right) \\ & = \left(\mathbf{Z}^{*} \otimes \mathbf{I}_{P} \right) \left(\mathbf{Z} \otimes \mathbf{I}_{P} \right) \\ & = \mathbf{Z}^{*} \mathbf{Z} \otimes \mathbf{I}_{P}. \end{array} $$
((15))

Using some basic properties regarding ⊗ operations and expressions in (11) ∼(15), the MMSE in (10) can be rewritten as

$$\begin{array}{*{20}l} {}\text{MMSE} & = \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \mathbf{R}_{\mathbf{hh}}^{-1} + \mathbf{X}^{*} \mathbf{X} \right)^{-1} \right] \\ & = \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{R} \otimes \mathbf{I}_{Q} \otimes \mathbf{I}_{P} \right)^{-1} + \mathbf{Z}^{*} \mathbf{Z} \otimes \mathbf{I}_{P} \right)^{-1} \right] \\ & = \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{R}^{-1} \otimes \mathbf{I}_{Q} \otimes \mathbf{I}_{P} \right) + \mathbf{Z}^{*} \mathbf{Z} \otimes \mathbf{I}_{P} \right)^{-1} \right] \\ & = \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{R}^{-1} \otimes \mathbf{I}_{Q} \right) + \mathbf{Z}^{*} \mathbf{Z} \right) \otimes \mathbf{I}_{P} \right)^{-1} \right] \\ & = \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{R}^{-1} \otimes \mathbf{I}_{Q} \right) + \mathbf{Z}^{*} \mathbf{Z} \right)^{-1} \otimes \mathbf{I}_{P} \right] \\ & = P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{R} \otimes \mathbf{I}_{Q} \right)^{-1} + \mathbf{Z}^{*} \mathbf{Z} \right)^{-1} \right]. \end{array} $$
((16))

In addition, it is known that, for any matrices A and B, there exists a permutation matrix Π such that A⊗B=Π(B⊗A)Π T [22]. Let Π be a permutation matrix such that R⊗I Q =Π(I Q ⊗R)Π T. Using this relation in (16) yields

$$\begin{array}{*{20}l} {}\text{MMSE} & = P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \Pi \left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} \Pi^{T} + \mathbf{Z}^{*} \mathbf{Z} \right)^{-1} \right] \\ & = P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \Pi \left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} \Pi^{T} + \Pi \Pi^{T} \mathbf{Z}^{*} \mathbf{Z} \Pi \Pi^{T} \right)^{-1} \right] \\ & = P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\Pi \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} + \Pi^{T} \mathbf{Z}^{*} \mathbf{Z} \Pi \right) \Pi^{T} \right)^{-1} \right]. \end{array} $$
((17))

Note that row permutations by Π followed by column permutations by the same Π does not change the trace of the matrix because any diagonal parameter may be permuted but still remains in the diagonal position. Thus, we have

$$\begin{array}{*{20}l} \text{MMSE} = P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} + \Pi^{T} \mathbf{Z}^{*} \mathbf{Z} \Pi \right)^{-1} \right]. \end{array} $$
((18))

Finally, our optimization problem is formulated as a constrained optimization problem as

$$\begin{array}{@{}rcl@{}} \text{Minimize} & & P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} + \Pi^{T} \mathbf{Z}^{*} \mathbf{Z} \Pi \right)^{-1} \right] \\ \text{Subject to} & & \mathbf{tr} \left[ \mathbf{Z}^{*} \mathbf{Z} \right] = E_{\text{total}}, \end{array} $$
((19))

where E total is the total transmit energy.

3.2 Optimality conditions

For simplicity of notations, we define D (N Q×N Q) and A (N Q×N Q) as

$$\begin{array}{*{20}l} \mathbf{D} \equiv \Pi^{T} \mathbf{Z}^{*} \mathbf{Z} \Pi, \end{array} $$
((20))

and

$$\begin{array}{*{20}l} \mathbf{A} \equiv \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} + \mathbf{D} \right), \end{array} $$
((21))

where Z ∗ Z is Hermitian and block diagonal. We solve the constrained optimization problem using the Lagrange multiplier method. By letting μ be the Lagrange multiplier, the Lagrangian is expressed as

$$\begin{array}{*{20}l} J \left(\mathbf{D}, \mu \right) = P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \mathbf{A}^{-1} \right] + \mu \left(\mathbf{tr} \left[ \mathbf{D} \right] - E_{\text{total}} \right). \end{array} $$
((22))

To obtain the optimal solution, we set the derivatives of the Lagrangian function J(D,μ) to zeros as

$$\begin{array}{*{20}l} \frac{\partial}{\partial x} J \left(\mathbf{D}, \mu \right) = 0, \end{array} $$
((23))

where x is an arbitrary parameter in D. We split our approaches into two cases: diagonal parameters and off-diagonal parameters in matrix D. To find the required derivatives, we will use the following property. For any matrix A depending on a parameter x [22],

$$\begin{array}{*{20}l} \frac{\partial}{\partial x} \mathbf{A}^{-1} = - \mathbf{A}^{-1} \left(\frac{\partial}{\partial x} \mathbf{A} \right) \mathbf{A}^{-1}, \end{array} $$
((24))

which directly follows from the fact that A A −1=I.

We first consider the diagonal parameters x in D and derive the following lemma.

Lemma 1.

By denoting the kth column of A −1 as a k , the following condition should be satisfied in order to achieve \(\frac {\partial }{\partial x} J \left (\mathbf {D}, \mu \right) = 0\).

$$\begin{array}{*{20}l} \textbf{C1:} \quad \left\| \mathbf{a}_{k} \right\|^{2} = c, \quad \text{for all } k, \end{array} $$
((25))

where c is a constant.

Proof.

Let x be the kth diagonal parameter in D. We define the derivative of A with respect to x as

$$\begin{array}{*{20}l} \frac{\partial \mathbf{A}}{\partial x} = \mathbf{E} \equiv \text{diag} \left\{ 0, \ldots, 0,1,0, \ldots,0 \right\}, \end{array} $$
((26))

where E is a diagonal matrix in which only the kth diagonal element has a value of one. Using associating (24) with (22), we have

$$\begin{array}{*{20}l} \frac{\partial}{\partial x} J \left(\mathbf{D}, \mu \right) & = - P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \mathbf{A}^{-1} \mathbf{E} \mathbf{A}^{-1} \right] + \mu \\ & = - P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \mathbf{A}^{-1} \mathbf{E} \left(\mathbf{A}^{-1} \right)^{*} \right] +\mu \\ & = - P \sigma_{\mathbf{w}}^{2} \left\| \mathbf{a}_{k} \right\|^{2} + \mu \end{array} $$
((27))

where we use the fact that A −1 is a Hermitian matrix in the second line. Applying \(\frac {\partial }{\partial x} J \left (\mathbf {D}, \mu \right) = 0\) finally gives

$$\begin{array}{*{20}l} \left\| \mathbf{a}_{k} \right\|^{2} = \frac{\mu}{P \sigma_{\mathbf{w}}^{2}}, \end{array} $$
((28))

which holds for all diagonal elements in D without loss of generality. This completes the proof.

We then consider the off-diagonal parameters χ in D and derive the following lemma. Note that off-diagonal parameters are complex numbers. We split each parameter into two real parameters as χ=x+i y.

Lemma 2.

By denoting the kth column of A −1 as a k , the following condition should be satisfied in order to achieve \(\frac {\partial }{\partial x} J \left (\mathbf {D}, \mu \right) = 0\).

$$\begin{array}{*{20}l} \textbf{C2:} \quad \mathbf{a}_{k}^{*} \mathbf{a}_{l} = 0, \quad \text{for all } k,l \ \left(k \ne l \right). \end{array} $$
((29))

Proof.

Let χ appear in the kth column and in the lth row in D. We first focus on the real part x. We define the derivative of A with respect to x as \(\frac {\partial \mathbf {A}}{\partial x} = \mathbf {E}\), where E is defined as a sparse matrix in which only the element in the kth column lth row and the element in the lth column kth row have a value of one. Using associating (24) with (22), we have

$$\begin{array}{*{20}l} \frac{\partial}{\partial x} J \left(\mathbf{D}, \mu \right) & = - P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \mathbf{A}^{-1} \mathbf{E} \mathbf{A}^{-1} \right] \\ & = - P \sigma_{\mathbf{w}}^{2} \mathbf{tr} \left[ \mathbf{A}^{-1} \mathbf{E} \left(\mathbf{A}^{-1} \right)^{*} \right] \\ & = - P \sigma_{\mathbf{w}}^{2} \left(\mathbf{a}_{k}^{*} \mathbf{a}_{l} + \mathbf{a}_{l}^{*} \mathbf{a}_{k} \right) \end{array} $$
((30))

where we use the fact that A −1 is a Hermitian matrix in the second line. Applying \(\frac {\partial }{\partial x} J \left (\mathbf {D}, \mu \right) = 0\) finally gives

$$\begin{array}{*{20}l} \text{Re} \left[ \mathbf{a}_{k}^{*} \mathbf{a}_{l} \right]= 0, \end{array} $$
((31))

where Re[·] denotes the real part of a complex number. Then, similar derivations with respect to the imaginary part y yield

$$\begin{array}{*{20}l} \text{Im} \left[ \mathbf{a}_{k}^{*} \mathbf{a}_{l} \right]= 0, \end{array} $$
((32))

where Im[·] denotes the imaginary part of a complex number. Combining both results provides

$$\begin{array}{*{20}l} \mathbf{a}_{k}^{*} \mathbf{a}_{l}= 0, \end{array} $$
((33))

which holds for all off-diagonal elements in D without loss of generality. This completes the proof.

4 Optimal training signal design

In this section, we provide an optimal training signal satisfying the optimality conditions derived in the previous section.

Suppose that the matrix D is a multiple of the identity, i.e., D=c I, where c is a real constant. Because R is a Toeplitz Hermitian matrix, it can be written as

$$\begin{array}{*{20}l} \mathbf{R} = \mathbf{F} \mathbf{\Lambda} \mathbf{F}^{-1}, \end{array} $$
((34))

where F and Λ are a unitary matrix and a diagonal matrix, respectively. In addition, we may write

$$\begin{array}{*{20}l} \mathbf{I}_{Q} \otimes \mathbf{R} = \left[ \begin{array}{ccc} \mathbf{F} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{F} \end{array} \right] \left[ \begin{array}{ccc} \mathbf{\Lambda} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{\Lambda} \end{array} \right] \left[ \begin{array}{ccc} \mathbf{F}^{-1} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{F}^{-1} \end{array} \right], \end{array} $$
((35))
$$\begin{array}{*{20}l} {}\left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} \,=\, \left[ \begin{array}{ccc} \mathbf{F} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{F} \end{array} \right] \left[\! \begin{array}{ccc} \mathbf{\Lambda}^{-1} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{\Lambda}^{-1} \end{array} \!\right] \left[ \begin{array}{ccc} \mathbf{F}^{-1} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{F}^{-1} \end{array} \right]\!. \end{array} $$
((36))

For shorter notations,

$$\begin{array}{*{20}l} \left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} = \bar{\mathbf{F}} \bar{\mathbf{\Lambda}}^{-1} \bar{\mathbf{F}}^{-1}, \end{array} $$
((37))

where \(\bar {\mathbf {F}}\) and \(\bar {\mathbf {\Lambda }}^{-1}\) are respectively defined as

$$\begin{array}{*{20}l} \bar{\mathbf{F}} = \left[ \begin{array}{ccc} \mathbf{F} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{F} \end{array} \right], \ \bar{\mathbf{\Lambda}}^{-1} = \left[ \begin{array}{ccc} \mathbf{\Lambda}^{-1} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{\Lambda}^{-1} \end{array} \right] \end{array} $$
((38))

Associating (37) in (21) gives

$${} {\fontsize{7.6pt}{9.6pt}\selectfont{\begin{aligned} \mathbf{A}^{-1} & \,=\, \left(\sigma_{\mathbf{w}}^{2} \left(\mathbf{I}_{Q} \otimes \mathbf{R} \right)^{-1} + \mathbf{D} \right)^{-1} \\ & \,=\, \left(\sigma_{\mathbf{w}}^{2} \bar{\mathbf{F}} \bar{\mathbf{\Lambda}}^{-1} \bar{\mathbf{F}}^{-1} + c \mathbf{I} \right)^{-1} \\ & \,=\, \left(\sigma_{\mathbf{w}}^{2} \bar{\mathbf{F}} \bar{\mathbf{\Lambda}}^{-1} \bar{\mathbf{F}}^{-1} + c \bar{\mathbf{F}} \bar{\mathbf{F}}^{-1} \right)^{-1} \\ & \,=\, \bar{\mathbf{F}} \left(\sigma_{\mathbf{w}}^{2} \bar{\mathbf{\Lambda}}^{-1} + c \mathbf{I} \right)^{-1} \bar{\mathbf{F}}^{-1} \\ & \,=\, \left[\! \begin{array}{ccc} \mathbf{F} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{F} \end{array}\! \right]\! \left[\!\! \begin{array}{ccc} \left(\sigma_{\mathbf{w}}^{2} \!\mathbf{\Lambda}^{\!-1}\! +\! c \mathbf{I} \right)^{-1} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \left(\! \sigma_{\mathbf{w}}^{2} \mathbf{\Lambda}^{\!-1} \!+ \!c \mathbf{I} \right)^{-1} \end{array}\! \!\right]\! \left[ \!\begin{array}{ccc} \mathbf{F}^{-1} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{F}^{-1} \end{array}\! \right] \\ & \,=\, \left[ \begin{array}{ccc} \mathbf{F} \left(\sigma_{\mathbf{w}}^{2} \mathbf{\Lambda}^{-1} + c \mathbf{I} \right)^{-1} \mathbf{F}^{-1} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{F} \left(\sigma_{\mathbf{w}}^{2} \mathbf{\Lambda}^{-1} + c \mathbf{I} \right)^{-1} \mathbf{F}^{-1} \end{array} \right] \\ & \,=\, \left[\! \begin{array}{ccc} \mathbf{C} & \cdots & \mathbf{0} \\ \vdots & \ddots & \vdots \\ \mathbf{0} & \cdots & \mathbf{C} \end{array} \right], \end{aligned}}} $$
((39))

where C is a Toeplitz Hermitian matrix. Thus, it is clear that D=c I satisfies both the optimality conditions C1 in (25) and C2 in (29). Solving D=Π T Z ∗ Z Π=c I for Z ∗ Z yields

$$\begin{array}{*{20}l} \mathbf{Z}^{*} \mathbf{Z} = \Pi \left(c \mathbf{I} \right) \Pi^{T} = c \mathbf{I}. \end{array} $$
((40))

From (14) and (40), we finally reach the following theorem on the optimal training signal.

Theorem (Optimal training signal).

In MIMO-OFDM systems where a sequence of a training signal is transmitted at the transmitter through Q transmit antennas, N subcarriers, M OFDM symbols with the total transmit power of E total, and LMMSE channel estimation is performed at the receiver upon the reception of M OFDM symbols from P received antennas, the training signal is ‘optimal’ in terms of minimizing channel estimation errors if the training signal satisfies the following conditions:

  • The energy of the training signal in each subcarrier is equal, i.e.,

    $$\begin{array}{*{20}l} \mathbf{tr} \left[ \mathbf{X}_{n} \mathbf{X}_{n}^{*} \right] = \frac{E_{\text{total}}}{N}, \quad \text{for} \quad n = 1,2, \ldots, N. \end{array} $$
    ((41))
  • On each subcarrier, the training signals transmitted from the different antennas are orthogonal and of equal energy, i.e.,

    $$\begin{array}{*{20}l} \mathbf{X}_{n} \mathbf{X}_{n}^{*} = \frac{E_{\text{total}}}{NQ} \mathbf{I}, \quad \text{for} \quad n = 1,2, \ldots, N. \end{array} $$
    ((42))

5 Examples of optimal training signals

In this section, we use the theorem revealed in the previous section as a design guideline and present two examples of optimal training signal implementations. These designs are practical owing to its simple structure. Note that optimal designs of training signals are not limited to the following cases.

5.1 Sequential transmission on antennas

Assume that the number of OFDM symbols transmitted is equal to the number of transmit antennas, M=Q, and let

$$\begin{array}{*{20}l} \mathbf{X}_{n} = \left[ \begin{array}{cccc} z_{1}^{(n)} & 0 & \cdots & 0 \\ 0 & z_{2}^{(n)} & & \\ \vdots & & \ddots & \vdots \\ 0 & & \cdots & z_{Q}^{(n)} \\ \end{array} \right], \quad \text{for} \quad n=1,2, \ldots, N, \end{array} $$
((43))

where \(z_{i}^{(n)}\) are arbitrary complex numbers but satisfying \(\left | z_{i}^{(n)} \right |^{2} = \frac {E_{\text {total}}}{NQ}\). This implementation implies that each successive OFDM symbol is transmitted on a different antenna in a round-robin fashion. Figure 1 illustrates an example of the optimal training signal design in this type of optimal training signal implementation.

Figure 1
figure 1

An example of optimal training signals with sequential transmission on antennas (Q=4,M=4,N=8).

5.2 Interlaced transmission on antennas

Assume that the number of OFDM symbols transmitted is equal to the number of transmit antennas, M=Q, and N is a multiple of Q. In this implementation, we transmit simultaneously from every antenna in each symbol interval, but each antenna uses only every Qth subcarrier. Let matrix Φ represent a cyclic shift operation, which causes a cyclic shift by one element in the upward direction. For example, if Q=3, we have

$$\begin{array}{*{20}l} \mathbf{\Phi} = \left[ \begin{array}{ccc} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \\ \end{array} \right], \end{array} $$
((44))

and it operates as

$$\begin{array}{*{20}l} \mathbf{\Phi} \left[ \begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array} \right] = \left[ \begin{array}{c} x_{2} \\ x_{3} \\ x_{1} \end{array} \right]. \end{array} $$
((45))

We define the training signal matrix by

$$\begin{array}{*{20}l} \mathbf{X}_{n} = \mathbf{\Phi}^{(n-1)} \mathbf{\Psi}, \quad \text{for} \quad n=1,2, \ldots,N, \end{array} $$
((46))

where Φ n is the nth power of Φ,

$$\begin{array}{*{20}l} \mathbf{\Psi} = \left[ \begin{array}{cccc} z_{1} & 0 & \cdots & 0 \\ 0 & z_{2} & & \\ \vdots & & \ddots & \vdots \\ 0 & & \cdots & z_{Q} \\ \end{array} \right], \end{array} $$
((47))

and z i are arbitrary complex numbers but satisfying \(\left | z_{i}^{(n)} \right |^{2} = \frac {E_{\text {total}}}{NQ}\). This implementation implies that each antenna always transmit in every OFDM symbol but using only every Qth subcarrier. During the first symbol, for example, Antenna 1 uses subcarriers 1,Q+1,2Q+1,…, Antenna Q uses subcarriers 2,Q+2,2Q+2,…, etc. During the second symbol, Antenna 1 uses subcarriers 2,Q+2,2Q+2,…, and Antenna Q uses subcarriers 3,Q+3,2Q+3,…, and so on. Figure 2 illustrates an example of the optimal training signal design in this type of optimal training signal implementation.

Figure 2
figure 2

An example of optimal training signals with interlaced transmission on antennas (Q=4,M=4,N=8).

6 Simulation results

In this section, we verify the optimality of the training signals through extensive computer simulations. We consider a MIMO-OFDM system where the number of TX antennas is Q=4, the number of RX antennas is P=4, the number of OFDM subcarriers is N=128, and the length of the cyclic prefix is 32. The number of OFDM symbols for a training signal is set to M=4. A wide sense stationary uncorrelated scattering (WSSUS) model is considered for a multipath channel [23]. A multipath intensity profile of an exponential distribution is used where the number of delay taps is L=128 and an exponentially decaying factor is α. Doppler frequency is assumed to be zero. The optimal training signal shown in Figure 1 is used in the simulations. Five non-optimal training signals are also generated for performance comparisons. Unlike the optimal training signal that satisfies the conditions in (41) and (42), the non-optimal training signals are created at random but they all satisfy the total transmit power E total.

Figure 3 compares the LMMSE channel estimation performances of different training signals. As the SNR increases, the channel estimation error decreases. The optimal training signal shows a considerable performance gap compared to the other non-optimal training signals. For example, a SNR gain using the optimal training signal is more than 5 dB in most cases. Figure 4 shows how the LMMSE channel estimation performance varies with the multipath intensity profile. With a small exponentially decaying factor, large number of multipaths become dominant, resulting in high frequency selectivity and relatively low LMMSE channel estimation performance. As a small exponentially decaying factor increases, the number of dominant multipaths decreases, resulting in low frequency selectivity and relatively high LMMSE channel estimation performance. Again, the optimal training signal provides a huge performance gap compared to the other non-optimal training signals in all cases. Figure 5 shows the LMMSE channel estimation performance with respect to the antenna dimension. As the number of antennas increases, the MMSE of the optimal training signal does not increase while those of non-optimal training signals continuously increases. Accordingly, the performance gap between the optimal and non-optimal training signals also increase. This is because the optimal training signal is designed such that training signals transmitted from the different antennas are orthogonal. In non-optimal training signal, training signals simultaneously transmitted from different antennas collide and interfere each other, which degrades the LMMSE channel estimation performance.

Figure 3
figure 3

MMSE versus SNR, where P=4,Q=4,M=4,N=128,L=128, and α=10.

Figure 4
figure 4

MMSE versus exponentially decaying factor, where P=4,Q=4,M=4,N=128, L=128, and S N R=30 dB.

Figure 5
figure 5

MMSE versus Number of Antennas, where P=Q=M,N=128,L=128,α=10, and S N R=30 dB.

7 Conclusions

In this paper, optimality conditions are analytically derived and design guidelines for the optimal training signals are provided for LMMSE channel estimation for MIMO-OFDM. On the basis of the analysis, we clearly reveal that the training signal that satisfies the following is optimal: (i) the energy of the training signal on each subcarrier is equal, and (ii) on each subcarrier, the training signals transmitted from the different antennas are orthogonal and of equal energy. Interestingly, the optimality conditions of training signals for LMMSE estimator in MIMO-OFDM systems are basically in line with design principles known from single-carrier MIMO systems; training signals across transmit antennas should be orthogonal and training signals should be equi-powered. We mathematically prove that the simple generalization of the design principles with an additional dimension, i.e., multi-carriers, still holds optimality. This work is important because the optimality conditions of training signals for LMMSE estimation in MIMO-OFDM systems have been mathematically proved. Future research may include an extension of the results to more practical channel statistics, e.g., correlated channels and time-varying channels.

References

  1. GJ Foschini, Layered space-time architecture for wireless communication in a fading environment when using multi-element Aantennas. Bell Labs Technical J. 1(2), 41–59 (1996).

    Article  Google Scholar 

  2. GJ Foschini, MJ Gans, On limits of wireless communications in a fading environment when using multi-element antennas. Wireless Personal Commun. 6(3), 311–335 (1998).

    Article  Google Scholar 

  3. IE Telatar, Capacity of multi-antenna Gaussian channels. Eur Trans. Telecommunicaitons. 10(6), 585–595 (1999).

    Article  Google Scholar 

  4. SB Weinstein, PM Ebert, Data transmission by frequency division multiplexing using the discrete Fourier transform. IEEE Trans. Commun. 19(5), 628–634 (1971).

    Article  Google Scholar 

  5. I Sohn, JY Ahn, Joint processing of ZF detection and MAP decoding for MIMO-OFDM system. ETRI J. 26(5), 384–390 (2004).

    Article  Google Scholar 

  6. R Negi, J Cioffi, Pilot tone selection for channel estimation in a mobile OFDM system. IEEE Trans. Consumer Electron. 44(3), 1122–1128 (1998).

    Article  Google Scholar 

  7. I Barhumi, G Leus, M Moonen, Optimal training design for MIMO OFDM systems in mobile wireless channels. IEEE Trans. Signal Process. 51(6), 1615–1624 (2003).

    Article  Google Scholar 

  8. H Minn, N Al-Dhahir, Optimal training signals for MIMO OFDM channel estimation. IEEE Trans. Wireless Commun. 5(5), 1158–1168 (2006).

    Article  Google Scholar 

  9. Y-S Choi, PJ Voltz, FA Cassara, On channel estimation and detection for multicarrier signals in Fast and selective Rayleigh. IEEE Trans. Commun. 49(8), 1375–1387 (2001).

    Article  MATH  Google Scholar 

  10. S Ohno, GB Giannakis, Capacity maximizing MMSE-optimal pilots for wireless OFDM over frequency-selective block Rayleigh-fading channels. IEEE Trans. Inf. Theory. 50(9), 2138–2145 (2004).

    Article  MATH  MathSciNet  Google Scholar 

  11. X Ma, L Yang, GB Giannakis, Optimal training for MIMO frequency-selective fading channels. IEEE Trans. Wireless Commun. 4(2), 453–466 (2005).

    Article  Google Scholar 

  12. W-C Huang, C-P Li, H-J Li, Optimal pilot sequence design for channel estimation in CDD-OFDM systems. IEEE Trans. Wireless Commun. 11(11), 4006–4016 (2012).

    Article  Google Scholar 

  13. H Minn, D Munoz, in Proc. IEEE Wireless Communications and Networking Conference (WCNC). Pilot designs for channel estimation of OFDM systems with frequency-dependent I/Q imbalances (Budapest, 2009), pp. 1–6.

  14. H Minn, D Munoz, Pilot designs for channel estimation of MIMO OFDM systems with frequency-dependent I/Q imbalances. IEEE Trans. Commun. 58(8), 2252–2264 (2010).

    Article  Google Scholar 

  15. N Shariati, J Wang, M Bengtsson, in Proc. IEEE Asilomar Conference on Signals, Systems, and Computers (ACSSC). On robust training sequence design for correlated MIMO channel estimation (Pacific Grove, 2012), pp. 504–507.

  16. N Shariati, J Wang, M Bengtsson, in Proc. IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP). A robust MISO training sequence design (Vancouver, 2013), pp. 4564–4568.

  17. N Shariati, J Wang, M Bengtsson, Robust training sequence design for correlated MIMO channel estimation. IEEE Trans. Signal Process. 62(1), 107–120 (2014).

    Article  MathSciNet  Google Scholar 

  18. C-C Cheng, Y-C Chen, YT Su, in Proc. IEEE International Conference on Communications (ICC). Modelling and estimation of correlated MIMO-OFDM fading channels (Kyoto, 2011), pp. 1–5.

  19. C-T Chiang, CC Fung, Robust training sequence design for spatially correlated MIMO channel estimation. IEEE Trans. Vehicular Technol. 60(7), 2882–2894 (2011).

    Article  Google Scholar 

  20. SM Kay, Fundamentals of Statistical Signal Processing: Estimation Theory (Prentice-Hall, Englewood Cliffs, NJ, USA, 1993).

    MATH  Google Scholar 

  21. M Biguesh, AB Gershman, Downlink channel estimation in cellular systems with antenna arrays at base stations using channel probing with feedback. EURASIP J. Appl. Signal Process. 9, 1330–1339 (2004).

    Article  Google Scholar 

  22. H Lutkepohl, Handbook of matrices (John Wiley & Sons, NY, USA, 1996).

    Google Scholar 

  23. PA Bello, Characterization of randomly time-variant linear channels. IEEE Trans. Commun. Syst. 11, 360–393 (1963).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Illsoo Sohn.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jo, J., Sohn, I. On the optimality of training signals for MMSE channel estimation in MIMO-OFDM systems. J Wireless Com Network 2015, 105 (2015). https://doi.org/10.1186/s13638-015-0345-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13638-015-0345-y

Keywords