# Joint source and relay precoding for generally correlated MIMO with full and partial CSIT

- Nguyen A. Vinh
^{1}View ORCID ID profile, - Nguyen N. Tran
^{1}Email author and - Nguyen H. Phuong
^{1}

**2017**:43

https://doi.org/10.1186/s13638-017-0826-2

© The Author(s) 2017

**Received: **2 February 2016

**Accepted: **16 February 2017

**Published: **1 March 2017

## Abstract

In this paper, we jointly design linear source and relay precoders for two-hop MIMO relaying. The involved channels encounter spatially correlated fading, the source data symbols are mutually correlated, and the noises are colored not only at the destination but also at the relay. Two different scenarios of channel state information (CSI) are assumed to be available at the transmitters: full CSI of both hops (the full CSIT) and full CSI of the source-relay hop only plus the covariance information of the relay-destination hop (the partial CSIT). First, with the full CSIT, we derive optimal precoders by maximizing the instantaneous mutual information (MI). Secondly, with the partial CSIT knowledge we derive suboptimal precoders by maximizing the average MI. For both the CSIT cases, we propose an iterative algorithm to perform power allocation iteratively and alternatively between the source antennas and the relay antennas. Its simplified version in which the power allocation is performed separately between the source antennas and the relay antennas is also developed. Simulation results show that our proposed precoding schemes with the full CSIT provide significantly higher capacity than the existing schemes. Besides, the proposed schemes with the partial CSIT also perform well especially when the channels are spatially correlated at the transmit sides and at medium-to-high signal-to-noise ratios (SNRs), while they require much lower computational complexity and less feedback overhead.

## Keywords

## 1 Introduction

Relaying in which signal transmission from the source to the destination is done with the aid of one or multiple intermediate relays has received much attention due to its ability to enhance transmission reliability and extend coverage for wireless communication systems [1–6]. There are different relaying strategies which are categorized based on how relays process the received signals from the source, typically including amplify-and-forward (AF) and decode-and-forward (DF). The AF scheme is also known as the non-regenerative relaying, and the DF scheme is called the regenerative relaying [5, 6]. In regenerative relaying, a relay decodes the received signal and forwards its encoded version to the destination. In the non-regenerative relaying, a relay simply amplifies and forwards the received signal to the destination. In general, the non-regenerative relaying introduces shorter delay and is less complex than the regenerative relaying. Besides, multiple-input multiple-output (MIMO) techniques are well-known to provide spatial diversity and multiplexing gains to wireless links [7]. Thus, it is straightforward to find much studies on non-regenerative MIMO relay systems, e.g., [1–6].

When the channel state information (CSI) is available to transmit nodes (CSIT), precoding is applicable to non-regenerative MIMO relay systems for further system performance improvement. In such, this precoding scheme, the covariance matrices of the transmitted signals, or equivalently source and relay precoders, are designed to optimize performance metrics such as the mutual information (MI) between the source and the destination or the mean-squared error (MSE) of the detected symbols. Relay precoder designs were derived for maximizing the capacity of two-hop relay systems in [1, 2] with instantaneous CSI of both links known at the relay. These designs were developed to the joint source and relay precoding schemes in [3, 4] when the source has instantaneous CSI of both links as well. In [4], the optimal structure of the source and relay precoders that decouples the compound relaying channel into independent sub-channels was found, and an iterative algorithm to properly allocate power over such these sub-channels was also developed. The result of [4] was successfully extended to the multicarrier case in [8].

To provide the transmit nodes, the instantaneous CSI of all links for the precoder designs of [1–6, 8], the relay and the destination need to feedback instantaneous CSI of the source-relay and the relay-destination links to the source and the relay through feedback channels. This requires a large amount of signaling overhead. Furthermore, it is infeasible to obtain exact CSI of the relay-destination link at transmitters in a situation where the destination moves rapidly. Besides, in practical communication systems, the rate of the feedback channels is commonly limited. Therefore, the assumption that partial information such as mean and covariance of the relay-destination channel is available at the transmitters might be more reasonable. As such, the partial CSI is considered in the works [5, 6, 9–18]. When having full CSI of the source-relay link and covariance information of the relay-destination link at the relay, relay precoders are designed for maximizing the MI [5, 6] and for minimizing the MSE [9, 10] of two-hop relay systems having transmit-sided spatially correlated relay-destination channel. To improve the system performance, joint source and relay precoders are proposed in [11] when the source also has full CSI of the source-relay link and partial information of the relay-destination link. In [12], an asymptotic MI for large-sized multi-hop relay systems having the large number of antennas is derived. The equi-powered source and relay precoder structures are also obtained with covariance information to maximize this asymptotic MI. Robust joint designs of linear relay precoders and destination equalizers for MIMO relay systems in the presence of noisy or outdated CSI (i.e., imperfect CSI) can be found in [13–18].

In practice, communication systems often encounter some interference. Such interference causes white noise at the receiver which is dominant by thermal noise to become colored noise [19–25], and degrades the communication performance. Co-channel interference (CCI) that comes from nearby interferers using the same frequency as the receiver is a common interference type. In cellular mobile network, the CCI comes from the frequency reuse, for example, a receiver at the edge of a cell may encounter undesired signals that come from transmitters in neighbour cells using the same frequency band. Another example is when a receiver in the macro network is in the coverage range of a femtocell, it may be impacted by the CCI that results from undesired transmitters in this femtocell [26]. The aforementioned works on the precoder design [1–6, 8–18] assumed that the receiver noise is *white* and the source signals are *independent*. The case of colored noise has been taken into account in the training signal designs for MIMO point-to-point systems in [21, 22, 27] and MIMO relay systems in [23–25]. Instead of colored noise, the relay precoder that maximizes the average capacity of a two-hop relay system where the destination lies close to some interferers was designed based on covariance information of the interferers-destination channels and the relay-destination channel in [26]. The papers [28–31] considered general MIMO relay systems having spatially correlated channels, colored noises, and mutually correlated source signals. Note that mutually correlated source signals arise from encoding operations on the bit stream including channel coding, modulation, and space-time coding at a transmitter [7, 32]. In [28, 29], the optimal structure of relay precoder that maximizes the MI of the generally correlated two-hop MIMO relay systems was obtained by using full CSI of two links and covariance matrices of the correlated source signals and the colored noises known at the relay. The papers [30, 31] devoted for the case of the generally correlated multi-hop MIMO relay systems. With the full CSI of hops and the signal and noise covariance matrices at the transmitters (the full CSIT), the source and relay precoders were designed asymptotically by either maximizing the individual MI of each hop or minimizing the individual soft mean-squared error (MSE) of estimated signals of each hop.

In this paper, we investigate generally correlated two-hop MIMO relay systems with mutually *correlated* source symbols, spatially *correlated* channels and *colored* noises. For the system capacity maximization, we propose joint designs of source and relay precoders in two cases of the full CSIT (like [28–31]) and the partial CSIT. The partial CSIT denotes full CSI of the source-relay link and covariance information of the relay-destination link and the source signal and noise covariance matrices known to the transmitters. First, with the full CSIT, the optimal structure of the source and relay precoders is derived by maximizing the instantaneous MI between the source and the destination. By the obtained source and relay precoders and the destination equalizer, the compound relaying channel is shown to be decomposed into parallel sub-channels. We design an iterative algorithm to perform power allocation iteratively and alternatively between the source antennas and the relay antennas. To reduce the computational complexity of the iterative algorithm, we develop its simplified version in which power allocation is carried out separately between the source antennas and the relay antennas. Next, with the partial CSIT, the optimal structure of the source and relay precoders is also derived by maximizing an upper bound of the average MI between the source and the destination. Again, an iterative algorithm and a simplified algorithm for source and relay power allocation are developed as well.

- 1.
This paper extends the relay precoding with the full CSIT in [28, 29] to the joint source and relay precoding with the full and partial CSIT.

- 2.
This paper develops the simplified precoding strategy based on the full CSIT in [31] to the iterative and simplified strategies based on the full CSIT as well as the partial CSIT.

- 3.
This paper is a generalization of [4] from the joint design of source and relay precoding with the

*full*CSIT for the system case of*white*i.i.d. channels,*white*noises,*white*source symbols to those with the full and partial CSIT for the system case of spatially*correlated*channels,*colored*noises,*correlated*source symbols. - 4.
The proposed joint precoding schemes in this paper include the relay precoding scheme with the partial CSIT for the system case of transmit-sided spatially correlated channel, white noises, independent source symbols in [5, 6] as special case.

- 5.
The proposed joint precoding schemes in this paper provide higher capacity than the existing schemes in [4] and [28, 29, 31] by numerical simulations.

The rest of the paper is organized as follows. The system model and the precoding design problem formulation are introduced in Section 2. The derivation of the joint designs of source and relay precoders with the full CSIT and those with the partial CSIT are presented in Section 3 and in Section 4, respectively. The performance of the proposed joint precoder designs is demonstrated by numerical simulations in Section 5. Finally, some conclusions are drawn in Section 6.

*Notation:* A boldface upper case is used for a matrix, and a boldface lower case for a vector. An *N*×*N* identity matrix is denoted by **I**
_{
N
}. Sometimes, we omit the index *N* when the identity matrix size is clear. We use (.)^{
H
}, (.)^{−1}, |.|, tr (.) for the conjugate transpose, the pseudo-inverse, the determinant, the trace of a matrix, respectively. For a matrix **A**, the operate vec (**A**) is used for vectorizing **A** by stacking the columns of **A** into a column vector. The notations **A**≽0, **B**≻0 imply that the matrices **A**, **B**
**,** are respectively, positive semi-definite and definite. For a scalar *z*, [ *z*]^{+} is a short form of *z*= max(*z*,0). \( \mathbf {H} \sim \mathcal {CN}(\mathbf {Z},\mathbf {\Theta } \otimes \mathbf {\Omega })\) denotes a matrix-variate complex Gaussian distribution with mean E (**H**)=**Z** and covariance E (vec(**H**−**Z**)^{
T
}vec(**H**−**Z**)^{
TH
})=**Θ**⊗**Ω** [33].

## 2 System model and design problem formulation

*M,K*and

*N*antennas, respectively. The half-duplex mode is assumed to be how the system operates. Each signal transmission from the source to the destination takes two time slots to complete.

In the first time slot, the source node multiplies the signal vector \( \mathbf {x} \in \mathbb {C}^{M\times 1} \) by a source precoding matrix \( \mathbf {B} \in \mathbb {C}^{M \times M}.\) Here, the signal **x** contains mutually correlated data symbols with covariance matrix E(**x**
**x**
^{
H
})=**R**
_{
x
}=**Ψ**
_{
x
} known to the three terminals since **x** has been arisen from encoding operations on the baseband signals [7]. The matrix **Ψ**
_{
x
}≽0 denotes a correlation matrix with unit elements on its diagonal. The source precoding matrix **B** has the power constraint tr(**B**
**R**
_{
x
}
**B**
^{
H
})≤*p*
_{1}, where *p*
_{1} is the allowed maximum transmit power at the source. Then, the resulting signal is transmitted to the relay node through the source-relay channel \( \mathbf {H}_{1} \in \mathbb {C}^{M \times K} \). The received signal at the relay \( \mathbf {y}_{1} \in \mathbb {C}^{K \times 1} \) is given by **y**
_{1}=**H**
_{1}
**B**
**x**+**n**
_{1}, where \( \mathbf {n}_{1} \in \mathbb {C}^{K \times 1} \) is the colored Gaussian noise vector at the relay with zero-mean and covariance matrix \(\phantom {\dot {i}\!}\mathrm {E}\left (\mathbf {n}_{1} \mathbf {n}_{1}^{H}\right) =\mathbf {R}_{n_{1}} = \sigma _{1}^{2} \mathbf {\Psi }_{n_{1}} \), and \(\phantom {\dot {i}\!}\mathbf {\Psi }_{n_{1}} \succ 0 \) is a correlation matrix with \(\phantom {\dot {i}\!}\text {tr}(\mathbf {\Psi }_{n_{1}}) = K\).

**y**

_{1}is multiplied by a relay precoding matrix \( \mathbf {F} \in \mathbb {C}^{K \times K} \) satisfying the power constraint \( \text {tr}\Big (\mathbf {F} \left (\mathbf {H}_{1} \mathbf {B} \mathbf {R}_{x} \mathbf {B}^{H} \mathbf {H}_{1}^{H} + \mathbf {R}_{n_{1}}\right) \mathbf {F}^{H}\Big) \leq p_{2}\phantom {\dot {i}\!}\), where

*p*

_{2}is the allowed maximum transmit power at the relay. After that, the resulting signal

**F**

**y**

_{1}is forwarded to the destination through the relay-destination channel \({\mathbf {H}_{2} \in \mathbb {C}^{K \times N} }\). Therefore, the received signal at the destination \( \mathbf {y} \in \mathbb {C}^{N \times 1} \) is

where \( \mathbf {n}_{2} \in \mathbb {C}^{N \times 1} \) is the colored Gaussian noise vector at the destination with zero-mean and covariance matrix \(\mathrm {E}(\mathbf {n}_{2} \mathbf {n}_{2}^{H}) =\mathbf {R}_{n_{2}} = \sigma _{2}^{2} \mathbf {\Psi }_{n_{2}} \), and \( \mathbf {\Psi }_{n_{2}} \succ 0 \) is a correlation matrix with \( \text {tr}(\mathbf {\Psi }_{n_{2}}) = N. \) The channel matrices **H**
_{1},**H**
_{2} are generated base on Kronecker model [34] as \( \mathbf {H}_{i}=\mathbf {\Omega }_{i}^{1/2} \mathbf {H}_{w,i}\mathbf {\Theta }_{i}^{1/2},i = 1,2, \) where the elements of **H**
_{
w,i
} are i.i.d. zero-mean and unit-variance circularly symmetric complex Gaussian random variables and **Ω**
_{
i
} and **Θ**
_{
i
} are positive definite transmit and receive covariance matrices of **H**
_{
i
}, and thereby, \( \mathbf {H}_{i} \sim \mathcal {CN}(\mathbf {0},\mathbf {\Theta }_{i} \otimes \mathbf {\Omega }_{i}).\)

*L*independent substreams of source symbols is active in each transmission. Like [21–25, 27–31], we assume that the three terminals know \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}} \) and \( \mathbf {R}_{n_{2}}.\) An interesting example that relates to how to obtain these noise covariance matrices is given in [26] where the problem of designing the relay precoder in a three-node relay system in the presence of some interferers near the destination by using the covariance information of the interferers-destination channels is addressed. Specifically, the received signal at the destination is \(\mathbf {y} = \mathbf {H}_{2} \mathbf {F} \mathbf {H}_{1} \mathbf {x} + \mathbf {H}_{2} \mathbf {F} \mathbf {n}_{w,1} + \sum _{j = 1}^{J}{\mathbf {H}_{I_{j}} \mathbf {x}_{I_{j}}} + \mathbf {n}_{w,2},\) where

**n**

_{ w,1},

**n**

_{ w,2}are the relay and destination white Gaussian noise vectors with covariance matrices \( \mathbf {R}_{n_{w,1}}= \sigma _{1}^{2} \mathbf {I}_{K},\mathbf {R}_{n_{w,2}}= \sigma _{2}^{2} \mathbf {I}_{N}. \) It is assumed that the destination knows covariance matrices \(\mathbf {\Theta }_{I_{j}}\) of the interferers-destination channels \( \mathbf {H}_{I_{j}}= \mathbf {H}_{w,j}\mathbf {\Theta }_{I_{j}}^{1/2}\) by the training signals that are friendly shared by the interferers

*I*

_{ j }, and then the relay also knows \( \mathbf {\Theta }_{I_{j}}\) by feedback from the destination. It is valid to equivalently view \( \mathbf {n}_{2} \triangleq \sum _{j = 1}^{J}{\mathbf {H}_{I_{j}} \mathbf {x}_{I_{j}}} + \mathbf {n}_{w,2} \) as the colored noise vector at the destination. The destination can easily compute the destination-colored noise covariance matrix \(\mathbf {R}_{n_{2}} = \mathrm {E}\left (\mathbf {n}_{2} \mathbf {n}_{2}^{H}\right)\) by the covariance matrices \( \mathbf {\Theta }_{I_{j}} \) and \(\mathbf {R}_{n_{w,2}}.\) The relay has \( \mathbf {R}_{n_{2}} \) by feedback from the destination. The source also has \( \mathbf {R}_{n_{2}} \) by feedback from the relay. In a similar situation to the destination where the relay lies near some interferers such that \(\mathbf {y} = \mathbf {H}_{2} \mathbf {F} \mathbf {H}_{1} \mathbf {x} + \mathbf {H}_{2} \mathbf {F} \left (\sum _{j = 1}^{J}{\mathbf {H}_{I_{j}}^{'} \mathbf {x}_{I_{j}}^{'}} + \mathbf {n}_{w,1}\right) + \sum _{j = 1}^{J}{\mathbf {H}_{I_{j}} \mathbf {x}_{I_{j}}} + \mathbf {n}_{w,2},\) the relay can also have the covariance matrix \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}} = \mathrm {E}\left (\mathbf {n}_{1} \mathbf {n}_{1}^{H}\right)\) of the relay colored noise \( \mathbf {n}_{1} \triangleq \sum _{j = 1}^{J}{\mathbf {H}_{I_{j}}^{'} \mathbf {x}_{I_{j}}^{'}} + \mathbf {n}_{w,1} \) by covariance matrices of the interferers-relay channels and \( \mathbf {R}_{n_{w,1}}.\) Then, \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}} \) is fed back to the source, and fed forward to the destination by the relay. In other words, the three nodes all know \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}} \) besides \(\phantom {\dot {i}\!}\mathbf {R}_{n_{2}}.\) Besides \(\phantom {\dot {i}\!} \mathbf {R}_{x}, \mathbf {R}_{n_{1}}\), and \(\phantom {\dot {i}\!} \mathbf {R}_{n_{2}}\), the destination is also assumed to have

**H**

_{2}

**F**

**H**

_{1}

**B**through a channel estimation method (e.g., [35–37]). The instantaneous MI between the source and the destination [31] is given by

where a factor 1/2 is due to the transmission duration of two time slots.

In the preceding sections, we address the problem of jointly designing the source precoder **B** and the relay precoder **F** for the three-node relay system in the absence of the direct link in Fig. 1. In practice, this system case commonly occurs when the source is so far from the destination that they cannot directly information exchange, thus the direct link is negligible. This case was also considered in [1–6, 8–11]. It is noteworthy that user cooperation among all involved nodes is well known to yield valuable spatial diversity, thus enhance capacity and mitigate detection error. However, these benefits actually come true in a situation where the distance between the source and the destination is short enough for them to directly communicate each other through the direct link. An efficient way to tackle the problem of serious loss on the direct link is the additional use of multiple cascaded relays to form a multi-hop relay system. Enhancing the system capacity with precoding schemes that exploit available CSIT was reported in [12, 30, 31]. We propose the iterative and simplified methods of jointly optimizing **B** and **F** under the mutual information maximizing criterion in different CSIT scenarios in Section 3 and Section 4. We also propose to extend the simplified design with the full CSIT to the case of multi-hop systems in Section 3.

## 3 Joint source and relay precoding with the full CSIT

### 3.1 Optimal structures for source and relay precoders

**H**

_{1},

**H**

_{2},

**R**

_{ x }, \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}}\), and \(\phantom {\dot {i}\!}\mathbf {R}_{n_{2}}\) (the full CSIT). In practice, the relay can estimate

**H**

_{1}by the training signals sent from the source, and the destination can estimate

**H**

_{2}by the training signals sent from the relay. The relay has

**H**

_{2}by feedback from the destination, and the source has

**H**

_{1}and

**H**

_{2}by feedback from the relay. Note that feedbacking

**H**

_{2}from the destination to the source is unusual due to the poor condition of the source-destination direct link [39], which is also assumed in this paper. With the full CSIT, we jointly design

**B**and

**F**to maximize \(\mathcal {I}(\mathbf {B},\mathbf {F})\) under the source and relay power constraints. This design issue can be formulated as:

**U**

_{1},

**U**

_{2},

**V**

_{1}, and

**V**

_{2}are unitary matrices, and

**Λ**

_{1}and

**Λ**

_{2}are diagonal matrices of non-negative eigenvalues in descending order. In order to attain a maximum MI in Problem (6),

**B**and

**F**should are optimally structured as:

where **Λ**
_{
b
} and **Λ**
_{
f
} are *M*×*M* and *K*×*K* diagonal matrices of non-negative entries with up to *L* positive elements.

###
*Proof*

**A**+

**B**

**C**

**D**)

^{−1}=

**A**

^{−1}−

**A**

^{−1}

**B**(

**D**

**A**

^{−1}

**B**+

**C**

^{−1})

^{−1}

**D**

**A**

^{−1}and

**B**

^{ H }(

**B**

**C**

**B**

^{ H }+

**I**)

^{−1}

**B**=

**C**

^{−1}−(

**C**

**B**

^{ H }

**B**

**C**+

**C**)

^{−1}to (5) leads to

**M**in (11) is rewritten more compactly as

**X**

_{1}is an

*M*×

*M*unitary matrix with \( \mathbf {X}_{1} \mathbf {X}_{1}^{H} = \mathbf {I}_{M} \),

**X**

_{2}is an

*M*×

*K*unitary matrix with \( \mathbf {X}_{2} \mathbf {X}_{2}^{H} = \mathbf {I}_{M} \). It is easy to find that they do not affect the source and relay power constraints. After substituting (12) and (13) into (14) and performing some manipulations,

**M**becomes

Next, we consider the following properties:

For any Hermitian matrix **A** with main diagonal vector *d*(**A**) and eigenvalue vector *λ*(**A**), then *d*(**A**)≺*λ*(**A**) [41].

For *mN*×*N* complex matrices **A**
_{1},**A**
_{2},…,**A**
_{
m
} with singular values arranged in the same order as the product **B**=**A**
_{1}
**A**
_{2}…**A**
_{
m
}, then the vector of singular values of **B** is weakly majorized by the Schur (element-wise) product of the vectors of singular values of these complex values, that means *σ*(**B**)≺_{
w
}
*σ*(**A**
_{1})⊙*σ*(**A**
_{2})⊙…⊙*σ*(**A**
_{
m
}) [41].

**Γ**, we have

*d*(

**I**

_{ M }−

**Γ**)) is Schur-convex and increases with

*d*(

**Γ**), we get \( -\log _{2}\big (d(\mathbf {I}_{M} - \mathbf {\Gamma })\big) \leq -\log _{2}\big (d(\mathbf {I}_{M} - \tilde {\mathbf {\Gamma }})\big). \) This leads to

*f*meets

**x**≺

_{ w }

**y**⇒

*f*(

**x**)≤

*f*(

**y**) if

*f*is a Schur-convex and increasing function [41], and the maximum of the MI is attained when

**X**

_{1}and

**X**

_{2}are chosen as

**X**

_{1}=

**I**

_{ M }and \( \mathbf {X}_{2} = \mathbf {U}_{\tilde {H}_{1}}^{H}.\) Obviously, the MI is invariant in

**X**

_{1}and

**X**

_{2}.

where the inequality exists because the fact that for any two *N*×*N* positive semidefinite matrices **A** and **B** having the corresponding eigenvalues *λ*
_{
i
}(**A**) and *λ*
_{
i
}(**B**) in the descending order, it follows that \( \text {tr}(\mathbf {A} \mathbf {B}) \geq \sum _{i}^{N}{\lambda _{i}(\mathbf {A})\lambda _{N+1-i}(\mathbf {B})} \).

**X**

_{1}, and the minimum of the source power is achieved when \( \mathbf {U}_{\tilde {H}_{1}}^{H} = \mathbf {U}_{1}.\) Besides, because we also have

**X**

_{1}=

**I**

_{ M }as proved above,

**B**can be obtained as

**B**as shown in (9).

**X**

_{2}does not impact on the relay transmit power. Similar to (19), the equality in (20) holds when \( \mathbf {U}_{\tilde {H}_{2}}^{H} = \mathbf {U}_{2}.\) Besides, since we also have \( \mathbf {X}_{2} = \mathbf {U}_{\tilde {H}_{1}}^{H} \) as proved above,

**F**can be calculated as

**F**as shown in (10). □

**B**and

**F**comes as a generalization of that for relay systems with i.i.d. channels, independent source signals, white noise of [4] to relay systems with spatially correlated channels, correlated input signals and colored noise. It is a development of the relay only precoding (ROP) with the same

**F**,

**G**as our design, but \( \mathbf {B} = \sqrt {p_{1}/M} \mathbf {I}_{M} \) of [28, 29]. Like the exiting designs with the full CSIT (e.g., [1–6, 8]), these optimal structures of

**B**and

**F**are independent of the type of channel fading.

**B**performs whitening the source signal streams, then loads the source power and beam-forms the obtained parallel signal streams across the eigenvectors

**V**

_{1}, while

**F**performs whitening the relay colored noise, then loads the relay power across the eigenvectors \( \mathbf {U}_{1}^{H} \) and

**V**

_{2}. By this way, the equivalent end-to-end MIMO channel in the presence of

**B**,

**F**, and

**G**is separated into at most

*L*independent subchannels (or eigenmodes) as illustrated in Fig. 2. This implies that there is no longer interference among the signal streams, which enhances the system capacity. This channel separation is mathematically expressed by

where \( \mathbf {\Delta } \triangleq \mathbf {\Lambda }_{b}^{\frac {1}{2}} \mathbf {\Lambda }_{1}^{\frac {1}{2}} \mathbf {\Lambda }_{f}^{\frac {1}{2}} \mathbf {\Lambda }_{2}^{\frac {1}{2}} \left (\mathbf {\Lambda }_{2}^{\frac {1}{2}} \mathbf {\Lambda }_{f}^{\frac {1}{2}} \mathbf {\Lambda }_{1}^{\frac {1}{2}} \mathbf {\Lambda }_{b} \mathbf {\Lambda }_{1}^{\frac {1}{2}} \mathbf {\Lambda }_{f}^{\frac {1}{2}} \mathbf {\Lambda }_{2}^{\frac {1}{2}} + \mathbf {\Lambda }_{2}^{\frac {1}{2}} \mathbf {\Lambda }_{f} \mathbf {\Lambda }_{2}^{\frac {1}{2}} + \mathbf {I}_{N} \right)^{-1} \) is the diagonal matrix containing the non-negative diagonal elements, maximum *L* of whom *δ*
_{1},…,*δ*
_{
L
} are positive, \(\mathbf {G} = \mathbf {R}_{x}^{\frac {1}{2}} \mathbf {\Delta },\) and \( \bar {\mathbf {n}}_{1} \triangleq \mathbf {U}_{1}^{H} \mathbf {R}_{n_{1}}^{-\frac {1}{2}}\mathbf {n}_{1}, \bar {\mathbf {n}}_{2} \triangleq \mathbf {U}_{2}^{H} \mathbf {R}_{n_{2}}^{-\frac {1}{2}}\mathbf {n}_{2} \) and \( \bar {\mathbf {x}} \triangleq \mathbf {R}_{x}^{-\frac {1}{2}}\mathbf {x} \) are white due to \( \mathrm {E}\left (\bar {\mathbf {n}}_{1} \bar {\mathbf {n}}_{1}^{H}\right) = \mathbf {I}_{K}, \mathrm {E}\left (\bar {\mathbf {n}}_{2}\bar {\mathbf {n}}_{2}^{H}\right) = \mathbf {I}_{N} \) and \(\mathrm {E}\left (\bar {\mathbf {x}} \bar {\mathbf {x}}^{H}\right) = \mathbf {I}_{M}. \)

**Λ**

_{ b }and

**Λ**

_{ f }:

Here, we define \( \mathbf {b} \triangleq (b_{1}, \ldots, b_{L})^{T} \), \( \mathbf {f} \triangleq (f_{1}, \ldots, f_{L})^{T} \), and *λ*
_{1,l
}, *λ*
_{2,l
}, *b*
_{
l
} and *f*
_{
l
} are the *l*-th main diagonal elements of **Λ**
_{1}, **Λ**
_{2}, **Λ**
_{
b
} and **Λ**
_{
f
}, respectively.

*v*

_{ l }=(

*λ*

_{1,l }

*b*

_{ l }+1)

*f*

_{ l }. Problem (22) can be rewritten as:

Once *v*
_{
l
} is found, *f*
_{
l
} can be easily computed as *f*
_{
l
}=*v*
_{
l
}/(*λ*
_{1,l
}
*b*
_{
l
}+1). An optimal solution to Problem (23) is impossible to obtain, since this problem is still non-concave in **b** and **v**. However, in the next two Sections 3.2 and 3.3, we design an iterative algorithm and a simplified algorithm to find **b** and **v**.

### 3.2 Iterative power allocation algorithm

**b**and

**v**. It is important to find that

**b**and

**v**behave symmetrically in (23). Hence, if either

**b**or

**v**is kept unchanged, Problem (23) turns to a standard concave optimization problem. Specifically, when

**b**is fixed, it collapses to the problem of optimizing

**v**given by

**v**, it reduces to the problem of optimizing

**b**given by

**v**from Problems (24)–(25). Let us consider the function

*v*

_{ l }≤

*p*

_{2}. It follows that the objective function \( \mathcal {I}(\mathbf {v}) = \frac {1}{2} \sum _{l=1}^{L}{\log _{2}\big (f(v_{l}) \big)} \) is also concave on the same range of

*v*

_{ l }. In addition, the constraint functions are clearly convex. Hence, Problems (24)–(25) is a standard concave optimization problem [42] and its optimum solution

**v**can be found via Lagrange method as follows.

**v**as

**b**can be inferred as:

*μ*

_{ b }satisfies

**F**loads more power to the weaker eigenmodes of relay-destination link

*λ*

_{2,l }than the modified eigenmodes of the source-relay link

*b*

_{ l }

*λ*

_{1,l }and less power to the stronger eigenmodes of the relay-destination link

*λ*

_{2,l }than the modified eigenmodes of the source-relay link

*b*

_{ l }

*λ*

_{1,l }, while

**B**loads more power to the weaker eigenmodes of the source-relay link

*λ*

_{1,l }than the modified eigenmodes of the relay-destination link

*v*

_{ l }

*λ*

_{2,l }and less power to the stronger eigenmodes of the source-relay link

*λ*

_{1,l }than the modified eigenmodes of the relay-destination link

*v*

_{ l }

*λ*

_{2,l }. Here, the goal is to get as many optimal patterns of pairing

*b*

_{ l }

*λ*

_{1,l }with

*v*

_{ l }

*λ*

_{2,l }as possible to further enhance system capacity. This coordination of

**B**and

**F**is repeated until the achievement of desired system capacity. This iterative procedure is summarized briefly by Table 1. The computational complexity of the iterative design with the full CSIT is contributed by performing two SVDs in (7) and (8) with \( 2 \times \mathcal {O}(L^{3})\) operations in which we take

*L*=

*N*=

*M*=

*K*for simplicity, finding roots

**v**and

**b**in (31) and (33) with \( 2 \times \mathcal {O}(L)\) operations and searching the optimal patterns of pairing

*b*

_{ l }

*λ*

_{1,l }with

*v*

_{ l }

*λ*

_{2,l }with \( 2 \times \mathcal {O}(L!)\) operations. Hence, there are a total of \( 2 \times \mathcal {O}(L^{3}) + \mathcal {N} \times (2 \times \mathcal {O}(L)+ 2 \times \mathcal {O}(L!))\) operations where \( \mathcal {N} \) represents the number of iterations required to complete this iterative design.

An iterative algorithm to derive **b** and **v**

Initialize \( \mathbf {b} = \frac {p_{1}}{M}\mathbf {I}_{M} \) satisfying (27) | |

Repeat | |

1) | |

Compute \( \mathcal {I}(\mathbf {b},\mathbf {v})^{(old)}. \) | |

2) | |

Compute \( \mathcal {I}(\mathbf {b},\mathbf {v})^{(new)}. \) | |

Until \( \mathcal {I}(\mathbf {b},\mathbf {v})^{(new)} \!- \mathcal {I}(\mathbf {b}, \mathbf {v})^{(old)}\! \leq \! \epsilon. \) Here |

In the design process, besides the covariance matrices \(\phantom {\dot {i}\!}\mathbf {R}_{x}, \mathbf {R}_{n_{1}}\), and \( \mathbf {R}_{n_{2}} \), the relay needs to have the estimated CSI **H**
_{1} and the destination-relay feedback CSI **H**
_{2}, and the source needs to have the relay-source feedback CSI **H**
_{1} and **H**
_{2}. With each set of such the full CSIT, for fixed **b**, the relay computes **v**, and then feeds back it to the source. The source updates **b** with the received **v**, and then feeds forward the updated **b** to the relay. To obtain an output of **v** and **b**, this updating is repeated alternatively between the relay and the source until \( \mathcal {I}(\mathbf {b},\mathbf {v}) \) converges a desired value. Due to computational burden of such the iterative procedure, its output of **v** and **b**, thus the precoders **B** and **F** may be outdated to the current propagation condition. In practical wireless communications systems, to efficiently mitigate the overhead and the design complexity, codebook and limited feedback schemes are often utilized. The idea behind these techniques is that the receiver first would quantize the estimated CSI, and feedback the resulting index to the transmitter. The transmitter then picks the desired precoder from a codebook which is a set of precoders designed offline beforehand by using various CSIT sets [39].

### 3.3 Simplified power allocation algorithm

*x*and

*y*are two non-negative scalars. Applying inequality (35) to the objective function of Problem (23) leads to its upper bound that is equal to

**v**and

**b**to Problems (37) and (38) are, respectively, given by [20]:

It is intuitively from (39) and (40) that **F** loads more power to the weaker eigenmodes of the relay-destination link *λ*
_{2,l
} and less power to the stronger *λ*
_{2,l
}, while **B** loads more power to the weaker eigenmodes of the source-relay link *λ*
_{1,l
} and less power to the stronger *λ*
_{1,l
}. In the design process, the relay needs to have \(\phantom {\dot {i}\!} \mathbf {R}_{x}, \mathbf {R}_{n_{1}}, \mathbf {R}_{n_{2}}, \mathbf {H}_{1} \), and **H**
_{2}, while the source needs to have \(\phantom {\dot {i}\!} \mathbf {R}_{x}, \mathbf {R}_{n_{1}} \), and **H**
_{1}. With each set of such the full CSIT, the relay computes **v** and feeds back it to the source, and then the source calculates **b** with the received **v**. Because there is no need for feedbacking \( \mathbf {R}_{n_{2}} \) and **H**
_{2} to the source, the simplified design with the full CSIT allows to save a large signaling overhead compared to the iterative counterpart. In terms of the computational complexity, this simplified design requires \( 2 \times (\mathcal {O}(L^{3}) + \mathcal {O}(L)+ \mathcal {O}(L!)) \) operations to accomplish, thus it is much simpler than the iterative counterpart. Because of the simplicity of separately calculating **v** and **b**, the simplified scheme may give lower capacity than the iterative scheme. Nevertheless, we can expect that its capacity performance is comparable to the performance of the the iterative counterpart, especially at the medium-to-high SNRs. This is mainly due to the fact that in inequality (35), when *x,y*→*∞*, then 1+*x*+*y*≪(1+*x*)(1+*y*), or equivalently, when **b**,**v**→*∞* (i.e., \( p_{1}/\sigma _{1}^{2}, p_{2}/\sigma _{2}^{2} \rightarrow \infty \)), then \( \mathcal {I}(\mathbf {b},\mathbf {v}) \) approaches to its upper bound. Interestingly, this simplified design can be also extended to the case of multi-hop systems, as presented below.

*Z*≥2 non-negative scalars

*x*

_{1},…,

*x*

_{ Z }. The obtained inequality is

*Z*-hop system can be derived as

*b*

_{ l }of the diagonal matrix

**Λ**

_{ b }of the source precoding matrix

*f*

_{ i,l }of the diagonal matrix

**Λ**

_{ f,i }of the relay precoding matrices

**V**

_{ i },

**U**

_{ i }come from the SVDs of \( \mathbf {R}_{n_{i}}^{-\frac {1}{2}} \mathbf {H}_{i} = \mathbf {U}_{i} \mathbf {\Lambda }_{i}^{\frac {1}{2}} \mathbf {V}_{i}^{H} \). Solving Problems (43) and (44) yields [20]

Again, it is easy to see that with the coordination of precoders **B**, **F**
_{
i
} and a respective linear MMSE equalizer at destination, the *Z*-hop MIMO relay channel is also decoupled. Notably, this extension design with the full CSIT has the same solution as the MMI asymptotic precoder design of [30, 31] where the same problem was studied. This shows the flexibility of our proposed design methods.

Theorem 1 below concludes the main results on the joint design of source and relay precoders with the full CSIT.

###
**Theorem 1**

The instantaneous mutual information \( \mathcal {I}(\mathbf {B},\mathbf {F}) \) attains its maximum under the power constraints tr(**B**
**R**
_{
x
}
**B**
^{
H
})≤*p*
_{1} and \(\phantom {\dot {i}\!} \text {tr}\Big (\mathbf {F} \left (\mathbf {H}_{1} \mathbf {B} \mathbf {R}_{x} \mathbf {B}^{H} \mathbf {H}_{1}^{H} + \mathbf {R}_{n_{1}} \right) \mathbf {F}^{H}\Big) \leq p_{2} \) when **B** and **F** are of the optimal structures as \( \mathbf {B} = \mathbf {V}_{1} \mathbf {\Lambda }_{b}^{\frac {1}{2}} \mathbf {R}_{x}^{-\frac {1}{2}} \) and \(\phantom {\dot {i}\!} \mathbf {F} = \mathbf {V}_{2} \mathbf {\Lambda }_{f}^{\frac {1}{2}} \mathbf {U}_{1}^{H} \mathbf {R}_{n_{1}}^{-\frac {1}{2}} \). Here, **V**
_{1},**U**
_{1} and **V**
_{2} are unitary matrices of \( \mathbf {R}_{n_{1}}^{-\frac {1}{2}} \mathbf {H}_{1} = \mathbf {U}_{1} \mathbf {\Lambda }_{1}^{\frac {1}{2}} \mathbf {V}_{1}^{H} \) and \( \mathbf {R}_{n_{2}}^{-\frac {1}{2}} \mathbf {H}_{2} = \mathbf {U}_{2} \mathbf {\Lambda }_{2}^{\frac {1}{2}} \mathbf {V}_{2}^{H} \), and **Λ**
_{
b
} and **Λ**
_{
f
} are diagonal matrices of non-negative entries which can be determined alternately by the iterative algorithm (Section 3.2) or separately by the simplified algorithm (Section 3.3).

## 4 Joint source and relay precoding with partial CSIT

### 4.1 Suboptimal structures for source and relay precoders

In Section 3, we obtained the precoder designs with the full CSIT. However, it is too hard for the relay and the source to obtain **H**
_{2} in the situation when the destination moves rapidly. This is basically because a large amount of signalling overhead is needed for feedbacking **H**
_{2}, while the feedback channels in practical wireless systems are commonly rate-limited. For these reasons, in this section, we assume that the source and the relay have **R**
_{
x
}, \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}}\), \(\mathbf {R}_{n_{2}}\phantom {\dot {i}\!}\), **H**
_{1}, and only covariance information **Θ**
_{2} and **Ω**
_{2} of **H**
_{2} (the partial CSIT). With the partial CSIT, we jointly design **B** and **F** to maximize \( \mathrm {E_{H_{2}}}\big (\mathcal {I}(\mathbf {B},\mathbf {F})\big) \) under the source and relay transmit power constraints. However, it is intractable to exactly compute \( \mathrm {E_{H_{2}}}\big (\mathcal {I}(\mathbf {B},\mathbf {F})\big) \) because taking the expectation with respect to unknowns **B** and **F** is needed. Here, an alternative solution proposed is to use its an upper bound, which is derived below.

where \( \mathbf {W} \triangleq \left (\mathbf {R}_{n_{1}}^{-\frac {1}{2}}\mathbf {H}_{1} \mathbf {B} \mathbf {R}_{x}^{\frac {1}{2}}\right)\left (\mathbf {R}_{n_{1}}^{-\frac {1}{2}}\mathbf {H}_{1} \mathbf {B} \mathbf {R}_{x}^{\frac {1}{2}}\right)^{H}+\mathbf {I}_{K}.\)

*f*(

**X**)=

**X**

^{−1}is convex in

**X**[43],

**M**in (45) is convex in \( \mathbf {H}_{2}^{H} \mathbf {R}_{n_{2}}^{-1} \mathbf {H}_{2} \) for a given

**H**

_{1}. By Jensen’s inequality [44] and the property which states that for any matrix

**H**with the distribution \( \mathbf {H} \sim \mathcal {CN}(\mathbf {0},\mathbf {\Theta } \otimes \mathbf {\Omega }) \), then

*E*

_{ H }(

**H**

**A**

**H**

^{ H })=tr(

**A**

**Θ**

^{ T })

**Ω**and

*E*

_{ H }(

**H**

^{ H }

**A**

**H**)=tr(

**Ω**

**A**)

**Θ**

^{ T }[33], we have

**U**

_{1},

**U**

_{ θ },

**V**

_{1}, and

**V**

_{ θ }are unitary matrices, and

**Λ**

_{1}and

**Λ**

_{ θ }are diagonal matrices of non-negative eigenvalues in descending order. In order to achieve a maximum of \( \mathcal {\dot {I}}_{erg}(\mathbf {B},\mathbf {F}) \) in Problem (47),

**B**and

**F**should are in forms as:

where \( \mathbf {\Lambda }_{b}^{(p)} \) and \( \mathbf {\Lambda }_{f}^{(p)} \) are *M*×*M* and *K*×*K* diagonal matrices of non-negative entries with up to *L* positive elements.

###
*Proof*

**M**

_{ L }in (46) is simplified to

Clearly, **M**
_{
L
} has the same form as **M** in (14). Therefore, the proof part of the optimal source and relay precoder structures with the full CSIT in (9) and (10) presented in Part 3 can be used for the derivation of those with the partial CSIT in (50) and (51). □

**R**

_{ x }=

**I**

_{ M }, \( \mathbf {R}_{n_{1}} =\sigma _{n_{1}}^{2} \mathbf {I}_{K} \), \( \mathbf {R}_{n_{2}} =\sigma _{n_{2}}^{2} \mathbf {I}_{N} \),

**Ω**

_{2}=

**I**

_{ N }into (50) and (51), the source and relay precoding reduce to the ROP in [5, 6]. Since the relay eigen-beamer directions

**V**

_{ θ }does not match the relay-destination subchannel directions

**V**

_{2}, the obtained partial-CSIT precoders are clearly suboptimal compared to the full-CSIT precoders. Therefore, the system capacity enhancement much relies on how to allocate power across the source and relay antennas. This task is equivalent to solving a problem of optimizing \( \mathbf {\Lambda }_{b}^{(p)} \) and \( \mathbf {\Lambda }_{f}^{(p)} \) given by:

where \(\gamma \triangleq \text {tr}\left (\mathbf {\Omega }_{2}\mathbf {R}_{n_{2}}^{-1}\right)^{\frac {1}{2}}, \mathbf {b}^{(p)} \triangleq \left (b_{1}^{(p)}, \ldots, b_{L}^{(p)}\right)^{T} \) is the diagonal vector of \( \mathbf {\Lambda }_{b}^{(p)},\) and \( \mathbf {f}^{(p)} \triangleq \left (f_{1}^{(p)}, \ldots, f_{L}^{(p)}\right)^{T} \) is the diagonal vector of \( \mathbf {\Lambda }_{f}^{(p)},\)

Once \( v_{l}^{(p)} \) is found, \( f_{l}^{(p)} \) is straightforward to calculate as \( {f}_{l}^{(p)} = v_{l}^{(p)}/(\lambda _{1,l} {b}_{l}^{(p)} + 1) \). Directly solving Problem (56) is impossible due to its non-concavity in **b**
^{(p)} and **v**
^{(p)}. Therefore, we propose to deal with this problem by an iterative algorithm in Section 4.2 and by a simplified algorithm in Section 4.3.

### 4.2 Iterative power allocation algorithm

**b**

^{(p)}and

**v**

^{(p)}are symmetrical each other in (56). Therefore, if either

**b**

^{(p)}or

**v**

^{(p)}is kept fixed, Problem (56) becomes a standard concave optimization problem. Specifically, for a given

**b**

^{(p)}, it collapses to the problem of optimizing

**v**

^{(p)}given by

**v**

^{(p)}, it relaxes to the problem of optimizing

**b**

^{(p)}given by

**b**

^{(p)}and

**v**

^{(p)}alternatively is also developed, as shown in Table 2.

An iterative algorithm to derive **b**
^{(p)} and **v**
^{(p)}

Initialize \( \mathbf {b}^{(p)} = \frac {p_{1}}{M}\mathbf {I}_{M} \) satisfying (60) | |

Repeat | |

1) | Find |

\( v_{l}^{(p)} = \left [\sqrt {\left (\frac {\lambda _{1,l}}{2 \gamma \lambda _{\theta,l}}b_{l}^{(p)}\right)^{2} + \frac {\lambda _{1,l}}{\gamma \lambda _{\theta,l}} b_{l}^{(p)} \mu _{v}} - \frac {\lambda _{1,l}}{2 \gamma \lambda _{\theta,l}}b_{l}^{(p)} - \frac {1}{\gamma \lambda _{\theta,l}} \right ]^{+}, \) | |

where | |

\( \sum _{l=1}^{L} {\left [\sqrt {\left (\frac {\lambda _{1,l}}{2 \gamma \lambda _{\theta,l}}b_{l}^{(p)}\right)^{2} + \frac {\lambda _{1,l}}{\gamma \lambda _{\theta,l}} b_{l}^{(p)} \mu _{v}} - \frac {\lambda _{1,l}}{2 \gamma \lambda _{\theta,l}}b_{l}^{(p)} - \frac {1}{\gamma \lambda _{\theta,l}} \right ]^{+}} = p_{2}. \) | |

Compute \( \mathcal {\dot {I}}_{erg}(\mathbf {b}^{(p)},\mathbf {v}^{(p)})^{(old)}. \) | |

2) | Find |

\( b_{l}^{(p)} = \left [\sqrt {\left (\frac {\gamma \lambda _{\theta,l}}{2 \lambda _{1,l}}v_{l}^{(p)}\right)^{2} + \frac {\gamma \lambda _{\theta,l}} { \lambda _{1,l}} v_{l}^{(p)} \mu _{b}} - \frac {\gamma \lambda _{\theta,l}}{2 \lambda _{1,l}}v_{l}^{(p)} - \frac {1}{\lambda _{1,l}} \right ]^{+}, \) | |

where | |

\( \sum _{l=1}^{L} {\left [\sqrt {\left (\frac {\gamma \lambda _{\theta,l}}{2 \lambda _{1,l}}v_{l}^{(p)}\right)^{2} + \frac {\gamma \lambda _{\theta,l}} { \lambda _{1,l}} v_{l}^{(p)} \mu _{b}} - \frac {\gamma \lambda _{\theta,l}}{2 \lambda _{1,l}}v_{l}^{(p)} - \frac {1}{\lambda _{1,l}} \right ]^{+}} = p_{1}. \) | |

Compute \( \mathcal {\dot {I}}_{erg}(\mathbf {b}^{(p)},\mathbf {v}^{(p)})^{(new)}. \) | |

Until \( \mathcal {\dot {I}}_{erg}(\mathbf {b}^{(p)},\mathbf {v}^{(p)})^{(new)} - \mathcal {\dot {I}}_{erg}(\mathbf {b}^{(p)},\mathbf {v}^{(p)})^{(old)} \leq \epsilon. \) Here, | |

a desired accuracy. |

In each iteration of this iterative algorithm, **F** and **B** coordinate to search for as many optimal patterns of pairing \( \lambda _{1,l} {b}_{l}^{(p)} \) with \( \gamma \lambda _{\theta,l} {v}_{l}^{(p)} \) as possible. By this way, the interference due to mismatch between the relay beamformer **V**
_{
θ
} and the eigen vectors of the relay-destination link **V**
_{2} decreases, and the system capacity increases after each iteration. As a result, the overall system capacity increases after the algorithm terminates. The complexity of the iterative precoder design with the partial CSIT is nearly the same as the complexity of the iterative precoder design with the full CSIT, with a total of \( 2 \times \mathcal {O}(L^{3}) + \mathcal {N} \times (2 \times \mathcal {O}(L)+ 2 \times \mathcal {O}(L!))\) operations. In the design process, besides **R**
_{
x
}, \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}}\), \(\phantom {\dot {i}\!}\mathbf {R}_{n_{2}}\), and **H**
_{1}, the source and the relay only need to have covariance information **Θ**
_{2} and **Ω**
_{2} of **H**
_{2}. Since these covariance matrices change much slower compared to their channel realization **H**
_{2}, a large mount of signaling overhead and the design complexity required are significantly saved compared to the iterative design with the full CSIT. These benefits thus allow to broaden the applicability of the iterative design with the partial CSIT in practical communications systems, especially when it is realized by codebook and limited feedback techniques [39] that we discussed in Section 3.2.

### 4.3 Simplified power allocation algorithm

**v**

^{(p)}and

**b**

^{(p)}to the respective problems (62) and (63) are given by [20]:

It is obviously from (64) and (65) that **F** loads more power to the weaker eigenmodes of the transmit covariance matrix of the relay-destination link *γ*
*λ*
_{
θ,l
} and less power to the stronger *γ*
*λ*
_{
θ,l
}, while **B** loads more power to the weaker eigenmodes of the source-relay link *λ*
_{1,l
} and less power to the stronger *λ*
_{1,l
}. The relay needs to have **R**
_{
x
}, \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}}\), \(\phantom {\dot {i}\!}\mathbf {R}_{n_{2}}\), **H**
_{1}, **Θ**
_{2} and **Ω**
_{2}, while the source only need to have **R**
_{
x
}, \(\phantom {\dot {i}\!}\mathbf {R}_{n_{1}}\), **H**
_{1}. With each set of such the partial CSIT, the relay computes **v**
^{(p)} and feeds back it to the source, and then the source calculates **b**
^{(p)} with the received **v**
^{(p)}. Obviously, the simplified design with the partial CSIT requires less signaling overhead than the iterative design with the partial CSIT and the simplified design with the full CSIT. Besides, its computational complexity is the same as that of the simplified design with the full CSIT, and much less than that of the iterative design with the partial CSIT. Despite its simplicity, the simplified design with the partial CSIT works well, especially at high SNRs. This is mainly due to the fact that in inequality (35), when *x,y*→*∞*, then 1+*x*+*y*≪(1+*x*)(1+*y*), or equivalently, when when **b**
^{(p)},**v**
^{(p)}→*∞* (i.e., \( p_{1}/\sigma _{1}^{2}, p_{2}/\sigma _{2}^{2} \rightarrow \infty \)), then \( \mathcal {\dot {I}}_{erg}(\mathbf {b}^{(p)},\mathbf {v}^{(p)}) \) approaches to its upper bound.

Theorem 2 below summarizes the main results on jointly designing source and relay precoders with the partial CSIT.

###
**Theorem 2**

The average mutual information \(\mathrm {E_{H_{2}}}\big (\mathcal {I}(\mathbf {B},\)
**F**)) achieves its maximum under the power constraints tr(**B**
**R**
_{
x
}
**B**
^{
H
})≤*p*
_{1} and \( \text {tr}\left (\mathbf {F} (\mathbf {H}_{1} \mathbf {B} \mathbf {R}_{x} \mathbf {B}^{H} \mathbf {H}_{1}^{H} + \mathbf {R}_{n_{1}}) \mathbf {F}^{H}\right) \leq p_{2} \) when **B** and **F** are suboptimally constructed as \( \mathbf {B} = \mathbf {V}_{1} \left [\mathbf {\Lambda }_{b}^{(p)}\right ]^{\frac {1}{2}} \mathbf {R}_{x}^{-\frac {1}{2}} \) and \( \mathbf {F} = \mathbf {V}_{\theta } \left [\mathbf {\Lambda }_{f}^{(p)}\right ]^{\frac {1}{2}} \mathbf {U}_{1}^{H} \mathbf {R}_{n_{1}}^{-\frac {1}{2}} \). Here, **U**
_{1}, **U**
_{
θ
}, **V**
_{1} and **V**
_{
θ
} are unitary matrices of \( \mathbf {R}_{n_{1}}^{-\frac {1}{2}} \mathbf {H}_{1} = \mathbf {U}_{1} \mathbf {\Lambda }_{1}^{\frac {1}{2}} \mathbf {V}_{1}^{H} \) and \(\gamma ^{\frac {1}{2}}\mathbf {\Theta }_{2}^{\frac {1}{2}}= \mathbf {U}_{\theta } \mathbf {\Lambda }_{\theta }^{\frac {1}{2}} \mathbf {V}_{\theta }^{H}\), and \( \mathbf {\Lambda }_{b}^{(p)} \) and \( \mathbf {\Lambda }_{f}^{(p)} \) are diagonal matrices of non-negative entries which can be determined alternately by the iterative algorithm (Section 4.2) or separately by the simplified algorithm (Section 4.3).

## 5 Simulation results

**Θ**

_{ i }and

**Ω**

_{ i },

*i*={1,2} in the corresponding Toeplitz forms [6, 12] as

*r*

_{ t },

*r*

_{ r }meet

*r*

_{ t },

*r*

_{ r }∈(0,1]. Like [2, 5, 6], the source power

*p*

_{1}includes the source-relay path-loss, and the relay power

*p*

_{2}includes the relay-destination path-loss. \( \mathrm {SNR_{1}} \triangleq p_{1}/\sigma _{1}^{2} \) is defined as the first-hop SNR, and \( \mathrm {SNR_{2}} \triangleq p_{2}/\sigma _{1}^{2} \) as the second-hop SNR. The source, relay, and destination nodes all have four antennas (

*M*=

*K*=

*N*=4). The correlation matrices of source signals, relay and destination noises are chosen as

The noise correlation matrices are chosen randomly as long as the noises are colored enough but do not totally interfere the other channel factors. The first condition is to make sure that the noises will have a “colored” effect to the system, while the second one is to maintain the practical meaning of the wireless model. The matrix \( \mathbf {\Psi }_{n_{1}} \) has the vector of eigenvalues [1.5000, 1.2000, 0.8000, 0.5000], \( \mathbf {\Psi }_{n_{2}} \) has the vector of eigenvalues [2.6000, 0.7000, 0.5000, 0.2000], **Ψ**
_{
x
} has the vector of eigenvalues [2.9000, 0.5000, 0.4000, 0.2000]. The Matlab command used to generate \( \mathbf {\Psi }_{n_{1}} \) randomly is gallery(’randcorr’, [1.5000, 1.2000, 0.8000, 0.5000])^{1}. Like previous references [28–31, 45–47], the colored noise vector \( \mathbf {n_{1}} \sim \mathcal {CN}(\mathbf {0},\sigma ^{2}_{1}\mathbf {\Psi }_{n_{1}})\) is generated as \( \mathbf {n_{1}} = \mathbf {\Psi }_{n_{1}}^{\frac {1}{2}} \mathbf {n}_{w,1},\) where \( \mathbf {n_{1}}_{w,1} \sim \mathcal {CN}(\mathbf {0},\sigma ^{2}_{1}\mathbf {I}_{K})\) is the white noise vector. The matrix \( \mathbf {\Psi }_{n_{1}}^{\frac {1}{2}} \) plays a role as the digital filter, and it uniquely exists since \( \mathbf {\Psi }_{n_{1}} \) is positive semidefinite [19, 40]. **Ψ**
_{
x
},**x** and \( \mathbf {\Psi }_{n_{2}}, \mathbf {n}_{2} \) are also generated by the same way as \( \mathbf {\Psi }_{n_{1}} \) and **n**
_{1}. Note that in [21–25], the covariance matrix of the relay colored noise is generated via a first order autoregressive filter as \(\mathbf {R}_{n_{1}}(i,j) = \alpha _{1} r_{1} \eta _{1}^{|i-j|},\) where *r*
_{1} is a normalization factor to keep \(\text {tr}(\mathbf {R}_{n_{1}}) = \alpha _{1} K, \) and *α*
_{1} denotes the interference power from the neighbour interferers. In comparison with our colored noise simulation method, *α*
_{1} functions as \( \sigma ^{2}_{1} \), while \( r_{1} \eta _{1}^{|i-j|} \) functions as \( \mathbf {\Psi }_{n_{1}}(i,j).\)

*r*

_{ t }=

*r*

_{ r }=0.5 are chosen for all the involved channels.

All the examined precoding schemes provide substantial capacity gains over the NAF scheme. Although the ROP in [28, 29] uses the relay precoder alone, it performs better than the iterative precoding in [4] that employs both the source and relay precoders. This is mainly because knowledge of the signal, relay and destination noise correlation matrices is not taken into account in designing the iterative precoding in [4]. The proposed iterative precoding delivers the largest gain, while the proposed simplified precoding maintains a good performance to this precoding over nearly the whole SN*R*
_{1} range. There are clear performance gaps of our proposed designs over the other two precoding schemes. To clarify these gaps, the equal power precoding with the full CSIT that has the same optimal structures as the proposed precoding designs with the full CSIT and equal power allocation across the source antennas and across the relay antennas is used. Results reveal that it offers higher capacity than the ROP in [28, 29] and the iterative precoding in [4] at nearly all SNRs shown (specifically, at SNR_{1} values from 5 dB to higher for Fig. 3 and at SNR values from 10 dB to higher for Fig. 4), and even provides identical capacity to the proposed simplified precoding at high SNRs. This implies that the optimality in the precoder structures alone contributes a significant portion in the capacity enhancement. A well-designed water-filling power allocation adds further capacity, especially in the low SNR regime.

_{1}, SNR

_{2}and the spatial correlations. This statement is shown in Figs. 5, 6, 7, and 8. These figures demonstrate a number of performance comparisons among all the proposed precoding designs with the full CSIT as well as with the partial CSIT. As aforementioned in Section 4.1, the ROP schemes in [5, 6] designed for two-hop relay systems with transmit-sided spatially correlated channels, white noises, independent symbols are included in the proposed partial-CSIT precoding designs. Clearly, there is no valuable information for the use of these schemes as the performance references. Hence, their behaviours are not shown in Figs. 5, 6, 7 and 8. Figure 5 shows the capacity of systems having SNR

_{2}=SNR

_{2}=SNR and

*r*

_{ t }=

*r*

_{ r }=0.3, and Fig. 6 shows the capacity of systems having SNR

_{2}=SNR

_{2}=SNR and

*r*

_{ t }=0.95,

*r*

_{ r }=0.3. The correlation coefficent

*r*

_{ t }=0.95 represents a strong correlation effect among the transmit antennas since the corresponding correlation matrix has the eigenvalues [ 3.7568, 0.1627, 0.0506, 0.0300] with the very large condition number 125.2267. It is clear to observe that when

*r*

_{ t }increases from 0.3 to 0.95, capacity gains of the proposed precoding designs over the NAF scheme increase, and capacity gaps of the full CSIT based designs over the partial CSIT based designs decrease. The partial CSIT based designs provide higher capacity than the equal power precoding at low to medium SNRs. These higher capacity regions are even enlarged with increase in the transmit correlation.

How the schemes based on the partial CSIT behave for *r*
_{
t
}=0.95,*r*
_{
r
}=0.3 will be revealed more clearly in Figs. 7 and 8. Figure 7 plots the curves of capacity as a function of SN*R*
_{1} and SN*R*
_{2}=12*dB*, and Fig. 8 shows the curves of capacity as a function of SN*R*
_{2} and SN*R*
_{1}=12*dB*. The results from these figures reveal there are more clearer extensions in the capacity gains of the precoding techniques to the NAF technique. Amongst the considered techniques, the iterative precoding techniques based on the full CSIT still provides the best capacity gain. Besides, the capacity gaps among all the precoding schemes also increase, especially for the ones based on the full CSIT. Figure 7 indicates that when SNR_{2}=12*dB* our both designs with the partial CSIT outperform the equal precoding, and more interestingly, the iterative design with the partial CSIT yields higher capacity than the simplified design with the full CSIT over the low-to-medium SNR_{1} range, with up to 15*dB*. Figure 8 indicates when SNR_{1}=12*dB*, the iterative design with the partial CSIT gives the same performance as the simplified design with the full CSIT at low-to-medium SNR_{2} levels, but gives the higher performance than this scheme at higher SNR_{2} levels. While the simplified design with the partial CSIT not only extends the capacity gaps over the equal precoding but also creates the capactiy curve closer to that of the simplified design with the full CSIT, especially at low and high SNR_{2} levels.

These observations verify the effectiveness of the water-filling-typed power allocation strategies of the partial-CSIT-based designs, especially when the involved channels are in medium SNR environments and are strongly affected by the spatial correlation fading at the transmit sides, regardless of the mismatch between the relay eigen-beamer directions (**V**
_{
θ
}) and the relay-destination subchannel directions (**V**
_{2}) as aforementioned in Section 4.1.

## 6 Conclusions

In summary, we developed the iterative and simplified methods of jointly designing of source and relay precoders with the full CSIT and those with partial CSIT for general correlated dual-hop MIMO relay systems without the direct link under the MMI criterion. These general systems have spatially correlated channels, mutually correlated source signals and colored noises. We showed the optimal source and relay precoder obtained with the full CSIT and the destination equalizer altogether decouple the equivalent end-to-end MIMO channel into orthogonal SISO subchannels. We also successfully extended the simplified precoder design with the full CSIT to the multi-hop relay system case. Simulation results showed that the proposed joint precoder designs with the full CSIT provide higher capacity than the existing designs. Also, the proposed joint precoder designs with the partial CSIT work well, especially when the channels are strongly correlated at the transmit sides and at medium-to-high SNRs, while they require much lower computational complexity and less feedback overhead. In future work, we would consider the relay system case where the distance between the source and the destination is short enough for them to directly communicate each other, thus the direct link is taken into account. We would propose joint source and relay precoding schemes that exploit CSI of the compound relaying link and the direct link for the system capacity maximization.

## 7 Endnote

^{1} With a given *K*×1 vector **x** of non-negative elements summing to *K*, we always create a *K*×*K* real symmetric positive semidefinite matrix, i.e., correlation matrix **A** as **A**=**U**
^{
H
}diag(**x**)**U**, where **U** is a unitary matrix. Here, the correlation matrix **A** has unit elements on its main diagonal and eigenvalues given by **x** [41].

## Declarations

### Competing interests

The authors declare that they have no competing interests.

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- X Tang, Y Hua, Optimal design of non-regenerative MIMO wireless relays. IEEE Trans. Wireless Commun.
**6**(4), 1398–1407 (2007).View ArticleGoogle Scholar - O Munoz, J Vidal, A Agustin, Linear transceiver design in nonregenerative relays with channel sate information. IEEE Trans. Signal Process.
**55**(6), 2593–2604 (2007).MathSciNetView ArticleGoogle Scholar - R Mo, YH Chew, Precoder design for non-regenerative MIMO relay systems. IEEE Trans. Wireless Commun.
**8**(10), 5041–5049 (2009).View ArticleGoogle Scholar - Z Fang, Y Hua, JC Koshy, in Fourth IEEE Workshop on Sensor Array Multi-channel Signal Processing. Joint source and relay optimization for a non-regenerative MIMO relay (Waltham, 2006), pp. 239–243.Google Scholar
- C Jeong, H-M Kim, Precoder design of non-regenerative relays with covariance feedback. IEEE Commun. Lett.
**13**(12), 920–922 (2009).View ArticleGoogle Scholar - C Jeong, B Seo, SR Lee, H-M Kim, I-M Kim, Relay precoding for non-regenerative MIMO relay systems with partial CSI feedback. IEEE Trans. Wireless Commun.
**11**(5), 1698–1711 (2012).View ArticleGoogle Scholar - D Gesbert, M Shafi, A N, From theory to practice: an overview of MIMO space-time coded wireless systems. IEEE J. Selected Areas Commun.
**3**(21), 281–302 (2003).View ArticleGoogle Scholar - Y Rong, X Tang, Y Hua, A unified framework for optimizing linear nonregenerative multicarrier MIMO relay communication systems. IEEE Trans. Signal Process.
**57**(12), 4837–4851 (2009).MathSciNetView ArticleGoogle Scholar - D-H Kim, HM Kim, in
*21st Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications*. MMSE precoder design for a non-regenerative MIMO relay with covariance feedback, (2010), pp. 461–464. doi:10.1109/PIMRC.2010.5671893. - L Gopal, Y Rong, Z Zang, in
*The 17th Asia Pacific Conference on Communications*. Joint MMSE transceiver design in non-regenerative MIMO relay systems with covariance feedback, (2011), pp. 290–294. doi:10.1109/APCC.2011.6152821. - L Gopal, Y Rong, Z Zang, in
*2013 IEEE 77th Vehicular Technology Conference (VTC Spring)*. MMSE based transceiver design for MIMO relay systems with mean and covariance feedback, (2013), pp. 1–5. doi:10.1109/VTCSpring.2013.6692635. - N Fawaz, K Zarifi, M Debbah, D Gesbert, Asymptotic capacity and optimal precoding in MIMO multi-hop relay networks. IEEE Trans. Inform. Theory.
**57**(4), 2050–2069 (2011).MathSciNetView ArticleGoogle Scholar - C Xing, S Ma, Y-C Wu, Robust joint design of linear relay precoder and destination equalizer for dual-hop amplify-and-forward MIMO relay systems. IEEE Trans. Signal Process.
**58**(4), 2273–2283 (2010).MathSciNetView ArticleGoogle Scholar - B Zhang, Z He, K Niu, L Zhang, Robust linear beamforming for MIMO relay broadcast channel with limited feedback. IEEE Signal Process. Lett.
**17**(2), 209–212 (2010).View ArticleGoogle Scholar - W Xu, X Dong, W-S Lu, MIMO relaying broadcast channels with linear precoding and quantized channel state information feedback. IEEE Trans. Signal Process.
**58**(10), 5233–5245 (2010).MathSciNetView ArticleGoogle Scholar - Y Rong, Robust design for linear non-regenerative MIMO relays with imperfect channel state information. IEEE Trans. Signal Process.
**59**(5), 2455–2460 (2011). doi:10.1109/TSP.2011.2113376.MathSciNetView ArticleGoogle Scholar - Z Wang, W Chen, J Li, Efficient beamforming for MIMO relaying broadcast channel with imperfect channel estimation. IEEE Trans. Veh. Technol.
**61**(1), 419–426 (2012).View ArticleGoogle Scholar - Y Cai, RC de Lamare, L-L Yang, M Zhao, Robust mmse precoding based on switched relaying and side information for multiuser MIMO relay systems. IEEE Trans. Veh. Technol.
**64**(12), 45677–5687 (2015).View ArticleGoogle Scholar - C Chien, Digital Radio Systems on A Chip: A System Approach (Kluwer Academic, 2001).Google Scholar
- A Scaglione, GB Giannkis, S Barbarossa, Redundant filterbank precoders and equalizers. Part I: unification and optimal designs. IEEE Trans. Signal Process.
**47**(7), 1988–2006 (1999).View ArticleGoogle Scholar - M Biguesh, S Gazor, MH Shariat, Optimal training sequence for MIMO wireless systems in colored environments. IEEE Trans. Sig. Process.
**57**(8), 3144–3153 (2009).MathSciNetView ArticleGoogle Scholar - Y Liu, TF Wong, WW Hager, Training signal design for estimation of correlated MIMO channels with colored interference. IEEE Trans. Signal Process.
**55**(4), 1486–1497 (2007).MathSciNetView ArticleGoogle Scholar - R Wang, M Tao, H Mehrpouyan, Y Hua, Channel estimation and optimal training design for correlated MIMO two-way relay systems in colored environment (2014). http://arxiv.org/abs/1407.5161.Google Scholar
- R Wang, M Tao, H Mehrpouyan, Y Hua, Channel estimation and optimal training design for correlated MIMO two-way relay systems in colored environment. IEEE Trans. Wire. Commun.
**14**(5), 2684–2699 (2015).View ArticleGoogle Scholar - R Wang, H Mehrpouyan, M Tao, Y Hua, in IEEE Glob. Commun. Conf. Optimal training design and individual channel estimation for MIMO two-way relay systems in colored environment (Austin, 2014), pp. 3561–3566.Google Scholar
- C Jeong, HM Kim, HK Song, IM Kim, Relay precoding for non-regenerative MIMO relay systems with partial CSI in the presence of interferers. IEEE Trans. Wireless Commun.
**11**(4), 1521–1531 (2012). doi:10.1109/TWC.2012.020812.111246.View ArticleGoogle Scholar - NN Tran, HD Tuan, HH Nguyen, Training signal and precoder designs for OFDM under colored noise. IEEE Trans. Vehic. Technol.
**57**(6), 3911–3917 (2008).View ArticleGoogle Scholar - NA Vinh, NN Tran, NH Phuong, Optimal precoding design for non-regenerative dual-hop correlated relaying MIMO. Electron. Lett.
**51**(20), 1613–1615 (2015).View ArticleGoogle Scholar - NA Vinh, NN Tran, NH Phuong, DL Khoa, in 2015 NAFOSTED Conference on Information and Computer Science. Optimally non-regenerative relaying for general dual-hop correlated MIMO channels (Hochiminh, 2015), pp. 300–304.Google Scholar
- NN Tran, S Ci, in 2010 IEEE Global Telecommunications Conference (GLOBECOM 2010). Asymptotic capacity and precoding design for correlated multi-hop MIMO channels (Miami, 2010), pp. 1–5.Google Scholar
- NN Tran, S Ci, HX Nguyen, CMI analysis and precoding designs for correlated multi-hop MIMO channels. EURASIP Journal on Wireless Communications and Networking, (127) (2015).Google Scholar
- M Vu, A Paulraj, MIMO wireless linear precoding using CSIT to improve link performance. IEEE Signal Process. Mag.
**87:**, 86–105 (2007).View ArticleGoogle Scholar - AK Gupta, DK Nagar, Matrix Variate Distributions (Chapman and Hall/CRC, USA, 1999).Google Scholar
- D-S Shiu, GJ Foschini, M J.Gans, J M.Kahn, Fading correlation and its effect on the capacity of multielement antenna systems. IEEE Trans. Commun.
**48**(3), 502–513 (2000).View ArticleGoogle Scholar - NN Tran, HH Nguyen, HD Tuan, DE Dodds, Training designs for amplify-and-forward relaying with spatially correlated antennas. IEEE Trans. Vehic. Technol.
**61:**, 2864–2870 (2012).View ArticleGoogle Scholar - S Sun, Y Jing, Channel training design in amplify-and-forward MIMO relay networks. IEEE Trans. Wire. Commun.
**10**(10), 920–922 (2011).Google Scholar - J-S Sheu, J-K Lain, W-H Wang, On channel estimation of orthogonal frequencydivision multiplexing amplify-and-forward cooperative relaying systems. IET Commun.
**7**(4), 325–334 (2013).MathSciNetView ArticleMATHGoogle Scholar - SM Kay, Fundamentals of Statistical Signal Processing, Volume 1: Estimation Theory (Prentice Hall, New Jersey, 1993).Google Scholar
- DJ Love, RW Heath, VKN Lau, D Gesbert, BD Rao, M Andrews, An overview of limited feedback in wireless communication systems. IEEE J. Selected Areas Commun.
**26**(8), 1341–1365 (2008). doi:10.1109/JSAC.2008.081002.View ArticleGoogle Scholar - KB Petersen, MS Pedersen, The Matrix Cookbook (Technical University of Denmark, 2012).Google Scholar
- AW Marshall, I Olkin, BC Arnold,
*Inequalities: Theory of Majorization and Its Applications - Second Edition*(Springer, New York, 2011).View ArticleMATHGoogle Scholar - S Boyd, L Vandenberghe,
*Convex Optimization*(Cambridge University Press, New York, 2004).View ArticleMATHGoogle Scholar - E Jorswieck, H Boche, Majorization and Matrix-Monotone Functions in Wireless Communications (Hanover, MA: Now Publishers, 2007).Google Scholar
- RA Horn, CR Johnson,
*Matrix Analysis*(Cambridge University Press, New York, 1985).View ArticleMATHGoogle Scholar - NN Tran, HD Tuan, HH Nguyen, Training signal and precoder designs for OFDM under colored noise. IEEE Trans. Veh. Techno.
**57**(6), 3911–3917 (2008).View ArticleGoogle Scholar - NN Tran, HX Nguyen, Optimal SP training for spatially correlated MIMO channels under coloured noises. Electron. Lett.
**51:**, 247–249 (2015).View ArticleGoogle Scholar - G Panci, S Colonnese, P Campisi, G Scarano, Blind equalization for correlated input symbols: a bussgang approach. IEEE Trans. Signal Process.
**53**(5), 1860–1869 (2005).MathSciNetView ArticleGoogle Scholar