Open Access

Lattice reduction aided with block diagonalization for multiuser MIMO systems

EURASIP Journal on Wireless Communications and Networking20152015:254

Received: 9 August 2015

Accepted: 2 November 2015

Published: 3 December 2015


The block diagonalization (BD) precoding technique is a well-known linear transmit strategy for multiuser multi-input multi-output (MU-MIMO) systems. The MU-MIMO broadcast channel is decomposed into multiple independent parallel single user MIMO (SU-MIMO) channels and achieves the maximum diversity order at high data rates. The lattice reduction-aided decoding (LRAD) features the reduced decoding complexity in MIMO communications. The Lenstra-Lenstra-Lovasz (LLL) algorithm has been extensively used to obtain better bases of the channel matrix while the complex lattice reduction (CLR) is aimed at improving orthogonality of basis vectors and shortening them. The orthogonalization and size reduction work are left for the CLR algorithm so that a modification of the channel matrix is carried out, resulting in better precoding and detection performances. We also derive bounds for lattice decoding. Simulation results show that the bit error rate (BER) performance of our proposed algorithm is better than that of conventional ones and it reduces the complexity compared with the LLL algorithm-based schemes.


Complex lattice reduction Block diagonalization Multiuser MIMO Detection algorithms Proximity factors Low complexity

1 Introduction

Multiple-input multiple-output (MIMO) systems have been proposed for the next-generation wireless communication systems to increase the transmission capacity, and therefore, a high-performance and low-complexity MIMO detector becomes an important issue. The maximum likelihood detector (MLD) is known to be an optimal detector; however, it is impractical for realization owing to its great computational complexity. Signal processing is performed on a per-cell basis in conventional wireless systems. The zero-forcing (ZF) and minimum mean-square error (MMSE) precoding are the well-known linear precoding schemes. Although linear precoding techniques have considerably low computational complexity, they show relatively low performance due to the susceptible noise amplification, particularly when the channel matrix is ill-conditioned. The block diagonalization (BD) is one of the key processing techniques for multiuser MIMO (MU-MIMO) systems. The MU-MIMO downlink channel can be decomposed into multiple parallel single user MIMO (SU-MIMO) channels with the use of BD which was first proposed in [1]. Because of no interference between the users after BD, the MU-MIMO channel can be transformed into equivalent SU-MIMO channels [2], and then the SU-MIMO techniques can be applied. Two singular value decomposition (SVD) operations have to be implemented through the BD algorithm for the complete or full BD reported in [1, 3]. By using the first SVD, the multiuser interference (MUI) is forced to be zero and the second SVD is used to produce orthogonal parallel SU-MIMO channels. By replacing the first SVD operation with a less complex solution to mitigate the MUI, a QR decomposition-based BD precoding scheme is presented in [4] for MU-MIMO systems. QR-BD utilizes a QR decomposition to the MUI-MIMO channel to obtain the null space of MUI. Therefore, the complexity of SVD operation on BD precoding is reduced by QR operation in QR-BD precoding. A generalized ZF channel inversion (GZI) precoding method is developed in [4], where the MUI-MIMO channel is operated by pseudo inversion and QR decomposition to mitigate the MUI. Furthermore, the generalized MMSE channel inversion precoding scheme denoted as GMI is proposed in [4] to balance the MUI and the noise for each user effectively.

Lattice reduction (LR) is another preprocessing and detection technique that has recently attracted significant research efforts. Yao and Wornell used the LR algorithm in conjunction with MIMO detection techniques [5]. LR is a powerful concept for solving diverse problems involving point lattices. The LR has been successfully used in signal processing applications including global positioning system (GPS), frequency estimation, and particularly data detection and precoding in wireless communication systems. Besides linear detection schemes based on the ZF or the MMSE criterion, successive interference cancelation (SIC) is a popular way to detect the transmitted signals at the receiver side [6]. The LR has been proposed in order to transform the system model into an equivalent one with a better-conditioned channel matrix prior to low-complexity linear or SIC detection [6]. The symbol error rate (SER) curves can parallel those of the MLD algorithms by using LR-aided detection schemes, which has devoted a great deal of interest to exploring the application of LR in MIMO systems. The LR-aided detection schemes with respect to the MMSE criterion have been extended by Wuebben et al. [6]. In [7], both the LR-aided SU-MIMO detection and the LR-aided SU-MIMO precoding have been investigated. LR-aided MIMO precoding for decentralized receivers was discussed in [812]. The aim of the complex LR (CLR) algorithm is to find a new basis which is shorter and nearly orthogonal as compared to the original matrix [12]. Therefore, if the second precoding filters for the equivalent SU-MIMO channels after the first SVD were designed based on the lattice-reduced channel matrix, a better bit error rate (BER) performance can be achieved. Then, a CLR-aided regularized BD (RBD) precoding algorithm is proposed, which not only has a lower complexity but also achieves a better BER performance than the RBD or QR/SVD RBD [12, 13].

Among the LR algorithms, the Lenstra-Lenstra-Lovasz (LLL) algorithm is most commonly used, which was first proposed by Lenstra et al. in [14]. However, a real value-based matrix can be processed which may lead to high complexity when the channel has large dimensions. The complex LLL (CLLL) algorithm was proposed in order to reduce the computational complexity [15]. The overall complexity of the CLLL algorithm is nearly half of the LLL algorithm without any performance degradation [15]. The essence of the LR algorithm is to try to orthogonalize the columns of the channel matrix and reduce its size as well [12]. Gram-Schmidt orthogonalization (GSO) procedure and size reduction are the two core components of the LR algorithm. The main contributions of our paper are summarized as below:
  • We propose complex lattice reduction aided with block diagonalization for MU-MIMO systems.

  • A BD-based precoding algorithm is able to separate several SU-MIMO channels from the MU-MIMO downlink channel as well as achieve the maximum diversity order at high data rates and reduce the interference.

  • To reduce the complexity of precoding scheme, we employ the CLR to replace the SVD of conventional BD-based precoding algorithm by introducing a combined channel inversion to eliminate the MUI.

  • The LLL algorithm has been used to obtain better bases of the channel matrix, while the CLR is aimed at improving orthogonality of basis vectors and shortening them. We also derive the bounds for lattice decoding.

  • The simulation results show that the BER performance of our proposed algorithm is better than that of conventional algorithms and the complexity is reduced compared with the LLL algorithm-based schemes.

This paper is organized as follows. A system model is introduced in Section 2. In Section 3, we present precoding techniques in detail. In Section 4, we describe complex LR-aided block diagonalization. In Section 5, MIMO detection algorithms are presented. In Section 6, we introduce performance bounds for lattice decoding, and complexity analysis is described in Section 7. Simulation results are presented in Section 8, and conclusions are drawn in Section 9.

2 System model

The MU-MIMO broadcast model is shown in Fig. 1, where K users equipped with N i receiving antennas on an individual basis and data streams manipulated at the base station by a precoder with N T antennas are sent to the corresponding receiving antennas, respectively. The total number of receiving antennas is \( {N}_R={\displaystyle \sum_{i=1}^k{N}_i} \). We assume that the total number of transmitted data streams is r ≤ min(N R , N T ). The received signal vector
Fig. 1

Structure of CLR-aided BD system

y can be expressed as
$$ \boldsymbol{y}=\mathbf{D}\left(\mathbf{H}\mathbf{W}\mathbf{s}+\mathbf{n}\right), $$

where \( \mathbf{D}\in {\mathbb{C}}^{r\times {N}_R} \) is the detection matrix, \( \mathbf{H}\in {\mathbb{C}}^{N_R\times {N}_T} \) is the complex Gaussian channel matrix with zero mean and unit variance, \( \mathbf{W}\in {\mathbb{C}}^{N_T\times r} \) is the precoding matrix, s r × 1 is the data vector, and n r is the Gaussian noise with independent and identically distributed (i.i.d) entries of zero mean and variance N 0.

3 Precoding technique

In this section, we discuss conventional BD and CLR algorithms. This drawback would be more serious when the channel is highly correlated. One solution for this problem is known as BD which was first proposed in [3].

3.1 Block diagonalization

The MUI constraint forces all interference terms to be zero, which is known as the ZF precoding. The precoding matrix W is designed to satisfy the transmit power constraint. The channel inversion is applied to both the ZF and MMSE precoding approaches. An additional power is needed to force two closely spaced antennas of a single user in order to receive different signals, which would be a more serious disadvantage in the event of the highly correlated channel. The BD is well known as one of the solutions for this problem. The precoding matrix is defined as
$$ \mathbf{W}=\left[{\mathbf{W}}_1,{\mathbf{W}}_2,\cdots, {\mathbf{W}}_K\right], $$
where \( {\mathbf{W}}_i\in {\mathbb{C}}^{N_T\times {r}_i} \) is the i-th user’s precoding matrix which lies in the null space of the other users’ channel matrices. Without losing generality, excluding the i-th user’s channel matrix, \( {\tilde{\mathbf{H}}}_i \) is defined as
$$ {\tilde{\mathbf{H}}}_i={\left[{\mathbf{H}}_i^T,\cdots, {\mathbf{H}}_{i-1}^T{\mathbf{H}}_{i+1}^T,\cdots, {\mathbf{H}}_K^T\right]}^T. $$
From the SVD of \( {\tilde{\mathbf{H}}}_i \), we obtain
$$ {\tilde{\mathbf{H}}}_{i, eff}={\tilde{\mathbf{U}}}_i{\tilde{\boldsymbol{\Lambda}}}_i{\left[{\tilde{\mathbf{V}}}_i^{(1)}{\tilde{\mathbf{V}}}_i^{(0)}\right]}^H, $$
where Ũ i and \( {\tilde{\boldsymbol{\Lambda}}}_i \) denote the left singular vector matrix and the matrix of ordered singular values of \( {\tilde{\mathbf{H}}}_i \), respectively. Matrices \( {\tilde{\mathbf{V}}}_i^{(1)} \) and \( {\tilde{\mathbf{V}}}_i^{(0)} \) denote the right singular matrices where each consists of the singular vectors corresponding to non-zero singular values and zero singular values, respectively. \( {\tilde{\mathbf{V}}}_i^{(0)} \) forms an orthogonal basis for the null space of \( {\tilde{\mathbf{H}}}_i \), that is, we choose \( {\mathbf{W}}_i={\tilde{\mathbf{V}}}_i^{(0)} \) to force the MUI to be zero. After removing the effect of the interfering users’ streams, the BD maximizes the data throughput by the well-known water-filling (WF) algorithm and the highest sum rate is achieved. The SVD is defined as
$$ {\mathbf{H}}_i{\tilde{\mathbf{V}}}_i^{(0)}={\mathbf{U}}_i{\boldsymbol{\Lambda}}_i{\left[{\mathbf{V}}_i^{(0)}{\mathbf{V}}_i^{(0)}\right]}^H. $$

The product of \( {\mathbf{V}}_i^{(1)} \) and \( {\tilde{\mathbf{V}}}_i^{(0)} \) can yield an equivalent SU-MIMO channel with orthogonal bases. Orthogonality can be measured by the coefficients \( {\mu}_{i,j}=\frac{\left\langle {h}_i,hj\right\rangle }{{\left\Vert {h}_j\right\Vert}^2} \), where h i , h j are the columns of the equivalent channel \( {\mathbf{H}}_i{\tilde{\mathbf{V}}}_i^{(0)}{\mathbf{V}}_i^{(1)} \).

3.2 CLLL reduction algorithm

A complex lattice is a set of points [9],
$$ \mathcal{L}\left(\mathbf{H}\right)=\left\{\mathbf{H}\mathbf{x}/{h}_i\in {\mathbb{C}}^{N_i\times {N}_T},{\mathbf{x}}_i\in \mathbb{Z}+j\mathbb{Z},\right\} $$
where \( \mathbf{H}=\left\{{h}_1,{h}_2,\cdots, {h}_{N_T}\right\} \) contains the bases of the lattice (H). It is well known that H H H is diagonal when the channel matrix H in Eq. (1) is orthogonal, and the decision region of the linear detectors required to find the nearest lattice point is the same as that of the ML detector. Actually, any matrix H LR can generate the same lattice if and only if H LR  = H eff T with a uni-modular matrix. Since the LR scheme is adopted, the complex valued system model given in Eq. (1) is transformed into the equivalent real valued system as
$$ \mathbf{H}=\left[\begin{array}{l}\mathcal{R}\left(\mathbf{H}\right)\kern1.5em -\mathcal{J}\left(\mathbf{H}\right)\hfill \\ {}\mathcal{J}\left(\mathbf{H}\right)\kern2.5em \mathcal{R}\left(\mathbf{H}\right)\hfill \end{array}\right], $$
$$ \mathbf{y}=\left[{}_{\mathcal{J}\left(\mathbf{y}\right)}^{\mathcal{R}\left(\mathbf{y}\right)}\right],\mathbf{s}=\left[{}_{\mathcal{J}\left(\mathbf{s}\right)}^{\mathcal{R}\left(\mathbf{s}\right)}\right],\mathbf{n}=\left[{}_{\mathcal{J}\left(\mathbf{n}\right)}^{\mathcal{R}\left(\mathbf{n}\right)}\right], $$

where \( \mathcal{R}\left(\cdot \right),\mathcal{J}\left(\cdot \right) \) is the real and imaginary part, respectively.

The LR algorithm aims to find a new basis H LR for a given (H) which is shorter and nearly orthogonal compared with the original matrix H. Let the orthogonal factor be represented as \( {\mu}_{i,j}=\frac{\left\langle {h}_i,{h}_j^{*}\right\rangle }{{\left\Vert {h}_j^{*}\right\Vert}^2} \), where \( {h}_j^{*} \) represents the vectors generated by the GSO procedure.

Definition (δ-LLL-reduced basis): A basis H LR by the QR decomposition, i.e., \( {\mathbf{H}}_{LR}=\tilde{\mathbf{Q}}\tilde{\mathbf{R}} \), is regarded as δ-LLL-reduced basis where 1/4 < δ < 1, if
$$ {u}_{i,j}\le 1/2,1\le j<i\le {N}_T $$
$$ {\left\Vert {h}_k^{*}\right\Vert}^2+\left|{u}_{k,k-1}\right|{\left\Vert {h}_{k-1}^{*}\right\Vert}^2\ge \delta {\left\Vert {h}_{k-1}^{*}\right\Vert}^2,1<k\le {N}_T, $$

where δ (1/2, 1) is a factor chosen to achieve a good performance with lower complexity. If only Eq. (9) is satisfied, this basis is the size-reduced basis as well. The parameter δ influences the quality of the reduced basis. Throughout this paper, δ = 3/4 as in [14].

4 Proposed complex LR-aided BD

In this section, we combined the BD and CLR techniques. To cancel the MUI, we took the similar design concept from BD and thus the MU-MIMO channel can be transformed into equivalent SU-MIMO channels. We assume that the channel information is perfectly known both at the transmit side and the receiving side. We remark that a performance study of the proposed scheme with imperfect channel information and limited feedback can be considered.

We employ a similar strategy derived from the BD scheme in order to eliminate the interference between users. We successfully transform the MU-MIMO channel into equivalent SU-MIMO channels after the precoding. Each equivalent SU-MIMO channel has the same properties as a conventional SU-MIMO channel, and when increasing the number of transmit antennas of the MU-MIMO system by one, the number of spatial channels of each user is also increased by one. The equivalent SU-MIMO channel is given by
$$ {\mathbf{H}}_{eff}=\mathbf{H}\mathbf{W} $$
The received signal at the receiving side is
$$ \mathbf{y}={\mathbf{H}}_{eff}\mathbf{s}+\mathbf{n} $$
By using the CLLL algorithm, we can make the columns of H eff orthogonal and shorter, that is
$$ {\mathbf{H}}_{LR}={\mathbf{H}}_{eff}\mathbf{T}. $$
We can rewrite Eq. (12) as
$$ \mathbf{y}={\mathbf{H}}_{eff}\mathbf{T}{\mathbf{T}}^{-1}\mathbf{s}+\mathbf{n}={\mathbf{H}}_{LR}\mathbf{z}+\mathbf{n}, $$

where z = T − 1 s and H LR possesses a better channel quality, and we can design the detector based on the better detector performance which can be achieved due to less noise enhancement increased by H LR . The basic idea behind approximate lattice decoding (LD) is to use LR in conjunction with traditional low-complexity decoders. With LR, the basis B is transformed into a new basis consisting of roughly orthogonal vectors. And the complexity is reduced also compared to the SVD technique.

5 MIMO detection algorithms

5.1 ZF and MMSE detection algorithms

The interference is completely suppressed in a ZF detector by multiplying the receiving signal vector y with the pseudo-inverse of the channel matrix \( {\mathbf{H}}_{LR}^{\dagger }={\left({\mathbf{H}}_{LR}^T{\mathbf{H}}_{LR}\right)}^{-1}{\mathbf{H}}_{LR}^T \). Given the received signal y in Eq. (14), the MLD problem consists of determining the vector z with the highest likelihood, that is, solving the following integer least squares problem [7]:
$$ {\tilde{\mathbf{z}}}_{ML}= \arg \underset{z\in {\mathbb{Z}}^r}{ \min }{\left\Vert \mathbf{y}-{\mathbf{H}}_{LR}\mathbf{z}\right\Vert}^2. $$
However, the MLD is usually impractical due to its complexity that grows exponentially with the number of constellation points and the number of transmitted streams r. The decision step consists of mapping each element of the filter output vector
$$ {\tilde{\mathrm{s}}}_{ZF}={\mathbf{H}}_{LR}^{\dagger}\mathbf{y}=\mathbf{s}+{\left({\mathbf{H}}_{LR}^T{\mathbf{H}}_{LR}\right)}^{-1}{\mathbf{H}}_{LR}^T\mathbf{n} $$
onto an element of the symbol alphabet by a minimum distance quantization, which in case of M-QAM corresponds to a simple rounding operation to the allowed range of values. For an orthogonal channel matrix, ZF is identical to ML. The MMSE detector takes the noise term into account and thereby leads to an improved performance.
$$ {\tilde{\mathbf{s}}}_{MMSE}={\mathbf{H}}_{LR}^{\dagger}\mathbf{y}={\left({\mathbf{H}}_{LR}^T{\mathbf{H}}_{LR}+{\sigma}_n^2\mathbf{I}\right)}^{-1}{\mathbf{H}}_{LR}^T\mathbf{y}. $$

5.2 Lattice-reduction-aided linear detection

Linear detection is optimal for an orthogonal channel matrix. For s m , we also have z m , so s and z stem from the same set. The idea behind LR-aided linear detection is to consider the equivalent system model in Eq. (14) and perform the nonlinear quantization on z instead of s. For LR-aided ZF, this means that first
$$ {\tilde{\mathbf{z}}}_{LR-ZF}={\mathbf{T}}^{-1}{\tilde{\mathbf{s}}}_{ZF}={\mathbf{H}}_{LR}\mathbf{y}=\mathbf{z}+{\mathbf{H}}_{LR}\mathbf{n} $$
is calculated, where the multiplication with H LR usually causes less noise amplification than the multiplication with \( {\mathbf{H}}_{LR}^{\dagger } \) in Eq. (14) due to the roughly orthogonal columns of H LR . Therefore, a hard decision based on \( {\tilde{\mathbf{z}}}_{LR-ZF} \) is in general more reliable than one on \( {\tilde{\mathbf{s}}}_{ZF} \). We may apply a MMSE filter instead of the ZF solution in order to get an improved estimate for z. One obvious way is given by the MMSE-solution of the lattice-reduced system (Eq. (14))
$$ {\tilde{\mathbf{z}}}_{LR\hbox{--} MMSE}={\left({\mathbf{H}}_{LR}^T{\mathbf{H}}_{LR}+{\sigma}_n^2\mathbf{T}{\mathbf{T}}^{\hbox{--} 1}\right)}^{\hbox{--} 1}{\mathbf{H}}_{LR}^T\mathbf{y}={\mathbf{T}}^{\hbox{--} 1}{\tilde{\mathbf{s}}}_{MMSE} $$

5.3 Lattice-reduction aided SIC

As shown in several publications, e.g., [16, 17], SIC can be well described in terms of the QR decomposition of the channel matrix. Applying this strategy to the system model from Eq. (14), we get
$$ {\tilde{\mathbf{z}}}_{LR\hbox{--} ZF\hbox{--} SIC}={\tilde{\mathrm{Q}}}^T\mathbf{y}=\tilde{\mathbf{R}}\mathbf{z}+{\tilde{\mathrm{Q}}}^T\mathbf{n}, $$
where \( \tilde{\mathrm{Q}} \) and \( \tilde{\mathbf{R}} \) have already been calculated by the LLL algorithm. Similar to linear detection, we can consider the lattice-reduced version of the extended system model with the equivalent channel matrix \( {\mathbf{H}}_{LR}=\tilde{\mathbf{Q}}\tilde{\mathbf{R}} \). This leads to LR-aided MMSE-SIC with decision variables given by
$$ {\tilde{\mathbf{z}}}_{LR\hbox{--} MMSE\hbox{--} SIC}={\tilde{\mathbf{Q}}}^T\mathbf{y}=\tilde{\mathbf{R}}\mathbf{z}+\boldsymbol{\upeta}, $$

where the newly defined noise term η also incorporates residual interference. The detection procedure equals that of LR-aided ZF-SIC.

6 Performance bounds for lattice decoding

In this section, we shall introduce an analytic tool for approximate LD. However, such results do not directly translate into how close approximate LD is to LD in terms of the minimum distance, which is more useful in digital communications [18].

Consider a fixed but arbitrary n-D complex lattice Λ. The decision regions of ZF and SIC have 2n faces. We only have to study n distances due to symmetry. The i-th distance of ZF is d i,ZF  = (1/2)‖h i ‖ sin θ i , for i = 1, …, n, where θ i denotes the acute angle between and the linear space spanned by the other n − 1 basis vectors h 1, …, h i − 1, h i + 1,...., h n . For the SIC detector, the i-th distance is given by \( \left(1/2\right)\left\Vert {h}_i^{*}\right\Vert \).

The minimum distance of the lattice decoder is d LD  = (1/2)λ(Λ), where λ(Λ) is the length of the shortest vector of lattice Λ. We are motivated to define the proximity factors measuring the proximity between the performances of LD and approximate LD as follows:
$$ {\rho}_{i,ZF}\triangleq \sup \frac{d_{LD}^2}{d_{i,ZF}^2}= \sup \frac{\lambda^2\left(\varLambda \right)}{{\left\Vert {h}_i\right\Vert}^2{ \sin}^2{\theta}_i} $$
$$ {\rho}_{i, SIC}\triangleq \sup \frac{d_{LD}^2}{d_{i, SIC}^2}= \sup \frac{\lambda^2\left(\varLambda \right)}{{\left\Vert {h}_i^{*}\right\Vert}^2} $$
For each decoder, an error occurs when the noise falls outside of R. Accordingly, given the basis B, the error probability for vector x is given by
$$ {P}_e(B)=P\left(\mathrm{x}\ne 0/\mathrm{x}=0\right)=P\left(n\in R\right) $$
To keep the results general, we write SNR = c/σ 2, where c is a constant depending on the problem. By the symmetry of the Voronoi cell, we have the lower bound on the conditional decoding error probability of LD
$$ {P}_{e,LD}\left(SNR,B\right)\ge 2Q\left(\frac{d_{LD}}{\sigma}\right)=2Q\left(\sqrt{\frac{d_{LD}^2.SNR}{c}}\right). $$
Meanwhile, the union bound on the conditional error probability of ZF reads
$$ {P}_{e,ZF}\left(SNR,B\right)\le 2{\displaystyle \sum_{i=1}^nQ}\left(\frac{d_{i,ZF}}{\sigma}\right) $$
where the factor 2 is due to symmetry. The union bound for SIC admits a form similar to Eq. (26). Given the same basis matrix B, the conditional error probability of LR-aided ZF can be bounded above as
$$ {P}_{e,ZF}\left(SNR,B\right)\le 2{\displaystyle \sum_{i=1}^nQ}\left(\frac{d_{LD}}{\rho_{i,ZF}\sigma}\right) $$
$$ =2{\displaystyle \sum_{i=1}^nQ}\left(\sqrt{\frac{d_{LD}^2.SNR}{c.{\rho}_{i,ZF}}\frac{d_{LD}}{\sqrt{\rho_{i,ZF}\sigma }}}\right). $$
since \( {d}_{i,ZF}^2\ge {\rho}_{i,ZF}.{d}_{LD}^2 \) by definition (Eqs. (22)–(23)) and since Q(·) is a decreasing function. It is worth pointing out that while the distance d LD is a function of B, ρ i,ZF is not. Now, combining (25) and (26), we have
$$ {P}_{e,ZF}\left(SNR,B\right)\le {\displaystyle \sum_{i=1}^n{P}_{e,LD}\left(\frac{SNR}{\rho_{i,ZF}},B\right)}. $$
Since Eq. (28) holds for any B, averaging out B, we obtain
$$ {P}_{e,ZF}(SNR)\le {\displaystyle \sum_{i=1}^n{P}_{e,LD}\left(\frac{SNR}{\rho_{i,ZF}}\right)} $$
for arbitrary SNR. In particular,
$$ {P}_{e,ZF}(SNR)\le n{P}_{e,LD}\left(\frac{SNR}{\rho_{ZF}}\right). $$

The relations Eq. (29) and Eq. (30) hold irrespective of fading statistics, and similar relations exist for SIC. They reveal, in a quantitative manner, that approximate LD performs within a constant bound from LD. The mere effect on the error rate curve is a shift from that of LD, up to a multiplicative factor n, which obviously does not change the diversity order. In other words, the diversity order is the same as that of LD [18]. Therefore, existing results on the diversity order of LD can be extended to approximate LD. Moreover, since LD achieves full receive diversity in the uncoded V-BLAST system [19], approximate LD also achieves full diversity. This provides an alternative way of showing the diversity order of LR-aided decoding given in [19, 20].

7 Complexity analysis

The LLL algorithm leads to a significant reduction of the computational complexity. The complexity of the LLL reduction algorithm depends on the random basis matrix H. We use the total number of flops to measure the computational complexity of the existing algorithms [12, 13, 21, 22]. We summarize the total flops needed for the matrix operations below:
  • Multiplication of m × n and n × p complex matrices: 8mnp

  • QR decomposition of an m × n(m ≤ n) complex matrix: 16(n 2 m − nm 2 + 1/3m 3)

  • SVD of an m × n(m ≤ n) complex matrix where only Σ and V are obtained: 32(nm 2 + 2m 2)

  • SVD of an m × n(m ≤ n) complex matrix where UΣ, and V are obtained: 8(4n 2 m + 8nm 2 + 9m 3)

  • Inversion of an m × m real matrix: 2m 3 − 2m 2 + m

For the case shown in Tables 1 and 2, the complexity of the LR-ZF is about 46.1 % of BD and 70.3 % of QR/SVD-BD, while the complexity of the LR-MMSE is about 55.8 % of BD and 85.1 % of the QR/SVD-BD [12]. Clearly, the algorithm requires the lowest complexity.
Table 1

Complexity of LR algorithm




Case (2, 2, 2) × 6



\( 16K\left({N}_T^2{N}_i+{N}_T{N}_1^2+1/3{N}_1^3\right) \)




\( 8{N}_R{N}_T^2 \)


3 ZF

\( CLR{\left({H}_{LR}^T\right)}^T \)

\( 25.6K\left({N}_T^2{N}_i-{N}_T{N}_i^2+1/3{N}_i^3\right) \)


4 ZF

\( {H}_{LR}^T{\left({H}_{LR}{H}_{LR}^T\right)}^{-1} \)

\( K\left(2{N}_1^3-2{N}_i^2+{N}_i+16{N}_T{N}_i^2\right) \)



\( {H}_{LR}^T{\left({H}_{LR}{H}_{LR}^T\right)}^{-1} \)

\( K\left(18{N}_1^3-2{N}_i^2+{N}_i+16{N}_T{N}_i^2\right) \)


Table 2

Computational complexity of QR/SVD-BD [22]






H = QR

\( 16K\left({N}_T^2{N}_i+{N}_T{N}_i^2+1/3{N}_i^3\right) \)



H eff  = HW

\( 8{N}_R{N}_T^2 \)



\( {\mathbf{H}}_{i, eff}={\mathbf{U}}_i{\boldsymbol{\Lambda}}_i{\mathbf{V}}_i^H \)

\( 64\left(9/8{N}_i^3+{N}_T{N}_i^2+1/2{N}_i^2{N}_i\right) \)


We focus on the computational complexity reduction of the alternative BD methods. The complexities of the alternative methods are usually compared by the number of floating point operations (flops). A flop is defined as a real floating operation, e.g., a real addition, multiplication, division, and so on. Based on the analysis, we summarize the computational costs of the alternative BD methods, where QR-BD denotes the BD method similar as SVD-BD but replacing the SVD operation with the fast Givens QR operation.

We give the calculated results of the flops of the alternative methods in Figs. 2 and 3. We consider the case that N T  = KN k as shown in Fig. 2. We set N k  = 2 and express the computation cost as a function of N T . We consider the case that KN k  < N T while expressing the computation cost as a function of N k .
Fig. 2

The required flops versus the number of the transmit antennas, N T

Fig. 3

The required flops versus the number of the receiving antennas per MS, N k

8 Simulations results

In this section, we evaluate the BER performance of the LR-aided linear precoding. We use both linear ZF and MMSE precoding schemes with the conventional LLL algorithm. From Fig. 4, linear precoding jointly applied with LLL algorithm clearly outperforms the linear precoding. At a target BER of 10− 3, the gain in the transmission power is 7.5 dB.
Fig. 4

BER performances of the LR linear precoding schemes

The performances of the successive detection schemes with optimum ordering are provided in Fig. 5. Note that this improvement comes at almost no cost because the complexity of SIC is comparable to that of linear detection. Again, detection with respect to the LR system significantly reduces the BER. The LR-MMSE-SIC scheme achieves almost ML performance, while the main computational effort is required only once per transmitted frame.
Fig. 5

Bit error rate of a system with N T  = N R  = 4 antennas, 4-QAM symbols, ZF, and MMSE optimally SIC detection

The analysis of probability of error is compared to the BER results of simulations.

We investigate the performance comparison in terms of BER given a bit SNR, i.e., E b /N 0 in Fig. 6. The 4×4 MIMO precoding and detection techniques are given and compared with the proposed schemes. Figure 6 shows the comparison where 16-QAM modulation is used. The ML is the best performance of all techniques, while the LR-MMSE outperforms the LR-ZF. It is clear that the performance of BD precoding with LR is as almost similar to the LR-MMSE detection in Fig. 6.
Fig. 6

The comparison results for the BER performances versus E b /N 0

9 Conclusions

In this paper, several detection schemes for multiple antenna systems are investigated, which make use of the LR algorithm proposed by [14]. It is shown that the performance of our proposed algorithm is better than that of conventional methods and the complexity is reduced compared with the LLL-based schemes. It is clear that the performance of BD precoding with LR is as almost similar to the LR-MMSE detection. Aside from the improved performance, it is suggested that the MMSE-based LR has a significantly smaller complexity than the ZF-based LR. Simulation results evidence that our proposed algorithms have substantial performance gains compared to the existing MU-MIMO linear precoding and BD detection.




This work was supported by the MEST 2015R1A2A1A 05000977, NRF, Korea.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

Division of Electronics and Information Engineering, Chonbuk National University


  1. Q Spencer, M Haardt, Capacity and downlink transmission algorithms for a mutli-user MIMO channel, in Proc. 36th Asilomar Conf. on Signals, Systems, and Computers (IEEE Computer Society Press, Pacific Grove, 2002)Google Scholar
  2. LU Choi, RD Murch, A transmit preprocessing technique for multiuser MIMO systems using a decomposition approach. IEEE Transactions on Wireless Communications 3(1), 20–24 (2004)View ArticleGoogle Scholar
  3. QH Spencer, AL Swindlehurst, M Haardt, Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels. IEEE Transactions on Signal Processing 52(2), 461–471 (2004)View ArticleMathSciNetGoogle Scholar
  4. H Sung, S Lee, I Lee, Generalized channel inversion methods for multiuser MIMO systems. IEEE Trans. Commun. 57(11), 3489–3409 (2009)View ArticleGoogle Scholar
  5. H Yao, G Wornell, Lattice-reduction-aided detectors for MIMO communication system, in IEEE Proc. Globecom (IEEE, Taipei, 2002)Google Scholar
  6. D Wubben, R Bohnke, V Kuhn, KD Kammeyer, MMSE based lattice reduction for near ML detection of MIMO systems. ITG workshop on Smart Antennas, 2004, pp. 106–113Google Scholar
  7. D Wübben, R Böhnke, V Kühn, K-D Kammeyer, Near maximum-likelihood detection of MIMO systems using MMSE-based lattice-reduction, in Proc. 2004 Int. Conf. Commun. (ICC’04) (IEEE, Paris, 2004), pp. 798–802Google Scholar
  8. C Windpassinger, R Fischer, Low-complexity near maximum likelihood detection and precoding for MIMO systems using lattice reduction, in Proc. IEEE Inf. Theory Workshop (IEEE, Paris, 2003), pp. 345–348Google Scholar
  9. K Zu, RC de Lamare, Lattice Reduction-Aided Preprocessing and Detection Techniques for MU-MIMO in Broadcast Channel. 11th European Wireless Conference 2011 (VDE, Vienna, 2011)Google Scholar
  10. C Windpassinger, R Fischer, JB Huber, Lattice-reduction-aided broadcast precoding. IEEE Trans. on Communications 52, 2057–2060 (2004)View ArticleGoogle Scholar
  11. RFH Fischer, CA Windpassinger, Improved MIMO precoding for decentralized receivers resembling concepts from lattice reduction, in Proc. of IEEE Global Telecommunications Conf (IEEE, San Francisco, 2003), pp. 1852–1856Google Scholar
  12. K Zu, RC Lamare, Low-complexity lattice reduction-aided regularized block diagonalization for MU-MIMO systems. IEEE Commun. Lett. 16, 6 (2012)View ArticleGoogle Scholar
  13. K Zu, RC de Lamare, M Haart, Generalized design of low-complexity block diagonalization type precoding algorithms for multiuser MIMO systems. IEEE Trans. on Communications 61, 10 (2013)View ArticleGoogle Scholar
  14. AK Lenstra, HW Lenstra, L Lov’asz, Factoring polynomials with rational coefficients. Math. Ann 261, 515–534 (1982)View ArticleMathSciNetMATHGoogle Scholar
  15. YH Gan, C Ling, WH Mow, Complex lattice reduction algorithm for low-complexity full-diversity MIMO detection. IEEE Transactions on Signal Processing 57(7), 2701–2710 (2009)View ArticleMathSciNetGoogle Scholar
  16. D Wubben, R Bohnke, V Kuhn, KD Kammeyer, MMSE extension of V-BLAST based on sorted QR decomposition, in IEEE Proc. VTC-Fall (IEEE, Orlando, 2003)Google Scholar
  17. R Bohnke, D Wubben, V Kuhn, KD Kammeyer, Reduced Complexity MMSE Detection for BLAST Architectures, in IEEE Proc. Globecom (IEEE, San Francisco, 2003)Google Scholar
  18. C Ling, On the proximity factors of lattice reduction aided decoding. IEEE Trans. on Signal Processing 59, 6 (2011)View ArticleGoogle Scholar
  19. J Jald´en, P Elia, LR-aided MMSE lattice decoding is DMT optimal for all approximately universal codes, in Proc. Int. Symp. Inform. Theory (ISIT’09) (IEEE, Seoul, 2009)Google Scholar
  20. M Taherzadeh, A Mobasher, AK Khandani, LLL reduction achieves the receive diversity in MIMO decoding. IEEE Trans. Inform. Theory 53, 4801–4805 (2007)View ArticleMathSciNetGoogle Scholar
  21. GH Golub, CF Van Load, Matrix Computations, 3rd edition (The John Hopkins University Press, Baltimore and London, 1996)Google Scholar
  22. H Wang, L Li, L Song, X Gao, A linear precoding scheme for downlink multiuser MIMO precoding systems. IEEE Communications Letter 15(6), 635–655 (2011)View ArticleGoogle Scholar


© Khan et al. 2015