# A simple block diagonal precoding for multi-user MIMO broadcast channels

## Abstract

The block diagonalization (BD) is a linear precoding technique for multi-user multi-input multi-output (MIMO) broadcast channels, which is able to completely eliminate the multi-user interference (MUI), but it is not computationally efficient. In this paper, we propose the block diagonal Jacket matrix decomposition, which is able not only to extend the conventional block diagonal channel decomposition but also to achieve the MIMO broadcast channel capacity. We also prove that the QR algorithm achieves the same sum rate as that of the conventional BD scheme. The complexity analysis shows that our proposal is more efficient than the conventional BD method in terms of the number of the required computation.

## 1 Introduction

Recently, the research of the capacity region of the multi-user multi-input multi-output (MIMO) broadcast channels (BC) has been of concern. It is well known that any algorithm requiring the eigenvalue decomposition (EVD) suffers from the high computational cost. In mobile wireless communication systems, in which MIMO technique is utilized, the channel characteristics may vary faster than the computation process of the precoding/decoding algorithm that is based on the EVD of the channel matrix that is changing instantaneously.

In , the authors proposed the MIMO channel precoding/decoding based on the Jacket matrix decomposition where we believe that the required computational complexity in obtaining diagonal-similar matrices is smaller than that required in the conventional EVD.

Definition 1 Let J N {ai,j} be a N × N matrix; then, it is called a Jacket matrix when $J N − 1 = 1 N a i , j − 1 T$, that is, the inverse of the Jacket matrix can be determined by its element-wise inverse [2, 3].

Definition 2 Let A be an n × n matrix. If there exists a Jacket matrix J such that A = JJ−1, where Σ is a diagonal matrix, then we say that A is a Jacket matrix similar to the diagonal matrix ∑. Moreover, we say that A is a Jacket diagonalizable .

Theorem 1 A 4 × 4 matrix $J$ is a Jacket matrix similar to the diagonal matrix if and only if $J$ has the following form:

$J 4 = A 2 B 2 C 2 A 2$
(1)

i.e., the entries of the main diagonal of a matrix are equal.

Proof Refer to  for the proof.

Multi-user diversity can significantly improve the performance of multiple antenna systems. The simplest ways to achieve the diversity gain in MIMO downlink communications are the zero forcing (ZF)-based linear precoding approaches. In [5, 6], it was shown that the maximum sum rate in the multi-user MIMO broadcast channels can be achieved by dirty paper coding (DPC). However, the high computational complexity of the DPC makes it difficult to implement in practical systems. A suboptimal strategy of the DPC , the Tomlinson-Harashima precoding (THP) algorithm which is based on nonlinear modulo operations, is still impractical due to its high complexity.

In linear processing systems, several practical precoding techniques have been proposed, typically as the channel inversion method [8, 9] and the block diagonalization (BD) method . The ZF channel inversion scheme  can suppress co-channel interference (CCI) completely for the case where all users employ a single antenna. However, its performance is degraded due to the effect of noise enhancement. Although the minimum mean-squared error (MMSE) channel inversion method  overcomes the drawback of the ZF, it is still confined to a single-receive antenna case. In the scenario where multiple antennas are located at both the mobile terminal and base station for each user, low-complexity BD methods have been proposed [8, 1113]. Moreover, the BD attempts to completely eliminate the multi-user interference (MUI) irrespective of the noise. The BD precoding has been proposed in  to improve the sum rate or reduce the transmitted power. A BD precoding algorithm has focused on how to implement the BD precoding algorithms with less computational complexity without the performance degradation. A low-complexity generalized ZF channel inversion (GZI) method has been proposed in  to equivalently implement the first singular value decomposition (SVD) operation of the original BD precoding, and a generalized MMSE channel inversion (GMI) method is also developed in  for the original regularized BD (RBD) precoding. Therefore, the performance of the BD scheme is poor at the low SNR regime, while preserving its good performance at high SNR. With the purpose of improving the performance of the BD, an RBD scheme  is proposed. The QR/SVD techniques require only low complexity to equivalently implement the BD precoding algorithms. As an improvement of the BD precoding algorithms, a low-complexity lattice reduction-aided RBD (LC-RBD-LR)-type precoding algorithm has been proposed in [11, 12] based on the QR decomposition scheme. However, the complexity of the RBD is too high, which is difficult to be implemented in practice. Owing to the SVD in the algorithm, the BD is not computationally efficient.

In this paper, we propose QR-based BD and Jacket matrix methods. We consider the channel matrix decomposition based on QR and Jacket matrices for the case where each user has multiple antennas. By using the QR decomposition to find the orthogonal complement, the complexity of the SVD-BD can be reduced. As a new approach of the conventional BD scheme, the QR shows a significant improvement in computational complexity. In addition, we prove that the proposed QR algorithm has the same sum rate as the conventional BD scheme. We also discuss the block diagonal Jacket matrix decomposition because Jacket matrices are element-wise inverse matrices. Thus, we can calculate their complexity easily.

The rest of this paper is organized as follows. In Section 2, we describe the system model. In Section 3, we discuss the BD method. In Section 4, we analyze the block diagonal Jacket decomposition of an equivalent channel matrix. In Section 5, we perform the complexity analysis. Finally, we draw meaningful conclusions in Section 6.

## 2 System model

We consider the downlink MIMO broadcast channel base station (BS) to K mobile users as shown in Figure 1. The MIMO channel of each user is assumed to be flat fading with distribution $CN 0 , I$, where the BS has NT transmitter antennas, and each user has NR receiver antennas. In this linear precoding scheme, the precoded signal vector for the k- th user can be written as

$x k = T k s k$
(2)

The received signal for the k- th user can be represented as

$y k = H k ∑ j = 1 K T j x j + n k = H k T k s k + ∑ j = 1 , j ≠ k K H k T j s j + n k , k = 1 , ⋯ , K ,$
(3)

where k and j are user indices, $T k ∈ ℂ N T × N k$ is a precoding vector for the user k, s k represents the data symbol vector, $x k ∈ ℂ N k × 1$ is a transmit signal, $H k ∈ ℂ N k × N T$ is a MIMO channel matrix, and n k is a Gaussian noise with zero mean and variance σ2. It is also assumed that all signals are detectable and $∑ k = 1 K N k ≤ N T$.

Note that the precoding vectors are normalized to unity, i.e., T k 2 = 1 for k = 1,, K. Furthermore, the power constraints are defined as tr(T k T k H) ≤ P k , where P k is the total transmission power. The power constraint corresponding to the BS applies to the transmitters of k- th BS. Therefore, a sum rate maximization problem with power constraints can be expressed as

$max ∑ k log I + H k T k T k H H k H s . t . tr T k T k H ≤ P k , k = 1 , … , K H ˜ k T k = 0 , k = 1 , … , K$
(4)

The aforementioned problem is categorized as a convex optimization problem. Thus, it can be solved optimally and efficiently by using the water filling algorithm, which is proposed for the multi-user transmit optimization for broadcast channels.

## 3 Block diagonalization method

In this section, we represent a novel BD method for multi-user MIMO systems. The BD algorithm is an extension of the ZF method for multi-user MIMO systems where each user has multiple antennas. Each user's linear precoder and receiver filter can be obtained by twice SVD operations –.

### 3.1 Block diagonalization

The key idea of the BD algorithm is to employ the precoding matrix Τ to suppress the MUI completely. To eliminate all MUI, the following constraint is imposed.

$H ˜ k T k = 0 , k = 1 , ⋯ , K$
(5)

$H ˜ k$ is defined as the channel matrix for all users other than the user k.

$H ˜ k = H 1 T ⋯ H k − 1 T H k + 1 T ⋯ H K T T$
(6)

By applying the SVD, the following value for the channel is obtained

$H ˜ k = U k Σ k V k 1 V k 0 H ,$
(7)

where Σ k is the diagonal matrix of which the diagonal elements are non-negative singular values of $H ˜ k$ and its dimension equals to the rank of $H ˜ k$. V k (0) contains vectors corresponding to the zero singular values, and V k (1) consists of the singular vectors corresponding to nonzero singular values. Thus, V k (0) is an orthogonal basis for the null space of $H ˜ k$. In order to maximize the achievable sum rate of the BD, the water filling algorithm can be additionally incorporated. Define the SVD of $H ˜ k V ˜ k 0$ as

$H ˜ k V ˜ k 0 = U ˜ k Σ ˜ k V ˜ k 1 V ˜ k 0 H .$
(8)

Thus, we define the total precoding matrix as

$T BD = V ˜ 1 0 V 1 1 V ˜ 2 0 V 2 1 ⋯ V ˜ K 0 V K 1 Λ 1 / 2 ,$
(9)

where Λ is a diagonal matrix of which the element λ k scales the power transmitted into each of columns of TBD. To maximize the sum rate under a total power constraint at the BS, where the power allocation matrix is the solution to the following optimization, with TBD chosen in Equation 9, the capacity of the BD [10, 15] is

$C BD = max Λ log 2 I + Σ 2 Λ σ 2 ,$
(10)

where

$Σ = Σ 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ Σ K .$
(11)

The optimal power-loading coefficients of Λ are determined by using the water filling on the diagonal elements of Σ, assuming that P k is a total power constraint. A summary of the BD algorithm  in Algorithm 1. ### 3.2 Proposed QR-based BD method

In this subsection, we propose an alternative method to find vectors orthonormal to $H ˜ k$ based on the QR decomposition. In order to compute the null space of $H ˜ k$, we define a QR decomposition of $H ˜ k$ as

$H ˜ k = Q k Q ¯ k R k 0 = Q k R k ,$
(12)

where Q k is an NT × NT unitary matrix, so Q k HQ k  = I k ; $R k ∈ ℂ N T × N R$ is an NT × NR upper triangular matrix, and $Q ¯ k$ is an NT × (NR − NT) matrix. $Q ¯ k H = Q k 1 Q k 2 ,$ where Q k 1 is an N k column unitary matrix.

The pseudo inverse of the channel matrix H k  = [H1TH2TH K T]T is $H ¯ k = H k H H k H k H − 1 = H ¯ 1 H ¯ 2 ⋯ H ¯ K$. Then, we can show that

$H k H ¯ k = H 1 ⋮ H K H ¯ 1 ⋯ H ¯ K = H 1 H ¯ 1 ⋯ H 1 H ¯ K ⋮ ⋱ ⋮ H K H ¯ 1 ⋯ H K H ¯ K = I N R , 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ I N R , K .$
(13)

Clearly, $H j H ¯ k = 0$ when j ≠ k, which is called the zero inter-user interference (IUI) constraint since it gets the IUI to be zero. By defining $H ˜ j$ as $H ˜ j = H 1 T ⋯ H j − 1 T H j + 1 T ⋯ H K T T$, it is shown that the zero IUI constraint is satisfied such as $H ˜ j H ¯ j = 0$. The QR decomposition of $H ˜ j$ is

$H ˜ j = Q j R j for j = 1 , ⋯ , K .$
(14)

From the zero IUI constraint, we have $H ˜ j Q j R j = 0$. Since R j is invertible, it is conjectured that $H ˜ j Q j = 0$

Let G k  = H k Q k 1 and we apply the EVD of G k as

$G k = U ⌢ k Σ ⌢ k U ⌢ k H ,$
(15)

where $U ⌢ k$ is a unitary matrix, and $Σ ⌢ k$ is a diagonal matrix. Thus, we get the precoding matrix as

$T QR = Q 1 1 U ⌢ 1 Q 2 1 U ⌢ 2 ⋯ Q K 1 U ⌢ K Ψ 1 / 2 ,$
(16)

where Ψ is a diagonal matrix of which the elements scale the power transmitted into each of columns of TQR. The capacity of the QR-EVD is

$C QR − EVD = max Ψ log I + Σ ⌢ 2 Ψ σ 2 ,$
(17)

where

$Σ ⌢ = Σ ⌢ 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ Σ ⌢ K .$
(18)

The optimal power-loading coefficients of Ψ are determined by using the water filling on the diagonal elements of $Σ ⌢$, assuming that P k is a total power constraint. Equation 10 and Equation 17 are the same as the channel capacity of the conventional BD and the QR-EVD decomposition (Algorithm 2).

Figure 2 shows that the BD method has the same sum rate as the QR-EVD method and An's method  under condition that a MIMO broadcasting system consists of one base station and two users where the base station has four transmit antennas and each use has two receive antennas. ## 4 Block diagonal Jacket decomposition of an equivalent channel matrix

In this section, we introduce the block diagonal Jacket decomposition of an equivalent channel matrix. Assume that H k is an NR × NT block diagonal matrix given by

$H k = LΣ L − 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ LΣ L − 1 ,$
(19)

and its inverse is

$H k − 1 = LΣ L − 1 − 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ LΣ L − 1 − 1 .$
(20)

The channel matrix is decomposed into parallel single-input single-output subchannels. A special k × k Jacket matrix called a diagonal Jacket matrix can be defined as follows:

$J k = J 1 , 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ J k , k , and$
(21)

Its inverse matrix is

$J k − 1 = 1 / J 1 , 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ 1 / J k , k .$
(22)

Obviously, the unitary matrices can be considered as the Jacket matrices.

Let us denote B2 as a 2 × 2 block matrix in the main diagonal of H k [1, 17]. Then, Equation 19 can be written as

$H k = I k / 2 ⊗ B 2 ,$
(23)

where

$B 2 = LΣ L − 1$
(24)

I k/2 is an identity matrix, and  is the Kronecker product. It is worthwhile to note that each block in the diagonal of the matrix in Equation 19 is a 2 × 2 matrix that satisfies the condition specified in Theorem 1, and hence, we say that B2 can be decomposed by the EVD using Jacket matrices. In other words, B2 is able to be represented by

$B 2 = J 2 Σ 2 J 2 − 1 .$
(25)

In addition, it is shown that H k is decomposed, which has the diagonal form as

$H k = I k / 2 ⊗ B 2 = I k / 2 ⊗ J 2 Σ 2 J 2 − 1 = I k / 2 ⊗ J 2 diag λ 1 , λ 2 ⋯ λ k I k / 2 ⊗ J 2 − 1 = JΣ J − 1 .$
(26)

Thus, we can write

$H k = JΣ J − 1 ,$
(27)

where

$J = I k / 2 ⊗ J 2 = J 2 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ J 2 k × k ,$
(28)
$Σ = I k / 2 ⊗ Σ 2 = Σ 2 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ Σ 2 k × k , and$
(29)
$J − 1 = I k / 2 ⊗ J 2 − 1 = J 2 − 1 ⋯ 0 ⋮ ⋱ ⋮ 0 ⋯ J 2 − 1 k × k .$
(30)

Note that the size of each block element in the diagonal matrices (28), (29), and (30) is 2 × 2.

### 4.1 Eigenvalue decomposition of matrix of order 3

In this subsection, we introduce a class of matrices of order 3 that can be factorized into EVD forms through Jacket matrices [1, 17]. A 3 × 3 matrix A is a Jacket matrix similar to a diagonal matrix Λ if and only if such a matrix can be factorized into the form of an EVD such as A = J Λ J−1. Consider a special matrix, A, of which the elements in the first row are arbitrary, whereas the elements in the other rows are generated by cyclically shifting the previous row. One of its examples is given as follows.

$A = a b c c a b b c a .$
(31)

The abovementioned matrix, A, can be decomposed as follows:

$a b c c a b b c a = 1 1 1 1 ω ω 2 1 ω 2 ω × a + b + c 0 0 0 a + bω + c ω 2 0 0 0 a + b ω 2 + cω × 1 1 1 1 ω ω 2 1 ω 2 ω − 1 ,$
(32)

where ω = ej 2π/n (n is a matrix order). Note that ω3 = 1, and ω1 ≠ 1.

Consider a matrix A6 that is able to be decomposed via Jacket matrices as

$A 6 = A 2 ⊗ A 3 = J 2 ⊗ J 3 Λ 2 ⊗ Λ 3 J 2 ⊗ J 3 − 1 ,$
(33)

where  is the Kronecker product. Then, the EVD of Equation 33 is given as

$A 6 = a b b a ⊗ a b c c a b b c a = a a a − a ⊗ 1 1 1 1 ω ω 2 1 ω 2 ω$
(34)

In general, a matrix of order n (n = 2k × 31) can be decomposed via Jacket transform as follows:

$A n = A 2 k × 3 l = A 2 k ⊗ A 3 l = J 2 k Λ 2 k J 2 k − 1 ⊗ J 3 l Λ 3 l J 3 l − 1 = J 2 k ⊗ J 3 l Λ 2 k ⊗ Λ 3 l J 2 k ⊗ J 3 l − 1$
(35)

The diagonal mobile communication channel matrix is given by Equation 23, where

$B 2 = cos 45 0 − i sin 45 0 sin 45 0 i cos 45 0 = 1 2 1 − i 1 i$
$= 0.8881 − 0.3251 + 0.3251 i 0.3251 + 0.3251 i 0.8881 0.9659 − 0.2588 i 0 0 − 0.2588 + 0.9659 i 0.8881 0.3251 − 0.3251 i − 0.3251 − 0.3251 i 0.8881 = QΛ Q H .$
(36)

A 4 × 4 block wise Jacket matrix is

$H 4 = B 2 0 0 B 2 = 1 2 1 − i 0 0 1 i 0 0 0 0 1 − i 0 0 1 i = 1 0 0 1 ⊗ 1 2 1 − i 1 i = I 2 ⊗ B 2 .$
(37)

Then, the capacity of a MIMO wireless communication system is given by

$C = log 2 det I N R + SNR N T H k H k H bits / s / Hz$
(38)

The channel matrix H k is also able to be decomposed by the EVD

$H k = QΛ Q H .$
(39)

Then, the EVD is obtained as

$H k H k H = QΣ Σ H Q H = QΛ Q H ,$
(40)

where QQH = QHQ = IN, and Λ = dig(λ1, λ2,, λ K ) with its diagonal elements given as

$λ k = σ k 2 , if k = 1 , 2 , ⋯ , K min 0 , if k = K min + 1 , ⋯ , K .$
(41)

It is shown that the MIMO system capacity can be written as

$C = ∑ k = 1 K log 2 1 + SNR N T λ k bits / s / Hz .$
(42)

Therefore, the EVD can be also applied to block diagonal Jacket matrices.

## 5 Complexity analysis

In this section, we quantify the complexity of the QR-EVD decomposition algorithm and compare it with the conventional SVD-BD schemes. The complexities of the alternative methods are usually compared by the number of floating point operations. A flop is defined as real floating operations, i.e., real additions, multiplications, divisions, and so on. One complex addition and multiplication elaborate two and six flops, respectively.

### 5.1 Complexity of matrix operations

For an m × n complex-valued matrix E m × n, its multiplication with another n × p complex-valued matrix D n × p, we use the total number of flops to measure the computational complexity of the existing algorithms [11, 13, 18, 19]. We summarize the total flops needed for the matrix operations as below:

• Multiplication of m × n and n × p complex matrices is 8mnp flops.

• When D = E, the complexity is reduced to 4 nm (m + 1) flops, where D is a diagonal or block diagonal matrix.

• The flop count for the SVD of real-valued m × n (m ≤ n) matrices is 4m2n + 8mn2 + 9n3. For complex-valued m × n (m ≤ n) matrices, we approximate the flop count as 24mn2 + 48m2n + 54 m3 by treating every operation as the complex multiplication.

• The QR decomposition on E using the Gram-Schmidt Orthogonalization (GSO) method takes 6 × 2m2n flops.

• The water filling operation is 2 m2 + 6 m flops for the water filling over m eigenvalues .

### 5.2 Complexity analysis for BD methods

For the conventional SVD-BD method, obtaining the orthogonal complementary basis V k (0) requires K times of SVD operations . Hence, we consider GSO or QR decomposition methods. To calculate all, $H ˜ k V ˜ k 0$ requires K matrix multiplications while obtaining the singular vectors $V ˜ k 1$ and the singular values λ k require another K SVD operations. The water filling is needed to find P k . The square root of the real-valued diagonal matrix P k 1/2 needs to be calculated and multiplied by $V ˜ k 0$ and $V k 0$, respectively. Those operations repeat K times as well.

Based on the above analysis, two results of the SVD-BD and the QR decomposition are shown in Figures 2 and 3, respectively. Figure 3 shows the required number of flops according to the number of transmit antennas, NT, where n = 2 and k = 2. Figure 4 shows the required number of flops according to the number of users, K, where m = 24 and n = 2. From Figures 3 and 4, it is obvious that the QR decomposition can significantly reduce the number of flops compared with the BD algorithm. The larger values NT and K have, the less number of flops the QR decomposition has. Figure 4 shows that the number of flops significantly decreases. In other words, the complexity highly declines.

The channel in Equation 27 can be decomposed by Jacket matrices, which has the diagonal form, where J is a unitary matrix. Therefore, Equations 8 and 15 are the same as Equation 27 because U and V are unitary matrices and a family of Jacket matrices, which are mathematically proved in the previous sections. Thus, the complexity analysis of Jacket matrices are the same as that of the QR-EVD decomposition as shown in Table 1. The complexity of the conventional EVD method and Jacket-based EVD method increases as the respective sizes of their matrices increase, as shown in Figure 5. In addition, we compare the performance of the conventional-based EVD method and Jacket-based EVD method. Classes of these matrices, which are simply decomposed by the EVD based on Jacket transform, have been used to significantly reduce their computational complexity compared to the conventional EVD method.

## 6 Conclusion

In this paper, we propose the QR method to obtain the precoding matrix for MIMO broadcast downlink systems. In addition, the QR scheme that of achieves the same sum capacity as the SVD-BD scheme. We show that the new method has the lower complexity than the conventional BD method through complexity analysis, and the efficiency improvement becomes significant when the base station or users have a large number of transmit antennas. These results also show that the QR decomposition algorithm requires much less complexity than the conventional BD method. Thus, the complexity analysis of Jacket matrices is the same as that of the QR-EVD decomposition. We believe that the amount of computation required to obtain diagonal-similar matrices is much smaller than that of computation required in the conventional EVD. In addtion, by using the QR decomposition to find the orthogonal complement, it is shown that the complexity of the SVD-BD can be significantly reduced. In addition, we show that EVD can be extended to the high-order matrices. These properties may be used for Jacket matrices to be applied to signal processing, coding theory, and orthogonal code design. The EVD can be used in the information-theoretic analysis of MIMO channels.

## References

1. 1.

Lee MH, Matalgah MM, Song W: Fast method for precoding and decoding of distributive multi-input multi-output channels in relay based decode-and-forward cooperative wireless networks. IET Commun 2010, 4(2):144-153. 10.1049/iet-com.2008.0712

2. 2.

Lee MH: A new reverse Jacket transform and its fast algorithm. IEEE Trans Circuits Syst II 2000, 47(1):39-47. 10.1109/82.818893

3. 3.

Chen Z, Lee MH, Zeng G: Fast cocyclic Jacket transform. IEEE Trans Signal Process 2008, 56: 2143-2148.

4. 4.

Lee MH, Manev NL, Zhang XD: Jacket transform eigenvalue decomposition. Appl Math Comput 2008, 198: 854-864.

5. 5.

Weingarten H, Steinberg Y, Shamai S: The capacity region of the Gaussian MIMO broadcast channel. IEEE Trans Inform Theory 2006, 52(9):3936-3964.

6. 6.

Costa MHM: Writing on dirty paper. IEEE Trans Inform Theory 1983, 29: 439-441. 10.1109/TIT.1983.1056659

7. 7.

Windpassinger C, Fischer RFH, Vencel T, Huber JB: Precoding in multiantenna and multiuser communications. IEEE Trans Wireless Commun 2004, 3(4):1305-1315. 10.1109/TWC.2004.830852

8. 8.

Peel CB, Hochwald BM, AL S: A vector perturbation technique for near-capacity multiantenna multiuser communication - part I: channel inversion and regularization. IEEE. Trans. Commun. 2005, 53: 195-202. 10.1109/TCOMM.2004.840638

9. 9.

Sung H, Lee S, Lee I: Generalized channel inversion methods for multiuser MIMO systems. IEEE Trans Commun 2009, 57(11):3489-3499.

10. 10.

Spencer QH, Swindlehurst AL, Haardt M: Zero-forcing method for downlink spatial multiplexing in multiuser MIMO channels. IEEE Trans Signal Processing 2004, 52(2):461-471. 10.1109/TSP.2003.821107

11. 11.

Zu K, de Lamare RC: Low-complexity lattice reduction-aided regularized block diagonalization for MU-MIMO systems. IEEE Commun Lett 2012, 16(6):925-928.

12. 12.

Zu K, de Lamare RC, Haardt M: Generalized design of low-complexity block diagonalization type precoding algorithms for multiuser MIMO systems. IEEE Trans Commun 2013, 61(10):4232-4242.

13. 13.

Shen Z, Chen R, Andrews JG, Heath RW Jr, Evans BL: Low complexity user selection algorithms for multiuser MIMO systems with block diagonalization. IEEE Trans Signal Process 2006, 54(9):3658-3663.

14. 14.

Stankovic V, Haardt M: Generalized design of multiuser MIMO precoding matrices. IEEE Trans Wireless Commun 2008, 7: 953-961.

15. 15.

An J, Liu YA, Liu F: An Efficient Block Diagonalization Method for Multiuser MIMO Downlink. In 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet). China; 2012.

16. 16.

Wang H, Li L, Song L, Gao X: A linear precoding scheme for downlink multiuser MIMO precoding systems. IEEE Commun Lett 2011, 15(6):653-655.

17. 17.

Lee MH: Jacket Matrices - Construction and Its Application for Fast Cooperative Wireless Signal Processing. LAP LAMBERT Academic Publishing, Germany; 2012.

18. 18.

Golub GH, Van Load CF: Matrix Computations. 3rd edition. The John Hopkisns University Press, Baltimore and London; 1996.

19. 19.

Li W, Latva-aho M: An efficient channel block diagonalization method for generalized zero forcing assisted MIMO broadcasting systems. IEEE Trans on Wireless Commun 2011, 10(3):739-744.

## Acknowledgements

This work was supported by the MEST 2012–002521 and Brain Korea 21 (BK21) Plus Project in 2014, National Research Foundation (NRF), Republic of Korea.

## Author information

Authors

### Corresponding author

Correspondence to Moon Ho Lee.

### Competing interests

The authors declare that they have no competing interests.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions 