 Research
 Open Access
 Published:
Fast matrix inversion methods based on Chebyshev and Newton iterations for zero forcing precoding in massive MIMO systems
EURASIP Journal on Wireless Communications and Networking volume 2020, Article number: 34 (2020)
Abstract
In massive MIMO (mMIMO) systems, large matrix inversion is a challenging problem due to the huge volume of users and antennas. Neumann series (NS) and successive over relaxation (SOR) are two typical methods that solve such a problem in linear precoding. NS expands the inverse of a matrix into a series of matrix vector multiplications, while SOR deals with the same problem as a system of linear equations and iteratively solves it. However, the required complexities for both methods are still high. In this paper, four new joint methods are presented to achieve faster convergence and lower complexity in matrix inversion to determine linear precoding weights for mMIMO systems, where both Chebyshev iteration (ChebI) and Newton iteration (NI) are investigated separately to speed up the convergence of NS and SOR. Firstly, joint Chebyshev and NS method (ChebINS) is proposed not only to accelerate the convergence in NS but also to achieve more accurate inversion. Secondly, new SORbased approximate matrix inversion (SORAMI) is proposed to achieve a direct simplified matrix inversion with similar convergence characteristics to the conventional SOR. Finally, two improved SORAMI methods, NISORAMI and ChebISORAMI, are investigated for further convergence acceleration, where NI and ChebI approaches are combined with the SORAMI, respectively. These four proposed inversion methods provide near optimal bit error rate (BER) performance of zero forcing (ZF) case under uncorrelated and correlated mMIMO channel conditions. Simulation results verify that the proposed ChebINS has the highest convergence rate compared to the conventional NS with similar complexity. Similarly, ChebISORAMI and NISORAMI achieve faster convergence than the conventional SOR method. The order of the proposed methods according to the convergence speed are ChebISORAMI, NISORAMI, SORAMI, then ChebINS, respectively. ChebINS has a low convergence because NS has lower convergence than SOR. Although ChebISORAMI has the fastest convergence rate, NISORAMI is preferable than ChebISORAMI due to its lower complexity and close inversion result.
Introduction
Massive MIMO (mMIMO) is one of the most promising technologies for the 5th generation (5G) communication systems [1]. mMIMO recent applications include machine type communications, drone communications, control circuits in nuclear reactors, and nuclear physics applications. Its channel hardening property ensures mitigating the effect of noise and interference as the number of antennas increase [2]. Hence, linear precoding methods can approximately achieve optimal performance in mMIMO systems [3]. However, there are challenging problems in practical implementation of mMIMO systems such as large matrix inversion resulted from the large number of users and antennas.
Large matrix inversion is an important practical issue that affects the precoder design and performance. A good precoder depends on matrix inversion approximation characteristics such as low complexity and good approximation accuracy. Generally, precoding methods are divided into linear and nonlinear ones. Nonlinear precoding methods such as constant enveloper (CE), dirty paper coding (DPC) [4], vector perturbation (VP), lattice aided, and TomlinsonHarashima precoding (THP) are unfriendly to hardware implementation due to their high complexity [5]. Hence, linear precoders such as matched filter (MF), zero forcing (ZF), regularized zero forcing (RZF), phased ZF (PZF), and minimum mean square error (MMSE) are favorable although they need the inversion of channel matrix containing all users [6]. Direct, iterative, and expansions methods are three main categories to calculate large matrix inverse for linear precoding. Direct methods suffer from high complexity as it depends mainly on transferring the matrix to be inverted into a multiplication of simple matrices like QR and Chelosky decomposition [7]. Iterative methods belong to the family of solving linear equations such as Richardson method [8], conjugate gradient (CG) method [9], successive over relaxation (SOR) [10], symmetric successive over relaxation (SSOR) [11], and GaussSeidel (GS) method [12]. They have acceptable performance in mMIMO systems. However, these approaches provide indirect matrix inversion approximation as they calculate a product containing matrix inversion and quadrature amplitude modulation (QAM) symbol vectors. In addition, matrix inversion is required separately for specific calculations such as sum rate computations and rapid matrix modifications [13, 21]. The matrix inverse can be directly updated (column added and column deleted) to save the matrix inversion time and complexity. Hence, these methods require more complexity for these specific calculations as the symbol vector is divided. Chebyshev iteration (ChebI) and Newton iteration (NI) provide fast convergence characteristics while their complexity depends on the number of iterations involved [14, 15]. However, both iterative methods require complex calculation of initial input to ensure convergence. The third category, expansion methods, transfers the inverse of a matrix into a series of matrix vector products like Neumann series (NS) [16]. Although NS has slow convergence rate, it not only can approximate matrix inversion separately but also owns simple hardware implementation property [17, 18]. In [19], the authors utilize NI to achieve faster convergence than ordinary NS. This inspired us to replace the quadrature order NI with cubic order ChebI to achieve not only more accurate inversion results but also faster convergence. NS and SOR are recent two research directions to reduce the complexity of matrix inversion in linear precoding. However, their convergence speed should be improved. This motivates us to speed up their convergence using cubic order ChebI.
Although the large precoding gain can be obtained by making use of large number of antennas, interference become dominant factor rather than additive noise. Hence, under these circumstances, ZF precoder is a reasonable choice compared to other precoding methods. So our focus in this paper is on ZF precoding technique for mMIMO systems to reduce its complexity. In this paper, four new joint methods are proposed to achieve faster convergence with reasonable complexity in matrix inversion to determine linear precoding weights for mMIMO systems, where the first iteration result of Chebyshev iteration (ChebI) and Newton iteration (NI) approaches are employed to reconstruct both NS and SOR methods. A high probability of convergence is achieved that can offer useful guidelines for practical mMIMO systems. The main contributions of this paper are five folds. – Firstly, we propose a new joint Chebyshev iteration and Neumann series (ChebINS) method that not only achieves faster convergence but also provides more accurate matrix inversion approximation than previous NS methods. – Secondly, we propose a new SORbased approximate matrix inversion (SORAMI) method that directly approximates matrix inversion by separating the QAM symbol vector from the whole iteration process. The new method, which is very useful for further calculations, achieves the same convergence rate as SOR method with lower complexity. – Thirdly, to further improve convergence characteristics of SORAMI, we propose joint NI and SORAMI method (NISORAMI), where we adopt one NI iteration to get an efficient searching direction for the following SORAMI iterations to achieve a fast convergence rate. – Fourthly, another method to accelerate the convergence of SORAMI is to make use of cubic order ChebI instead of quadrature order NI. Hence, joint ChebI and SORAMI method (ChebISORAMI) is the fourth proposed technique that achieves faster convergence rate. – Finally, the above four proposed methods are compared with existing methods in order to prove their faster convergence with reasonable nearideal ZF ^{Footnote 1} performance of downlink (DL) mMIMO system. Based on these results, we discuss the effectiveness of the proposed approaches.
The rest of this paper is organized as follows. Section 2 discusses the system model, mMIMO channel model, and preliminaries about NS expansion, SOR, proposed SORAMI method, NI, and Chebyshev iterations. Section 3 describes the four new proposed joint matrix inversion methods. Section 4 presents computational complexity analysis. Simulation results are introduced in Section 5. Finally, Section 6 concludes the work.
Notations: Uppercase and lowercase boldface letters denote matrices and vectors, respectively. (.)^{T}, (.)^{H}, (.)^{−1}, (.)^{(n)}, and (.)^{†} present transpose, conjugate transpose (Hermitian), inversion, n^{th} iteration number, and pseudo inverse, respectively. C∼N(μ,σ^{2}I_{K}) denotes the circularly symmetric complex Gaussian distribution with mean μ and covariance matrix σ^{2}I_{K} where I_{K} is the identity matrix of size K. . and ._{2} define the 1norm and 2norm, respectively.
System model and preliminaries
In this section, we will carefully describe our system model followed by mMIMO channel model. Also, related matrix inversion approaches such as NS, SOR, NI, and ChebI are briefly introduced.
System model
Figure 1 shows a DL centralized mMIMO system with N antennas equipped at the base station (BS) and serves K <<N single antenna users [1]. If the DL transmitted signal vector after precoding is x∈C^{N×1}, the received signal vector y∈C^{K×1} for K users can be expressed as:
where n∈C^{K×1} is the additive white Gaussian noise vector with zero mean and unit variance. ω is a normalization factor to determine signal to noise power ratio (SNR), i.e., SNR is given as \(SNR= \frac {\omega }{ \sigma ^{2}= 1} = \omega \), where σ^{2} denotes additive noise variance. H∈C^{K×N} is the DL channel matrix^{Footnote 2}. Furthermore, H=[h_{1},h_{2},....,h_{k}]^{T}, where h_{k}∈C^{1×N} is the channel vector between the BS and the k^{th} user modeled as an independent and identically distributed (i.i.d) random vector. x is precoded using the ZF precoder and defined as:
where s∈C^{K×1} is the symbol vector of 64 QAM symbols from K users for transmission [1], P∈C^{N×K} is the ZF precoding matrix, W∈C^{K×K} is the Gram matrix defined as HH^{H}, and β is a normalization parameter defined as \(\sqrt {\frac {K}{tr(\mathbf {W}^{1})}}\) [22], where tr(W^{−1}) defines the trace of W^{−1}. In this paper, we assume perfect channel state information (CSI) at the BS by utilizing the time domain training pilot [23]. In time division duplex (TDD) mMIMO systems, the BS uses the user pilots to estimate the uplink channel. Hence, the DL CSI is achieved using channel reciprocity property in TDD systems.
It is obvious from (2) that the main complexity for ZF precoding is the inversion of K×K matrix W. The Gram matrix W is Hermitian positive definite as in Eq. (3).
where u is an arbitrary K×1 nonzero vector.
The columns of the channel matrix H are asymptotically orthogonal and thus H is a full rank matrix [1]. uH equals zero vector only when u is a zero vector. Hence, we have uH(uH)^{H}>0 for all nonzero vectors indicating that W is a positive definite matrix.
mMIMO channel model
This paper considers not only uncorrelated Rayleigh channel but also spatially correlated ones. The elements of uncorrelated channel, H_{un}∈C^{K×N}, are independent and identically distributed (i.i.d.) complex Gaussian random variables (RVs) with zero mean and unit variance. On the other hand in the spatially correlated MIMO channel H_{co} [24], the Kronecker channel model [25], in which H_{co}∈C^{K×N}, can be modeled as:
where R_{rx}∈C^{K×K} and R_{tx}∈C^{N×N} are the correlation matrices for the receive and transmit antennas, respectively. Since we assume single antenna user, then R_{rx}=I [26]. Note that if also R_{tx} equals identity matrix, the left hand side of Eq. (4) will be the uncorrelated channel H_{un}. The (p, q) element of exponentially correlated transmit correlation matrix R_{tx} is given as [25]:
where 0≤ζ≤1 denotes the correlation magnitude between adjacent transmit antennas and Ψ is the phase.
Neumann series (NS) method
According to the Neumann series expansion [17], the required Gram matrix to be inverted, W∈C^{K×K}, is approximated as a sum of matrix polynomials.
where ϕ∈C^{K×K} preconditioning matrix and I_{K} is the K×K identity matrix. Assumptions of ϕ and the proposed approach to determine it are given in Section 3.1.
The main condition of Eq. (6) to be fulfilled is
where 0_{K} is a zero matrix of size K×K. For practical use, the inverse, W^{−1}, is approximated according to the value of L which is the maximum number of iterations^{Footnote 3}
where L is the iteration number and \(\hat {\mathbf {W}}^{1}\) is the approximated inverse.
SOR method
The SOR method aims to iteratively solve the Gram matrix inversion problem as a linear equation Wg=s, where g is an unknown vector solution of length K×1. The matrix W is decomposed into
where D, L, and U=L^{H} are the diagonal component, lower triangular component, and upper triangular component of Hermitian positive definite matrix W.
If Wg=s, i.e., g=W^{−1}s, the n^{th} estimation of W^{−1}s is obtained by substituting Eq. (9) into the SOR method equation as follows [10]:
where n defines the number of iterations, g^{(n)} is the n^{th} iteration of g which also equals the SOR n^{th} estimation of W^{−1}s, and α is the relaxation parameter. The utilized optimal relaxation parameter in this paper according to [10] equals
Note that SOR method computes a product that contains the matrix inverse, i.e., g^{(n)} is the n^{th} estimation of W^{−1}s.
Newton iteration (NI)
Newton iteration method can be employed to calculate W^{−1} in an iterative way [14]. Assume that Z^{(0)} is the initial estimation inverse of W^{−1} and
Hence, the (n+1)^{th} iteration estimation of W^{−1} using NI is obtained by substituting f(Z)=Z^{−1}−W in NI function \(\mathbf {Z}^{({n}+1)}= \mathbf {Z}^{({n})} \frac {f(\mathbf {Z}^{(n)})}{f^{\prime }(\mathbf {Z}^{({n})})}\), where f^{′}() is first derivative function of a function whose argument is a matrix as defined in [14, 15]. The final NI formula that calculates the (n+1)^{th} estimation of W^{−1} is expressed as [14, 15]:
where n denotes the number of iterations. If n is large, Eq. (13) is converged to the Gram matrix inverse, i.e., W^{−1}.
Chebyshev iteration (ChebI)
Chebyshev iteration is a third order convergence algorithm [15]. Similar to NI, substitute the function f(Z)=Z^{−1}−W into Chebyshev three terms function \(\mathbf {Z}^{({n}+1)}=\mathbf {Z}^{({n})}\frac {f(\mathbf {Z}^{(n)})}{f^{\prime }(\mathbf {Z}^{({n})})} \frac {f^{\prime \prime }(\mathbf {Z}^{({n})})}{2 f^{\prime }(\mathbf {Z}^{({n})})} \left (\frac {f(\mathbf {Z}^{({n})})}{f^{'}(\mathbf {Z}^{({n})})}\right)^{2}\) to get the matrix inversion using ChebI, where f^{′′}() is the second derivative function. Note that the third term of Chebychev Z^{(n+1)} helps ChebI to provide more accurate results than NI. The (n+1)^{th} Chebyshev iteration expression of W^{−1} is expressed as [15]:
If number of iterations is sufficient, Eq. (14) is converged to the matrix inverse, i.e., W^{−1}.
Proposed methods
In this section, we will discuss four proposed methods to speed up large matrix inversion calculation. We will start with the first proposal (i.e., ChebINS), and then, we will move to the second proposal (i.e., SORAMI). To achieve further improvement, we also propose improved SORAMI methods as the third and fourth proposal (i.e., ChebISORAMI and NISORAMI).
Joint Chebyshev iteration and Neumann series method (ChebINS)
The initial NS value, ϕ in Eq. (6), greatly affects the convergence. The method of selecting ϕ plays an important role in NS acceleration. There are three assumptions to get ϕ where two of them depend on the special properties of the matrices while the third one focus on getting the initial from other iterations like NS and ChebI. The popular assumption of ϕ is the matrix inversion of K×K diagonal matrix D whose entries are the main diagonal elements of the Gram matrix W, i.e., D^{−1} [17]. The matrix D^{−1} can be calculated as follows [18]:
where w_{k,k} is the k^{th} diagonal element of Gram matrix W.
The second assumption of ϕ is \(\left (\frac {\mathbf {I}_{K}}{{N}+{K}}\right)\) which also represents a diagonal matrix [1]. This is due to that the largest and smallest eigenvalues of Gram matrix W depends on N and K. As the number of N and K grows, the eigenvalues of the gram matrix converges to a fixed distribution [17]. The third assumption is to utilize the first iteration output of NI, Z^{(1)} as in Eq. (13) with n=0 to initialize NS [19]. Initializing NS with ChebI instead of NI not only provides accurate inversion approximation but also speeds up NS convergence. The advantages of ChebI over NI such as fast convergence and more accurate approximation motivated us to initialize NS with the output of the first iteration of ChebI instead of NI.
In this paper, ChebI is applied first to provide a suitable ϕ to speed up the convergence of NS. The joint ChebINS approach main steps to estimate W^{−1} are:
Step 1 Obtain the inverse of the diagonal matrix of Gram matrix W, i.e, D^{−1} as in Eq. (15).
Step 2 Apply one Chebyshev iteration (i.e., n=0 in Eq.(14) with initial input Z^{(0)}=D^{−1} as follows:
Step 3 Apply the obtained first ChebI, Z^{(1)}, as an initial to the Neumann series as follows:
An approximated solution, \(\hat {\mathbf {W}}^{1}\), is obtained for finite number of iterations.
Lemma 1
For DL mMIMO systems, the Neumann series with initial value from Chebyshev iteration, ϕ=Z^{(1)}, have a high probability convergence when [16]
Proof
See Appendix Appendix A Proof of Lemma 1. □
Equation (18) has a practical applications in mMIMO systems as it identifies the suitable number of BS antennas to the number of single antenna users in mMIMO systems. For example, η=8 and η=16 produce two typical downlink mMIMO configuration N×K=256×32 and 256×16 [12]. According to [16,19], these values ensures high probability of convergence of 0.999 due to the large values of η.
SORbased approximate matrix inversion method (SORAMI)
Our main idea is provided in the following Lemma
Lemma 2
W^{−1} can be approximated to R^{(n)} when n→∞ using iterative SOR method as follows:
where R^{(0)} is the initial input and chosen to be the diagonal component, i.e., D^{−1}, R^{(n)} is the n^{th} direct estimation of W^{−1}. An approximated solution, \(\hat {\mathbf {W}}^{1}\), is obtained for finite number of iterations.
Proof
See Appendix 6. □
The SORAMI main steps, to directly estimate W^{−1}, are as follow:
Step 1 Calculate the initial input R^{0}=D^{−1} from Eq. (15).
Step 2 Apply the obtained R^{0} on the SORAMI method as in Eq.(19).
Since W is Hermitian positive definite, the SOR method is convergent [10]. Hence, as W^{−1}s is approximated by g^{(n)}(n→∞), W^{−1} can be approximated by R^{(n)}(n→∞). The SORAMI method based on Eq. (19) can be utilized to directly calculate W^{−1}. According to Eq. (26), it has the same convergence of SOR iterative method.
Improved SORAMI methods
The convergence of SORAMI is accelerated by making use of the fast convergence property of NI and ChebI which is revealed at the beginning of the iteration for SORAMI method. Hence, ChebISORAMI and NISORAMI are discussed below, respectively.
Joint Chebyshev iteration and SORAMI method (ChebISORAMI)
The joint algorithm main procedures, to directly estimate W^{−1} using ChebISORAMI, are as follows:
Step 1 Apply one Chebyshev iteration with initial input Z^{(0)}=D^{−1} as in Eq. (16).
Step 2 Use the obtained first ChebI, Z^{(1)}, to apply on the SORAMI method as follows:
Since W is Hermitian positive definite, the ChebISORAMI method is convergent as SORAMI has the same convergence of traditional SOR method. Equation (20) calculates W^{−1} after initializing SORAMI method with one iteration of ChebI, i.e., Z^{(n)}≈W^{−1}. As the number of iterations approaches infinity, i.e., (n→∞), Eq. (20) converges to the exact matrix inverse, i.e., W^{−1}. An approximated solution, \(\hat {\mathbf {W}}^{1}\), is obtained by finite number of iterations.
Joint Newton iteration and SORAMI (NISORAMI)
Similar to ChebISORAMI method, NISORAMI depends on applying one NI as initial input to SORAMI. The main steps, to estimate W^{−1} directly using NISORAMI, are as follow:
Step 1 Apply one Newton iteration with initial input Z^{(0)}=D^{−1} as follows:
Step 2 Apply the first NI, Z^{(1)}, obtained from step 1 to SORAMI method similar to Eq. (20).
Similar to ChebISORAMI, NISORAMI is convergent and W^{−1} can be approximated by Z^{(n)}(n→∞) resulted from step 2. Similarly to the previous, an approximated solution, \(\hat {\mathbf {W}}^{1}\), is obtained for finite number of iterations. The main advantage of this method is its reduced complexity compared with ChebISORAMI method. Next section discusses the complexity analysis of the proposed methods.
Complexity analysis
In this paper, we evaluate the computational complexity analysis of the proposed methods in terms of required number of complex multiplications which is more popular and complicated. The channel coherence interval T_{c}, defined as the product of coherence time and coherence bandwidth, is under consideration for fair complexity comparison. There are two types of approaches that can solve (2). The first type that include our four proposed methods is to directly compute W^{−1} every channel coherence interval. Thus, W^{−1} can be calculated regardless of T_{c}, while it requires other auxiliary processing which increases the total complexity as T_{c} increases. On the other hand, the other type is to calculate the precoding weight recursively as a product of W^{−1}s such as SSOR method. Thus, overall complexity is increased as T_{c} (i.e., the number of symbols per T_{c}) increases.
The complexity of ZF precoding within T_{c} is O(K^{3}+T_{c}NK). The NS complexity for different initial ϕ values and more than two iterations (i.e., n>2) is O(K^{3}) compared to the exact matrix inversion. NS implements matrix multiplication and matrix addition which are favorable in hardware as no divisions are required [1,18]. From Eq. (14), one ChebI requires two matrix additions and three matrix multiplications. However, one NI complexity is reduced by one matrix addition and one matrix multiplications according to Eq.(13). When ϕ=D^{−1} or \(\boldsymbol {\phi }=\frac {\mathbf {I}_{K}}{{N}+{K}}\), the complexity of Eq. (8) is O(K^{2}) for the first iteration (i.e., n=1) and O(K^{3}) for further iterations (i.e., n≥2). Note that for ChebINS, the complexity increases at i = 2 due to the added multiplications because of applying one ChebI.
For the SORAMI method, the complexity is O(K^{2}+T_{c}NK) as only two matrix multiplications are required. This means that SORAMI convergence is faster than NS and also has lower complexity especially at large iteration numbers. The computational complexity of NISORAMI is slightly lower than ChebISORAMI by one matrix multiplication and one matrix addition.
The overall complexities of the proposed methods in addition to NS, NINS [19], and SSOR [11] are shown in Table 1 considering the channel coherence interval T_{c}. Small T_{c} values do not greatly affect our proposed methods complexity because it is defined as complexity per the number of symbols during T_{c}. Hence, it greatly reduces SSOR method complexity. For large T_{c} values, if the product of T_{c}NK is larger than K^{3}, then the complexity is increased with T_{c} increment; otherwise, it has small effect compared to K value. SORAMI complexity is lower than SSOR and in the same time it directly computes W^{−1}. ChebINS complexity O(K^{3}) is close to its traditional NS methods. Note that SSOR method estimates the matrix inversion using two SOR iterations in both the forward and reverse order. Also, SSOR calculates the matrix inverse indirectly, i.e., W^{−1}s, while other methods compute it directly that is highly recommended for fast matrix inverse updates [21]. The proposed SORAMI method has the lowest complexity compared to other methods. Initializing SORAMI with either NI or ChebI slightly increases its complexity but provides faster convergence results to achieve close inversion results at low iterations.
Results and discussion
To evaluate the effect of the proposed method, we conducted computer simulation. System model is the same as in Section 2.1. In the three proposed methods (i.e., ChebINS, ChebISORAMI, and NISORAMI), the initial values of Newton and Chebyshev iterations are the diagonal component of W. To evaluate the proposed methods, the Frobenius norm error and bit error rate (BER) as performance metrics. Uncoded system is assumed during simulation. Also, the average BER of all users are calculated during simulation. The MSE is defined as follows.
where W^{−1} and \(\hat {\mathbf {W}}^{1}\) are defined as ideal inverse of the Gram matrix and approximated solution by the three abovementioned proposed methods. The ZF precoding with exact matrix inversion of W is added to our results as the benchmark. Two configurations N×K=256×32 and N×K=128×16 are considered. The utilized modulation scheme is 64 QAM. The parameters of correlated channel model are set to ζ=0.1 and Ψ=60^{∘} phase shift.
Figure 2 shows the Monte Carlo simulation results for the Frobenius norm error between exact Gram matrix inverse and its approximated inverse against number of BS antennas, N, for NINS, ChebINS, SORAMI, NISORAMI, and ChebISORAMI methods under uncorrelated channel conditions after 10,000 MC trials and for second, third, and fourth iterations, respectively. The MSE is plotted against N to measure the inversion error for each proposed scheme. Our error calculations neglects the modulation effect as our main focus is on the error resulted from precoding matrix inversion approximation. At the second iteration, The error of SORAMI method is the largest followed by NINS method that have the largest 2norm error at the following two iterations. According to Lemma 1, when N is known, the number of users K can be easily calculated and there will be a high convergence probability of the inversion. As small N values, the error decreases because of the small matrix inversion dimensions. The figure illustrates the merits of initializing NS and SORAMI with ChebI. The three terms Chebyshev iteration is more accurate than NI in spite of increased computational complexity by one matrix addition and multiplication. Also, Because SORAMI convergence is faster than NS convergence, the MSE for the three based SORAMI methods are lower than ChebINS and NINS methods.
For ease of illustration and discussion, the next three figures divide the results into three parts. Figure 3 compares the proposed ChebINS with other NSbased methods. Figure 4 do the same but for SORAMI, ChebISORAMI, NISORAMI with SOR based methods. Finally, Fig. 5 compares the all four proposed methods with each other. Figure 3a shows the BER against SNR of NINS, diagonalbased NS, new ChebINS, and NS with initial \(\frac {I_{K}}{{N+K}}\) under uncorrelated channel conditions with N×K=128×16 at the second iteration. Figure 3b and c are for third and fourth iterations, respectively. From the three subfigures, the new ChebINS algorithm has the superior performance close to ZF followed by the NINS method at the third and fourth iteration and has close performance to NINS at the second iteration. Figure 3 d, e, and f show the same analysis performed under correlated channel conditions with ζ=0.1 and 60^{∘} phase shift for the second, third, and fourth iteration, respectively. It is worth noting that at the second iteration, i.e., Fig. 3a, the NINS [19] converges slightly faster than ChebINS, but the reverse occurs under correlated channel conditions, i.e., Fig. 3c. Their performance is still not close to the optimal ZF. Hence, their matrix inversion results lack accuracy due to low utilized iterations. In the third and fourth iteration, i.e., Fig. 3b, c, e, and f, ChebINS is the nearest method to optimal performance. Because NS has a slow convergence rate, it requires more than two iterations for more accurate matrix inversion. Therefore at these conditions, the new ChebINS method gains a superior performance more than other NS approaches ensuring its fastest convergence.
Figure 4 shows the second and fourth iteration of the BER against SNR of new proposed SORAMI, NISORAMI, ChebISORAMI, and SSOR with N×K=265×32 under uncorrelated channel, Fig. 4 a and b, and correlated channel, Fig. 4c and d, respectively. From the figure, the new ChebSORAMI algorithm has the superior performance close to ZF followed by the NISORAMI method. The results at second iteration, i.e., Fig.4a and c, indicate that both ChebISORAMI and NISORAMI converge fast to the optimal performance; however, at the fourth iteration, all methods have close performance to ZF.
Figure 5 presents a performance comparison among the new proposed methods under uncorrelated channels, Fig. 5a and b, and correlated channels, Fig. 5c and d, for the second and fourth iterations, with N = 128 and K= 16, respectively. In correlated channel results, the BER error floor for the proposed methods show a good convergent trend similar to uncorrelated channels. ChebISORAMI has much better performance than other methods. SORAMI convergence speed is faster than NS; hence, SORAMIbased methods provide accurate results at lower iterations. Also, at low iterations, we recommend utilizing NISORAMI than ChebISORAMI as it has close result with reduced complexity. ChebINS has low performance due to the slower convergence of NS than SOR method. However, ChebINS is preferable for designers due to the ease of NS hardware implementation. Also, our proposed methods are robust to channel correlation more than existing NS, and SOR methods.
Conclusion
In this paper, we have investigated the slow convergence of both NS and SOR methods in linear precoding. For this purpose, we have proposed four joint methods to calculate ZF linear precoding weights for mMIMO systems, i.e., ChebINS, SORAMI, NISORAMI, and ChebISORAMI. ChebINS has been proposed to speed up the convergence of NS and also to give more accurate approximation. Unlike traditional SOR method, SORAMI method directly calculates the matrix inversion without multiplying the symbol vector. Joint ChebISORAMI and NISORAMI are based on applying fast converging Chebyshev/Newton iteration as initial step of SORAMI method. Simulation results illustrate that the proposed methods give not only accurate results but also fast convergence under both uncorrelated and correlated channel conditions. ChebINS method is the fastest among NSbased methods. NISORAMI is more preferable than ChebISORAMI as it achieves close performance to ChebISORAMI with lower complexity. Although NS convergence is lower than SORAMI, it is preferable in hardware implementation. Hence, ChebINS is important to accelerate the convergence of such a prominent method. Further investigations of the proposed methods with other linear precoders like MMSE and RZF is under consideration as future work.
Appendix A Proof of Lemma 1
First, we will approve the convergence of ChebINS, i.e., \(\sum _{{n}=0}^{\infty }{\left (I_{K} \mathbf {Z}^{(1)} \mathbf {W}\right)^{n} \mathbf {Z}^{(1)}}\) then from the convergence approval, the theory is proofed.
According to Eq. (6), the condition of convergence of ChebINS is \({\lim }_{n\to \infty } (\mathbf {I}_{K}\mathbf {Z}^{(1)} \mathbf {W})^{n} =\mathbf {0}_{K} \).
Let matrix A=I_{K}−Z^{(1)}W.
Since, ρ(I_{K}−Z^{(1)}W)<1⇔λ_{i}(A)<1,
where λ_{i}(A) denotes the i^{th} eigenvalue of A, 1≤ i ≤ k, ρ(I_{K}−Z^{(1)}W)=ρ(A)<1,where ρ(A) is the spectral radius of A i.e., λ_{max}(A) the largest absolute value of A eigenvalues
Substituting the value of Z^{(1)} from Eq. (16) into the matrix A yields
Let B=I_{K}−D^{−1}W then \(\sum _{n=0}^{\infty }{(\mathbf {I}_{K} \mathbf {D}^{1} \mathbf {W})^{n} \mathbf {D}^{1}}\) converges to λ_{i}(B)<1 which have a high probability convergence as in Lemma 1 [16].
Since, A=B^{3} and λ_{i}(A)=λ_{i}(B)^{3}then \(\sum _{{n}=0}^{\infty }{(\mathbf {I}_{K} \mathbf {Z}^{(1)} \mathbf {W})^{n} \mathbf {Z}^{(1)}}\) converges as \(\leftrightarrow \sum _{n=0}^{\infty }{(\mathbf {I}_{K} \mathbf {D}^{1} \mathbf {W})^{n} \mathbf {D}^{1}}\) converges too.
This approves the convergence of ChebINS.
From equations (11 17) in [16], a high probability convergence condition for\(\sum _{{n}=0}^{\infty }{(\mathbf {I}_{K} \mathbf {D}^{1} \mathbf {W})^{n} \mathbf {D}^{1}}\) equals \(\eta > \frac {1}{(\sqrt {2}1)^{2}}\), i.e., η>5.83. Thus, since we approved the convergence of ChebINS, the same high probability convergence for ChebINS is achieved when η>5.83 [16]. Hence, the proof is finished.
Appendix B Proof of Lemma 2
Substituting R^{(0)}=D^{−1} into g^{(0)}=D^{−1}s yields the following:
Hence, for k_{th} iteration, g^{(k)}=R^{(k)}s. Substituting this result in SOR method i.e. Eq. (10), R^{(k+1)} is obtained as
Hence, based on the mathematical induction, we can obtain the following:
Equation (26) ends the proof.
Notes
 1.
In ideal ZF, exact value of Gram matrix is used. Thus, the result corresponds to case where Gram matrix inverse is obtained with sufficient accuracy.
 2.
Two channel models are considered in the next subsection: (i) uncorrelated Rayleigh channel, named H_{un}∈C^{K×N}, with Gaussian distribution of zero mean and unit variance, and (ii) spatial correlated channel, named H_{co}∈C^{K×N}, as in [23].
 3.
Mainly, L is the number of expanded terms. However, in this paper, L equals the number of iterations(i) so as to compare NS convergence with other methods. The maximum value of L=4, i.e., 4th iteration as it provides a good tradeoff between complexity and performance [20].
Abbreviations
 5G:

5th generation
 BER:

Bit error rate
 BS:

Base station
 CE:

Constant enveloper
 CG:

Conjugate gradient
 ChebINS:

Joint Chebyshev and NS method
 ChebISORAMI:

Joint Chebyshev and SORbased approximate matrix inversion
 ChebI:

Chebyshev iteration
 CSE:

Channel state information
 DL:

Down link
 DPC:

Dirty paper coding
 GS:

Gauss Siedel
 i.i.d:

Independent and identically distributed
 MF:

Matched filter
 mMIMO:

Massive MIMO
 MMSE:

Minimum mean square error
 MSE:

Mean square error
 NINs:

Joint Newton iteration and NS method
 NISORAMI:

Joint Newton iteration and SORbased approximate matrix inversion
 NI:

Newton iteration
 NS:

Neumann series
 PZF:

Phased zero forcing
 QAM:

Quadrature amplitude modulation
 RV:

Random variable
 RZF:

Regularized zero forcing
 SNR:

Signal to noise ratio
 SORAMI:

SORbased approximate matrix inversion
 SOR:

Successive over relaxation
 SSOR:

Symmetric successive over relaxation
 TDD:

Time division duplex
 THP:

Tomlinson Harashima precoding
 VP:

Vector perturbation
 ZF:

Zero forcing
References
 1
F. Rusek, et al., Scaling up MIMO: opportunities and challenges with very large arrays. IEEE Signal Process. 30(1), 40–60 (2013). https://doi.org/10.1109/MSP.2011.2178495.
 2
T. Marzetta, E. Larsson, H. Yang, H. Ngo, Fundamentals of Massive MIMO (Cambridge University Press, 2016). https://doi.org/10.1017/CBO9781316799895.
 3
E. Bjrnson, J. Hoydis, L. Sanguinetti, Massive MIMO networks: spectral, energy, and hardware efficiency. Found. Trends Signal Process. (2018). doi:10.1561/2000000093.
 4
M. Costa, Writing on dirty paper (Corresp.)IEEE Trans. Informa. Theory. 29(3), 439–441 (1983).
 5
M. MazroueiSebdani, W. A. KrzymieÅ, J. Melzer, Massive MIMO with nonlinear precoding: largesystem analysis. IEEE Trans. Veh. Technol.65(4), 2815–2820 (2016). https://doi.org/10.1109/TVT.2015.2425884.
 6
N. Fatema, G. Hua, Y. Xiang, D. Peng, I. Natgunanathan, Massive MIMO linear precoding: a survey. IEEE Syst. J, 1–12 (2017). https://doi.org/10.1109/JSYST.2017.2776401.
 7
ke Bjrck, Numeric Methods in Matrix Computations (Springer, 2015). https://doi.org/10.1007/9783319050898.
 8
X. Gao, L. Dai, Y. Ma, Z. Wang, Lowcomplexity nearoptimal signal detection for uplink largescale MIMO systems. Electron. Lett.50(18), 1326–1328 (2014). https://doi.org/10.1049/el.2014.0713.
 9
B. Yin, M. Wu, J. R. Cavallaro, C. Studer, Conjugate gradientbased softoutput detection and precoding in massive MIMO systems. IEEE Glob. Commun. Conf. (GLOBECOM), 3696–3701 (2014). https://doi.org/10.1109/GLOCOM.2014.7037382.
 10
T. Xie, Q. Han, H. Xu, Z. Qi, W. Shen, in 81st IEEE Vehicular Technology Conference. A lowcomplexity linear precoding scheme based on SOR method for massive MIMO systems (VTC Spring, 2015), pp. 1–5. https://doi.org/10.1109/VTCSpring.2015.7145618.
 11
T. Xie, L. Dai, X. Gao, X. Dai, Y. Zhao, Lowcomplexity SSORbased precoding for massive MIMO systems. IEEE Commun. Lett.20(4), 744–747 (2016). https://doi.org/10.1109/LCOMM.2016.2525807.
 12
X. Gao, L. Dai, J. Zhang, S. Han, I. ChihLin, Capacityapproaching linear precoding with lowcomplexity for largescale MIMO systems. IEEE Int. Conf. Commun. (ICC), 1577–1582 (2015). https://doi.org/10.1109/ICC.2015.7248549.
 13
L. Shao, Y. Zu, Approaches of approximating matrix inversion for zeroforcing precoding in downlink massive MIMO systems. Springer J. Wirel. Netw. (2017). https://doi.org/10.1007/s112760171496z.
 14
C. Tang, C. Liu, L. Yuan, Z. Xing, High precision low complexity matrix inversion based on Newton iteration for data detection in the massive MIMO. IEEE Commun. Lett.20(3), 490–493 (2016). https://doi.org/10.1109/LCOMM.2015.2514281.
 15
C. Zhang, Z. Li, L. Shen, F. Yan, M. Wu, X. Wang, A lowcomplexity massive MIMO precoding algorithm based on Chebyshev iteration. IEEE Access. 5:, 22545–22551 (2017). https://doi.org/10.1109/ACCESS.2017.2760881.
 16
D. Zhu, B. Li, P. Liang, On the matrix inversion approximation based on neumann series in massive MIMO systems. IEEE Int. Conf. Commun. (ICC), 1763–1769 (2015). https://doi.org/10.1109/ICC.2015.7248580.
 17
H. Prabhu, J. Rodrigues, O. Edfors, F. Rusek, Approximative matrix inverse computations for verylarge MIMO and applications to linear precoding systems. IEEE Wirel. Commun. Netw. Conf. (WCNC), 2710–2715 (2013). https://doi.org/10.1109/WCNC.2013.6554990.
 18
A. Thanos, Algorithms and Hardware Architectures for Matrix Inversion in Massive MIMO Uplink Data Detection, M.Sc. Thesis (University of PATRAS, 2017).
 19
L. Shao, Y. Zu, Joint Newton iteration and Neumann series method of convergence accelerating matrix inversion approximation in linear precoding for massive MIMO systems. Hindawi J. Math. Probl. Eng. (2016). doi:10.1155/2016/1745808.
 20
M. Wu, B. Yin, G. Wang, C. Dick, J. R. Cavallaro, C. Studer, LargeScale MIMO Detection for 3GPP LTE: Algorithms and FPGA Implementations. IEEE J. Sel. Top. Signal Process.8(5), 916–929 (2014). https://doi.org/10.1109/JSTSP.2014.2313021.
 21
F. Rosrio, F. A. Monteiro, A. Rodrigues, Fast matrix inversion updates for massive MIMO detection and precoding. IEEE Signal Process. Lett.23(1), 75–79 (2016). https://doi.org/10.1109/LSP.2015.2500682.
 22
L. Lu, G. li, A. Swindlehurst, A. Ashikhmin, R. Zhang, An overview of massive MIMO: benefits and challenges. IEEE J. Sel. Top. Signal Process.8(5), 742–758 (2014). http://doi.org/10.1109/JSTSP.2014.2317671.
 23
T. L. Marzetta, Noncooperative cellular wireless with unlimited numbers of base station antennas. IEEE Trans. Wirel. Commun.9(11), 3590–3600 (2010). http://doi.org/10.1109/TWC.2010.092810.091092.
 24
L. Dai, X. Gao, X. Su, S. Han, C.L. I., Z. Wang, Lowcomplexity softoutput signal detection based on GaussSeidel method for uplink multiuser largescale MIMO systems. IEEE Trans. Veh. Technol.64(10), 4839–4845 (2015). http://doi.org/10.1109/TVT.2014.2370106.
 25
E. Godana, T. Ekman, Parametrization based limited feedback design for correlated MIMO channels using new statistical models. IEEE Trans. Wirel. Commun.12(10), 5172–5184 (2013). http://doi.org/10.1109/TWC.2013.092013.130045.
 26
C. He, R. D. Gitlin, Limiting performance of massive MIMO downlink cellular systems. Informa. Theory Appl. Workshop. ITA:, 1–6 (2016). http://doi.org/10.1109/ITA.2016.7888139.
Acknowledgements
Our sincere thanks to MOHE and Center for JapanEgypt Cooperation in Science and Technology, Kyushu University, Japan, for their guidance, support, and encouragement.
Funding
This research was supported in part by the JSPS KAKENHI (JP17K06427) and in part by the Egyptian Ministry of Higher Education (MoHE).
Author information
Affiliations
Contributions
Authors’ contributions
Both authors contributed to the design and analysis of the research, to the simulation results, and to the writing of the manuscript. They also discussed the results and contributed to the final manuscript. Both authors read and approved the final manuscript.
Authors’ information
Sherief Hashima was born in ElSanta, Gharbiya, Egypt in 1983. He received his B.Sc. and M.Sc. degrees in Electronics and Communication Engineering (ECE), with class of honors, in 2004, 2010 from Tanta and Menoufiya University, respectively. He obtained his Ph.D degree from EgyptJapan University of Science & Technology (EJUST), Alexandria, EGYPT at 2014. He is working as assistant professor at the Engineering and scientific equipment Department, Nuclear Research Center (NRC), Egyptian Atomic Energy Authority (EAEA), Egypt since 2014. From JanJune 2018, he was a visiting researcher at Center for JapanEgypt Cooperation in Science and Technology, Kyushu University. Recently, he joined computational learning theory team, RIKEN AIP, Japan as a postdoctoral researcher since July 2019. His research interests include wireless communications, machine learning, online learning, 5G systems, image processing, nuclear instruments design, and internet of things.
Osamu Muta received a B.E. degree from Ehime University, Ehime, Japan, in 1996, an M.E., and Ph.D. degrees from Kyushu University, Fukuoka, Japan in 1998, and 2001 respectively. In 2001, he joined the Graduate School of Information Science and Electrical Engineering, Kyushu University as an assistant professor. Since 2010, he has been an associate professor in Center for JapanEgypt Cooperation in Science and Technology, Kyushu University. His research interests include signal transmission processing techniques for highspeed wireless communications and powerline communications, and non linear distortion compensation techniques for highpower amplifiers. He received the 2005 Active Research Award for excellent presentation from IEICE Radio Communication Systems. Dr. Muta is a senior member of IEICE and a member of IEEE.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
\(^\dag\)Equal contributor
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Hashima, S., Muta, O. Fast matrix inversion methods based on Chebyshev and Newton iterations for zero forcing precoding in massive MIMO systems. J Wireless Com Network 2020, 34 (2020). https://doi.org/10.1186/s136380191631x
Received:
Accepted:
Published:
Keywords
 Massive MIMO
 Matrix inversion
 Neumann series
 Successive over relaxation
 Chebyshev iteration
 Newton iteration