Massive MIMO with multicell MMSE processing: exploiting all pilots for interference suppression
 Xueru Li^{1}View ORCID ID profile,
 Emil Björnson^{2}Email author,
 Erik G. Larsson^{2},
 Shidong Zhou^{1} and
 Jing Wang^{1}
https://doi.org/10.1186/s1363801708792
© The Author(s) 2017
Received: 11 August 2016
Accepted: 28 March 2017
Published: 26 June 2017
Abstract
A new stateoftheart multicell minimum mean square error (MMMSE) scheme is proposed for massive multipleinputmultipleoutput (MIMO) networks, which includes an uplink MMSE detector and a downlink MMSE precoder. Contrary to conventional singlecell schemes that suppress interference using only channel estimates for intracell users, our scheme shows the optimal way to suppress both intracell and intercell interference instantaneously by fully utilizing the available pilot resources. Specifically, let K and B denote the number of users per cell and the number of orthogonal pilot sequences in the network, respectively, where β=B/K is the pilot reuse factor. Our scheme utilizes all B channel directions that can be estimated locally at each base station, to actively suppress both intracell and intercell interference. Our scheme is practical and general, since power control, imperfect channel estimation, and arbitrary pilot allocation are all accounted for. Simulations show that significant spectral efficiency (SE) gains are obtained over the conventional singlecell MMSE scheme and the multicell zeroforcing (ZF) scheme. Furthermore, largescale approximations of the uplink and downlink signaltointerferenceandnoise ratios (SINRs) are derived, which are tight in the largesystem limit. These approximations are easy to compute and very accurate even for small system dimensions. Using these SINR approximations, a lowcomplexity power control algorithm is further proposed to maximize the sum SE.
Keywords
1 Introduction
Multiuser multipleinputmultipleoutput (MUMIMO) communication has drawn considerable interest in recent years. By scheduling multiple users on the same timefrequency resource, the spatial degrees of freedom offered by multiple antennas can be exploited to focus signals on intended receivers, reduce interference, and thereby increase the system throughput [1–6]. These features motivate that MUMIMO technology is incorporated into recent and evolving wireless standards like 4G LTEAdvanced [7].
Massive MUMIMO is an emerging 5G technology that scales up MUMIMO by orders of magnitude [8, 9]. The idea is to employ an array comprising a hundred, or more, antennas at the base station (BS) and serve tens of users simultaneously per cell. Compared to the contemporary cellular systems, the system throughput can be drastically increased without consuming extra bandwidth [7–9]. The uplink and downlink transmit power can also be reduced by an order of magnitude since the phasecoherent processing provides a comparable array gain [10]. In the limit of an infinite number of antennas, intracell interference and uncorrelated noise can be averaged out by using simple coherent precoders and detectors, and the main performance limitations are pilot contamination and the distortions from hardware impairments [8, 11].
In the uplink reception and downlink transmission, the most common processing schemes are matched filtering (MF), zeroforcing (ZF), and minimum mean square error (MMSE) processing, where the latter is referred to as singlecell MMSE (SMMSE) in this work.^{1} A key characteristic of these schemes is that the BS only utilizes the instantaneous realizations of the channels to its own intracell users when creating the precoders/detectors, while users in other cells are either neglected or only considered based on their longterm statistics [12]. This is why we refer to MF, ZF, and SMMSE as singlecell schemes. In the coordinated multipoint (CoMP) literature, there also exist multicell schemes that exploit the instantaneous channel realizations of the users in all cell; see [13] for an overview. However, there is no scalable solution for estimating all these channel realizations in a large system.
Massive MIMO addresses the channel estimation issue by operating in timedivision duplex (TDD) mode and requiring only uplink pilots for channel estimation. Hence, the pilot overhead scales linearly with the number of users, instead of the number of BS antennas, which allows for adding additional antennas without affecting the pilot overhead [14]. The BS first listens to the uplink pilot signaling from its own cell, estimates the K intracell channels, and then constructs its precoders/detectors based on these channel estimates to mitigate the intracell interference [12, 15–17]. In principle, the BS can also estimate and utilize the channels from users in neighboring cells, but the channel estimates can be very unreliable due to pilot contamination. As shown in [17], the gains are marginal in the baseline scenario with uncorrelated Rayleigh fading channels and every pilot being reused in every cell, and a similar conclusion is drawn in [18].
In this work, we explore multicell scenarios where the pilot signals are not reused in every cell. Let B denote the number of orthogonal pilot sequences and K denote the number of users in each cell. The pilot reuse factor is β=B/K≥1, which implies that 1/β of the cells use a particular pilot sequence. In this case, a BS can estimate the channels to intercell users more reliably by utilizing the B−K pilots that are not used in the own cell. In our previous work [19], we used these estimates to propose a multicell ZF detector (referred to as fullpilot ZF detector in [19]) to cancel interference from neighboring cells. Unfortunately, the gains over the singlecell schemes were marginal, partly due to the loss in array gain of B in multicell ZF, instead of K as with singlecell ZF. Therefore, in this work, we derive and analyze the uplink multicell MMSE (MMMSE) detector and downlink MMMSE precoder instead, under arbitrary pilot reuse and pilot allocation. This is a generalization of the MMMSE schemes considered in [17] and [20] for the special case of B=K and in [21] for the idealized case of perfect channel state information (CSI).

A new stateoftheart MMMSE scheme is proposed, which includes an uplink detector and a downlink precoder. The novelty is that all B pilots are exploited at each BS to actively suppress both intracell and intercell interference. It brings significant SE gains over conventional singlecell schemes which dominate the MIMO literature. Moreover, we prove that the computational complexity of the scheme is scalable since the KL channels, in an Lcell setup, are fully represented by only B channel direction estimates. The proposed scheme is general since it accounts for imperfect channel estimation, power control, and arbitrary pilot allocation.

Largescale approximations of the uplink and downlink signaltointerferenceandnoise ratios (SINRs) for the proposed MMMSE scheme are derived, which are asymptotically tight in the largesystem limit. The approximations are very accurate even for small system dimensions and are easy to compute, which enables performance analysis and optimization without the need for heavy MonteCarlo simulations.

By utilizing the SINR approximations, a lowcomplexity power control algorithm for sum SE maximization is proposed. Since the SINR approximations depend only on longterm statistics, the computation complexity can be spread over time. Compared to equal power allocation, the proposed algorithm significantly improves the sum SE and provides good user fairness.
The paper is organized as follows: In Section 2, we describe the system model and the construction of the MMMSE scheme. Largescale approximations of the uplink and downlink SINRs are derived in Section 3. A power control algorithm is proposed in Section 4. Simulation results are provided in Section 5 before we conclude the paper in Section 6.
Notation: Boldface lower and upper case symbols represent vectors and matrices, respectively. The trace, transpose, conjugate, Hermitian transpose, and matrix inverse operators are denoted by tr(·), (·)^{ T }, (·)^{∗}, (·)^{ H }, and (·)^{−1}, respectively. The function diag(·) constructs a diagonal matrix by selecting the diagonal elements of a matrix.
2 System model and transceiver design
The vector \(\mathbf {z}_{lk} \in \mathbb {R}^{2}\) is the geographical position of user k in cell l, and d _{ j }(z) is an arbitrary function that accounts for the channel attenuation (e.g., path loss and shadowing) between BS j and any user position z. Since the user position changes relatively slowly, d _{ j }(z _{ lk }) is assumed to be known at BS j for all l and all k.
We consider a TDD protocol in this paper, where the downlink channels are estimated by uplink pilot signaling by exploiting channel reciprocity.^{2} In TDD mode, each transmission block is divided into two phases: 1) uplink channel estimation phase, where each BS acquires CSI from uplink pilot signaling which occupies B out of S symbols in each block, and 2) uplink and downlink payload data transmission phase, where each BS processes the received uplink signal and the tobetransmitted downlink signals using the estimated CSI. Let ζ ^{ul} and ζ ^{dl} denote the fixed fractions allocated for uplink and downlink payload data transmission, respectively. These fractions can be selected arbitrarily under the conditions that ζ ^{ul}+ζ ^{dl}=1 and that ζ ^{ul}(S−B) and ζ ^{dl}(S−B) are positive integers. The uplink channel estimation is first discussed to lay a foundation for the transceiver design.
2.1 Uplink channel estimation
The B pilot symbols in a coherence block are used for transmitting Blength pilot sequences. We consider a set of B orthogonal sequences with unitmodulus entries, denoted as \({{\mathbf {v}_{1}},\ldots,{\mathbf {v}_{B}}} \in \mathbb {C}^{B}\). These sequences could, for instance, be selected as the columns of a discrete Fourier transform (DFT) matrix. By gathering the sequences in a matrix \(\mathbf {V} = \left [{{\mathbf {v}_{1}},\, \ldots, \, {\mathbf {v}_{B}}}\right ] \in \mathbb {C}^{B \times B}\), our orthogonality and scaling assumptions lead to V ^{ H } V=B I _{ B }.
Arbitrary pilot allocation is considered in this work, with the only requirement of B≥K. The parameter β=B/K≥1 is called the pilot reuse factor. If the pilots are allocated wisely in the network, a larger β brings a lower level of interference during the pilot transmission, also known as pilot contamination. Let i _{ lk }∈{1,…,B} denote the index of the pilot sequence used by user k in cell l, which implies that the user sends the pilot sequence \(\mathbf {v}_{i_{lk}}\) (i.e., the i _{ lk }th column of V).
where h _{ jlk } is the channel response defined in (1), p _{ lk }≥0 is the transmit power for the pilot of user k in cell l, and the additive white Gaussian noise (AWGN) term \({\mathbf {N}_{j}}\in \mathbb {C}^{M \times B}\) contains independent and identically distributed (iid) elements that are distributed as \({\mathcal {CN}}(0,\sigma ^{2})\).
where e _{ i } denotes the ith column of the identity matrix I _{ B }. The property that users with the same pilot have parallel estimated channels is utilized to derive and analyze new SE expressions in the sequel.
Finally, notice that also \(\hat { \mathbf {h}}_{{\mathcal {V}},ji}\) is a zeromean complex Gaussian vector and its covariance matrix is \(\mathbb {E}\left \{ \hat { \mathbf {h}}_{{\mathcal {V}},ji}\hat { \mathbf {h}}^{H}_{{\mathcal {V}},ji}\right \} = \tilde { \phi }_{ji} \mathbf {I}_{M}\).
2.2 Uplink MMMSE detector
Recall that C _{ jjk } and C _{ jlk } are estimation error covariance matrices, defined in (8). Note that \(R_{jk}^{\text {ul}}\) is a lower bound on the uplink ergodic capacity.
where \(\tilde { \phi }_{ji_{lk}}\) is defined in (6). As the name suggests, \(\tau _{jk}\mathbf {g}_{jk}^{\mathrm {MMMSE}}\) also minimizes the mean square error (MSE) in estimating x _{ jk } [22], \(\mathbb {E} \left \{\left {\hat x}_{jk} x_{jk} \right ^{2} \big \hat { \mathbf {h}}_{(j)}\right \}\).
Remark 1
of the intracell estimation errors plus the intercell interference. When Z _{ j } in (16) is used, \(\tau _{jk} \mathbf {g}_{jk}^{\mathrm {SMMSE}}\) minimizes the MSE \(\mathbb {E} \{{\hat x}_{jk} x_{jk} ^{2} \big  \hat {\mathbf {h}}_{jj1}, \ldots, \hat {\mathbf {h}}_{jjK} \}\) under the assumption that only estimates of the intracell channels are available, but we stress that this is a limiting assumption in multicell scenarios since also the intercell channels can be estimated without any additional pilot overhead. As we will show numerically in Section 5, the benefit of the MMMSE detector over SMMSE grows with β, since the estimates of the intercell channels then improve, and this allows for more efficient interference suppression.
Remark 2
by exploiting the fact that (C _{1} C _{2}+I)^{−1} C _{1}=C _{1}(C _{2} C _{1}+I)^{−1} for any matrices C _{1},C _{1} of compatible dimensions. Hence, only a Bdimensional matrix needs to be inverted and only once per cell and not once per user. The computation of the MMMSE detectors in a cell requires approximately \(\frac {3}{2} B^{2} M\) complex multiplications. This is greater than with the SMMSE detector, which after similar matrix algebra requires the inversion of a Kdimensional matrix, and thus, \(\frac {3}{2} K^{2} M\) complex multiplications are required.^{4} In summary, the increased complexity compared to SMMSE is about \(\frac {3}{2} \left (\beta ^{2}1\right)K^{2} M\) complex multiplications. Since in massive MIMO systems M≫K is often assumed, the complexity increase is not a big issue when K is small or moderate, particularly since the computational efficiency of digital hardware grows rapidly and is not expected to be a bottleneck in the future. One way to reduce the complexity is to check which of the diagonal elements of Λ _{ j } are below a certain threshold and put these values to zero, to effectively reduce the matrix to be inverted in the MMMSE expression. This approximation can significantly reduce the complexity if only a few of the B−K pilots that belong exclusively to other cells are used by users that cause strong interference. Note that the MMMSE scheme can be seen as a CoMP coordinated beamforming scheme, but since there is no signaling between the BSs (BS j estimates \(\hat { \mathbf {H}}_{{\mathcal {V}},j}\) from pilots), the MMMSE scheme is fully scalable.
2.3 Downlink MMMSE precoder
where \(\mathbf {w}_{lm}\in \mathbb {C}^{M \times 1}\) is the precoder used by BS l for user m in its cell, \(s_{lm} \sim {\mathcal {CN}} (0,1)\) is the payload data symbol for user m in cell l, ϱ _{ lm } is the corresponding downlink transmit power, and \(n_{jk}\sim {\mathcal {CN}}\left (0,\sigma ^{2} \right)\) is AWGN.
where \(\gamma _{jk} = \mathbb {E}\left \{\left \\mathbf {g}_{jk}^{\mathrm {MMMSE}} \right \^{2} \right \}\) normalizes the average transmit power for the user k in cell j to \(\mathbb {E}\left \{\left \ \sqrt {\varrho _{lm}}\mathbf {w}_{jk}^{\mathrm {MMMSE}}s_{lm}\right \^{2} \right \}=\varrho _{lm}\).
This downlink SINR holds for any linear precoding scheme, and we omit the superscript “MMMSE” of w _{ jk } for brevity. By treating \(\sqrt {\varrho _{jk}} \mathbb {E}_{\mathbf {\{h\}}}\left \{\mathbf {h}_{jjk}^{H} \mathbf {w}_{jk}\right \}\) as the true channel, and the last three term in (20) as uncorrelated Gaussian noise, the user applies semicoherent symbol detection and achieves the effective SINR in (22).^{5} Thus, \(R_{jk}^{\text {dl}}\) is a lower bound on the downlink ergodic capacity.
By utilizing all the available estimated directions, the MMMSE precoder can suppress intracell interference and also reduce the interference caused to other cells. Thus, a higher SINR is expected by our precoder than conventional singlecell precoders, at least if an appropriate power control is applied [19]. In [20], the authors also proposed a MMMSE precoder, but it does not account for arbitrary or optimized pilot allocation. Moreover, no closedform performance expression is provided in [20], which makes it cumbersome to analyze the performance and optimize the power control.
3 Asymptotic analysis
In this section, performance analysis is conducted for the proposed MMMSE scheme. Since the uplink SINR in (12) depends on the stochastic channel estimates in each block, the uplink SE in (11) cannot be computed in closed form. Therefore, a deterministic equivalent expression for the SINR is computed instead which is tight in the largesystem limit. A largescale approximation of the downlink SINR is also provided. The largesystem limit is considered, where M and K go to infinity while keeping K/M finite and nonzero. In what follows, the notation M→∞ refers to K, M→∞ such that K/M→c∈(0,∞). Hence, B/M→β c. The results should be understood in the way that, for each set of system dimension parameters M, K, and B, we provide largescale approximative expressions for the uplink SINR and downlink SINR, and the expressions are tight as M, K, and B grow large. The main feature is that they are deterministic and can be computed efficiently without the need for timeconsuming Monte Carlo simulations. Almost sure convergence of a stochastic sequence is denoted by \(\xrightarrow [M \to \infty ]{a.s.}\), and \(\xrightarrow [M \to \infty ]{}\) denotes convergence of a deterministic sequence.
Before we continue with our performance analysis, a useful theorem from large random matrix theory is first recalled.
Theorem 1
Based on Theorem 1, we obtain the following Theorem 2 which is useful in our analysis.
Theorem 2
and m _{ o }(ρ) is defined in Theorem 1.
3.1 Largescale approximations of the SINRs with the MMMSE scheme
Theorem 3

1 δ _{ j }=m _{ o }(ω) is given by Theorem 1 for \(\omega = \frac {\sigma ^{2} + \varphi _{j}}{M}\) and T=Φ _{ j } Λ _{ j }, with the diagonal matrix \(\boldsymbol {\Phi }_{j}={\text {diag}}\left \{ \tilde {\phi }_{j1},\ldots,\tilde {\phi }_{jB}\right \}\).

2 𝜗 _{ j }=mo′(ω) is given by Theorem 2 for \(\omega = \frac {\sigma ^{2} +\varphi _{j}}{M}\), T=Φ _{ j } Λ _{ j } with \(\boldsymbol {\Phi }_{j}={\text {diag}}\left \{\tilde {\phi }_{j1},\ldots, \tilde {\phi }_{jB}\right \}\).

3 \(\mu _{jlm}=1  p_{lm} d_{j}(\mathbf {z}_{lm})\tilde {\phi }_{ji_{lm}} \frac {\left (1+\lambda _{ji_{lm}} \tilde {\phi }_{ji_{lm}}\delta _{j} \right)^{2}1}{\left (1+\lambda _{ji_{lm}} \tilde {\phi }_{ji_{lm}}\delta _{j} \right)^{2}}\).
Proof
: See Appendix 1. □
The \({\bar \eta }_{jk}^{\text {ul}}\) above not only provides a tight SINR approximation but also shows how the signal, the interference, and the noise change as M grows large. The first term of the denominator represents the interference from the pilotsharing users, i.e., those users with i _{ lm }=i _{ jk }. This term is at the same order of magnitude as the signal power (notice the \(\delta _{jk}^{2}\) in both terms), since the estimated channels of these users are parallel with the target user. The second term of the denominator is the interference from the nonpilotsharing users, i.e., those users with i _{ lm }≠i _{ jk }. Since their estimated channels are independent of the channel of the target user, their interference decreases and goes to zero as M→∞. So does the third term which represents the noise. Thus, only the signal and the interference from the pilotsharing users remain as M grows, which is referred to as the pilot contamination effect [7–10].
Next, we provide the largescale SINR approximation for the downlink MMMSE precoder.
Theorem 4
where δ _{ l }, μ _{ ljk }, and 𝜗 _{ l } are given in Theorem 3.
Proof
: See Appendix 2. □
By utilizing Theorems 3 and 4, the ergodic SEs \(R_{jk}^{\text {ul}}\) in (11) and \(R_{jk}^{\text {dl}}\) in (21), after dropping the prelog factor \((1\frac {B}{S})\), converge to \({\bar R}_{jk}^{\text {ul}}=\log _{2} \left (1+{\bar \eta }_{jk}^{\text {ul}}\right)\) and \({\bar R}_{jk}^{\text {dl}}=\log _{2} \left (1+{\bar \eta }_{jk}^{\text {dl}}\right)\) in the largesystem limit, respectively. Therefore, a largescale approximation of the joint ergodic SE in (23) is provided by \(\left (1\frac {B}{S}\right)\left (\zeta ^{\text {ul}} {\bar R}_{jk}^{\text {ul}}+\zeta ^{\text {dl}}{\bar R}_{jk}^{\text {dl}}\right)\). This approximation is easy to compute and only depends on the longterm parameters: largescale fading, power control, and pilot allocation. As shown in Section 5, this approximation is very accurate even for smallsystem dimensions.
3.2 Uplinkdownlink duality
It is pointed out in [19] that when the precoder is a scaled version of the detector, the same peruser SEs as in the uplink can be achieved in the downlink by properly selecting the downlink payload power. We establish this uplinkdownlink duality also for our MMMSE scheme, using the largescale SINR approximations given by Theorem 3 and Theorem 4.
Theorem 5
where u=k+(j−1)K, v=m+(l−1)K. The symbol [ ·]_{ i,j } represents the element in the ith row and the jth column of the corresponding matrix.
Proof
: The proof follows the same lines as the duality proof in [19] and is thus omitted. □
Remark 3
By utilizing the largescale SINR approximations, Theorem 5 provides a powerful tool to obtain a judicious downlink power allocation whenever the same SEs are desired in both uplink and downlink. However, a certain level of BS coordination is required for this downlink power control policy. Specifically, LK elements in E, LK elements in τ, and 2KL ^{2} elements in F need to be exchanged (F can be represented by 2K L ^{2} elements from its definition). Therefore, the exchange overhead is 2K L(L+1) elements. Fortunately, this overhead is acceptable since the exchanged elements are longterm statistical parameters.
4 Iterative power control
Power control for sum SE maximization has been widely studied in cellular networks [13,24–30]. However, the power control with the MMMSE scheme is complicated since the detector/precoder depend on the power control parameters and since the SINRs can not be computed in closed form. In this section, we provide a key application of the results from Theorem 3: joint uplink payload power control for sum SE maximization in multicell network. Since the downlink payload power can be obtained according to Theorems 4 and 5, the optimized uplink SEs can also be achieved in the downlink.
where t is the iteration index in the fixed point algorithm, for t=0,1,…. It is proved in [33] that starting from the initial point τ _{ l }(0)=P _{ max } for all l, the above algorithm converges at a geometric rate to the optimal solution of \({\mathcal {P}}_{1}\) (for fixed F and D).
In our case, however, F and D are not fixed since δ _{ j } and 𝜗 _{ j } will change as τ _{ l } changes. Hence, \({\mathcal {P}}_{1}\) in our work is not a pure GP. Therefore, Algorithm 1 is proposed to iterate between solving \({\mathcal {P}}_{1}\) for fixed F and D and updating F and D using the current τ.
The rigorous proof of convergence of R(t) is intractable, since D and F depend in a very complicated way on the powers τ _{ lm } of all users, and we update D and F after each iteration. However, numerical results testify the fast convergence: about five iterations are enough. Therefore, our algorithm can converges to some local optimal solution of \({\mathcal {P}}_{1}\), and the involved information exchange overhead is acceptable. Moreover, since only longterm parameters need to be exchanged, the exchange overhead can be spread over time.
5 Simulation results
The user locations are generated independently and uniformly at random in the cells, but the distance between each user and its serving BS is at least 0.14r. For each user location \(\mathbf {z} \in \mathbb {R}^{2}\), a classic pathloss model is considered, where the variance of the channel attenuation is \(d_{j}(\mathbf {z}) = \frac {C_{(\mathbf {z})}}{\left \ \mathbf {z}  \mathbf {b}_{j}\right \^{\kappa }}\). The vector \(\mathbf {b}_{j} \in \mathbb {R}^{2}\) is the location of the BS in cell j, κ is the pathloss exponent, and ∥·∥ denotes the Euclidean norm. C _{(z)}>0 is independent shadow fading for some user location z with \(10\log _{10}\left (C_{(\mathbf {z})}\right)\sim {\mathcal {N}}\left (0,\sigma ^{2}_{sf}\right)\). In the simulation, we assume κ=3.7, \(\sigma _{sf}^{2} = 5\) and the coherence block length S=1000.^{6}
5.1 Benefits of the proposed MMMSE scheme
In this subsection, we show the benefits of our MMMSE scheme over the conventional alternatives. Statistical channel inversion power control is applied to both pilot and uplink payload data, i.e., \(p_{lk} =\tau _{lk}= \frac {\rho }{d_{l}(\mathbf {z}_{lk})}\) [19]. Thus, during the uplink phase, the average effective channel gain between users and their serving BSs is constant: \(\mathbb {E}\left \{p_{lk}\left \\mathbf {h}_{llk}\right \^{2}\right \}=\mathbb {E}\left \{\tau _{lk}\left \\mathbf {h}_{llk}\right \^{2}\right \} = M\rho \). Then, the average uplink SNR per antenna and user at its serving BS is ρ/σ ^{2}. This is a simple but effective policy to avoid nearfar blockage and, to some extent, guarantee a uniform user performance in the uplink. For downlink payload data transmission, the transmit power ϱ _{ lk } is selected according to Theorem 5 to achieve the same downlink SE at each user as in the uplink. In our simulation, ρ/σ ^{2} is set to 0 dB to allow for decent channel estimation accuracy, and the time proportions for the uplink and downlink are set to \(\zeta ^{\text {ul}}=\zeta ^{\text {dl}}=\frac {1}{2}\).
5.2 Effectiveness of joint power control
In this subsection, the effectiveness of the proposed power control scheme is testified. Statistical power control \(p_{lk} = \frac {\rho }{d_{l}(\mathbf {z}_{lk})}\) is still applied for pilots, while the uplink payload power τ _{ jk } is optimized. ρ/σ ^{2} is still set to 0 dB, and the maximal transmit power P _{ max } in \(\mathcal {P}\) is selected as in Section 5.1. Results for equal maximum power allocation (i.e., τ _{ lk }=P _{ max }) are provided as a baseline. We also apply Algorithm 1 to the instantaneous SINR in (12) for comparison. The following results are obtained for M=300 and K=10. After generating user locations and shadow fading realizations, the 9 users with the worst channel conditions in the whole network are dropped to provide 95% coverage.
6 Conclusions
In this paper, a new stateoftheart MMMSE scheme is proposed, which includes an uplink MMMSE detector and a downlink MMMSE precoder. It brings very promising sum SE gains over SMMSE and other singlecell schemes by actively suppressing both intracell and intercell interference. Since imperfect CSI is accounted for in our scheme, the gains obtained by our scheme are likely to be achievable in practical systems. Furthermore, largescale approximations of the uplink and downlink SINRs are derived for the proposed MMMSE scheme. The approximations are very accurate even for small system dimensions and are easy to compute since they only depend on longterm statistics. Hence, the expressions can be utilized for efficient performance analysis, without the need for MonteCarlo simulations. The SINR approximations can further be used for power control design, and a lowcomplexity power control algorithm for sum SE maximization is proposed. The proposed algorithm brings a notable sum SE gain and also provides good user fairness compared to the equal power allocation policy. Since the SINR approximations depend only on longterm statistics, the complexity of the algorithm can be spread over a long time period.
7 Endnotes
^{1} These schemes have several names in the literature: MF is also known as maximum ratio combining/transmission; ZF is also known as channelinversion; and regularized ZF (RZF) is a simple variation on SMMSE.
^{2} In practice, only the propagation channels are reciprocal, while the hardware used for uplink and downlink communication is not. This requires reciprocitycalibration of the hardware, but there are many algorithms for this, and the variations are slow so the calibration overhead is negligible [34].
^{3} Notice that \(\frac {1}{\sqrt {B}} \mathbf {V}=\frac {1}{\sqrt {B}}[\mathbf {v}_{1},\ldots,\mathbf {v}_{B}]\in \mathbb {C}^{B \times B}\) is an orthogonal basis for a B dimensional space. Therefore, a singular value decomposition of Ψ _{ j } is \(\boldsymbol {\Psi }_{j} = \frac {1}{B} \mathbf {V}\mathbf {A}_{j}\mathbf {V}^{H}\), where A _{ j } is a diagonal matrix with its bth element as \(a_{jb}= B/\tilde { \phi }_{jb}\). Then, (7) is obtained.
^{4} Only multiplications are counted in the complexity comparison, since additions and subtractions have a negligible complexity in comparison.
^{5} This method works well in massive MIMO systems due to channel hardening—the effective channel is relatively close to its mean, while the performance loss would be large in a smallscale MIMO system.
^{6} This coherence block can, for example, have the dimensions of T _{ c }=10 ms and W _{ c }=100 kHz.
^{7} One should notice that K and β cannot be increased indefinitely due to the prelog loss in the achievable SE.
8 Appendix 1
8.1 Proof of Theorem 3
 1.
\({\hat { \mathbf {H}}_{{\mathcal {V}},{jlk}}} = \left [\hat {\mathbf {h}}_{{\mathcal {V}},j1},...,\hat { \mathbf {h}}_{{\mathcal {V}},j\left (i_{lk} 1 \right)},\hat {\mathbf {h}}_{{\mathcal {V}},j\left (i_{lk} + 1 \right)},..., \hat {\mathbf {h}}_{{\mathcal {V}},jB} \right ]\),
 2.
\(\boldsymbol {\Lambda }_{jlk}={\text {diag}}\left \{ \lambda _{j1},...\lambda _{j\left (i_{lk}  1 \right)},\lambda _{j\left (i_{lk} + 1 \right)},....,\lambda _{jB}\right \}\),
 3.
\(\boldsymbol {\Phi }_{j} = {\text {diag}}\{\tilde { \phi }_{j1},..., \tilde { \phi }_{jB}\}\),
 4.
\(\boldsymbol {\Sigma }_{jjk}= \left ({\hat { \mathbf {H}}_{{\mathcal {V}},{jjk}}}{\boldsymbol {\Lambda }_{jjk}}{\hat {\mathbf {H}}_{{\mathcal {V}},{jjk}}}^{H} + \left (\sigma ^{2}+ \varphi _{j} \right)\mathbf {I}_{M}\right)^{1}\),
 5.
\(\boldsymbol {\Sigma }_{j}^{'} = M\boldsymbol {\Sigma }_{j}\) and \(\boldsymbol {\Sigma }_{jjk}^{'} = M \boldsymbol {\Sigma }_{jjk}\),
then we have the following lemma.
Lemma 1
Proof
where (a) follows from Lemma 1 in [12] and \(\hat {\mathbf {h}}_{jjk} = \sqrt {p_{jk}} d_{j}(\mathbf {z}_{jk}) \hat { \mathbf {h}}_{{\mathcal {V}},{ji}_{jk}}\) and (b) follows from Lemma 12 in [36], which can be applied since \(\boldsymbol {\Sigma }_{jjk}^{'}\) has uniformly bounded spectral norm with respect to M, because φ _{ j } scales as K and \(\frac {K}{M} > 0\) by assumption; thus, \(\frac {\varphi _{j}}{M} > 0\) for all M. (c) follows from Lemma 14.3 in [37]. In step (d), we define \(\delta _{j} = m_{o}\left (\frac {\sigma ^{2} + \varphi _{j}}{M}\right)\), which is obtained by Theorem 1 for T=Φ _{ j } Λ _{ j } and \(\rho = \frac {\sigma ^{2}+\varphi _{j}}{M}\).
where (a) and (b) follow from Lemma 1 in [12] and Lemma 12 in [36], respectively. □
We use Lemma 1 in the following to determine the asymptotic behavior of each term in (12).
8.2 Signal power
8.3 Channel uncertainty
8.4 Interference power
The computation depends on which pilots that are used.
i _{ lm }=i _{ jk }=i _{0}
i _{ lm }≠i _{ jk }
where \(\vartheta _{j} = m_{o}^{\prime }\left (\frac {\sigma ^{2} + \varphi _{j}}{M}\right)\) is given by Theorem 2 for \(\rho = \frac {{{\sigma ^{2}} + {\varphi _{j}}}}{M}\) and T=Φ _{ j } Λ _{ j }.
8.5 Noise power
Finally, by the continuous mapping theorem, we arrive at the expression in (29).
9 Appendix 2
9.1 Proof of Theorem 4
Except for the channel variance \({\text {var}} \left \{\mathbf {h}_{jjk}^{H} \mathbf {w}_{jk}\right \} =\mathbb {E}\left \{\left \mathbf {h}_{jjk}^{H} \mathbf {w}_{jk} \mathbb {E}\left \{\mathbf {h}_{jjk}^{H} \mathbf {w}_{jk} \right \} \right ^{2} \right \}\), largescale approximations of the signal power and the interference in (22) can be calculated by following similar procedures as in Appendix 1. Thus, only the channel variance is considered here.
where the last step is due to the fact that \(\hat { \mathbf {h}}_{jjk}\) is independent of \(\tilde { \mathbf {h}}_{jjk}\) and that \(\mathbb {E} \{b\}=0\).
Declarations
Funding
The work is supported by National Basic Research Program (2012CB316000), National Natural Science Foundation of China (61201192), National High Technology Research Development Program of China (2014AA01A703), National S&T Major Project (2014ZX03003003002), TsinghuaQualcomm Joint Research Program, Keysight Technologies, Inc., ELLIIT, the CENIIT project 15.01, and the FP7MAMMOET project.
Authors’ contributions
All authors contributed to this work, and the authors are listed in the order of their contribution. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 G Caire, N Jindal, M Kobayashi, N Ravindran, Multiuser MIMO achievable rates with downlink training and channel state feedback. IEEE Trans. Inf. Theory. 56(6), 2845–2866 (2010).MathSciNetView ArticleGoogle Scholar
 G Caire, S Shamai, On the achievable throughput of a multiantenna Gaussian broadcast channel. IEEE Trans. Inf. Theory. 49(7), 1691–1706 (2003).MathSciNetView ArticleMATHGoogle Scholar
 Y Wei, JM Cioffi, Sum capacity of Gaussian vector broadcast channels. IEEE Trans. Inf. Theory. 50(9), 1875–1892 (2004).MathSciNetView ArticleMATHGoogle Scholar
 D Gesbert, M Kountouris, RW Heath, CB Chae, T Salzer, From single user to multiuser communications: shifting the MIMO paradigm. IEEE Signal Process. Mag. 24(5), 36–46 (2007).View ArticleGoogle Scholar
 V Stankovic, M Haardt, Generalized design of multiuser MIMO precoding matrices. IEEE Trans. Wireless Commun. 7(3), 953–961 (2008).View ArticleGoogle Scholar
 T Yoo, A Goldsmith, On the optimality of multiantenna broadcast scheduling using zeroforcing beamforming. IEEE J. Sel. Areas Commun. 24(3), 1912–1921 (2006).Google Scholar
 EG Larsson, O Edfors, F Tufvesson, TL Marzetta, Massive MIMO for next generation wireless systems. IEEE Commun. Mag. 52(2), 186–195 (2014).View ArticleGoogle Scholar
 TL Marzetta, Noncooperative cellular wireless with unlimited numbers of base station antennas. IEEE Trans. Wireless Commun. 9(1), 3590–3600 (2010).View ArticleGoogle Scholar
 F Rusek, D Persson, KL Buon, EG Larsson, TL Marzetta, O Edfors, F Tufvesson, Scaling up MIMO: opportunities and challenges with very large arrays. IEEE Trans. Signal Process. 30(1), 40–60 (2013).View ArticleGoogle Scholar
 HQ Ngo, EG Larsson, TL Marzetta, Energy and spectral efficiency of very large multiuser MIMO systems. IEEE Trans. Commun. 61(4), 1436–1449 (2013).View ArticleGoogle Scholar
 E Björnson, J Hoydis, M Kountouris, M Debbah, Massive MIMO systems with nonideal hardware: energy efficiency, estimation, and capacity limits. IEEE Trans. Inf. Theory. 60(11), 7112–7139 (2014).MathSciNetView ArticleMATHGoogle Scholar
 J Hoydis, S ten Brink, M Debbah, Massive MIMO in the UL/DL of cellular networks: how many antennas do we need? IEEE J. Sel. Areas Commun. 31(2), 160–171 (2013).View ArticleGoogle Scholar
 E Björnson, E Jorswieck, Optimal resource allocation in coordinated multicell systems. Found. Trends Commun. Inf. Theory. 9(2–3), 113–381 (2013).View ArticleMATHGoogle Scholar
 TL Marzetta, in Proc. IEEE Asilomar Conference on Signals, Systems and Computers. How much training is required for multiuser MIMO? (IEEE, Pacific Grove, 2006), pp. 359–363. 29 Oct.1 Nov. 2006.Google Scholar
 KF Guo, Y Guo, G Fodor, G Ascheid, in Proc. IEEE International Conference on Communications (ICC). Uplink power control with MMSE receiver in multicell MUmassiveMIMO systems (IEEE, Sydney, 2014), pp. 5184–5190.Google Scholar
 N Krishnan, RD Yates, NB Mandayam, Uplink linear receivers for multicell multiuser MIMO with pilot contamination: large system analysis. IEEE Trans. Wireless Commun. 13(8), 4360–4373 (2014).View ArticleGoogle Scholar
 HQ Ngo, M Matthaiou, EG Larsson, in 2012 Swedish Communication Technologies Workshop (SweCTW). Performance analysis of large scale MUMIMO with optimal linear receivers (IEEE, Lund, 2012), pp. 59–64.View ArticleGoogle Scholar
 J Hoydis, S ten Brink, M Debbah, in Proc. of 49th Allerton. Massive MIMO: How many antennas do we need? (IEEE, Monticello, 2011), pp. 545–550.Google Scholar
 E Björnson, EG Larsson, M Debbah, Massive MIMO for maximal spectral efficiency: How many users and pilots should be allocated?IEEE Trans. Wireless Commun. 15(2), 1293–1308 (2016).View ArticleGoogle Scholar
 J Jose, A Ashikhmin, TL Marzetta, S Vishwanath, Pilot contamination and precoding in multicell TDD systems. IEEE Trans. Wireless Commun. 10(8), 2640–2651 (2011).View ArticleGoogle Scholar
 KF Guo, G Ascheid, in Proc. IEEE Wireless Communications and Networking Conference (WCNC). Performance analysis of multicell MMSE based receivers in MUMIMO systems with very large antenna arrays (IEEE, Shanghai, 2013), pp. 3175–3179.Google Scholar
 D Tse, P Viswanath, Fundamentals of Wireless Communication (Cambridge University Press, New York, 2005).View ArticleMATHGoogle Scholar
 JW Silverstein, ZD Bai, On the empirical distribution of eigenvalues of a class of large dimensional random matrices. J. Multivariate Anal. 54(2), 175–192 (1995).MathSciNetView ArticleMATHGoogle Scholar
 M Chiang, P Hande, T Lan, CW Tan, Power control in wireless cellular networks. Foundations Trends®; Netw. 2(4), 381–533 (2008).View ArticleGoogle Scholar
 D Gesbert, SG Kiani, A Gjendemsjo, GE Oien, Adaptation, coordination, and distributed rresource allocation in interferencelimited wireless networks. Proc. IEEE. 95(12), 2393–2409 (2007).View ArticleGoogle Scholar
 ZQ Luo, W Yu, An introduction to convex optimization for communications and signal processing. IEEE J. Sel. Areas Commun. 24(8), 1426–1438 (2006).View ArticleGoogle Scholar
 M Chiang, Balancing transport and physical layers in wireless multihop networks: jointly optimal congestion control and power control. IEEE J. Sel. Areas Commun. 23(1), 104–116 (2005).View ArticleGoogle Scholar
 IC Paschalidis, W Lai, D Starobinski, Asymptotically optimal transmission policies for largescale lowpower wireless sensor networks. IEEE/ACM Trans. Netw. 15(1), 105–118 (2007).View ArticleGoogle Scholar
 K Kumaran, L Qian, Uplink scheduling in CDMA packetdata systems. Wireless Netw. 12(1), 33–43 (2006).View ArticleGoogle Scholar
 M Charafeddine, A Sezgin, A Paulraj, in Proc. of 45th Allerton. Rate region frontiers for nuser interference channel with interference as noise (Allerton House, UIUCIllinois, 2007). September 26–28, 2007.Google Scholar
 ZQ Luo, SZ Zhang, Dynamic spectrum management: complexity and duality. IEEE J. Sel. Topics Signal Process. 2(1), 57–73 (2008).View ArticleGoogle Scholar
 M Chiang, CW Tan, DP Palomar, D O’Neill, D Julian, Power control by geometric programming. IEEE Trans. Wireless Commun. 6(7), 2640–2651 (2007).View ArticleGoogle Scholar
 CW Tan, M Chiang, R Srikant, in Proc. IEEE INFOCOM. Fast algorithms and performance bounds for sum rate maximization in wireless networks (IEEE, Rio de Janeiro, 2009), pp. 1350–1358.Google Scholar
 J Vieira, S Malkowsky, K Nieman, Z Miers, N Kundargi, L Liu, IC Wong, V Öwall, O Edfors, F Tufvesson, in Proc. IEEE GLOBECOM Workshop. A flexible 100antenna testbed for massive MIMO, (2014), pp. 287–293.Google Scholar
 H Yang, TL Marzetta, in Proc. IEEE Vehicular Technology Conference (VTC Fall). A macro cellular wireless network with uniformly high user throughputs (IEEE, Vancouver, 2014), pp. 1–5.Google Scholar
 J Hoydis, Random matrix methods for advanced communication system. Ph.D dissertation, Supélec, GifSurYvette, France (2012).Google Scholar
 R Couillet, M Debbah, Random Matrix Methods for Wireless Communications (Cambridge University Press, New York, 2011).View ArticleMATHGoogle Scholar
 AW van der Vaart, Asymptotic Statistics (Cambridge Series in Statistical and Probabilistic Mathematics) (Cambridge University Press, New York, 2000).Google Scholar
 P Billingsley, Probability and Measure, 3rd ed. edn (John Wiley & Sons, Inc., New York, 1995).MATHGoogle Scholar