Interference Alignment for Clustered Multicell Joint Decoding

Multicell joint processing has been proven to be very efficient in overcoming the interference-limited nature of the cellular paradigm. However, for reasons of practical implementation global multicell joint decoding is not feasible and thus clusters of cooperating Base Stations have to be considered. In this context, intercluster interference has to be mitigated in order to harvest the full potential of multicell joint processing. In this paper, four scenarios of intercluster interference are investigated, namely a) global multicell joint processing, b) interference alignment, c) resource division multiple access and d) cochannel interference allowance. Each scenario is modelled and analyzed using the per-cell ergodic sum-rate capacity as a figure of merit. In this process, a number of theorems are derived for analytically expressing the asymptotic eigenvalue distributions of the channel covariance matrices. The analysis is based on principles from Free Probability theory and especially properties in the R and Stieltjes transform domain.


Introduction
Currently cellular networks carry the main bulk of wireless traffic and as a result they risk being saturated considering the ever increasing traffic imposed by internet data services. In this context, the academic community in collaboration with industry and standardization bodies have been investigating innovative network architectures and communication techniques which can overcome the interference-limited nature of cellular systems. The paradigm of multicell joint processing has risen as a promising way of overcoming those limitations and has since gained increasing momentum which lead from theoretical research to testbed implementations [1]. Furthermore, the recent inclusion of CoMP (Coordinated Multiple Point) techniques in LTE-advanced [2] serves as a reinforcement of the latter statement.
Multicell joint processing is based on the idea that signal processing does not take place at individual base stations (BSs), but at a central processor which can jointly serve the user terminals (UTs) of multiple cells through the spatially distributed BSs. It should be noted that the main concept of multicell joint processing is closely connected to the rationale behind Network MIMO and Distributed Antenna Systems (DAS) and those three terms are often utilized interchangeably in the literature. According to the global multicell joint processing, all the BSs of a large cellular system are assumed to be interconnected to a single central processor through an extended backhaul. However, the computational requirements of such a processor and the large investment needed for backhaul links have hindered its realization. On the other hand, clustered multicell joint processing utilizes multiple signal processors in order to form BS clusters of limited size, but this localized cooperation introduces intercluster interference into the system, which has to be mitigated in order to harvest the full potential of multicell joint processing. In this direction, reuse of time or frequency channel resources (resource division multiple access) could provide the necessary spatial separation amongst clusters, an approach which basically mimics the principles of the traditional cellular paradigm only on a cluster scale. Another alternative would be to simply tolerate intercluster signals as cochannel interference, but obviously this scheme becomes problematic in highly dense systems. Taking all this into account, the current paper considers the uplink of a clustered multicell joint decoding (MJD) system and proposes a new communication strategy for mitigating intercluster interference using interference alignment (IA). More specifically, the main contributions herein are: 1. the channel modelling of a clustered MJD system with IA as intercluster interference mitigation technique, 2. the analytical derivation of the ergodic throughput based on free probabilistic arguments in the R-transform domain, 3. the analytical comparison with the upper bound of global MJD, the Resource Division Multiple Access (RDMA) scheme and the lower bound of clustered MJD with Cochannel Interference allowance (CI), 4. the comparison of the derived closed-form expressions with Monte Carlo simulations and the performance evaluation using numerical results.
The remainder of this paper is structured as follows: Section 2 reviews in detail prior work in the areas of clustered MJD and IA. Section 3 describes the channel modelling, free probability derivations and throughput results for the following cases: (a) global MJD, (b) IA, (c) RDMA and (d) CI. Section 4 displays the accuracy of the analysis by comparing to Monte Carlo simulations and evaluates the effect of various system parameters in the throughput performance of clustered MJD. Section 6 concludes the paper.

Notation
Throughout the formulations of this paper, E[·] denotes expectation, (·) H denotes the conjugate matrix transpose, (.) T denotes the matrix transpose, ⊙ denotes the Hadamard product and ⊗ denotes the Kronecker product. The Frobenius norm of a matrix or vector is denoted by ||·|| and the delta function by δ(·). I n denotes an n × n identity matrix, I n×m an n × m matrix of ones, 0 a zero matrix and G n×m ∼ CN (0, I n ) denotes n × m Gaussian matrix with entries drawn form a CN (0, 1) distribution. The figure of merit analyzed and compared throughout this paper is the ergodic per-cell sum-rate throughput. a 2 Related work

Multicell joint decoding
This section reviews the literature on MJD systems by describing the evolution of global MJD models and subsequently focusing on clustered MJD approaches.

Global MJD
It was almost three decades ago when the paradigm of global MJD was initially proposed in two seminal papers [3,4], promising large capacity enhancements. The main idea behind global MJD is the existence of a central processor (a.k.a. "hyper-receiver") which is interconnected to all the BSs through a backhaul of wideband, delayless and error-free links. The central processor is assumed to have perfect channel state information (CSI) about all the wireless links of the system. The optimal communication strategy is superposition coding at the UTs and successive interference cancellation at the central processor. As a result, the central processor is able to jointly decode all the UTs of the system, rendering the concept of intercell interference void.
Since then, the initial results were extended and modified by the research community for more practical propagation environments, transmission techniques and backhaul infrastructures in an attempt to more accurately quantify the performance gain. More specifically, it was demonstrated in [5] that Rayleigh fading promotes multiuser diversity which is beneficial for the ergodic capacity performance. Subsequently, realistic path-loss models and user distributions were investigated in [6,7] providing closed-form ergodic capacity expressions based on the cell size, path loss exponent and geographical distribution of UTs. The beneficial effect of MIMO links was established in [8,9], where a linear scaling of the ergodic per-cell sum-rate capacity with the number of BS antennas was shown. However, correlation between multiple antennas has an adverse effect as shown in [10], especially when correlation affects the BS side. Imperfect backhaul connectivity has also a negative effect on the capacity performance as quantified in [11]. MJD has been also considered in combination with DS-CDMA [12], where chips act as multiple dimensions. Finally, linear MMSE filtering [13,14] followed by single-user decoding has been considered as an alternative to the optimal multiuser decoder which requires computationally-complex successive interference cancellation.

Clustered MJD
Clustered MJD is based on forming groups of M adjacent BSs (clusters) interconnected to a cluster processor. As a result, it can be seen as an intermediate state between traditional cellular systems (M = 1) and global MJD (M = ∞). The advantage of clustered MJD lies on the fact that both the size of the backhaul network and the number of UTs to be jointly processed decrease. The benefit is twofold; first, the extent of the backhaul network is reduced and second, the computational requirements of MJD (which depend on the number of UTs) are lower. The disadvantage is that the sum-rate capacity performance is degraded by intercluster interference, especially affecting the individual rates of cluster-edge UTs. This impairment can be tackled using a number of techniques as described here. The simplest approach is to just treat it as cochannel interference and evaluate its effect on the system capacity as in [15]. An alternative would be to use RDMA, namely to split the time or frequency resources into orthogonal parts dedicated to cluster-edge cells [16]. This approach eliminates intercluster interference but at the same time limits the available degrees of freedom. In DS-CDMA MJD systems, knowledge of the interfering codebooks has been also used to mitigate intercluster interference [12]. Finally, antenna selection schemes were investigated as a simple way of reducing the number of intercluster interferers [17].

Interference alignment
This section reviews the basic principles of IA and subsequently describes existing applications of IA on cellular networks.

IA preliminaries
IA has been shown to achieve the degrees of freedom (dofs) for a range of interference channels [18][19][20]. Its principle is based on aligning the interference on a signal subspace with respect to the non-intended receiver, so that it can be easily filtered out by sacrificing some signal dimensions. The advantage is that this alignment does not affect the randomness of the signals and the available dimensions with respect to the intended receiver. The disadvantage is that the filtering at the non-intended receiver removes the signal energy in the interference subspace and reduces the achievable rate. The fundamental assumptions which render IA feasible are that there are multiple available dimensions (space, frequency, time or code) and that the transmitter is aware of the CSI towards the nonintended receiver. The exact number of needed dimensions and the precoding vectors to achieve IA are rather cumbersome to compute, but a number of approaches have been presented in the literature towards this end [21][22][23].

IA and cellular networks
IA has been also investigated in the context of cellular networks, showing that it can effectively suppress cochannel interference [23,24]. More specifically, the downlink of an OFDMA cellular network with clustered BS cooperation is considered in [25], where IA is employed to suppress intracluster interference while intercluster interference has to be tolerated as noise. Using simulations, it is shown therein that even with unit multiplexing gain the throughput performance is increased compared to a frequency reuse scheme, especially for the cluster-centre UTs. In a similar setting, the authors in [26] propose an IA-based resource allocation scheme which jointly optimizes the frequency-domain precoding, subcarrier user selection, and power allocation on the downlink of coordinated multicell OFDMA systems. In addition, authors in [24] consider the uplink of a limited-size cellular system without BS cooperation, showing that the interference-free dofs can be achieved as the number of UTs grows. Employing IA with unit multiplexing gain towards the non-intended BSs, they study the effect of multi-path channels and single-path channels with propagation delay. Furthermore, the concept of decomposable channel is employed to enable a modified scheme called subspace IA, which is able to simultaneously align interference towards multiple non-intended receivers over a multidimensional space. Finally, the effect of limited feedback on cellular IA schemes has been investigated and quantified in [25,27].

Channel model and throughput analysis
In this paper, the considered system comprises a modified version of Wyner's linear cellular array [4,12,28], which has been used extensively as a tractable model for studying MJD scenarios [29]. In the modified model studied herein, MJD is possible for clusters of M adjacent BSs while the focus is on the uplink. Unlike [23,24], IA is employed herein to mitigate intercluster interference between cluster-edge cells. Let us assume that K UTs are positioned between each pair of neighboring BSs with path loss coefficients 1 and α, respectively ( Figure 1). All BSs and UTs are equipped with n = K + 1 antennas b [10]. to enable IA over the multiple spatial dimensions for the clustered UTs. In this setting, four scenarios of intercluster interference are considered, namely global MJD, IA, RDMA and CI. It should be noted that only cluster-edge UTs employ interference mitigation techniques, while UTs in the interior of the cluster use the optimal wideband transmission scheme with superposition coding as in [5]. Successive interference cancellation is employed in each cluster processor in order to recover the UT signals. Furthermore, each cluster processor has full CSI for all the wireless links in its coverage area. The following subsections explain the mode of operation for each approach and describe the analytical derivation of the per-cell sum-rate throughput.

Global multicell joint decoding
In global MJD, a central processor is able to jointly decode the signals received by neighboring clusters and, therefore, no intercluster interference takes place. In other words, the entire cellular system can be assumed to be comprised of a single extensive cluster. As it can be seen, this case serves as an upper bound to the IA case. The received n × 1 symbol vector y i at any random BS can be expressed as follows: where the n × 1 vector z denotes AWGN with includes the flat fading coefficients of the ith UT group towards the ith BS modelled as independent identically distributed (i.i.d.) complex circularly symmetric (c.c.s.) random variables. Similarly, the term αG i, i+1 (t)x i+1 (t) represents the received signal at the ith BS originating from the UTs of the neighboring cell indexed i + 1. The scaling factor a < 1 models the amount of received intercell interference which depends on the path loss model and the density of the cellular system c . Another intuitive description of the a factor is that it models the power imbalance between intra-cell and inter-cell signals.
Assuming a memoryless channel, the system channel model can be written in a vectorial form as follows: where the aggregate channel matrix has dimensions Mn × (M + 1)Kn and can be modelled as: with =˜ ⊗ I n×kn being a block-Toeplitz matrix and G ∼ CN (0, I Mn ). In addition,˜ is a M × M + 1 Toeplitz matrix structured as follows: Assuming no CSI at the UTs, the per-cell capacity is given by the MIMO multiple access (MAC) channel capacity: Theorem 3.1. In the global MJD case, the per-cell capacity for asymptotically large n converges almost surely (a.s.) to the Marcenko-Pastur (MP) law with appropriate scaling [6,10]: Proof. For the sake of completeness and to facilitate latter derivations, an outline of the proof in [6,10] is Cluster of M cells provided here. The derivation of this expression is based on an asymptotic analysis in the number of antennas n ∞: where λ i (X) and f ∞ X denote the eigenvalues and the asymptotic eigenvalue probability distribution function (a.e.p.d.f.) of matrix X respectively and V X (x) = E[log(1 + xX)] denotes the Shannon transform of X with scalar parameter x. It should be noted that γ = nγ denotes the total UT transmit power normalized by the receiver noise power d . The last step of the derivation is based on unit rank matrices decomposition and analysis on the R-transform domain, as presented in [6,10]. The scaling factor is the Frobenius norm of the Σ matrix tr{ H } normalized by the matrix dimensions and where step (a) follows from [10,Eq.(34)]. □

Interference alignment
In order to evaluate the effect of IA as an intercluster interference mitigation technique, a simple precoding scheme is assumed for the cluster-edge UT groups, inspired by [24]. Let us assume a n × 1 unit norm reference vector v with ||v|| 2 = n and where y 1 and y M represent the received signal vectors at the first and last BS of the cluster, respectively. The first UT group has to align its input x 1 towards the non-intended BS of the cluster on the left (see Figure  A), while the Mth BS has to filter our the aligned interference coming from the M + 1th UT group which belongs to the cluster on the right. These two strategies are described in detail in the following subsections:

Aligned interference filtering
The objective is to suppress the term αG M, M+1 x M+1 which represents intercluster interference. It should be noted that UTs of the M + 1th cell are assumed to have perfect CSI about the channel coefficients G M, M+1 . Let us also assume that x j i and G j˜i ,i represent the transmitter vector and channel matrix of the jth UT in the ith group towards theĩ th BS. In this context, the following precoding scheme is employed to align interference: where v j = vv j is a scaled version of v which satisfies precoding results in unit multiplexing gain and is by no means the optimal IA scheme e [22] provide conditions for classifying a scenario as proper or improper, a property which is shown to be connected to feasibility., but it serves as a tractable way of evaluating the IA performance [23,24]. the feasibility of IA. Following this approach, the intercluster interference can be expressed as: It can be easily seen that interference has been aligned across the reference vector and it can be removed using a K× n zero-forcing filter Q designed so that Q is a truncated unitary matrix [19] and Qv = 0. After filtering, the received signal at the Mth BS can be expressed as:ỹ Assuming that the system operates in high-SNR regime and is therefore interference limited, the effect of the AWGN noise colouringz M = Qz M can be ignored, Proof. Using the property det(I + γAB) = det(I + γBA), it can be written that: The K × n truncated unitary matrix Q has K unit singular values and therefore the matrix product Q H Q has K unit eigenvalues and a zero eigenvalue. Applying eigenvalue decomposition on Q H Q, the left and right eigenvectors can be absorbed by the isotropic Gaussian matrices G H M,M and G M,M respectively, while the zero eigenvalue removes one of the n dimensions. Using the definition of Shannon transform [30], Eq. (15) yields □ Based on this lemma and for the purposes of the analysis, QG M,M is replaced by G K × K in the equivalent channel matrix.

Interference alignment
The Mth BS has filtered out incoming interference from the cluster on the right (Figure 1), but outgoing intercluster interference should be also aligned to complete the analysis. This affects the first UT group which should align its interference towards the Mth BS of the cluster on the left (Figure 1). Following the same precoding scheme and using Eq. (10) where G j 0,1 represents the fading coefficients of the jth UT of the first group towards the Mth BS of the neighboring cluster on the left. Since the exact eigenvalue distribution of the matrix product G j 1,1 G j 0,1 −1 vv j is not straightforward to derive, for the purposes of rate analysis it is approximated by a Gaussian vector with unit variance. This approximation implies that IA precoding does not affect the statistics of the equivalent channel towards the intended BS.

Equivalent channel matrix
To summarize, IA has the following effects on the channel matrix H used for the case of global MJD. The intercluster interference originating from the M + 1th UT group is filtered out and thus Kn vertical dimensions are lost. During this process, one horizontal dimension of the Mth BS is also filtered out, since it contains the aligned interference from the M + 1th UT group. Finally, the first UT group has to precode in order to align its interference towards the Mth BS of neighboring cluster and as a result only K out of Kn dimensions are preserved. The resulting channel matrix can be described as follows: where G IA ∼ CN (0, I Mn−1 ) and Since all intercluster interference has been filtered out and the effect of filter Q has been already incorporated in the structure of H IA , the per-cell throughput in the IA case is still given by the MIMO MAC expression: Theorem 3.2. In the IA case, the per-cell throughput can be derived from the R-transform of the a.e.p.d.f. of matrix 1 n H H IA H IA . Proof. Following an asymptotic analysis where n ∞: The a.e.p.d.f. of considering that the Stieltjes transform is derived from the R-transform [31] as follows with k, b, q parameters given by: where  (1)), although in this case the throughput is analyzed separately for each orthogonal part and subsequently averaged. Assuming no CSI at the UTs, the per-cell throughput in the RDMA case is given by: where C RD 1 and C RD 2 denote the capacities for the first and second orthogonal part respectively. For the first part, the cluster processor receives signals from (M -1) K UTs through all M BSs and the resulting Mn × (M -1)Kn channel matrix is structured as follows:  Proof. Following an asymptotic analysis where n ∞: Using the matrix decomposition of Eq. (27) and free additive convolution [30]: Eq. and theorem A.1. □ For the second part, the cluster processor receives signals from MK UTs through M -1 BSs and the resulting (M -1)n ×MKn channel matrix is structured as follows: where the factor 2 is due to the doubling of the transmitted power.
Theorem 3.5. For the second part of the RDMA case, the per-cell throughput C RD 2 can be derived from the Rtransform of the a.e.p.d.f. of matrix 1 n H H RD 2 H RD 2 , where: Proof. Following an asymptotic analysis where n ∞: The rest of this proof follows the steps of Theorem 3.4. □

Cochannel interference allowance
CI is considered as a worst case scenario where no signal processing is performed in order to mitigate intercluster interference and thus interference is treated as additional noise [15]. As it can be seen, this case serves as a lower bound to the IA case. The channel modelling is identical with the one in global MJD case (Eq. (1)), although in this case the cluster-edge UT group contribution αG M, M+1 (t)x M+1 (t) is considered as interference. As a result, the interference channel matrix can be expressed as: Assuming no CSI at the UTs, the per-cell throughput in the CI case is given by [15,[32][33][34]: where C I denotes the throughput of the interfering UT group normalized by the cluster size: Theorem 3.6. In the CI case, the per-cell throughput converges almost surely (a.s.) to a difference of two scaled versions of the the MP law: Proof. Following an asymptotic analysis in the number of antennas n n ∞: Eq. (37) follows from Eq. (35), (38) and Theorem 3.1. □

Degrees of freedom
This section focuses on comparing the degrees of freedom for each of the considered cases. The degrees of freedom determine the number of independent signal dimensions in the high SNR regime [35] and it is also known as prelog or multiplexing gain in the literature. It is a useful metric in cases where interference is the main impairment and AWGN can be considered unimportant.
Theorem 3.7. The degrees of freedom per BS antenna for the global MJD, I A, RDM A and CI cases are given by: (40) Remark 3.1. It can be observed that d IA = d RD only for single UT per cell equipped with two antennas (K = 1, n = 2). For all other cases, d IA >d RD . Furthermore, it is worth noting that when the number of UTs K and antennas n grows to infinity, lim K,n ∞ d IA = d MJD which entails a multiuser gain. However, in practice the number of served UTs is limited by the number of antennas (n = K+1) which can be supported at the BS-and more importantly at UT-side due to size limitations.

Complexity considerations
This paragraph discusses the complexity of each scheme in terms of decoding processing and required CSI. In general, the complexity of MJD is exponential with the number of users [36] and full CSI is required at the central processor for all users which are to be decoded. This implies that global MJD is highly complex since all system users have to be processed at a single point. On the other hand, clustering approaches reduce the number of jointly-processed users and as a result complexity. Furthermore, CI is the least complex since no action is taken to mitigate intercluster interference. RDMA has an equivalent receiver complexity with CI, but in addition it requires coordination between adjacent clusters in terms of splitting the resources. For example, time division would require inter-cluster synchronization, while frequency division could be even static. Finally, IA is the most complex since CSI towards the nonintended BS is also needed at the transmitter in order to align the interference. Subsequently, additional processing is needed at the receiver side to filter out the aligned interference.

Numerical results
This section presents a number of numerical results in order to illustrate the accuracy of the derived analytical expressions for finite dimensions and evaluate the performance of the aforementioned interference mitigation schemes. In the following figures, points represent values calculated through Monte Carlo simulations, while lines refer to curves evaluated based on the analytical expressions of section 3. More specifically, the simulations are performed by generating 10 3 instances of random Gaussian matrices, each one representing a single fading realization of the system. In addition, the variance profile matrices are constructed deterministically based on the considered α factors and used to shape the variance of the i.i.d. c.c.s. elements. Subsequently, the per-cell capacities are evaluated by averaging over the system realizations using: (a) Eq. (5) for global MJD, (b) Eq. (10)- (14) and (20) for IA, (c) Eq. (26) for RDMA, (d) Eq. (35), (36) for CI. In parallel, the analytical curves are evaluated based on: (a) theorem 3.1 for global MJD, (b) theorems 3.2 and 3.3 for IA, (c) theorems 3.4 and 3.5 for RDMA, (d) theorems 3.1 and 3.6 for CI. Table 1 presents an overview of the parameter values and ranges used for producing the numerical results of the figures.
Firstly, Figure 2 depicts the per-cell throughput versus the cluster size M for medium α factors. It should be noted that the α factor combines the effects of cell size and path loss exponent as explained in [37]. As expected the performance of global MJD does not depend on the cluster size, since it is supposed to be infinite. For all interference mitigation techniques, it can be seen that the penalty due to the clustering diminishes as the cluster size increases. Similar conclusions can be derived by plotting the degrees of freedom versus the cluster size M (Figure 3). In addition, it can be observed that the IA dofs approach the global MJD dofs as the number of UTs and antennas increases. Subsequently, Figure 4 depicts the per-cell throughput versus the a factor. For high a factors, RDMA performance converges to IA, whereas for low a factors RDMA performance degrades. It should be also noted that while the performance of global MJD and RDMA increase monotonically with a, the performances of IA and cochannel interference degrade for medium a factors. Finally, Figure 5 depicts the per-cell throughput versus the number of UTs per cell K. It should be noted that the number of antennas per UT n scale jointly with K. Based on this observation, a superlinear scaling of the performance can be observed, resulting primarily from the increase of spatial dimensions (more antennas) and secondarily from the increase of the system power (more UTs). As it can be seen, the slope of the linear scaling is affected by the selected interference mitigation technique.

Conclusion
In this paper, various techniques for mitigating intercluster interference in clustered MJD were investigated. The case of global MJD was initially considered as an upper bound, serving in evaluating the degradation due to intercluster interference. Subsequently, the IA scheme was analyzed by deriving the asymptotic eigenvalue distribution of the channel covariance matrix using free-probabilistic arguments. In addition, the RDMA scheme was studied as a low complexity method for mitigating intercluster interference. Finally, the CI was considered as a worst-case scenario where no interference mitigation techniques is employed. Based on these investigations it was established that for dense cellular systems the RDMA scheme should be used as the best compromise between complexity and performance. For average to sparse cellular systems which is the usual regime in macrocell deployments, IA should be employed when the additional complexity and availability of CSI at transmitter side can be afforded. Alternatively, CI could be preferred especially for highly sparse cellular systems.

A Proof of theorem
Theorem A.1. Let A = [0 B 0] be the concatenation of the variance-profiled Gaussian matrix B = C ⊙ G and a number of zero columns. Let also k be the ratio of nonzero to total columns of A, b be the ratio of horizontal to vertical dimensions of B and q the Frobenius norm of C normalized by the matrix dimensions. The R-transform of A H A is given by: Proof. Let B = C ⊙ G n × m be a variance-profiled Gaussian matrix with b = m/n and q = ||C|| 2 /nm. According to [10], the R-transform of 1 n B H B is given by: Using eq. (23), the Stieltjes transform of 1 n B H B can be expressed as: Matrix 1 n A H A has identical eigenvalues to 1 n B H B plus a number of zero eigevalues with 0 <k < 1 defined as the ratio of non-zero eigenvalues over the total number of eigenvalues. As a result, the a.e.p.d.f. of Using the definition of the Stieltjes transform [30]: and employing eq. (23), the proof is complete. □

Competing interests
The authors declare that they have no competing interests.
Notes a The term throughput is used instead of capacity since the described techniques are suboptimal in the information-theoretic sense and lead to achievable sum-rates except for MJD which leads to MIMO MAC capacity. b The multiple antennas are assumed to be uncorrelated although the analytical results can be extended in the correlated case based on the principles described in c For more details on the modelling of the α parameter, the reader is referred to [37]. d For the purposes of the analysis the variableγ is kept finite as the number of antennas Mn grows large, so that the system power does not grow to infinity. e Depending on the signal dimensions and the channel coefficients, more than one degree of freedom per user could be achieved. The feasibility of higher multiplexing gain has been studied in [21,22]. More specifically, authors in [21] provide an algorithm which determines the achievable multiplexing gain by minimizing the interference leakage, while authors in f The structure of the first block of Σ 1 originates in the Gaussian approximation of 1 a G j 1,1 (G j 0,1 ) −1 vv j . g The structure of the last block of Σ 3 is based on Lemma 3.1.