Analysis of Multiuser MIMO Downlink Networks Using Linear Transmitter and Receivers

In contrast to dirty-paper coding (DPC) which is largely information theoretic, this paper proposes a linear codec that can spatially multiplex the multiuser signals to realize the rich capacity of multiple-input multiple-output (MIMO) downlink broadcast (point-to-multipoint) channels when channel state information (CSI) is available at the transmitter. Assuming single-stream (or single-mode) communication for each user, we develop an iterative algorithm, which is stepwise optimal, to obtain the multiuser antenna weights accomplishing orthogonal space-division multiplexing (OSDM). The steady state solution has a straightforward interpretation and requires only maximal-ratio combiners (MRC) at the mobile stations to capture the optimized spatial modes. Our main contribution is that the proposed scheme can greatly reduce the processing complexity (at least by a factor of the number of base station antennas) while maintaining the same error performance when compared to a recently published OSDM method. Intensive computer simulations show that the proposed scheme promises to provide multiuser diversity in addition to user separation in the spatial domain so that both diversity and multiplexing can be obtained at the same time for multiuser scenario.


INTRODUCTION
Recently, multiple-input multiple-output (MIMO) antenna coding/processing has received considerable attention because of the extraordinary capacity advantage over systems with single antenna at both transmitter and receiver ends.Independent studies by Telatar [1] and Foschini and Gans [2] have shown that the capacity of a MIMO channel grows at least linearly with the number of antennas at both ends without bandwidth expansion nor increase in transmit power.This exciting finding has proliferated numerous subsequent studies on more advanced MIMO antenna systems (e.g., [3,4,5,6,7,8,9]).Performance enhancement utilizing MIMO antenna for single-user (point-to-point) wireless communications is by now well developed.The presence of other cochannel users in a MIMO system is, nonetheless, much less understood.
In general, a base station is allowed to have more antennas and is able to afford more sophisticated technologies.Therefore, it is always the responsibility of the base station to design techniques that can manage or control cochannel signals effectively.In the uplink (from many mobile stations (MSs) to one base station), space-division multiple-access (SDMA) can be accomplished through linear array processing [10,11] or multiuser detection by sphere decoding [12].However, since a mobile station has to be inexpensive and compact, it rarely can afford the required complexity of performing multiuser detection or have a large number of receiving antennas.Support of multiple users sharing the same radio channel is thus much more challenging in downlink (from one base station to many mobile stations).
Promoting spectral reuse in downlink broadcast channels traces back several decades and the method is based on socalled "dirty-paper coding" (DPC) [13].By means of known preinterference cancellation at the transmitter, DPC encodes the data in a way that the codes align themselves as much as possible with each other so as to maximize the sum capacity of a broadcast channel [14,15,16].However, dirty-paper techniques are largely information theoretic and worse of all, the encoding process to achieve the sum capacity is data dependent.This makes it inconsistent with existing communication architectures.For this reason, conventional downlink space-division multiplexing approaches tend to control the multiuser signals based on their signal-to-interferenceplus-noise ratio (SINR) using linear transmitter and receivers [17,18,19].
In [17,18], the objective is to maintain for every user a preset SINR for acceptable signal reception.A joint power control and beamforming approach is presented, but a solution is not guaranteed to exist.Subsequently in [19], a closed-form solution that optimizes the base station antenna array in maximizing a lower bound of the product of multiuser SINR is proposed.The problem, however, is that in any of these works, the cochannel users are not truly uncoupled, and the residual cochannel interference (CCI) will not only degrade users' performance, but also more importantly, destroy the independency for managing multiuser signals (since the power of cochannel users must be carefully adjusted jointly).Since it is advantageous to handle users in an orthogonal manner (i.e., zero forcing (ZF)) in the spatial domain, recent attempts focus on the new paradigm of orthogonal space-division multiplexing (OSDM) in the downlink [20,21,22,23,24,25,26,27].
In [20,21], support of multiple users using a so-called joint transmission method is introduced in the context of code-division multiple-access (CDMA) systems.Because single-element mobile terminals are considered, these methods solve only the problem for multiuser multiple-input single-output (MISO) scenario.OSDM techniques for multiuser MIMO systems are recently proposed by several authors (e.g., [22,23,24,25,26,27]).In [22,23,24], by placing nulls at the antennas of all the unintended users, the downlink channel matrix is made block diagonal to eliminate the CCI.However, these methods fail to obtain the rich diversity of the channels and require an unnecessary larger number of transmit antennas at the base station when the mobile stations have multiple antennas.More recently in [25,26,27], iterative solutions that are able to optimize the receive antenna combining are presented.Among them, the iterative null-space-directed singular value decomposition (iterative Nu-SVD) proposed in [27] emerges as the most general method that is able to tradeoff between diversity and multiplexing [28] and requires the least possible number of transmit and receive antennas.The drawback, however, is that its complexity grows roughly with the number of base station antennas to the fourth-to-fifth power (see Section 3.2 for details).This greatly limits the scalability of the system when many users are to be served simultaneously.
In this paper, our aim is to devise a reduced-complexity linear codec for OSDM in broadcast MIMO channels and study the diversity and multiplexing behavior of the proposed system.It is assumed (as in [22,23,24,25,26,27]) that the channel state information (CSI) is known to both the transmitter and the receivers.By considering only singlestream (or single-mode) communication for each user, we derive a stepwise optimal iterative solution to obtain downlink OSDM.Surprisingly, we will show that the steady state solution has a straightforward interpretation, which ends up every user with a maximal-ratio combiner (MRC) under the ZF constraint.This intuition is then used to render a method that requires much less overall computational complexity.Simulation results demonstrate that the overall complexity of the proposed method is at least a factor of the number of base station antennas smaller than that of the iterative Nu-SVD, yet achieving the same error probability performance.
The proposed scheme is analyzed by intensive computer simulations.In summary, results will reveal that the proposed scheme promises to provide multiuser diversity in addition to user separation in the spatial domain (i.e., both diversity and multiplexing can be obtained at the same time; consistent with single-user MIMO antenna systems [28]).The diversity is not diminishing with the number of users if the number of base station antennas is kept at least the same as the number of users.In addition, the system performance improves with the number of receive antennas at the mobile stations (unlike [22,23,24]), showing the importance of collapsing the receive antennas to release the degree of freedom available at the transmitter.Furthermore, the performance degradation is mild even in the presence of spatial correlation as high as 0.4, easily achievable with current antenna design technologies.
The remainder of the paper is organized as follows.In Section 2, we introduce the system model of a multiuser MIMO antenna system in downlink.Section 3 presents the optimality conditions for single-mode OSDM and proposes the iterative method that leads to the solution.Simulation results will be provided in Section 4. Finally, we conclude the paper in Section 5.
Throughout this paper, we use italic letters to denote scalars, boldface capital letters to denote matrices, and boldface lowercase letters to denote vectors.For any matrix A, A † denotes the conjugate transpose of A and A T denotes the transpose of A, and a n,m or [A] n,m refers to the (n, m)th entry of A. In addition, I denotes the identity matrix, 0 denotes the zero matrix, • denotes the Frobenius norm, and N (0, σ 2 ) is the complex Gaussian distribution function with zero mean and variance σ 2 .

Linear signal processing at transmitter and receiver
The system configuration of a multiuser MIMO system in downlink is shown in Figure 1, where the n T base station antennas, is postmultiplied by a complex antenna vector: where t (m) k represents the transmit antenna weight of the symbol z m at the kth base station antenna.The weighted symbols of all users at the kth antenna are then summed up to produce a signal x k , which is finally transmitted from the antenna.Defining the transmitted signal vector as x [x1 x 2 • • • x nT ] T and the multiuser transmit weight matrix as T [t1 t 2 • • • t M ], the transmitted signal vector can be expressed as where z T is defined as the multiuser symbol vector.Note that single signal-stream (or singlemode) communication has been assumed for each user.
Given a flat fading channel, at the mth mobile receiver, the signal at each receive antenna is a noisy superposition of the n T transmitted signals perturbed by fading.As a result, we have where nR m ] T is the received signal vector with element y (m) denoting the received signal at the th antenna of the mth mobile station, n m is the noise vector with elements assumed to have distribution N (0, N 0 ), and H m denotes the channel matrix from the base station to the mth mobile station, given by where h (m) ,k denotes the fading coefficient from the base station antenna k to the receive antenna of the mth mobile station.We model h (m)  ,k 's statistically by spatial correlated zeromean complex Gaussian random variables with unit variance (i.e., E[|h (m)  ,k | 2 ] = 1), so the amplitudes are Rayleigh distributed and their phases are uniformly distributed from 0 to 2π.Detailed description of spatial correlated multiuser MIMO channel model will be presented in the next subsection.
An estimate of the transmitted symbol, z m , can be obtained by combining the received signal vector at the mth mobile station.This is done by where nR m ] T is the receive antenna weight vector of the mth mobile station.Consequently, we can write the multiuser MIMO antenna system as [19,25] If we further define T , the entire system can be written as The definition of ( 7) will become useful when we introduce the spatial correlation model next.

Spatially correlated multiuser MIMO channel model
Provided the channels are spatially uncorrelated, then if To model the spatial correlation among the antenna elements at the transmitter and receivers, we use the separable correlation model [29], which assumes that the correlation among receiver and transmitter array elements is independent from one another.An intuitive justification is that in most situations, only immediate surroundings of the antenna array impose the correlation between array elements and have no impact on correlations observed between the elements of the array at the other end of the link.With this assumption, spatial correlation can be introduced by postmultiplying the transmitter correlation matrix, Γ 1/2  T and premultiplying the receiver correlation matrix, Γ 1/2 R so that where H is an independent and identically distributed (i.i.d.) channel matrix satisfying (9).Furthermore, as the distance between different mobile stations is generally large enough, it is much reasonable to assume that the correlation between antennas of different mobile stations is zero.Following this, a matrix of the receiver correlation coefficients can be constructed as The values of the correlation coefficients may vary according to different communication environments and are usually determined empirically.In order to make our analysis tractable, the single-parameter correlation model proposed in [30] is used to determine Γ T and Γ R as a function of only parameters, γ T and γ Rm , respectively.Therefore,

Optimization of the linear processors
In this section, our objective is to determine the transmit and receive antenna weights, (T, R), that can project the multiuser signals onto orthogonal subspaces (see (14) defined later) and at the same time maximize the sum-gain metric (or the sum of the squared resultant channel responses of the spatial modes).Mathematically, this can be written as where β m is considered as the resultant channel response for user m.Without loss of optimality, hereafter, we will assume that t m = r m = 1.According to ( 13) and ( 14), it is clear that the optimal solution of T and R will depend on each other.In order to be able to solve this optimization, we will begin by first assuming that all the receive vectors are already fixed and known, and later, consider the optimization over all possible receive vectors.By doing so, the overall system can be reduced to a multiuser MISO system with an equivalent multiuser channel matrix, H e , as Following ( 13) and ( 14), we are thus required to find the optimal transmit antenna weight vectors t m 's so that ) Now, we define another set of weight vectors Then, the optimization problem ( 16) and ( 17) can be rewritten as respectively.Further, by defining a matrix (20) can be concisely expressed as In order for (21) to exist, we must have rank(H e ), rank(G) ≥ rank(I) = M.As a result, OSDM is possible only when n T ≥ M and this constitutes one necessary condition for OSDM in multiuser MISO/MIMO channels [25,27].
When n T = M, the optimal solution for the weights, G, is simply where the superscript −1 denotes inversion of a matrix.Note that this is the one and only one solution for (21).
When n T > M, there are generally infinitely many possible solutions for G.Among these possible solutions, we need to select the one that performs the minimization of (19), and hence (16).This problem can be recognized as a typical least squares problem for an underdetermined linear system [31] and this can be solved by the following.
Decomposing the equivalent channel matrix as H e = UΛV † , where is the right unitary matrix, and Λ = diag(λ 1 , λ 2 , . ..) ∈ R M×nT whose elements are the singular values of H e , the optimal solution for g m (in the sense of ( 19) and (20) jointly) is then given by [31] More importantly, it can be shown that the solution ( 23) can be rewritten in a more easy-to-compute form, as the pseudoinverse of H e , that is, where the superscript + denotes the Moore-Penrose pseu-doinverse of a matrix [31].Accordingly, we can find the optimal transmit antenna weights by Thus far, we have maximized the resultant channel gain based on fixed-value receive vectors.Now, we will further optimize it over all possible receive vectors.Given the set of the "optimal" transmit vectors, the problem remains to solve the receive weight vector that best balances the CCI and noise at each mobile station (relaxing the ZF constraint for the moment).Apparently, the minimum mean square error (MMSE) solution gives the optimum: where 25) and ( 26) jointly compose the optimality conditions for our problem.
To find the antenna weights that satisfy the conditions, an iterative updating process is necessary to tune the transmit and receive vectors because when using (26) for a given (generally not optimal) T, the orthogonality between different mobiles may be lost due to the mismatch.The details of the algorithm are given as follows.
(5) Compute If | i | satisfies a certain condition (will be described next), the convergence is said to be achieved.Otherwise, go back to step (2).
We refer to this method as iterative pseudoinverse MMSE (iterative Pinv-MMSE).By changing the rule for convergence, the iterative algorithm can be used to achieve either OSDM (i.e., ZF) or SINR balancing.For example, if we require that | i | ≤ 0 for all i, where 0 is a preset value (typically less than 10 −6 ), it ends up ZF.Alternatively, we can have where p n denotes the transmit power for the nth mobile station, and γ 0 is the preset SINR for ensuring certain link reliability.The above criterion leads to SINR balancing.As stated before, the SINR balancing method involves joint tuning of power distribution, p n 's and the weight vectors, so it will suffer high complexity and sometimes may not converge.Therefore, we concentrate on the ZF method only.According to (24) and (26), it is obvious that the optimal solution of T can be expressed as a function of the noise level N 0 , that is, However, it can be proved (see the appendix) that with the ZF constraint, the optimum MMSE receiver (26) can be simplified as which is essentially an MRC receiver.This actually reveals that the optimal solution is independent of N 0 .What is important here is that the MMSE solution (26) in step (4) can be replaced by the MRC solution (30) to greatly reduce the computational complexity of the iterative algorithm (to be discussed in Section 3.2).We refer to the method using (30) as iterative Pinv-MRC.
Here, it is worth pointing out two facts.First of all, although iterative Pinv-MRC and iterative Pinv-MMSE converge to the same point, for each iteration, MRC and MMSE receivers do give different updates.As a matter of fact, the two methods may have different convergent properties.Figure 2 shows the number of iterations for convergence versus the preset threshold 0 , for a system with 4 transmit antennas communicating to 2 mobile stations each with 2 receive antennas, and at signal-to-noise ratio (SNR) of 20 dB.As can be seen, the number of required iterations for iterative Pinv-MMSE is much larger than that for iterative Pinv-MRC.
Secondly, although the iterative process described before involves the computation of receive vectors, they are only temporary variables in the process to optimize the transmit vectors.In other words, the optimal transmit vectors can be computed solely at the transmitter without the need of coordination with the receivers.This can be made apparent by combining the optimality conditions ( 24) and (30) together, to yield where µ m 's are real constants to ensure t m = 1 for all m.Accordingly, we have the following fixed point iteration: where the superscript ν denotes the νth iterate, and f indicates the updating procedure stated in (31).The updating equation alone will solve the optimization at the transmitter.As for each mobile receiver, (30) can be used to capture the optimized spatial mode.

Complexity analysis
Iterative Pinv-MRC offers a linear codec for OSDM at an affordable complexity compared to existing schemes.To highlight this, the complexity requirements per iteration in terms of the number of floating point operations (flops) for the proposed method and the iterative Nu-SVD method in [27] are listed in Table 1, where n Rm = n R for all m has been assumed.Further, it is assumed that recursive SVD [31] is used for computing SVD and null-space while matrix inversion is performed using Gaussian elimination.Note that in most cases, n T ≥ M n R .The dominant factors which determine the computational complexity are M and n T .It follows that iterative Nu-SVD algorithm needs roughly O(11n 3  T M + 2n 2 T M 2 ) flops per iteration, while the proposed method requires only O(4n T M 2 ) flops per iteration.Therefore, for each iteration, complexity reduction by a factor of at least n T can be achieved.On the other hand, the complexity is also determined by the number of iterations required for convergence and it will be shown that iterative Pinv-MRC in general requires similar or in some cases a slightly greater number of iterations than iterative Nu-SVD.A more detailed discussion will be provided in Section 4.2 where examples are considered.

SIMULATION RESULTS AND DISCUSSION
Monte Carlo simulations have been carried out to assess the system performance of the proposed multiuser MIMO antenna system.Results on average bit error rate (BER) for various SNR are presented.In order to assess how effective the transmit powers are transformed into received power, the SNR used here is the average transmit energy per branchto-branch versus the power of noise.Perfect CSI is assumed to be available at the base station and all mobile stations.
Preprocessing The channel model is assumed to be quasistatic flat Rayleigh fading so that the channel is fixed during one frame and changes independently between frames.The fading coefficients among transmit and receive antenna pairs are spatially correlated and modelled by (10).The frame length is set to be 128 symbols and 4-and 16-QAM (quadrature amplitude modulation) will be used.More than 100 000 independent channel realizations are used to obtain the numerical results for each simulation.For convenience, we will use the notation {n T , [n R1 , . . ., n RM ]} to denote a multiuser MIMO antenna system, which has n T transmit antennas at the base station and M mobile users each with n Rm receive antennas.

Comparison with previous OSDM schemes [22, 23, 24, 25, 26, 27]
In Figure 3, we provide the average BER results for the proposed iterative Pinv-MRC and the approach in [22,23,24] (referred to as preprocessing-SVD) for various SNRs assuming no spatial correlation (i.e., γ T , γ R = 0).The system configurations we consider are: (a) {4, [2,2]} and (b) {4, [3,3]}.As can be seen in this figure, the performance of iterative Pinv-MRC is significantly better than that of [22,23,24].Specifically, more than an order of magnitude reduction in BER is possible for {4, [2,2]} systems and even more improvement is achieved for {4, [3,3]} systems.Most importantly, for the method in [22,23,24], the performance gets worse if the number of mobile station antennas increases since more degrees of freedom need to be consumed for nullification of signals at the receive antennas.However, this is not true for our proposed method, whose performance is shown to improve by increasing the number of receive antennas at the mobile station.This can be explained by the fact that for iterative Pinv-MRC, only one degree of freedom is needed at the transmitter for CCI suppression while the method in [22,23,24] requires n R (= 2 or 3) degrees of freedom.The remaining degrees of freedom left at the base station can be utilized for diversity enhancement.In Figure 4, the average BER results for the proposed iterative Pinv-MRC, the iterative Nu-SVD [27], and the Jacobilike approach in [25] are plotted against the average SNR for the configuration {2, [3,3]}.Results indicate that the three OSDM approaches perform nearly the same.This is further confirmed by other results (which are not included in this paper because of limited space) that the three methods have nearly the same performance with inappreciable difference for the scenarios when all of them obtain downlink OSDM.However, it is worth emphasizing that the method in [25] requires for every mobile station one additional antenna for interference space while the iterative Nu-SVD requires a much higher computational complexity than the proposed iterative Pinv-MRC (see results in Section 4.2).

BER results versus the number of receive antennas at the mobile station
In Figure 5, we investigate the impact on the performance of one user (say, user 1) by varying the number of antennas at another mobile receiver (say, user 2).[1]}, {2, [2]} and a 2-user system {4, [1,1]} are also included for comparisons.When n R2 increases, the Iterative Nu-SVD [27] for {2, [3,3]} Iterative Pinv-MRC for {2, [3,3]} Jacobi-like [25] for {2, [3,3] MRC method with the iterative Nu-SVD [27] and the method in [25].BER performances of user 1 for all three configurations reduce and eventually settle to certain error rates.Intriguingly, for {2, [1, n R2 ]}, if n R2 is large, its performance becomes a single-user system {2, [1]}.Similarly, {2, [2, n R2 ]} and {4, [1, n R2 , 1]} converge to, respectively, {2, [2]} and {4, [1,1]} systems when n R2 is large.In other words, by increasing the number of antennas at mobile station 2, user 2 will appear to be invisible to user 1.The reason is that with sufficiently large number of antennas at mobile station 2, little is needed to be done at the base station for suppressing the CCI to mobile station 2. Consequently, the optimization will be performed as if mobile station 2 does not exist.

BER results versus the number of users
In Figure 6, we study the impact of the number of mobile users in the iterative Pinv-MRC system.In this study, transmissions are 4-QAM with 8 dB of average SNR.Making OSDM possible, the number of transmit antennas n T must be equal to or greater than the number of mobile users M (i.e., n T ≥ M) [27].In this figure, we set n T = M to see if BER performance depends on the number of users in the system.Results are plotted for various n R (from 1 to 4).When n R = 1, the BER performance remains unchanged as M increases.This can be explained by the fact that for multiuser MISO antenna systems, the system performance of each mobile station is the same as that of a single-user MISO system with n T −M +1 = 1 transmit antennas.When n R > 1, the BER performance improves significantly as the number of receive antennas increases and more diversity can be achieved for a system with more users.The reason is that on having more users in the system, more base station antennas need to be employed for user separation.The increase in the degree of freedom contributes partly to maintain the orthogonalization and partly to obtain diversity.Therefore, if the number of transmit antennas keeps matching with the number of users, supporting more users in the system is beneficial, rather than detrimental.Hence, both diversity and multiplexing can be achieved at the same time not only for single-user [28] but also multiuser MIMO antenna systems as well.

BER performances versus number of iterations
Compared to some existing closed-form solutions for multiuser MIMO system [22,23,24], the drawback of our method is the need of an iterative process which sometimes may induce unpredictable computational complexity.The investigation of the iteration number needed for  convergence will be presented in the next subsection.Here we show that, in most cases, after a few number of iterations, the system performance will be very close to the steady state solution.Figure 7 gives the average BER performance versus the iteration number under four different system configurations.In this figure, the average SNR is fixed to 8 dB and 4-QAM is used; the dash lines with filled symbols are the steady state performance of the corresponding configurations.It is worth mentioning that the BER performances at 0 iteration are actually the performances of the scheme proposed in [23].With respect to this point, we can see that our scheme can have significant performance improvement compared to [23] with just a few iterations.Specifically, for {2, [2,2]} and {3, [2,2]}, results illustrate that the performance with 1 iteration makes a very significant improvement and converges to the steady state result after only 3 iterations.In addition, results also indicate that the iteration process is not very sensitive to the number of transmit antennas.However, when we increase the number of users M or the number of receive antennas n R per user, the number of iterations required to give close to the best performance will increase.For instance, for systems {4, [2, 2, 2, 2]} and {2, [3,3]}, more than 5 iterations would be required to have comparable performance as the steady state result.

Complexity results
Tables 2 and 3 demonstrate the complexity of the iterative Nu-SVD [27] and the proposed method.Four receive antennas at every mobile station (i.e., n Rm = n R = 4 for all m) is assumed.Results for the average number of iterations for convergence and the number of flops for each iteration are given, respectively, in Tables 2 and 3.
A close observation of Table 2 reveals that the average number of iterations required grows almost linearly with the number of users, M, for both methods.Note, however, that for any fixed M, the average number of iterations required slightly decreases with the number of antennas at the base station, n T , for iterative Nu-SVD.This does not occur for the proposed iterative Pinv-MRC system where the average number of iterations required increases with the number of base station antennas.Notice also that, in general, the proposed system requires higher number of iterations than that of iterative Nu-SVD, but the difference becomes smaller as the number of users increases.In addition, when n T = M, both systems require more or less the same number of iterations for convergence.
From Table 3, it is apparent that iterative Nu-SVD requires much larger number of flops for each iteration compared with iterative Pinv-MRC.Though the number of flops per iteration for both systems increases with the number of users and the number of base station antennas, the complexity of iterative Nu-SVD is much more sensitive to the increase of the number of base station antennas.In particular, an increase by about a factor of two is observed for an addition of a base station antenna.Results in Table 3 also demonstrate that a reduction by at least a factor of n T in the number of flops for each iteration can be obtained using the proposed iterative Pinv-MRC.More reduction can be achieved for large M or n T .For example, in the case of M = 4 and n T = 8, reduction by a factor of more than 32 is achieved.
Comparisons of the overall complexity of the two methods are given by the examples in Table 4.As can be seen, reduction by more than an order of magnitude is always realized when n T > M. Specifically, for the {5, [2,2]} system, iterative Pinv-MRC can reduce the overall complexity by a factor of about 18 as compared to iterative Nu-SVD.Note also that for the examples under investigation, more reduction can be obtained if the difference n T − M is larger.To summarize, for any values of n T , M, n R , iterative Pinv-MRC can significantly reduce the complexity of performing OSDM when compared to iterative Nu-SVD, a recently published OSDM system [27], while maintaining the error probability performances as have been demonstrated in Section 4.1.

Impact of spatial correlation
In this subsection, we investigate the correlation between the number of iterations for convergence and the spatial correlation of the channels.A {4, [4,4]} system using iterative Pinv-MRC is studied and the results are provided in Figure 8.We can observe that when γ R is fixed to zero, increasing γ T almost has no effect on the number of iterations.This is not the case when γ T is fixed to zero; as γ R increases the number of iterations will decrease.This can be reasoned by the following.The role of receive vector is to combine the channel matrix H m and form the "effective" channel vector r † m H m .Based on the ZF criterion, iteration is required only when the change of receive antenna weights destroys the orthogonality provided by the transmit weights.The iterative process is thus largely dependent on the receive spatial correlation.When the receive spatial correlation is low, even a small  adjustment of receive weights will result in dramatic change of the channel vector, leading to large number of iterations irrespective of the transmit spatial correlation.On the contrary, when the receive spatial correlation is high, any updating of the receive antenna weights results in only small change of effective channel vector and the number of iterations required will be small.In the extreme case that the receive antennas are entirely correlated (i.e., γ R = 1), the multiuser MIMO system will degenerate to a multiuser MISO system which has a closed-form solution and no iteration is needed.
Results in Figure 9 are provided for illustrating the sensitivity of the BER performance on the spatial correlation of the channel.In this figure, the SNR is set to 16 dB and 4-QAM is assumed.Analysis is done by varying one value of spatial correlation coefficient γ T (γ R ) while the other γ R (γ T ) is fixed.As expected, results show that the BER is getting worse for higher spatial correlation (either γ T or γ R ).In-triguingly, the performance degradation is more severe on the transmit correlation factor than the receive correlation factor.It is worth noting that this is contrary to the known results of the single-user MIMO system where the transmit and receive correlation factors have the same effect on the system performance.In particular, when γ T approaches 0.99 (perfectly correlated in space), BER becomes 0.5 indicating that the multiuser system actually breaks down.Otherwise, however, the BER performance degrades considerably, but is still able to give BER of 10 −3 .The reason is that the orthogonality of the system is largely provided by the difference (or rank) of the channels seen by the transmit antenna array.Therefore, when γ T increases, the channels of the users quickly become nondistinguishable while the effect of increasing γ R goes only to the loss of receive diversity at the users.Overall, the system performance does not degrade a lot when the spatial correlation is as high as 0.4.

CONCLUSIONS
This paper has revisited the OSDM problem in multiuser MIMO downlink channels.A linear codec called iterative Pinv-MMSE, which is stepwise optimal, is proposed to obtain the multiuser antenna weights satisfying the optimality conditions.We have shown analytically that at the optimal point at convergence, we can do iterative Pinv-MRC, which is computationally simpler, yet achieves the same solution.Remarkably, the proposed scheme has been shown by simulation results to yield the same performance as a recently published method [27] with much lower processing complexity.Further, our simulation results have revealed several important findings: (1) performance improves as the number of receive antennas at the mobile station increases (unlike the systems in [22,23,24]), (2) more diversity gain can be achieved for a system with more users if the number of base station antennas keeps matching with the number of users (so both diversity and multiplexing can be obtained at the same time), (3) less number of iterations is required for channels with higher receive spatial correlation, (4) system performances do not degrade a lot when spatial correlation is as high as 0.4 which is achievable with current antenna design technologies.

APPENDIX EQUIVALENCE OF MMSE RECEIVER AND MRC RECEIVER AT THE OPTIMUM POINT
As we know, multiplying a scalar value to the receive vector will not affect the final SNR.Therefore, we will ignore the normalization factor (i.e., the denominator) in ( 26) and (30) in this proof.We will show that, under ZF condition (r † m H m Tm = 0), the MMSE receiver will have the same form as MRC receiver.
Before we proceed to the proof, a result on the computation of matrix inversion will be useful.For matrices that have the form as A −1 + BB † , the inverse matrix can be computed as A − (AB(I + B † AB) −1 )B † A. This can be verified by the following: Given r † m H m Tm = 0, r m = (1/N 0 )H m t m will always be satisfied.As in this case, the second part of above equation will be zero because (H m Tm ) † H m t m = N 0 (H m Tm ) † r m = 0.

Figure 2 :
Figure 2: Number of iterations versus the preset threshold 0 .

Figure 4 :
Figure4: Performance comparison of the proposed iterative Pinv-MRC method with the iterative Nu-SVD[27] and the method in[25].

Figure 5 :
Figure 5: Average BER performance of user 1 with increasing number of antennas of user 2 at SNR = 12 dB.

Figure 6 :
Figure 6: Average BER performance of the proposed iterative Pinv-MRC method with various number of users, n T = M, and at SNR = 8 dB.

Figure 7 :
Figure 7: Average BER performance of the proposed iterative Pinv-MRC method for various number of iterations at SNR = 8 dB and 4-QAM.

− 1 + 1 B † AH m t m = 1 1 B † H m t m = 1 1 H
BB † A − AB I + B † AB −1 B † A = I + BB † A − B + BB † AB I + B † AB −1 B † A = I + BB † A − B I + B † AB I + B † AB −1 B † A = I + BB † A − BB † A = I. (A.1)With the above result, and by considering A = I/N 0 and B = H m Tm , we can compute the MMSE receiver asr m = BB † + A −1 −1 H m t m = A − AB I + B † AB −1 B † A H m t m = AH m t m + AB I + B † AB −N 0 H m t m + 1 N 0 AB I + B † AB −m Tm † H m t m .(A.2) signals are transmitted from one base station to M mobile stations, n T antennas are located at the base station; and n Rm antennas are located at the mth mobile station.The data symbol, z m , of the mth mobile user, before being transmitted from all of

Table 3 :
[27]er of flops required for each iteration of the iterative Nu-SVD[27]/the proposed iterative Pinv-MRC method when n R = 4.

Table 4 :
Comparisons of the computational complexity and the required number of iterations.