Orthogonal Beamforming Using Gram-Schmidt Orthogonalization for Multi-User MIMO Downlink System

Simultaneous transmission to multiple users using orthogonal beamforming, known as space-division multiple-access (SDMA), is capable of achieving very high throughput in multiple-input multiple-output (MIMO) broadcast channel. In this paper, we propose a new orthogonal beamforming algorithm to achieve high capacity performance in MIMO broadcast channel. In the proposed algorithm, the base station generates a unitary beamforming vector set using Gram-Schmidt orthogonalization. We extend the algorithm of LF-OSDMA (Opportunistic SDMA with Limited Feedback) to guarantee that the system never loses multiplexing gain for fair comparison with the proposed algorithm by informing unallocated beams. Finally, we show that the proposed method can achieve a significantly higher sum capacity than LF- OSDMA and the extended LF- OSDMA without a large increase in the amount of feedback bits and latency.


I. INTRODUCTION
In multiple-input multiple-output (MIMO) broadcast (downlink) systems, simultaneous transmission to multiple users, known as space-division multiple-access (SDMA), is capable of achieving very high capacity. In general, the capacity of SDMA can be considerably improved in comparison with time-division multiple-access (TDMA) because of multiuser diversity gain [1]. The optimal SDMA performance can be achieved by dirty paper coding (DPC) [2], however implementation of DPC is infeasible since it requires complete channel state information (CSI) and high computational complexity.
Various algorithms for limited feedback SDMA schemes have been proposed recently. When the number of users exceeds the number of antennas at the base station, a user scheduling algorithm should be jointly designed with limited feedback multiuser precoding. For the opportunistic SDMA (OSDMA) algorithm proposed in [4], the feedback of each user is reduced to a few bits by constraining the choice of beamforming vector to a set of orthonormal vectors. In OSDMA, the base station sends orthogonal beams, and each user reports the best beam and their signal-to-interferenceplus-noise ratio (SINR) to the base station. The base station then schedules transmissions to some users based on the received SINR. For a large number of users, OSDMA ensures that the sum capacity increases with the number of users. However, the sum capacity of the OSDMA is limited if there are not a sufficient number of users.
An alternative SDMA algorithm with orthogonal beamforming and limited feedback is proposed [5], called OSDMA with limited feedback (LF-OSDMA). LF-OSDMA results from the joint design of limited feedback, beamforming and scheduling under the orthogonal beamforming constraint. In LF-OSDMA, each user selects the preferred beamforming vector with their normalized channel vector, called the Channel shape, using a codebook made up of multiple orthonormal vector sets. Then, each user sends back the index of the preferred beam vector as well as signal-to-noise-interference ratio (SINR) to the base station. Using multi-user feedback and a criterion of maximum capacity, the base station schedules a set of simultaneous users with the beamforming vectors. More details of LF-OSDMA algorithm is stated in Section III.
The simulation in [5] shows that LF-OSDMA can achieve significant gains in sum capacity with respect to OSDMA. However, LF-OSDMA does not guarantee the existence of N t (the number of transmit antennas) simultaneous served users whose beam vectors belong to same orthonormal vector set, since each user selects a beamforming vector. This can result in the loss of multiplexing gain and hence the sum capacity of LF-OSDMA decreases for an increase of the number of subcodebooks.
In this paper, we propose a new orthogonal beamforming algorithm using Gram-Schmidt orthogonalization for achieving high capacity in multiple-input multiple-output (MIMO) broadcast channel. In this algorithm, the base station initially selects one or some users, and let them feed back their full channel state information (CSI). Among the feedback users, the base station selects the one having highest channel gain. Using full CSI information, the base station generates beamforming vector for the selected user, and using Gram-Schmidt orthogonalization, the base station can generate a unitary orthogonal vector set. More details of the proposed method is shown in Secion V. Because the base station generates the beamforming vector for the selected user, and schedules the one, the proposed algorithm is expected to achieve the high sum capacity, though the number of feedback bits and the amount of the latency increase in our system. For the fair comparison in terms of latency, we extend the algorithm of LF-OSDMA to guarantee that the system never loses multiplexing gain. In Section VI, we compare the number of feedback bits, the amount of latency, and the sum capacity of the proposed beamforming algorithm, LF-OSDMA and the extended LF-OSDMA. In the result, we show that the proposed method can achieve a significantly higher sum capacity than LF-OSDMA and the extended LF-OSDMA without a large increase in the amount of feedback bits and latency.

II. SYSTEM MODEL
We consider a downlink multiuser multiple-antenna communication system, made up by a base station and K active users. The base station is equipped with N t transmit antennas, and each user terminal is equipped with a single receive antenna. The base station can separate the multi-user data streams by beamforming, assigning a weight vector to each of N t active users. The weight vectors {w n } Nt n=1 are unitary orthogonal vectors. We assume equal power allocation over scheduled users. The received signal of the user k is represented as where h k ∈ C Nt×1 is a channel gain vector of user k with i.i.d. complex Gaussian entries ∼ CN (0, 1), B is the index set of scheduled users, x b is the transmitted symbol and n k is complex Gaussian noise with zero mean and unit variance of user k. It is assumed that the user k has perfect CSI h k .

III. CONVENTIONAL ORTHOGONAL BEAMFORMING
An orthogonal beamforming and limited feedback algorithm were proposed in [5], called LF-OSDMA, which results from the joint design of limited feedback, beamforming and scheduling under the orthogonal beamforming constraint.
The CSI h k can be decomposed into two components: gain and shape. Hence, h k = g k s k where g k = ||h k || is the gain and s k = h k /||h k || is the shape. The channel shape is used for choosing weight vector, and the channel gain is used for computing SINR value. The user k quantizes and sends back to the base station two quantities: the index of a selecting weight vector and the quantized SINR. We assume that a codebook is created by using the method in IEEE 802.20 [3], which can be expressed as F = {F 1 , . . . F M }, where the subcodebook F i is the unitary matrix and M is the number of subcodebooks. By expressing each unitary matrix as F i = {f i,1 , . . . , f i,Nt }, the preferred beam q k selected by the user k, as a function of CSI's shape s k , is given by where · T means transposition, and ||·|| means Frobenius norm.
To compute SINR, we define the quantization error as It is clear that the quantization error is zero if s k = q k . The SINR for the user k is a function of channel power ρ k = ||h k || 2 and the quantization error δ k where γ is the input SNR. Each user feeds back its SINR along with the index of the preferred beam. Only the index of q k needs to be sent back, because the quantization codebook F can be known a priori to both the base station and users. We assume that the SINR k is perfectly known to the base station by feedback processing. The same assumption is used in [4], and [5]. Let the number of bits for quantizing SINR be Q SINR , and the total amout of the feedback bits per user becomes log 2 (N t M ) + Q SINR bits. Among feedback users, the base station schedules a subset of users using the criterion of maximizing sum capacity. Using the algorithm discussed in [5] and [6], we group feedback users according to their quantized channel shapes as follows.
Among these subgroups, the one having the maximum sum capacity is scheduled, and base station selects the subcodebook having the maximum sum capacity for transmission. The resultant sum capacity can be written as If there is a large number of active users, LF-OSDMA can achieve high capacity. However, if there is a small number of active users, its capacity is limited because LF-OSDMA does not guarantee the existence of N t simultaneous users whose beam vectors belong to same orthogonal vector set, in other words, there is an unallocated beam vector in the selected subcodebook. This can result in the loss of multiplexing gain and hence the sum capacity of LF-OSDMA decreases for an increase of the number of subcodebooks where there is a small number of active users.

IV. EXTENDED CONVENTIONAL ORTHOGONAL BEAMFORMING
In this section, we extend the algorithm of conventional orthogonal beamforming to guarantee that there is no unallocated beam in the selected subcodebook for fair comparison with the proposed method. The scheduling algorithm with the extended LF-OSDMA is described from Step 1 to Step 6 as follows.
Step 1 A base station sends pilot signals to let users estimate CSI. In this paper, we assume that all users have perfect CSI h k . We denote the latency, until pilot signals are received by all users in the cell, by δ BC Step 2 Using CSI information, each user chooses the preferred beam vector from codebook and calculates the receive SINR. Then, each user feeds back indexes of the preferred beam vector and quantized SINR k . More detailed explanation about this Step's processing is written down in Section III. We denote the latency, until all users' feedback information are received by base station, by δ all .
Step 3 Among feedback users, the base station schedules a subset of users, and selects the subcodebook having the maximum sum capacity. So far, during Step 1 and Step 3, the algorithm is same as that of LF-OSDMA, and the extended part begins from Step 4 to Step 6.
Step 4 If the selected subcodebook has an unallocated beam vector, the base station informs all users about indexes of the selected subcodebook and the unallocated beam vector. We denote the latency, until the information of the unallocated beam vector is received by all users in the cell, by δ ad , and the number of informed bits is log 2 (M ) + N t bits.
Step 5 Using information from the base station about the unallocated beam vector, each user can generate the unallocated beam vector set F m = {f m,n , . . .}, n ∈ {1, 2, . . . , N t }, and selects the preferred beam q k which can be given by The quantization error and SINR for the user k is defined as Each user feeds back the quantized SINR k and the index of the preferred beam vector. In this step, the latency is same as that of Step 2, and the number of feedback bits is log 2 (N t ) + Q SINR bits Step 6 Among feedback users, the base station assigns a user to the unallocated beam vector of the selected subcodebook using the criterion of maximizing sum capacity.
The extended algorithm can guarantee the existence of N t simultaneous users, so even if there is a small number of users, and the extended LF-OSDMA can achieve high capacity. However, the extended LF-OSDMA leads to the large increase in the number of feedback bits, and worsens system latency. We make comparisons of the number of the feedback bits and a system latency in Section VI.

V. PROPOSED ORTHOGONAL BEAMFORMING ALGORIGHM
In this section, we propose a new orthogonal beamforming algorithm using Gram-Schmidt orthogonalization. The proposed algorithm is described from Step I to Step VI as follows.
Step I The base station initially selects S users, and sends pilot signals to let all users estimate CSI, where S is the number of users selected by the base station. In this paper, we assume that all users have perfect CSI. The latency is δ BC which is same as that of Step 1.
Step II Users who are initially selected by the base station feed back their full CSI. We denote the latency, until selected users' feedback information are received by the base station, by δ select , and the number of feedback bits are (SQ CSI ) bits, where Q CSI is the number of feedback bits of the full CSI.
Step III Among the feedback users, the base station picks up the one having highest channel gain, which is defined as user u that has CSI h u . Using full CSI information of user u, the base station generates a unitary orthogonal vector set, W = [w 1 , w 2 , . . . , w Nt ] as follows.
where · H means Hermitian transposition. We assume X is (N t ×N t ) unit matrix, which is used for generating orthogonal weight vectors. The vector w 1 is the beamforming vector for user u, and the vector set of [w 1 , w 2 , . . . , w Nt ] represents generated orthogonal beamforming vectors.
Step IV The base station informs all users about information of w 1 . We denote the latency, until the information of w 1 is received by all users in the cell, by δ ad , and the number of information bits is Q CSI bits which is the number of feedback bits of the full CSI Step V Using information from the base station about w 1 , each user can generate the same unitary orthogonal vector set for the base station using (11) and (12). We assume that the algorithm for getting the unitary vector set is known a priori to both the base station and users. Then, each user selects the preferred beam q k which is given by The quantization error and SINR for the user k is defined as Each user feeds back the quantized SINR, and the index of the preferred beam vector. In this step, the latency is same as that of Step 2 and the number of feedback bits is log 2 (N t ) + Q SINR bits.
Step VI Among feedback users, the base station schedules users using the criterion of maximizing sum capacity. Certainly, the beam w 1 is assigned by the user u, because this beam is the beamforming vector for the user u.

A. Encoding of the proposed algorithm
In the subsection, we evaluate the capacity performance of the proposed algorithm when CSI is quantized by a random vector quantization codebook, because the feedback of the full CSI results in a large amount. The size of the codebook is 2 QCSI where Q CSI is the number of feedback bits of the CSI. Fig. 1 shows the sum capacity of the proposed algorithm for different codebook sizes, Q CSI = { 5, 10, 15, 20, analog CSI }, for an increase of users. The number of transmit antennas is N t = 4, SNR is 5 dB and the number of the initially selected user is S = 1. We come up with the results based on Monte Carlo simulation.
As the codebook size becomes larger, the sum capacity of the proposed algorithm increases. This is because the quantization error of CSI becomes smaller, as the codebook size becomes larger. As observed from Fig. 1, 15 bits of the CSI feedback causes only marginal loss in sum capacity with respect to the analog CSI feedback. Such loss is negligible for 20-bits feedback. Therefore, the feedback by the codebook of Q CSI = 20 from the initially selected users is as good as the analog CSI case. Thus, in this paper, we assume the number of the feedback bits of the full CSI is Q CSI = 20 when we evaluate the feedback bits. Actually, the codebook of Q = 20 is not preferable in practice because of the large complexity at the mobile terminal side.

B. Feedback Comparison
In this subsection, we compare the number of feedback bits for the different schemes. We calculate the number of the feedback bits based on the analytic formula, and summarize them in TABLE. I. We assume that the number of transmit antennas is N t = 4, the number of feedback bits of the full channel information is Q CSI = 20 bits and the number of quantizing SINR is Q SINR = 3 bits [5]. Fig. 2 shows that the proposed method needs fewer number of feedback bits than the extended LF-OSDMA, and needs almost the same number of feedback bits as LF-OSDMA. We can also observe from Fig. 2 that the difference of the number of feedback bits between the proposed method and LF-OSDMA for M = 1 is constant, which represents the number of feedback bits of the full CSI from initially selected users. If there is a large number of users, e.g. User = 100, the proposed method needs much fewer number of feedback bits than the extended LF-OSDMA and LF-OSDMA with M = 8. Therefore, an increase of the number of the feedback bits for the proposed method against that of LF-OSDMA with M = 1 is not large compared with that of LF-OSDMA with M = 8 and extended LF-OSDMA.

C. Latency Comparison
In this subsection, we compare the latency among the proposed method, LF-OSDMA, and the extended LF-OSDMA. 1 lists the comparison of system latency. δ BC is the latency which is amount of time until pilot signals are received by all users in the cell, δ all is the latency which is amount of time until all users' feedback information are received by the base station, δ ad is the latency which is amount of time until the information of unallocated beam vector is received by all users in the cell, and δ select is the latency which is amount of time until the the initially selected users' feedback information is received by the base station.
TABLE. II shows the extended LF-OSDMA and the proposed method have to tolerate higher latency than that of LF-OSDMA. In practical systems, δ BC and δ ad are much lower than δ all or δ selec , because δ BC and δ ad use a downlink broadcast channel. In addition, if there is a large number of users in the cell, δ selec is much lower than δ all . Therefore, an increase of the latency for the proposed method against LF-OSDMA is not large.

D. Capacity Comparison
In this subsection, we show the capacity result of the proposed beamforming algorithm. Fig. 3 compare the sum capacity of the proposed method with that of LF-OSDMA and the extended LF-OSDMA for an increase of the number of users. The number of transmit antennas is N t = 4 and the SNR is 5 dB. Moreover, the number of subcodebooks is M = {1, 8}, and the number of initially selected users is Firstly, the proposed method achieves a significantly higher sum capacity than LF-OSDMA and the extended LF-OSDMA for any number of users. The sum capacity of LF-OSDMA decreases for an increase of the number of subcodebooks where there is a small number of active users. On the other hand, the extended LF-OSDMA improves the sum capacity on that of LF-OSDMA for the small number of users. However, for a large number of users, there is little difference in the sum capacity between LF-OSDMA and the extended LF-OSDMA. At User = 20, the capacity gain of the proposed method with respect to LF-OSDMA with M = 1 is 2 bps/Hz and with respect to the extended LF-OSDMA with M = 8 is 1 bits/Hz. At User = 100, the proposed method also improves the sum capacity of LF-OSDMA and the extended LF-OSDMA by 0.5 bps/Hz. In the result, the proposed method can achieve a significantly higher sum capacity than LF-OSDMA and the extended LF-OSDMA without a large increase in the amount of feedback bits and latency. VII. CONCLUSION In this paper, we propose a new orthogonal beamforming algorithm for the MIMO BC aiming to achieve high capacity performance for any number of users. In this algorithm, the base station generates a unitary beamforming vector set using Gram-Schmidt orthgonalization using the beamforming vector for a initially selected user. The proposed method increases the number of feedback bits and the amount of latency. For fair comparison about the amount of latency, we extend the algorithm of LF-OSDMA to guarantee that the system never loses multiplexing gain. Finally, we compare the number of feedback bits, the amount of latency, and the sum capacity of the proposed beamforming algorithm with LF-OSDMA and the extended LF-OSDMA. In the result, we showed that the proposed method can achieve a significantly higher sum capacity than LF-OSDMA and the extended LF-OSDMA without a large increase in the amount of feedback bits and latency.