The interference-reduced energy loading for multi-code HSDPA systems

A successive interference cancelation (SIC) method is developed in this article to improve the performance of the downlink transmission throughput for the current high speed downlink packet access (HSDPA) system. The multi-code code division multiplexing spreading sequences are orthogonal at the HSDPA downlink transmitter. However, the spreading sequences loose their orthogonality following transmission through frequency selective multipath channels. The SIC method uses a minimum-mean-square-error (MMSE) equalizer at the receiver to despread multi-code signals to restore the orthogonality of the receiver signature sequences. The SIC scheme is also used as part of the resource allocation schemes at the transmitter and for the purpose of interference and inter-symbol-interference cancelation at the receiver. The article proposes a novel system value based optimization criterion to provide a computationally efficient energy allocation method at the transmitter, when using the SIC interference cancelation and MMSE equalizer methods at the receiver. The performance of the proposed MMSE equalizer based on the SIC receiver is significantly improved compared with the existing schemes tested and is very close to the theoretical upper bound which may be achieved under laboratory conditions.


Introduction
The third generation mobile radio system uses a code division multiple access (CDMA) transmission scheme and has been extensively adopted worldwide. Three GPP has developed the high speed downlink packet access (HSDPA) system as a multi-code wide-band code division multiple access (WCDMA) system in the Release five specification [1,2] of the universal mobile telecommunications system (UMTS). The success of third generation wireless cellular systems is based largely on the efficient resource allocation scheme used by the HSDPA system to improve the downlink throughput.
With the recent availability of enabling technologies such as adaptive modulation and coding and hybrid automatic repeat request, it has been possible to introduce internet enabled smart phones for internet-centric applications. The trend for the HSDPA system is to improve the downlink throughput for smart phones with high-data-rate applications. The throughput of the HSDPA downlink has been extensively evaluated in [3,4]. A recent investigation conducted in [5] shows that the data throughput achievable in practice is significantly lower than the theoretical upper-bound when using the multiple-input multiple-output (MIMO) HSDPA system. This article aims to optimize the downlink throughput close to the upper-bound without too much complexity.
The downlink throughput optimization for the HSDPA multi-code CDMA system is considered to be a two part problem in [6]. The first involves the scheduling of users for transmissions such as [7,8] and the second is the link throughput optimization for a given resource allocation, which is the focus of this article. The link throughput can be optimized through signature sequence design, receiver design and power allocation.
Optimal signature sequence design ensures that the received spreading codes are orthogonal to each other at the expense of extensive channel state information (CSI) feedback [9,10]. Therefore, three GPP has standardized the use of a fixed set of signature sequences known as the orthogonal variable spreading factor (OVSF) codes to minimize the CSI feedback required. For the MIMO system, which requires a larger signature sequence set, 3GPP standardized the use of a given OVSF set multiplied with the pre-coding weights and then concatenating the weighted set of spreading sequences. This ensures that each symbol is spread by a unique precoded spreading sequence, while making sure that the concatenated spreading sequence is orthogonal to the remaining set of spreading sequences at the transmitter.
Although the signature sequences generated by OVSF codes with pre-coding weights are orthogonal to each other at the transmitter, their orthogonality is lost at the receiver after transmission over the frequency selective multipath channels. This is known as the inter-code interference. Similarly, the transmitted symbols overlap with the neighboring symbol period, creating inter-symbol interference (ISI). These interferences are part of self interference (SI). The presence of SI produces a difference between practical system throughput and the theoretical upper-bound shown in [5].
Linear minimum mean square error (MMSE) equalizers are used to reduce part of SI in [11][12][13]. The Linear MMSE equalizers in [11,12] restore orthogonality between the received codes. [13] reduces the overall SI by using a symbol level MMSE equalizer followed by a symbol-level successive interference cancelation (SIC) scheme, with the aim to obtain practical system throughput closer to the theoretical upper-bound. In references [12][13][14][15] the use of a SIC receiver in collaboration with either a chip or a symbol level MMSE equalizer has been examined for the HSDPA downlink throughput optimization.
Link-throughput is also examined in terms of the joint optimization of the transmitter and the receiver in [6] where power allocation is incorporated with a two-stage SIC for a multi-code MIMO systems. In each SIC iteration, the equalizer coefficient and the power allocation calculations require an inversion of a large dimension covariance matrix, which makes the system computationally expensive. Simplifications for inversion of large matrices is examined in [16] to make the implementation of the linear MMSE equalizers followed by the symbol level SIC practically feasible. There is a need for a method, which eliminates the requirement to have iterative covariance matrix inversions when dealing with the inter-code interference and the intra-cell ISI interferences. A method has not yet been developed to jointly optimize the linear symbol level MMSE equalizer, the SIC detector and then to allocate the transmission powers when maximizing the total transmission rate.
The objective of this article is to propose a novel receiver with a symbol level linear MMSE equalizer followed by a single level SIC detector. The objective is also to jointly optimize the transmission power and the receiver for a single-user multi-code downlink transmission system. The receiver proposed in this article suppresses the inter-code interference and ISI interferences iteratively without the need to invert a large covariance matrix for each iteration for when transmitting over frequency selective channels. The article also describes a novel iterative transmission power/energy adaptation scheme to maximize the sum capacity of the downlink for a single user, when using discrete transmission rates and a constrained total transmission power.
When transmitting data streams at discrete rates, an optimization criterion is usually used to deliver a given constrained signal to interference plus noise ratio (SINR) at the output of each receiver. In this article a novel energy adaptation criterion known as the system value optimization criterion is used to maximize the total rate. The system value approach is a modified version of the total mean square error (MSE) minimization criterion [17,18] used in the open literature. The related study is reviewed for the system value criterion in Section 2.
The remainder of this article is organized as follows: in Section 3 the system model used in this article is given. The optimization criterion adopted here is described in Section 4 before introducing the SIC receiver model in Section 5. Section 6 presents the proposed SIC-based power and rate allocation scheme to optimize the total rate. Its performance and results are discussed in Section 7 before the conclusion is presented in Section 8.

Related study on optimization criteria
Various optimization criteria are used when allocating powers for the multi-code downlink throughput optimization. References [11,[19][20][21] focus on the transceiver design optimization criteria and references [22][23][24] concentrate on criteria for the joint rate and power allocation. These joint rate and power adaptation methods are generalized in reference [22] under three headings as follows.
1. The first criterion includes systems which optimize the transmission power to maximize the rate for a given realization of channel gains such as [19][20][21]24,25]. The aim is to maximize the total rate by iteratively adjusting the transmission powers and satisfying a target SINR or MSE. 2. The second method, such as [26] aims to maintain the received power at a target level, whilst maximizing the total rate by jointly optimizing the transmission power, rate and signature sequences and also the linear MMSE equalizers at the receiver. 3. The third method, examples of which are [22,23], uses the average system performance as an evaluation criterion which requires the distribution of the received and the interference signal powers.
The focus of this article is to optimize the transmission power through iterative power adjustments to maximize the rate, which corresponds to the first optimization criteria. It is assumed that the rate and power adaptation is much faster than the changes in the link gains due to the users being mobile. The first optimization can be further divided into two categories: margin adaptive and rate adaptive optimization. Margin adaptive optimization minimizes the total transmission energy with a given rate for a target link performance such as target SNR at the output of each receiver [19] or minimization of per stream MSE [27]. Margin adaptive optimization maximizes the total rate over multicode parallel channels by optimizing the transmission power such as [24,25] and is explored in terms of minimizing weighted MSE [20,21] within a power constraint.
In the current HSDPA system specifications [1,2,28], an equal energy allocation scheme is used to load each channel with either a single data rate or two discrete bit rates. Therefore, this article aims to optimize the total rate through rate adaptive loading by using two discrete rates.
The article maximizes the total transmission rate by optimizing the power allocated to each channel using the linear MMSE and the novel SIC receiver. In literature, parameters of the MMSE receivers are usually optimized using either the max-min weighted SNIR [29] criterion or the total MSE minimization [17,18] criterion. This article uses the system value optimization criterion, which is a derivative of the MSE minimization criterion. The system value upper bound is used to compare the performance of the proposed SIC-based energy adaptation method with the theoretical upper bound. Recently, an iterative power adaptation method known as the two-group resource allocation scheme has been developed in [30,31] to load two distinct discrete bit rates over the multi-code downlink channels subject to a constrained total transmission power. The twogroup resource allocation scheme [30,31] is integrated into the system value based power allocation method with the SIC scheme to improve the total downlink bit rate for a single user. In the following section a system model is given for the constrained optimization formulation when maximizing the total rate for multi-code downlink transmissions.

System model
As the article concentrates on the SIC and the iterative power allocation concepts, it is sufficient to use the downlink transmission model for a single-input-single-output multi-code CDMA system operating over a frequency selective multipath channel. However, the methods reported here are also applicable to the MIMO based systems.
The system model in this section describes the process of transmitting parallel strings of data bits u 1 to u K which are first mapped to symbols according to the desired modulation scheme. Through processing, the transmit vector z(ρ) for each symbol period r is obtained at the transmit antenna. These vectors are transmitted over the frequency selective multipath channel before reaching the receiver. At the receiver, the antenna collects the receive signal vector r(ρ) for each symbol period r which are further processed to obtain the parallel data bits streams u 1 to u K .
Consider a multi-code CDMA downlink with K code channels, each of which is realizable with a bit rate of b p k bits per symbol from a set of bit rates, b p k P p k =1 , for a given total energy E T and p = 1, 2,..., P. The data for each intended channel is placed in an (N U × 1)-dimensional vector u k for k = 1, ..., K. Each of these data packets is then channel encoded to produce a (B × 1)dimensional vector d k and mapped to symbols using a quadrature amplitude modulation scheme (QAM) with M constellations to transmit data at a rate b = log 2 M bits per symbol. The channel encoder rate is r code = N U B and the realizable discrete rates are given by b p = r code log 2 M.
Data is transmitted in packets at a transmission-timeinterval (TTI) and the number of symbols transmitted per packet is denoted as N (x) , where N (x) = TTI NT c and N is the spreading sequence length, T c is the chip period, and NT c is the symbol period. Transmission symbols are used to produce a (N (x) × 1)-dimensional symbol vector The entire block of transmission can be represented as an (N (x) × K) dimensional transmit symbol matrix defined as The transmitted vector y(ρ) = [y 1 (ρ), . . . , y k (ρ), . . . , y K (ρ)] T contains the symbols, over the symbol period r = 1,..., N (x) , with the unit average energy E(y k (ρ)y * k (ρ)) = 1 for k = 1,...,K. Before transmission, the symbols are weighted with an ampli- This results in the size N transmission column vector expressed as z(ρ) = [z 1 (ρ), . . . , z n (ρ), . . . , z N (ρ)] T = SA y(ρ). Each element, z n (r), of the transmission vector z(ρ), for n = 1,..., N, is then filtered using a pulse shaping function at regular intervals of chip period T c before being modulated with an up converter to transmit the data at the desired frequency.
For the duration of packet transmission, the link between the transmitter and receiver antennas is then modeled using the multipath radio channel impulse In the presence of more than one resolvable path (L > 1), the despreading signature sequences at the receiver antenna would be longer than the spreading signature sequences at the transmit antenna. The channel impulse response h convolves with the transmission signature sequence matrix S to produce the (N + L -1) × K dimensional receiver matched filter signature sequence matrix as where q k = H s k is an (N + L -1)-dimensional matched filter receiver signature sequence sequence which is a function of an (N × 1)-dimensional signature sequence s k .
At the receiver, it is assumed that the receiver carrier and clocks are fully synchronized with the transmitter carrier and clocks. The received signal at the receiver antenna is first down converted to the baseband which is passed through the receiver chip matched filter (CMF) and the filtered signal is sampled at the chip period intervals T c .
In [32] . For simplicity the subscript will be dropped from the J matrix notation. When the matrix J (J T ) operates on a column vector, it downshifts (upshifts) the column by N chips while filling the top (bottom) of the column with N zeros. The ISI interference signature sequence matrices and . Both q k,1 and q k,2 are the receiver signature sequences corresponding to the previous and the next symbol periods and are used to handle the ISI. The (N + L-1) dimensional received signal vector is given in terms of the transmitter vector y(ρ) as where ⊗ is the Kronecker product and the (N + L -1) dimensional noise vector n(ρ) has the noise covariance matrix E n(ρ) n H (ρ) = 2σ 2 I N+L−1 with the noise var- The received signal vector r(ρ) is used to produce the size K column vectorˆ y(ρ) = ŷ 1 (ρ), ...,ŷ k (ρ), ...,ŷ K (ρ) T as an estimate of the transmitted symbol vector y(ρ) as followŝ The (N + L-1) × K dimensional matrix W = w 1 , ..., w k , ..., w K has the MMSE linear equalizer despreading filter coefficients w K for k = 1,..., K. To ensure that w H k q k = 1 while minimizing the cross-correlations w H k q j for j ≠ k, a normalized MMSE despreading filter coefficient vector [30], is used. Where is the (N + L-1) × (N + L -1) dimensional covariance matrix C = E r(ρ) r H (ρ) of the received signal vector r(ρ). The covariance matrix C, given in (10), can be iteratively calculated using = D k + E k q k q H k (12) for k = 1,..., K when using C 0 = 2s 2 I N+L-1 and C = C K . D k is a covariance matrix which excludes E k q k q H k for the current channel k as shown below: At the output of each receiver, the mean-square-error 2 between the transmitted signal y k (r) and the estimated signalŷ k (ρ) is given by [30] for k= 1, ..., K; where is the SNR at the output of each receiver. One of our main objectives is to minimize the total MSE ε T = K k=1 ε k based on [17,18] to maximize the total rate b T = K k=1 b p k where b p k is the number of discrete bits allocated to each spreading sequence symbol subject to the energy constraint K k=1 E k ≤ E T . This can be written in terms of Lagrangian dual objective function as to minimize ε T = K k=1 ε k and when maximizing total rate b T = K k=1 b p k , where b p k are discrete values and l is the Lagrangian multiplier.
Rearranging (15), the system value l k can be rewritten as follows: then, (17) and (18) are also equivalent to the optimizing the total system value λ T = K k=1 λ k : The following section will introduce the system value optimization in (21) for sum capacity maximization.

The system value optimization for sum capacity maximization
This section first describes the system upper-bound using the system value optimization when energies are allocated equally in all channels. As the aim is to optimize the total rate in (18) when allocating the same rate, the section then describes the use the system value to optimize the total rate for equal rate allocation with varying energy.
With the relations of g k and l k given in (19), the Shannon's system capacity equation for practical system in terms of g k and l k can be written as where Γ is the gap value. When the available energy is equally distributed such that E k = E T K , the total system value can be defined as where it gives a very close approximation to the system capacity in (23) as follows: However, this upper-bound is only valid for equal energy allocation E k = E T K with variable b p k , which requires a large discrete set of data rates. To make the system more practical, our interest is to maximize the total rate by allocating the same discrete rate b p k = b p for the energy can be related to the discrete rate as follows: The use of equal rate allocation to maximize the total rate in terms of system value can be reformulated as follows: When optimizing the total rate in (28), both E k and the covariance matrix C are functions of each other. Hence, the energy for each channel needs to be iteratively updated using (27). Initiating the energies to be equally allocated in all channels, the iterative optimization starts by calculating the energy E k using (27) for a given system value λ * k b p k for the corresponding (or target) discrete rates b p k = b p. The inverse matrix C -1 is recalculated according to the energies E k for k = 1, ..., K at each energy iteration. Also, the receiver coefficient w k in (9) also depends on the continuously updated C -1 . This iterative process, with a covariance matrix inversion in each iteration, is repeated until all the energies converge to fixed values.
These iterative energy calculations are repeated for different rate combinations b p {b 1 ,..., b P } until a given rate combination maximizes the total rate while satisfying the energy constraint K k=1 E k ≤ E T . Our optimization objective is to have a feasible practical implementation by keeping the total number of energy iterations to a minimum and eliminating the need to invert the covariance matrix per energy iteration whilst approaching the capacity upper bound for the transmission channel: In the following section, the practical implementation of this discrete rate maximization method is made feasible by modifying the system values under the assumption that a SIC based receiver is used.

System value simplifications using the SIC concept
To maximize the total rate, energies in each channel are iteratively adjusted to achieve its target system value λ * k . The previous section showed the recursive relation between E k and C -1 which makes the iterative energy calculation computationally expensive. The SIC proposed in this article removes the dependence on C -1 when calculating E k by using the recursive covariance matrix C k in (11).
With this SIC formulation, each channel has its own corresponding recursive covariance matrix C k for k = 1,...,K. This means that E k can be iteratively updated without the need of inverting the matrix C in the process. The corresponding C −1 k is only calculated and inverted when the final allocated energy of that channel is found. By forming C −1 k in terms of the stored C −1 k−1 from the previous channel and the final iteration of E k , the total number of matrix inversions for the whole iterative energy updates for all channels reduce to 1. The corresponding MMSE linear equalizer coefficient w k given in (9) will be expressed in term of C −1 k as for k = 1,..., K. The modified version of system values given in (20) becomes while the SINR at the output receiver in (16) will be modified to calculate in terms of D k in (13) as follows: Through the use of the recursive covariance matrix formulation, the proposed SIC decreases the number of matrix inversions to 1 which then dramatically reduces the computational complexity. Our SIC formulation also improves the total data rate by removing the inter-code interference and ISI caused by the transmitted symbol x k (r) from the received vector r(ρ). Its improvement can be further increased by channel ordering, where channels are ordered starting from those with the smallest system values λ k for k = 1,..., K. The SIC-based receiver model will be described in the following section.

The successive interference cancelation and the receiver structure
Differing from the previous receiver model described in Section 3, where signal processing is done in parallel, the SIC receiver, shown in Figure 1, processes the signal channel by channel from k = K,..., 1. Initializing where the MMSE coefficients are calculated using (31).
The decoded bit vector is then re-coded and re-modulated to regenerate the transmitted symbol vectorˆ x K . This process is done by using the coded parity packet (CPP) scheme in [33]. This regenerated symbol vector is multiplied with the received signature sequence and allocated with energy √ E K , before it is removed from the current received matrix R K to form the new

Matrix operation for ISI-affected channels
x 2 x k+1 x k+2 x K Figure 1 System block diagram. The system block diagram for the successive interference cancelation receiver. matched filter matrix R K-1 for the (K -1)th channel. This iterative despreading, decision, signal regeneration and signal canceling processes are repeated for every channel from k = K to k = 1. The signal cancelation process to form new matched filter matrix for the k -1 channel is done after estimating the signal for the k th channel for k = K,..., 1 by: where x k represent ISI symbols received in the previous and the next symbol period, while q k,1 and q k,2 are the ISI interference signature sequence matrix components defined in Section 3.
The following section will introduce the SIC-based energy calculation method and the calculation of the recursive covariance matrix inverse.

The SIC-based energy calculation method
The SIC-based energy calculation can simplify the iterative energy calculations and co-variance matrix inverse as introduced in the previous sections. This section describes the formulation of the recursive covariance matrix inverse C −1 k , and the calculation of E k based on C −1 k−1 . The recursive covariance matrix inverse C −1 k is expressed in terms of a linear combination of weighted vectors, covariance matrix inversion of the previous channel C −1 k−1 (or weighted identity matrix inverse C −1 0 = 1 2σ 2 I (N+L−1) for the first channel) and the allocated energy for the current channel E k .
With C k expressed in terms of D k in (13), its inverse D −1 k can be simplified in terms of C −1 k−1 and E k into Using the matrix inversion lemma on (12) as shown in Appendix 1, the matrix inversion C −1 k becomes which only depends on the stored C −1 k and variable E k . Defining distance vectors of and weights , ξ 1 , ξ 2 , ξ 3 , ξ 4 and ζ, ζ 1 , ζ 2 as follows: the inverse of the recursive covariance matrix C −1 k can be simplified into: which is proven in Appendix 2.
With the SINR γ * k b p k and D k relationship in (33), the iterative energy can be re-expressed as and with (36), iterative energy calculation for the kth channel can be simplified to where i is the energy iteration index. From (43), the energy update E k,i in the SIC formation only requires variable E k,i-1 and the stored C −1 k−1 . The iterative energy calculation using SIC to obtain the target SINR γ * k for all channels can be summarized as follows: 1. Initialize the target SINR γ * k = 2 b p k − 1 and C −1 0 = 1 2σ 2 I (N+L−1) . 2. Starting from k = 1, calculate its corresponding vectors d, d 1 , d 2 and weights ξ, ξ 1 ,ξ 2 ,ξ 3 ,ξ 4 and ζ, ζ 1 ,ζ 2 . 3. Perform energy calculation E k,i from i = 1 to I max using (43). 4. Calculate C −1 k using E k,I max with (41) and MMSE coefficient w k with (31). 5. Repeat steps 2-4 for all k channels until k = K.
The next part will describe the selection of optimum b p k values using the two-group allocation to optimize the total rate.

The SIC-based two-group loading scheme
When allocating the same rate b p k = b p for k = 1,..., K channels, the total rate will be given by R T = Kb p . As b p is selected from a discrete set, the total energy, E T may not be fully used as shown in [31]. The use of twogroup allocation was suggested to increase the total rate to R T = (K -m)b p + mb p+1 .
To search for the optimum b p and m values, the total number of matrix inversions required in [31] is (P + K -1)I max , where PI max iterations are required to determine b p , while (K -1)I max iterations are required to determine m. The optimum b p is found as follows: 1. For each b p {b 1 ,...,b P }, set b p k = b p and its corre-

K.
2. Run the SIC-based energy calculation to find E k (b p ) for k = 1,..., K.

Stop the iteration when b p satisfies
This ensures that the maximum b p is found without violating the energy constraint E T . If p = P, the total rate is maximized for a given discrete set of bit rates. Otherwise, the total rate is further optimized by using the two-group allocation. The optimum number of channels, m, to be loaded with rate b p+1 is found as follows: 1. For each channel m = 1,..., K-1, set b p k = b p for k = 1,..., K-m and set b p k = b p+1 for k = K-m+1,...,K. Find the corresponding target SINR Run the SIC-based energy calculation to find E k b p k for k = 1,..., K.

Stop the iteration when E T < K k=1 E k b p k and set m = m -1
The following section will evaluate the performance of the two-group allocation with SIC.

Numerical results
The proposed SIC-based two-group resource allocation scheme performance has been tested using the following parameters: the chip rate is 1/T c = 3.84 Mchips/s, the number of channel is K = 15, the spreading factor is N = 16, the additive white noise variance is s 2 = 0.02 and the number of delayed propagation paths is L = 4. The respectively, to produce the power delay profiles for the transmission system. Using a fading generator each coefficient of the channel impulse response was randomly faded and complex coefficients for the transmission channels were generated. Each channel impulse was used to generate a set of 100 impulse responses. Results were produced for the total system throughput, the total system values, the number of matrix inversions and the total energy margin between the total available and used energies. The throughputs for the different schemes are referred to as the two group constrained optimization (TG), the margin adaptive constrained optimization (MA), the successive interference cancelation constrained optimization (SIC) and the system throughput upper bound (UB). The throughput results were plotted in Figure 2 as a function of the total input SNRs, |h| 2 E T 2σ 2 . The system upper bound for the MMSE based receivers was obtained using (25) by setting the gamma value Γ = 0 dB for the UB throughput curve. The remaining throughput curves for the SIC, TG and MA cases were produced using the gamma value Γ = 0.75 dB.
The objective for the results presented in Figure 2 is to compare the throughput performances for the TG, MA, and SIC cases against the theoretical upper bound by averaging 100 different channels. The throughput results are measured in terms of the total number of bits per symbol period. The TG results were generated using the despreader coefficients generated as given in (9) and the covariance matrix as given in (10). The iterative energy calculations were used to find energies using (27) for a given set of discrete rates b p k which are related to the target system values λ * k as given in (26). Each iterative energy calculation requires a covariance matrix inversion. The main objective of the tests is to determine how close we can get the constrained optimization throughputs to the UB upper bound capacity, when using different ways of controlling the number of matrix inversions in the energy allocation process. The first set of control parameters used was the maximum number of iterations I max which was set to be 100 for the TG and MA cases. The second control parameter was the error between two consecutive energies during the iterative energy calculations. This error measurement was ΔE = |E k,i -E k,(i-1) |, where i is the iteration number taking values between 1 and I max . The residual energy error was set to one of two values ΔE = 0 or ΔE = 0.001E T .
Using a constrained rate adaptive optimization method the total bit rate R T,TG = (K -m)b p + mb p+1 for the two group optimization was maximized for the allocated energy constraint K k=1 E k ≤ E T . The constrained energy allocation objective was to find the parameters, the rate b p and the number of channels, m, in the second group when maximizing the total rate R T . For the margin adaptive optimization case the same iterative energy calculation was used by considering the target system values in terms of the same bit rates as the ones used in the TG constraint optimization. However, when maximizing the total rate the SNR at the output of each MMSE equalizer is kept the same so that the maximum total rate that may be carried is equal to R T,MA = Kb p .
The MA constrained energy optimization objective is to find the discrete rate value b p for a given energy allocation constraint K k=1 E k ≤ E T and the total receiver SNR |h| 2 E T 2σ 2 . The successive interference cancelation receiver considered uses the despreading coefficient calculations based on (31). The system value and the energy relationship for the SIC constrained optimization receiver is based on (32). The two group SIC constrained optimization objective is described in Section 6.1. The rate maximization criterion is based on the iterative energy allocation scheme, given in (43), with the maximum number of iterations I max = 10. The covariance matrix is inverted using the iterative relationship given in (41) for the allocated energies. The objective is to find the two parameters the rate b p and the number of channels, m. However, as these values are available when running the simulations for the TG case, it was sufficient to calculate the total energies allocated to each channel using the algorithm given in given Section 6.1. This was done for a given combination of the rate b p and the number m obtained from the TG case. Using the allocated energies the received total SNR is calculated to produce the SNR versus throughput results.
In Figure 2 the throughput results obtained using the Matlab simulation package are presented for the The total system throughputs for Vehicular A channel. The total system throughputs for the two-group resource allocation (TG), the margin adaptive (MA) and the SIC schemes are compared with the upper bound system throughput.
Pedestrian B channel h ped B after averaging a total 100 sets of measurements. There is a 1.5 to 2.0 dB difference between the UB and TG results. Part of the shift is due to the gap value Γ = 0.75 dB used during the simulations. The difference between the MA and the UB results is approximately 4 to 6 dB. However, the SIC based receiver throughput performance is closer to the theoretical system upper bound UB capacity results. In Figure 3, results corresponding to the Vehicular A channel h vec A are presented to show the same characteristics observed in Figure 2. When the two-group TG and SIC resource allocation schemes and the margin adaptive MA loading scheme are compared to each other in Figure 2 and 3, it is observed that the SIC scheme has the highest system throughput. Therefore, this SIC scheme is preferable for practical systems over the TG and the MA schemes.
The primary aim for each loading scheme under consideration is to increase the total system value, which is upper bounded by K = 15. As the system value increases, the realizable bit rate b p will increase hence improving the total bit rate. The calculated total system value for each scheme and each total input SNR is plotted in Figure 4 for the UB, TG, SIC, and MA schemes by averaging results corresponding to 100 channels generated from the channel response h ped B . The objective of the experiment, which produced the results given in Figure 4, was to demonstrate that we can achieve the total system value upper bound when using the SIC based constrained optimization. The total system value λ T upper bound for the UB case is calculated using (24) when allocating equal energy E k = E T K for each channel k = 1, ..., K. The total system values for the cases TG, SIC and MA schemes were calculated by adding the target system values corresponding to the allocated discrete rates b p k for k = 1,..., K. The total system values are plotted against the received total SNR |h| 2 E T 2σ 2 for the UB, TG, SIC, and MA cases. The SNR for the SIC scheme is calculated by replacing the E T value with the total allocated energy K k=1 E k in the total SNR equation. Results in Figure 4 show that the TG total system value is very close to the total UB system value. The SNR required for the total system value for the MA The total system throughputs for Pedestrian A channel. The total system throughputs for the two-group resource allocation (TG), the margin adaptive (MA) and the SIC schemes are compared with the upper bound system throughput.
scheme is approximately 2 dB higher than the UB case at low SNR values. This difference comes down to 1 dB at higher SNR values. The SNR for the total system value for the SIC scheme is slightly lower than the UB case. This is due to the impact of the interference suppression introduced by the SIC scheme. The total system value for the case of the SIC scheme, as expected, is observed to be the highest compared with the rest. A higher system value on each channel will result in a higher SNR, which is desirable to improve the total bit rate as well as the detection process at the receiver end.
In order to compare the SIC scheme with the TG and MA schemes, the number of energy calculation iterations and also the number of matrix inversions are taken as the measurement parameter to examine and indicate the computational complexity of each scheme. The main objective of using the SIC based MMSE receiver and the two group resource allocation is to reduce the number of matrix inversions required to run the two resource allocation algorithm for multi code downlink transmission channels. As the SIC scheme does not require a matrix inversion, Figure 5 shows the number of matrix inversions required by the TG and MA loading schemes for ΔE = 0 and ΔE = 0.001E T . The MA scheme requires a maximum of PI max iterations to determine the energy E k required for each channel to realize R T = Kb p bits per symbol. The TG scheme requires a maximum of (P + K -1)I max iterations to determine the energy E k to realize R T = (K -m)b p + mb p+1 . It is clear that the TG scheme has a considerable problem with the number of required matrix inversions although it has much better system throughput and total system value results than the MA scheme. When the error value is increased to ΔE = 0.001E T there is a significant reduction in the total number of matrix inversions for both the TG and MA schemes. However, as the SIC scheme is free from matrix inversions and provides better system throughput and total system values than the TG scheme, the SIC scheme would be the preferred option for the downlink throughput optimization from the number of matrix inversion point of view.
When the SIC-based energy calculation method is in place, the maximum iteration I max is observed to be reduced from approximately I max = 100 for the case without SIC to approximately I max = 10 for the case with SIC. The main reason behind this reduction is the simplified SIC-based energy calculation method which requires no matrix inversions. This energy calculation method requires only several constants and vectors and the energy updated or calculated at every iteration is the energy of the current channel. Therefore, by implementing the SIC-based energy calculation method with the two-group resource allocation scheme to determine b p and m the number of energy calculation iterations is reduced significantly. This system is recommended for practical systems such as femtocells.
Apart from the throughput and matrix inversion advantages of the proposed SIC scheme, there is an improved utilization of the transmission energy by the SIC loading. When providing the same throughput the energy utilization efficiency of the data rate loading algorithm can be measured in terms of the total energy margin defined as Using the constrained optimization schemes ensures that the margin is non negative. If the margin is positive when comparing two systems, which are transmitting the same number of bits per symbol period, the system with a positive margin is better. However, we can conclude that if a system provides a positive margin at the expense of reducing the total rate, this system would not be as energy efficient as a system which uses the available energy to provide an improved total rate. In Figure 6 the energy margins are plotted for the SIC, MA and TG schemes using the Pedestrian B channel. We see that the energy margin for the MA scheme is the highest. This is because the MA scheme tends to allocate the energy such that the SNR at the output of each MMSE despreader is equal in each channel. As a result, the sum of the unequal energy allocated to each channel may be lower than the total constrained energy E T , yielding a relatively significant amount of residual energy, which is not utilized. The unused energy, which is a function of the total available energy, tends to increase since the energy is not fully utilized on each channel. The increased energy margin is due to the reduced number of bits transmitted by the MA scheme. Therefore the MA scheme is not as energy efficient as the SIC and TG schemes. When comparing the SIC and TG schemes energy margins it is clear that the SIC scheme has a higher energy margin than the TG scheme.
The results corresponding to the throughput, the total system value, the number of matrix inversions and also the energy utilization margin for the SIC scheme is better than the TG and MA schemes. The SIC scheme provides a performance close to the theoretical upper throughput bound that can be achieved using the MMSE linear receiver for the downlink system throughput optimization.

Conclusions
A novel successive-interference-cancelation based twogroup resource allocation scheme has been proposed in this article for energy minimization and bit rate maximization with a relatively low computational complexity. The need to undertake matrix inversions, when calculating the energy to be loaded to each spread sequence channel, has been removed with a simple energy calculation method. This computationally efficient resource allocation design is also equipped with a coded packet transmission providing regenerated signals which are removed during the successive interference cancelation process. A system model for the HSDPA SISO system is proposed and this model is integrated with the SIC based scheme to allocate energies iteratively whilst maximizing the averaged total system capacity. The scheme uses the iterative energy and covariance matrix inversion method to produce system values and an upper bound for the system capacity. Matlab based system simulations have been run using power delay profiles corresponding to Pedestrian A, B and Vehicular A channels. Simulations show that the proposed iterative energy calculation and rate allocation method provide sum capacities very close to the system upper bound.
The system capacities for equal energy loading case is lower than the iterative energy loading case. The number of matrix inversions is examined for the equal energy and iterative energy loading cases. The two group algorithm without the SIC scheme has the highest number of matrix inversions. The equal energy loading case has less number of matrix inversions than the  Figure 6 Energy margin comparisons. The energy margins for the two group scheme, the margin adaptive loading scheme and the SIC schemes are compared to identify how efficiently the available total energy is allocated to different channels.
iterative energy loading case. However, the proposed SIC based iterative matrix inversion method has the least number of operations when allocating energies.
The energy margin between the total available energy and the total of the allocated energies have been examined for the equal and iterative energy loading schemes. The energy margin is the highest for the equal energy loading case due to the fact that at certain receiver SNR values it does not increase the transmission rate as there is not sufficient energy available to increase the data rate over each channel.
The results presented in this article confirm that the proposed iterative energy and co-variance matrix inversion scheme provides a significant performance improvement for the multicode downlink transmission, which could be useful to increase the capacity for the high speed down link transmission systems if adapted for standardization.

Appendix 1
The inverse of the covariance matrix C k given in (11) and (13) needs to be expressed in terms of D k where they are related to each other as follows The inverse of the covariance matrix C k in terms of inverse of the matrix D k can be expressed as follows The inverse of the matrix D k needs to be expressed in terms of inverse of the covariance matrix C k-1 to obtain iterative energy calculations. The covariance matrix D k may be rewritten as follows: = D 1,k + E k q k,2 q H k,2 , Where D 1,k = C k−1 + E k q k,1 q H k,1 . The inverse of matrix D k in (49) can be expressed using the matrix inversion lemma (47) as follows: where D −1 1,k can also be solved using the matrix inversion lemma to yield With (51) and under the assumption that the approximations q H k,2 C −1 k−1 q k,1 2 0 and q H k,1 C −1 k−1 q k,2 2 0 hold for the low cross correlation cases, the inverse matrix D −1 k in (50) can be written in the simplified format as follows (52)

Appendix 2
By inserting ζ = E k 1 + 2 b p k − 1 into (46), the inverse matrix C −1 k is further expressed as follows: since the SNR is set to the target SNR, γ k = γ * k = 2 b p k − 1 in the energy calculation process.
Using the definitions of d 1 = C −1 k−1 q k,1 , d 2 = C −1 k−1 q k,2 , ζ 1 = E k 1 + E k ξ 1 and , the inverse matrix D −1 k , which has been expressed in (52), is rewritten as follows: which is then inserted to (53) to yield Solving the right hand side of the above equation leads to the following equation,