 Research
 Open Access
 Published:
Design and measurementbased evaluations of coherent JT CoMP: a study of precoding, user grouping and resource allocation using predicted CSI
EURASIP Journal on Wireless Communications and Networking volume 2014, Article number: 100 (2014)
Abstract
Coordinated multipoint (CoMP) transmission provides high theoretic gains in spectral efficiency with coherent joint transmission (JT) to multiple users. However, this requires accurate channel state information at the transmitter (CSIT) and also user groups with spatially compatible users. The aim of this paper is to use measured channels to investigate if significant CoMP gains can still be obtained with channel estimation errors. This turns out to be the case, but requires the combination of several techniques. We here focus on coherent downlink JT CoMP to multiple users within a cluster of cooperating base stations. The use of Kalman predictors is investigated to estimate the complex channel gains at the moment of transmission. It is shown that this can provide sufficient CSIT quality for JT CoMP even for long (>20 ms) system delays at 2.66 GHz at pedestrian velocities or, for lower delays, at 500 MHz, at vehicular velocities. A user grouping and resource allocation scheme that provides appropriate groups for CoMP is also suggested. It provides performance close to that obtained by exhaustive search at very low complexity, low feedback cost and very low backhaul cost. Finally, a robust linear precoder that takes channel uncertainties into account when designing the precoding matrix is considered. We show that, in challenging scenarios, this provides large gains compared with zeroforcing precoding. Evaluations of these design elements are based on measured channels with realistic noise and intercluster interference assumptions. These show that high JT CoMP gains can be expected, on average over large sets of user positions, when the above techniques are combined  especially in severely intracluster interference limited scenarios.
1 Introduction
Shadowed areas and interference at cell borders pose challenges for future wireless broadband systems. A potentially powerful remedy would be coordinated multipoint (CoMP) transmission, using remote radio heads or coordination between cellular base station sites. It can overcome interference limitations in cellular radio networks and also provide coverage gains. The first steps towards support for CoMP have recently been added to the 3GPP LTE standard in Release 11 [1].
CoMP techniques for downlink transmission are often categorized into two groups [2, 3]. With joint transmission (JT), sometimes referred to as joint processing, user data is transmitted via several access points. The second group uses coordination for interference avoidance without sharing user data, using, e.g. joint scheduling (JS) and/or joint beamforming (JB) (see, e.g. [4]). The later techniques are often considered to require less backhaul capacity and to be more robust to inaccurate channel state information at transmitters (CSIT). Joint transmission can provide higher potential gains in spectral efficiency at full load (see, e.g. [3, 5]), by converting harmful interference power into useful signal power. For example, coherent JT CoMP was in [6] found to have the theoretical potential to multiply the spectral efficiency at 10% outage by a factor of 5 for terminals and base stations with single antennas. These gains are especially important for users at cell edges [7].
However, much less spectacular results are provided by recent system level simulations. Evaluations of coherent JT CoMP within 3GPP have resulted in gains in average spectral efficiency of below 27% for homogeneous deployments using 4×2 MIMO transmission [8].
These large discrepancies raise questions that have motivated our research: What reduces the large potential gains of JT CoMP? Can large improvements be obtained for most users, or only for a small subset of users, e.g. those close to cell edges? What combinations of scheduling strategies and beamforming algorithms are efficient for realistic coordination topologies, propagation conditions and CSIT quality?
Answering such questions requires a joint study of multiple aspects of the problem and their interactions, in particular the assumed propagation environment, the cooperation architecture, the CSIT quality, physical layer techniques, scheduling and the grouping of users who participate in cooperation. We here investigate an important subset of these issues for downlinks of orthogonal frequencydivision multiplexing (OFDM) systems, mainly considering frequencydivision duplexing (FDD). One focus is the effect of imperfect CSIT due to mobility. To obtain results for realistic propagation conditions, we mainly use measured channels from channel sounding signals in an urban environment for 20MHz OFDM downlinks. The measurements use simultaneous transmissions from three single antenna sites to a moving terminal. Large numbers of combinations of user positions are investigated and CSIT is obtained by Kalman channel predictors. These provide the best attainable quality of imperfect channel estimates.
Preliminary results obtained under these conditions were reported in [9]. A robust linear precoder performed joint coherent transmission from the three single antenna base stations to three single antenna terminals. These moved along randomly selected segments along the measured route at pedestrian velocities. The performance was here improved greatly for a minority of user sets by using JT CoMP, as compared to using conventional cellular transmission. However, the average spectral efficiency over all investigated sets of user positions was reduced. Such rather pessimistic results (obtained with imperfect CSIT) would be consistent with those recently reported in [8] that assumed perfect CSIT.
New results presented here are significantly more positive for the potential of JT CoMP: Large gains are obtained for a large majority of investigated user positions.
1.1 Contributions
We investigate and develop a transmit strategy for coherent JT CoMP by a stepbystep evaluation of its various components and interactions, leading to the following main conclusions and results.
First, one issue with CoMP is that significant coordination delays over backhaul links might eliminate the potential for CoMP gains. We show that channel prediction enables large average performance gains when using linear coherent joint transmission at pedestrian velocities for total delays of over 20 ms at 2.66 GHz. For lower delays, the same conclusion holds for highermobility users. CoMP would, e.g. remain possible at 500 MHz carrier frequencies for velocities up to 120 km/h, if the total delays are 5 ms.
Second, two parts of a JT CoMP design that are crucial for the average performance gains are the means for resource allocation over frequencyselective OFDM downlinks and the user grouping, i.e. the formation of groups of users who will share a particular timefrequency resource block.
We here introduce and evaluate a user grouping scheme with very low complexity, ‘User groups provided by cellular scheduling’. This user grouping strategy is based on local scheduling in the base stations, and it can (but does not have to) utilize already existing scheduling algorithms. In many papers with 2 to 3 base stations and singlecarrier transmission, the authors have intuitively used a user grouping scheme similar to this, often with all users placed at the same distance to their nearest base station site. However, to the best of our knowledge, this has never been compared with other schemes nor is it usually motivated by the authors using it. At much lower complexity than, e.g. greedy user selection, this strategy provides spatially good (although not optimal) user groups that improve the sum rate performance when using linear precoding. It preserves multiuser diversity gains and also requires less feedback and less backhaul capacity than alternative strategies proposed previously. For systems with many users, the backhaul demand for transmission control can even be significantly lower than that for JS/JB CoMP. Using this scheme, JT CoMP can improve the sum capacity for essentially all investigated combinations of user positions. On average over random sets of user positions, it is increased by up to 54% as compared to cellular transmission, with imperfect CSIT at full system load.
Third, a main mechanism behind the sometimes disappointing performance of JT CoMP is highlighted: The different distances involved from sets of transmitters to the different receivers will often generate hardtoinvert joint channel matrices. This results in precoders with large differences in the scaling of their elements. A joint linear precoding design under a perantenna power constraint is then forced to reduce the transmit powers of the closest base station to a user far below the allowed power to obtain a balanced solution. This effect reduces the total transmit power for a cluster of transmitters that participate in joint transmission, often with the result that outofcluster interference and noise reduce performance below that of singlecell transmission. The proposed user grouping strategy alleviates this problem.
Finally, since the CSIT is uncertain, robust techniques for joint precoder design are of interest. The robust linear precoder (RLP) design, introduced in [9], is here investigated further and is developed into a versatile tool for design of linear joint precoders. Robust design is most easily performed for mean square error (MSE) criteria. The RLP is here designed to optimize more general criteria by using a lowdimensional iteration over weighting matrices in a closedform robust precoder design. We here provide sufficient conditions for the closedform robust design to minimize a weighted sum of intracluster interference and transmit powers under imperfect CSIT accuracy for known secondorder moments of the statistical uncertainties. We also show that imperfect CSIT due to quantization is straightforwardly included into the design. We investigate under what conditions a robust JT design provides benefits by comparing to a simple zeroforcing (ZF) design. Also, we observe that the interplay between channel prediction errors, opportunistic scheduling and precoder design increases the multiuser scheduling gain when using CoMP, relative to singlecell transmission.
These results, taken together, in our opinion indicate that large performance gains are indeed possible by using linear JT CoMP techniques that can be designed with reasonable computational complexity.
1.2 Assumptions, design choices and related work
The potential for coherent JT CoMP was shown in [10] to be highest for lowmobility users, as compared to joint scheduling and to the use of noncoherent JT CoMP. We therefore here focus on coherent JT CoMP, also referred to as network multipleinput multipleoutput (MIMO) or multicell MIMO (see, e.g. [5, 6, 11, 12]), for lowmobility users.
Although, the largest gains are achieved with nonlinear precoding techniques such as dirty paper coding [6], complexity currently makes nonlinear precoding unfeasible for most realistic systems. We here focus on a lowcomplexity linear precoding solution. Zeroforcing linear precoders [13] are here a frequently studied alternative.
Coordination over a very wide area would provide the highest performance, but would be unrealistic due to computational complexity, delay constraints and capacity constraints in the fixed network. Therefore, we consider the use of CoMP within limited coordinated sets (clusters) of N transmitters distributed over N_{ B } cells. In cellular transmission, the transmitters belonging to each cell are coordinated, but they are uncoordinated to the transmission in other cells. In CoMP that uses clustered joint transmission, the aim is to suppress the intracluster interference when jointly transmitting to M_{ g } users. With perfect CSIT, the intracluster interference can then be eliminated by phase cancellation when N≥M_{ g }.
The cluster size, i.e. the number of cooperating cells per cluster, involves a tradeoff. A larger size ideally provides larger gains relative to cellular transmission, since a lower fraction of users are then located at cluster edges, but introduces a higher computational burden. Investigations in [11, 14] show that a cluster size above 7 to 9 cells will not provide large additional gains for systems with MIMO links. In [15], for few base station antennas, a cluster that used transmitters at three separate sites was adequate to attain most of the achievable CoMP gains (see also [16]). Our evaluations in Sections 6 and 7 focus on a cluster size of three sites, partially motivated by the results of [15] and partially due to the limitations of our measurements.
An important aspect is to limit the remaining intercluster interference. An interesting scheme proposed in [14] and further evaluated in [17] uses clusterspecific antenna tilting and power control for this purpose. We have in our investigations adjusted the interference statistics to approximate the one that would be generated by the scheme of [14].
Near accurate CSIT is important for multiuser MIMO [18] and for coherent JT CoMP [19]. We here evaluate schemes under the imperfect CSIT that would be due to the main unavoidable causes: noisy estimates and outdated CSIT due to signaling delays. Users are assumed to move at pedestrian velocities at 2.66 GHz. This setting results in large channel estimation errors due to outdating when channel prediction is not used. It has previously not been clear if the use of channel prediction helps CoMP performance in a significant way. Promising results based on simulations were reported in [19], using adaptive recursive least squares prediction. A preliminary simulation study in [20] investigated a twouser, twocell scenario. The recent paper [21] investigated this question theoretically, in the limit of large numbers of antennas per base station, but did not use a perbase station transmit power constraint, so it is hard to draw conclusions from these results.
Channel predictors are here assumed to be located in the user terminals. They report the predictions to their strongest base station. The base stations then transmit the reports over a backhaul link to a central control unit (CU) for the cluster which jointly designs the beamformers.
Kalman prediction of MIMO OFDM channels, outlined in Section 3 and Appendix 1 has been investigated in, e.g. [22, 23]. We here investigate its use in a CoMP setting, focusing on two requirements that are peculiar to this setting: (1) Transmit antennas located at different sites will be at different distances while their channels, with differing signaltointerferenceandnoise ratio (SINR), have to be estimated jointly. The weakest signals will in general be estimated with the lowest accuracy. The effects of this on the choice of pilots, the resulting precoder matrices and capacity performance need to be understood. (2) Channels may need to be predicted over long prediction horizons, due to the coordination delays.
Since significant model errors will be present, the precoder (the set of joint beamformers) should furthermore be designed to be robust with respect to (w.r.t.) the expected errors. Implementation without unrealistic computational complexity is here in focus, so we will restrict attention to linear precoders. We mainly use a versatile scheme with reasonable design complexity, the iteratively adjusted RLP introduced in [9] and further developed in Section 5 and in Appendix 2. This averaged robust design is used since it is less conservative than the minimax schemes in, e.g. [24, 25]. A useful property of the RLP is that the channel uncertainty in the form of covariance matrices that are provided by Kalman predictors can be directly used in its adjustment.
In the optimization of a criterion such as the weighted sum capacity for the involved terminals, the RLP design utilizes the analytical solution to an MSEoptimal linear robust precoder and iteratively optimizes over criterion weights used by this design. This MSEoptimal analytical solution constitutes a special case of robust feedforward control filters for dynamic (frequencyselective) systems, previously developed in [26–28]. Robust linear precoders that minimize MSE by averaging over CSIT uncertainty have more recently been highlighted for multipleinput singleoutput (MISO) transmit schemes by [29, 30] and for multiuser and MIMO downlinks in [24, 31]. Very few solutions have been proposed for robust linear precoder design for more general performance criteria.
Many proposals form user groups for CoMP, as, e.g. [32, 33], by first forming the user group and then allocating it to a transmission resource. This can provide groups with spatially compatible users, but may sacrifice some of the potential multiuser scheduling gain, since the frequencydomain variability of channels to users is not taken into account. Another approach is to use a greedy algorithm as in, e.g. [34–36] that assigns one user at a time to a given resource, forming a nearoptimal solution both in terms of spatially compatible users and exploiting multiuser diversity. This, however, requires repeated preevaluation of beamformers, resulting in a high complexity. Greedy user grouping will in Section 7 be compared to the user grouping scheme we propose, but due to high complexity, we use a blockfading model rather than the whole measured channel statistics for this particular comparison.
Notations
In the following, $\mathit{\u0112}\left[\xb7\right]$ averages over the distribution of channel model errors, E[·] averages over the statistics of noise and message symbols, ∥·∥ denotes the 2norm of a vector, tr(·) is the trace of a matrix, Re(·), (·)^{T} and (·)^{∗} denote the real part, the transpose and the Hermitian transpose of a matrix, respectively. The unit matrix is denoted I. For simplicity, we shall enumerate the users such that users {1,…,M_{ g }} are in the selected user group for the subcarriers considered. The Kronecker delta function is denoted δ_{ i j }. Unless otherwise explicitly stated, (·)_{ j n } denotes element (j,n) and (·)_{ j } denotes column j of a matrix or the j th element of a vector. The indices i and m are user indices, j and n are base station indices, t and τ are time indices and k and q are subcarrier indices. We shall denote the base station that, on average over all subcarriers and over the smallscale fading, has the strongest channel gain to a user as that user’s master base station.
2 Channel model
We assume an OFDM downlink with K subcarriers, over which M single antenna users are served by a coordinated cluster of N transmitters controlled by N_{ B } base stations, where each base station may control several transmit antennas. If M_{ g }≤M users are selected to be served jointly on the k th subcarrier at OFDM symbol τ, then their received signals ${y}^{k}\left(\tau \right)\in {\mathbb{C}}^{{M}_{g}\times 1}$, after OFDM receiver processing, are
Here, ${n}^{k}\left(\tau \right)\in {\mathbb{C}}^{{M}_{g}\times 1}$ is the sum of noise and outofcluster interference (we will henceforth call it noise), modeled as independent and identically distributed (i.i.d.) white noise with zero mean and known variance, ${u}^{k}\left(\tau \right)\in {\mathbb{C}}^{N\times 1}$ is the vector of transmitted signals and ${H}^{k}\left(\tau \right)\in {\mathbb{C}}^{{M}_{g}\times N}$ is the channel matrix where ${H}_{\mathit{\text{ij}}}^{k}\left(\tau \right)$ is the complex channel gain from transmitter j to user i. The assumption that n^{k}(τ) can be modeled as i.i.d. white noise with known variance is a simplification. It is relatively reasonable in the here considered downlink, since the intercluster interference consists of contributions from many base stations, that each transmit to many users. The resulting averaging of contributions would tend to stabilize the variance of n^{k}(τ) and to make it predictable. (The assumption of a knowable noise variance would be more problematic in the uplinks, where intercluster interference could be dominated by bursty transmission from a few user terminals). There exist methods for noise floor estimation [37].
Time and frequency synchronization with respect to all N transmitters is assumed to be adequate, in the sense that any intersymbol and intercarrier interference can be modeled as parts of the noise n^{k}(τ). It is also assumed that any frequency errors, causing rotation of elements of H^{k}(τ) over time can be handled by the tracking ability of the (Kalman) channel estimation.
The true channel is a sum of the reported predicted channel matrix ${\mathit{\u0124}}^{k}\left(\tau \right)\in {\mathbb{C}}^{{M}_{g}\times N}$, the prediction error $\Delta {H}^{k}\left(\tau \right)\in {\mathbb{C}}^{{M}_{g}\times N}$ and the quantization error $\Delta {H}_{\text{quant}}^{k}\left(\tau \right)$ of ${\mathit{\u0124}}^{k}\left(\tau \right)$
3 Channel predictions
For mobile users, the delays created by link adaption and CoMP processing will cause the CSIT to be outdated. This can partially be compensated by using channel predictions. To investigate the effectiveness of the channel prediction in a CoMP setting, we utilize Kalman predictors, which provide minimum mean square error (MMSE)optimal predictions if the channel fading statistics are known. Therefore, $\mathit{\u0112}\left[\Delta {H}^{k}\left(\tau \right)\right]=0$ and $\mathit{\u0112}\left[{\mathit{\u0124}}^{k}\left(\tau \right){\left(\Delta {H}^{k}\left(\tau \right)\right)}^{\ast}\right]=0$[38]. Kalman prediction can be performed either in the time domain (for channel impulse response components) or in the frequency domain for the complex channel gains ${H}_{\mathit{\text{ij}}}^{k}\left(\tau \right)$. These provide comparable accuracy [22] and we have chosen the frequency domain approach.
We consider FDD system downlinks, so predictions are based on downlink measurements of known antenna specific reference symbols (RS), or pilots. We will assume that the RS have regular time and frequency spacing, Δ τ and Δ f. The predictors are here assumed to be localized in the user terminals. For every RSbearing subcarrier, the i th terminal predicts its channels from several base stations within the cluster. Depending on the choice of user grouping strategy, described in Section 4, all M users that might potentially use a resource then report either the full CSIT and/or some Channel Quality Indicator (CQI), such as SINR, to their master base station.
3.1 Shortterm fading models
The Kalman predictor requires statistical models of the correlation properties of the channels over time and frequency to adjust the channel estimate according to the shortterm fading. For this, we use autoregressive (AR) models of order n_{ a }. The AR models at w RSbearing subcarriers of the channels from the N transmitters to the M users can then be realized in state space form. The dynamics of each complex channel gain is then modeled by using n_{ a } state variables. At user i,
Here, the integer t represents time steps spaced by Δ τ, $x\left(t\right)\in {\mathbb{C}}^{\left(w\xb7{n}_{a}\xb7N\right)\times 1}$ is the vector of state variables, $e\left(t\right)\in {\mathbb{C}}^{\left(w\xb7N\right)\times 1}$ is the zero mean process noise with covariance matrix Q, and
for Kalman predictor number $q=0,\dots ,\u230a\frac{{K}_{\text{CRS}}1}{w}\u230b$ where K_{CRS} is the number of RSbearing subcarriers. Note that the superscript index q w,q w + 1… in (4) represents a frequency spacing of Δ f, while k in (1) represents a frequency spacing of Δ f/n_{CRS} where n_{CRS} is the RS spacing in number of subcarriers. The prediction accuracy can be improved by increasing the number w of subcarriers that are predicted jointly, by averaging the noise. However, this comes at a cost of higher computational complexity which grows as $\mathcal{O}\left({w}^{3}\right)$[22].
The matrices A, B, C and the covariance matrix Q can be updated based on past channel estimates at an interval that is related to the time constant of the shadow fading (see [23] and chapter 4 of [22]).
3.2 Kalman predictor
Based on the AR fading models (3), each user is assumed to have a set of Kalman filters that provide filter estimates $\widehat{x}\left(t\rightt)$ of the state vector in (3) and also covariance matrices
Please see Appendix 1 for further aspects on the filter design.
MMSEoptimal predictions of the states x(t) and channel component vector (4) can then be calculated from the filter estimates. The required prediction horizon is ϑ Δ t, where $\vartheta \in \mathbb{N}$. It corresponds to the delay of the entire transmission control loop, including channel predictions, feedback, scheduling, joint precoding and any additional delays. The vector of channel predictions for a time horizon ϑ RS ahead, $\mathit{\u0125}(t+\vartheta )$, at the i th user is obtained from the filter estimate $\widehat{x}\left(t\rightt)$ by extrapolation in time. Equation (3) is iterated ϑ steps and future noise terms e(t + 1),…,e(t + ϑ1) are set to their average values of zero. This gives
The state prediction error covariance matrix is computed recursively starting with the covariance matrix P(tt) of the filter estimate:
Covariances of the prediction error Δ h(t) of the channels to one user can be described by the matrix
As mentioned above there is a tradeoff in the choice of the number w of subcarriers estimated by each Kalman filter. We here keep this parameter low and, in a second step, reduce prediction errors further by Wiener smoothing over estimates for all subcarriers. The true prediction error covariances then differ from those of (7) due to two effects. First, the AR models (3) are imperfect which increases the errors. Second, Wiener smoothing over frequency decreases the errors. In our studies, these two effects leave the variance of the prediction error slightly less than that given by (7). The use of the accurate covariance instead of (7) would cause only minor noticeable difference in precoder performance and only for systems with very low noise power. We shall therefore use (7) in the precoder design in Section 5.
4 UE allocation and scheduling
Appropriate user grouping is important if CoMP is to improve the rates for all participating users. Out of M users, M_{ g }≤N users will be selected for JT within a resource block. In [9] a preliminary investigation was performed where groups of three users were formed by random placement along a route for which measured channels from three sites were available. Figure 1 illustrates the received powers from the three sites along the measurement route. It then became evident that singlecell (SC) transmission in many situations outperformed coherent JT CoMP since JT might help some users but not all within the group simultaneously.
A subsequent analysis showed that for most of the CoMP groups that led to SC transmission outperforming CoMP, all three users had poor channels to the same base station. This led to a poorly conditioned channel matrix H, which forced the precoder design to reduce the total transmit power to fulfill a perbase station power constraint. This reduced the SNR as compared to SC transmission.
To solve this problem, we here propose to perform scheduling decisions locally at each base station and will show that this automatically creates good (although not optimal) CoMP groups. This scheme has the benefits that it has very low complexity and would be easy to implement in existing systems. It can furthermore utilize already existing scheduling algorithms. It generates no extra control signaling backhaul load since all decisions can be made locally at every base station. The proposed solution will in Section 7 be compared to the use of random user groups, to a Greedy user grouping (GUG) algorithm described below and to the optimal solution.
4.1 User groups provided by cellular scheduling (CUG)
This is our main proposed strategy to create diagonaldominant channel matrices that then become relatively easy to invert in the CoMP precoder design. We first present this scheme, denoted as cellular user grouping (CUG), for single antenna base stations. All users with the same master base station are then locally scheduled on orthogonal subcarriers by a scheduler connected to their master base station, as shown in the example in Figure 2. This scheduling is based on a CQI metric. For the schedulers explored in this paper, the CQI for user i at resource block b, CQI_{b,i}, is given by the average estimated channel gains from all antennas at that user’s master base station.
On each resource block, the scheduled M_{ g }≤N users within the cooperation cluster (with equality if each base station is the master base station of at least one user) will then form a CoMP group. These users, which all belong to different cells, are to be served jointly by all base stations in the cluster, including base stations that are not the master base station of any of these users. The full CSIT used in the precoder design then only needs to be fed back and transmitted over backhaul by the users that have been scheduled and only for a scheduled resource. Twostep feedback approaches such as this have been investigated in [39] for multiuser MIMO and in [40] for CoMP.
The scorebased (SB) scheduler proposed in [41] will be used in evaluations. It represents a fair scheduler in the sense that all users belonging to the same master base station are given approximately the same amount of resources. For each user, a score is computed for each resource block that indicates the ranking of its CQI relative to those of other resource blocks. Assuming scheduling over b = 1,…,B resource blocks, block l will for user i have a score of
Here > denotes a logical comparison resulting in 1 if true and 0 otherwise. The user with the highest score will be allocated to the resource block l. The use of scorebased scheduling to create the user grouping will be denoted SBCUG.
A second scheduler to be used is a close to optimal sum rate scheduler that always chooses the user with the highest estimated rate for every frequency resource. It is here based on the rate a user would have in a cellular system in which no other users within the cluster is served on the same resource
with ${\mathcal{P}}_{{j}_{\text{mast}:i},max}$ being the power constraint for the antennas of the master base station of user i. It is denoted best rate CUG (BRCUG). The use of this metric to compare attainable rates presupposes that a wellfunctioning CoMP scheme will suppress intracluster interference.
For multiantenna base stations with N_{ A } antennas, cellular scheduling proceeds similarly but may allocate up to N_{ A } users per frequency resource and base station, using cellspecific beamforming.
4.2 Greedy user grouping (GUG)
Here, for every frequency resource the CU uses, an algorithm first searches for the user that, given a specific criterion, has most to gain from entering the group. Then, it searches amongst the remaining users for the user that would provide the largest increase of the criterion value and adds that user to the group. It continues until none of the remaining users can increase the criterion value or until M_{ g } = N. We here use the specific criterion function
Here, P_{S,i}, P_{I,i} and P_{N,i} are the powers of the signals, the interference and the scalar noise powers at the receiver antenna i = 1,..,M_{ g }, respectively. Calculations of the expected values of the powers based on the prediction error statistics is discussed in Appendix 3. If α_{ i } = 1 for all i the sum rate is maximized. We shall denote this GUG with best rate (GUGBR). If instead ${\alpha}_{i}=1/\stackrel{\u0304}{{r}_{i}}$ with $\stackrel{\u0304}{{r}_{i}}$ being the average throughput of user i over already scheduled resources, we get a proportional fair scheduler [42], which will be denoted GUG with proportional fair scheduling (GUGPF).
GUG should provide better system performance than CUG which generates its user grouping without explicitly taking the resulting performance into account. However, this comes at several costs.

1.
Higher feedback requirements. For CUG, local scheduling can be carried out using only a local CQI as, e.g the estimated channel gains to users from antennas at their master base station. Scheduled users then only need to complement with the full CSIT for the resources they are allocated. With GUG, full CSIT is needed for all M users considered over all resources.

2.
Higher backhaul demand. CUG only requires the M _{ g }·N complex channel gains to be transmitted over the backhaul links for the M _{ g } users that are actually scheduled on a resource. With GUG, the CU needs knowledge of all M users; hence, M·N complex channel gains per scheduled resource slot must be transmitted over backhaul.

3.
Higher computational complexity, since greedy user grouping requires repeated design and evaluation of a joint precoder. With simplified CQI and performance metrics suggested above, this is not necessary when using the CUG strategy.
5 Precoding
A CU for the cluster is assumed to have full information of the reported predicted channels and of the covariances of the prediction and quantization errors of the scheduled users. It designs precoding matrices $R\in {\mathbb{C}}^{N\times {M}_{g}}$ for all utilized timefrequency resource blocks. The blocks consists of adjacent OFDM symbols and subcarriers, with at least one resource slot dedicated to a reference symbol. All transmitted symbols within such a resource block will normally be exposed to close to identical channels as at the RS position and can therefore use the same precoder. In the following, time and subcarrier indices within a block are excluded: ${H}_{\mathit{\text{ij}}}\triangleq {H}_{\mathit{\text{ij}}}^{k}\left(t\right)$, ${\mathit{\u0124}}_{\mathit{\text{ij}}}\triangleq {\mathit{\u0124}}_{\mathit{\text{ij}}}^{k}(t+\vartheta )$, $n\triangleq {n}^{k}\left(t\right)$, $u\triangleq {u}^{k}\left(t\right)$ and $y\triangleq {y}^{k}\left(t\right)$.
On each subcarrier and for each OFDM symbol within the resource block, the transmitted signal vector, $u\in {\mathbb{C}}^{N\times 1}$, is generated by a linear precoder
where c is a scalar scaling factor and $s\in {\mathbb{C}}^{{M}_{g}\times 1}$ is the message symbol vector, assumed to be white, have zero mean, covariance matrix I and to be uncorrelated with the noise n. We assume that perantenna transmit power constraints, P_{j,max}, apply to each subcarrier individually. The scaling factor c in (10) is selected to assure that the transmit powers at the N antennas satisfy
where u_{ j } is the j th element of the transmit vector u. (A reasonable modification would be to have a sum power constraint over all subcarriers. With a sum rate criterion, this would lead to a water filling power allocation as described in [17], which slightly increases the sum rate performance).
5.1 Target system
The system model used for precoder design is shown in Figure 3. Here, $u\in {\mathbb{C}}^{N\times 1}$ is the transmit signal vector, and $z=\frac{1}{c}\mathit{\text{Ds}}\in {\mathbb{C}}^{{M}_{g}\times 1}$ is the desired received vector. Its desired properties are modeled by a target matrix D which is diagonal, representing the ideal of a complete interference suppression. In a generalization to multiple receiver antennas, D would be blockdiagonal. The distances between terminals and transmitters will differ substantially in a CoMP setting. It would therefore be unrealistic to demand equal received power at all users by setting D = I. Instead, the targeted received signal magnitudes (the diagonal elements of D) should be set to realistically attainable levels. This can be done in different ways. We here adjust the targeted received signal magnitudes to the amplitude of the strongest channel for each user
This is a very simple way of choosing D. For channel matrices with a dominant diagonal, which often appear, e.g. if all users in a CoMP group have different master base stations, (12) provides a sum rate close to the sum rate that is obtained if D is optimized.
Alternatively, in [43] all users are given the same fraction of the transmit power in combination with zeroforcing precoding. This corresponds to an alternative strategy for adjusting the diagonal elements of D. We have investigated both that alternative and numerical optimization of D with respect to the sum rate. We then found little differences in the end result as compared with the use of (12). (However the use of D = I, which is commonly used in zeroforcing precoders for singlecell multiuser MIMO problems, would cause a large loss in system performance in CoMP settings).
5.2 Robust linear precoder (RLP)
The RLP scheme uses the closedform solution to a robust linear quadratic (LQ) optimal feedforward control problem presented in [26, 27] as its basic element. It minimizes general robust performance criteria by iterating over elements in penalty matrices of the robust LQ design. The robust LQ design generates a precoder matrix R that minimizes a scalar criterion J. In our case, the criterion includes a weighted difference between target and noisefree received signals, $\u03f5=\frac{1}{c}\left(\mathit{\text{HR}}D\right)s$ (describing the remaining intracluster interference) and a weighted transmit power term. These terms are averaged over all uncertainties and transmit symbol statistics
Here, V is a diagonal positive definite matrix and S is a positive semidefinite matrix, both realvalued. The use of these weighting matrices in the design is discussed in Sections 5.2.1 to 5.2.3 below.
Theorem 1
For a transmission system (1), model (2) and linear precoder (10), assume that $\mathit{\u0112}\left[\mathrm{\Delta H}\right]=\mathit{\u0112}\left[\Delta {H}_{\mathit{\text{quant}}}\right]=0$, that $\mathit{\u0112}\left[\Delta {H}^{\ast}{V}^{\ast}\mathrm{V\Delta}{H}_{\mathit{\text{quant}}}\right]=0$, that S∈R^{N×N} has full rank and that s in (10) is white. Then, the precoding matrix R minimizing J by (13) exists and is given uniquely by
For a proof, see Appendix 4.
After obtaining the precoder matrix R_{RLP} by (14), the scale factor c is adjusted to fulfill the transmit power constraint (11). This scales the criterion (13) but does not affect the minimizing precoder matrix.
The third and fourth terms in the inverse in (14) can be evaluated from the channel error statistics,
Here, Δ H_{ n } refers to column n of either the prediction error Δ H (for the third term) or the quantization error Δ H_{quant} (for the fourth term). For prediction errors, $\overline{E}\left[\phantom{\rule{0.3em}{0ex}}\Delta {H}_{n}\Delta {H}_{j}^{\ast}\right]$ is obtained using the covariance matrices C P(t + ϑt)C^{∗} for each of the M_{ g } users provided by their Kalman predictors. Since the terminals are assumed to predict the channels independently, $\overline{E}\left[\phantom{\rule{0.3em}{0ex}}\Delta {H}_{\mathit{\text{ij}}}\Delta {H}_{\mathit{\text{mn}}}^{\ast}\right]=0$ when i≠m. Therefore, the matrix $\overline{E}\left[\phantom{\rule{0.3em}{0ex}}\Delta {H}_{n}\Delta {H}_{j}^{\ast}\right]$ is diagonal, where element (i,i) is given by the i th users
Here (·)^{k} denotes the submatrix of (C P(t + ϑt)C^{∗}) from (3), (6) and (7) for relevant subcarrier k.
The matrix element j,n of the fourth term, describing the quantization error covariance of reported predictions, is by (15) determined by $\mathit{\u0112}\left[\Delta {H}_{\text{quant},n}\Delta {H}_{\text{quant},j}^{\ast}\right]$. This matrix will be diagonal if all channel components are quantized independently. The design works for any specified CSI quantization and feedback schemes, as long as errors introduced by them can be modeled and quantified. For example, assuming individual linear quantization with a properly set maximum power, the diagonal elements of this matrix are given by ${\delta}_{\text{step}}^{2}/12$ where δ_{step} is the step size of the quantizer, which may be adjusted individually for each channel component. If the quantization granularity (step size) is individually controlled by the standard deviation of the prediction error, then the quantization error term in (2) can be kept small relative to the prediction error term in an efficient way. The quantization errors would then have negligible impact on the performance metric.
As a comparison to the RLP, we have also investigated the zeroforcing (ZF) precoder with gain control. When M_{ g }≤N, the minimum norm pseudoinverse generates the ZF precoder matrix
to be used in (10). (When M_{ g }<N, other generalized inverses exist that provide better performance under perantenna power constraints than (17) (see [44])). The ZF solution is commonly used and is simple to compute, but model errors are not taken into account. Furthermore, illconditioned matrices $\mathit{\u0124}$ generate precoders R_{ZF} with large elements. This results in the use of a large scaling factor c in (10) to fulfill the power constraint (11). The resulting reduction of transmit power decreases the SNR. This is referred to as the power normalization loss problem.
Three ways of using the weighing matrices V and S in (13) are outlined below.
5.2.1 Minimizing intracluster interference
Consider V = I and S = ϵ I in (13), using a very small realvalued regularization term S^{∗}S = ϵ^{2}I in (14), with ϵ≠0 to preserve full rank in the inverse. Then, the transmit powers are almost not penalized and the errors at all receivers are considered equally important. This setup minimizes the sum of intracluster interference powers. It is related to ZF, but takes the channel uncertainty into account. Note that when M_{ g } = N, ${\mathit{\u0124}}^{1}$ exists, V = I, ϵ→0 and Δ H = Δ H_{quant} = 0, then (14) and (17) reduce to the same solution, R = H^{1}D.
5.2.2 Optimization w.r.t. an arbitrary criterion
The robust MSE solution of Theorem 1 can be used as a tool for adjusting the precoder matrix R w.r.t. a general criterion
Here, P_{S,i}, P_{I,i} and P_{N,i}, are the powers of the signals, the interference, and the scalar noise powers at receiver antenna i = 1,..,M_{ g }. Calculations of the expected values of the powers based on the prediction error statistics is discussed in Appendix 3.
Diagonal penalty matrices V and S in (13) provide significant flexibility, and optimization of their elements w.r.t. (18) provides a flexible tool for adjusting the precoder matrix by a lowdimensional numerical search. Here, the elements of V mainly affect the weighting and fairness between users, while the elements of S affect the power balance between transmit antennas.
One particular case is when (18) is set to approximate an unweighted sum rate criterion. Then, the use of a fixed V = I is appropriate. The use of S = ϵ I, with ϵ being a very small scalar, would then approximately minimize the intracluster interference, but not the sum rate. This is because the noise in (1) is not taken into account in (13) and its impact might be enhanced by the scaling to meet the power constraint through (10). The performance w.r.t. (18) is then for most cases improved significantly by iteratively adjusting a few realvalued diagonal elements of the transmit power penalty matrix S, to rebalance the received powers, interference and noise. This procedure is outlined in Appendix 2.
The solution will be suboptimal but, in a comparative study in [17], we showed that the precoder of (14) performed close to a near optimal linear precoder [45] found through a highdimensional search of all the complex elements of R.
In the evaluations, the RLP will be designed iteratively to maximize
an approximation of the sum rate for a given precoder R. This iterative scheme has been found to perform well compared to investigated alternatives.
5.2.3 Addressing user fairness by utilizing the penalty matrix V
User fairness can be incorporated in (18), e.g. by using a weighted sum rate. In a lowcomplexity optimization that iteratively uses (13), the weighting matrix V can then be used to place a high weight on the interference at some users. These users will then be allocated a larger fraction of the transmit power and experience a higher SIR which directly affects the peruser performance. However, user fairness is also affected by the choice of scheduling criterion as well as the scaling of the target matrix D. The balancing of user fairness by these tools is an interesting topic but has been left out of the scope of the present work.
6 Evaluations based on measured channels
6.1 Channel measurements
All simulations in this section are based on channel sounding measurements carried out by Ericsson Research. Three omnidirectional singleantenna base stations, located at different sites with 350 to 600m distance, were used to transmit channel sounding orthogonal RS to a measurement van in an outdoor urban environment in central Kista, Stockholm. The measurement parameters are presented in Table 1, and the received signal powers from the base stations are plotted in Figure 1. The measurements are of high quality and can hence be assumed to represent the true complex channel gains in space. For a detailed description of the measurements and channels, see [46, 47].
6.2 Simulation method and assumptions
To simulate velocities of pedestrian users, and to make the model more 3GPPLTE like, the data has been upsampled 25 times in time resulting in the parameters presented in the righthand column of Table 1. The upsampling is done using the fast Fourier transform to ensure that no extra frequency components are added.
In the present investigation, we have focused only on the prediction error part in the error model (2).
6.2.1 Prediction assumptions
The downlink channels from the N_{ B } = 3 singleantenna base stations are predicted for the entire measurement route in Figure 1. For this, the fading statistics in time and frequency, represented by fourthorder AR models, are estimated periodically every 1 s. The use of higher AR order than 4 would not significantly improve the prediction performance for this data set. The AR models are based on noisefree channel data, i.e. on perfect CSIT, from the past 1 s. From studying the measured data, we have found that this time interval is appropriate with respect to the longterm fading. It is short enough to ensure that the statistics of the Doppler spectrum stays fairly constant within the interval. It is also long enough to provide appropriate prediction performance statistics and CoMP performance statistics for each interval. For highmobility users, the interval might need to be shorter.
Signal measurements with an appropriate range of SNRs are created by using (21) in Appendix 1 with a transmit power of $\mathcal{P}=1$ and additive white Gaussian noise of three different power levels, σ^{2} (see Figure 1). On average over all three noise levels, the median SNR is 24 dB at the investigated positions. The SNR CDF is similar to that obtained when applying the intercluster interference mitigation framework of [14, 17, 48]. That proposal forms overlapping static clusters that use different timefrequency allocations and further controls interference by using different antenna downtilts and transmit powers to the outside and to the inside of each cluster. The noise is i.i.d. over subcarriers for all users.
The channel correlation over frequency determines the covariance matrix Q = E[e e^{∗}] for each user in (3). It is estimated as the sample mean of h^{k}(h^{k + κ})^{∗} for k = 1,…,K_{CRS}w, κ = 1,…,w1 and i = 1,…,M. Computational complexity increases with w, so we use a low value of w = 4. The channels are predicted for 144 RSbearing subcarriers using prediction horizons of ϑ = 0, 4, 8, 12 and18 RS. These correspond to distances d_{ λ } = 0, 0.06, 0.13, 0.19 and 0.28 wavelengths or time horizons of 0, 5, 10, 15 and 23 ms for the system defined in Table 1. The results for prediction distances d_{ λ } are scalable and could be interpreted as predictions for time horizons of d_{ λ }·λ_{ c }/v at a carrier wavelength of λ_{ c } and a user moving at velocity v. For these simulations, the Kalman filters are updated in each RSbearing symbol with Δ t = 1,3 ms. However, after approximately ten iterations (i.e. after 13 ms), they converge to a constant value for each AR model. This could be utilized in a commercial system to keep complexity low.
Orthogonal RS are used in all results below. The noise powers at the RSbearing resources might in general differ from those on the payloadbearing resources. In evaluations, we will here use the same power for both cases.
The prediction performance will be evaluated using the normalized mean squared error (NMSE) for the channel from the j th transmitter to the i th user
where T is an appropriate averaging interval. The NMSE (20) is averaged in decibels over each 1 s interval for every subcarrier separately.
6.2.2 Scheduling and precoding assumptions
It is assumed that the active users within a cluster have data to receive. The scheduling and precoding methods are evaluated at full system load for two cases. First with M = N = 3 users and second with M = 9 users. The singleantenna users are randomly scattered over the measurement route. At every time slot of length 1.3 ms, the users are grouped and scheduled over the resource blocks, represented by the 144 subcarriers, based on the predicted CSIT. Precoding is then carried out at each time slot as the users move along the route for 0.5 s. A onedimensional search in the penalty matrix S by (23) in Appendix 2 is used by the RLP scheme to optimize the approximated sum rate (19). The obtained sum rate $\sum log\left(1+\text{SINR}\right)$ is then averaged over the 0.5 s for each subcarrier. This is repeated for 1,000 different sets of user starting positions along the measurement route. The same noise power levels as those for the predictions are used. The power constraint is ${\mathcal{P}}_{max}=1$ for each transmitter and for each subcarrier.
User grouping results are compared to a random user grouping with round robin scheduling denoted RUGRR. In that scheme, all M users are randomly subdivided into user groups of size M_{ g }≤N, with equality (M_{ g } = N = 3) in these simulations. Groups are scheduled in a round robin (RR) fashion over frequency, so all M users are served within a time slot.
Precoding results are compared to SC transmission with frequency reuse one. Then, each of the three base stations serves its own users on orthogonal resources, transmitting at full power with no base station cooperation. When SC transmission is compared to RUGRR, users within a cell are scheduled with RR and when it is compared to SBCUG, SB scheduling is used.
6.3 Prediction performance
The average NMSE of the predictions obtained by the experiments outlined above are presented in Table 2. For comparison, the NMSE achieved if the outdated estimate is used as a predictor is presented in the last (fifth) column. As the prediction horizon increases so does the benefit of using predicted CSIT as opposed to outdated. Due to high transmission delays (>5 ms), current systems would need ϑ>4 for JT CoMP under the assumptions of Table 1. Therefore, the use of predictions instead of outdated estimates is very important.
For JT CoMP, assume that an interfering scalar complexvalued channel is given by $g=\u011d+\mathrm{\Delta g}$, with $\u011d$ known, $\mathit{\u0112}\left[\mathrm{\Delta g}\right]=0$, $\mathit{\u0112}\left[\mathrm{\u011d\Delta}{g}^{\ast}\right]=0$ and an NMSE $\mathit{\u0112}\left[{\left\mathrm{\Delta g}\right}^{2}\right]/\mathit{\u0112}\left[{\leftg\right}^{2}\right]$. If this interference is to be canceled by receiving another channel component h, from another base station, then the resulting interference power $\mathit{\u0112}\left[{\leftg+h\right}^{2}\right]$ is minimized by setting $h=\u011d$ resulting in $\mathit{\u0112}\left[{\leftg+h\right}^{2}\right]=\mathit{\u0112}\left[{\left\mathrm{\Delta g}\right}^{2}\right]$. Therefore, the maximum attainable relative dampening factor would become $\mathit{\u0112}\left[{\leftg\right}^{2}\right]/\mathit{\u0112}\left[{\left\mathrm{\Delta g}\right}^{2}\right]$. Hence, a channel error with an NMSE of x dB indicates that we can reduce the interference from that base station by at most x dB. For example, at a prediction horizon of ϑ=18, the interference from the weakest base station at a given user can on average only be suppressed by 3 to 5 dB. The prediction performance of the weakest base station is far below that of the average performance over all base stations. These poor predictions might become ‘bad apples’ that infect the quality of the total precoding solutions.
A closer study of the effect of using different noise floors and RS SNRs is shown in Figures 4 and 5. As expected, a low noise floor increases the prediction performance. The impact of the RS SNR is largest at short prediction horizons. This is because at long prediction horizons the fading statistics, rather than the noise, is the main limiting factor of the prediction performance, as also discussed in [22].
6.4 Precoding performance
In Table 3 the percell sum rates are presented for the precoding schemes when M = 3 and when the channels for 1,000 sets of user starting positions are predicted with a prediction horizon of ϑ = 8. When using random user grouping and round robin scheduling (RUGRR), we see that the two JT CoMP schemes, RLP and ZF, provide small gains as compared to SC transmission. In fact, ZF transmission performs much worse than SC transmission for the most difficult user groups (the 5% percentiles). Comparing ZF with RLP for these user groups, which can be regarded as the toughest CoMP groups, RLP outperforms ZF by almost a factor of 3. There are two reasons for this, the first being that RLP considers the CSIT inaccuracy in the design process and the second being that RLP performs power adjustments through the iterative process described in Section 5.2.2. As discussed in [9], both are important, but the most significant factor is that the RLP takes the CSIT inaccuracy into account. RLP will avoid transmitting power over poorly predicted channels, which usually coincide with the weak channels. Therefore, RLP will require a lower scaling constant c than ZF, even without using the iterative power adjustment.
With RUGRR, SC transmission outperforms RLP for 34% of the groups. For 17% of the groups, the percell sum rate is more than 1 bps/Hz/cell higher for SC transmission. With cellular user grouping combined with scorebased scheduling (SBCUG), these numbers decrease to 7% and 0.6%, respectively. The improvement is due to better conditioned 3×3 channel matrices H resulting in the need for on average smaller power scaling factors c in (10). These results indicate that even with few users to choose from in the system, local scheduling will provide good user groups for CoMP. This phenomenon will be further validated in Section 7.
A clear benefit of using local scheduling algorithms such as scorebased scheduling is that we can get the benefits of multiuser diversity at low complexity. This is evident when we in Tables 3 and 4 compare the average sum rates when M = 3 with those for M = 9. The results for RUGRR remain almost unchanged, as expected. However the SBCUG provides a multiuser diversity gain in the range of 30% for the CoMP schemes and 15% for SC transmission. For SBCUG with M = 9, the fraction of situations where SC outperforms CoMP with RLP is only 1%. The advantage of SC in sum rate is more than 1 bps/Hz in less than 0.01% of the situations. Interestingly, both of these observations indicate that the multiuser diversity gain is higher for JT CoMP than for SC transmission when using SBCUG. This is because the scorebased scheduler selects users when they have their best channel quality, so their prediction errors will also be the lowest. This increases the accuracy of the CoMP precoder.
With SBCUG for M = 9 users, CoMP improves the average sum rate by 54% as compared to SC transmission. For the worst combinations of positions of scheduled users (the 5% percentile), the sum rate improves by 47%.
It is seen from Figure 6 that the highest sum rate gains from using CoMP are achieved when the noise floor is low. The system is then intracluster interference limited. The performance for ZF with perfect CSIT has been added for comparison. As the noise floor decreases, the gap between ZF with perfect CSIT and ZF with predicted CSIT increases. For low noise floors, RLP does not outperform ZF since RLP can only compensate for inaccurate CSIT by allocating transmit power over the more reliable channels, but it cannot compensate for the actual phase errors in the CSIT. As the noise floor decreases, and the channels become more accurate as a result (see Table 2), it therefore cannot perform better than ZF, even for the tough user groups.We now in Figure 6 compare ZF, RLP and ZF with perfect CSIT in the case with a noise floor of 110 dBm using RUGRR. ZF with perfect CSIT then performs worse than RLP with predicted channels, which may seem surprising. However, as mentioned, the regularizing third term in the inverse in (14) affects the power allocation such that more power is transmitted over accurate channels than over very inaccurate channels. Since generally the most accurate channels are also the strongest channels, the power allocation is automatically better than that of the ZF solution, even when ZF uses perfect CSIT.
Table 5 shows the results as the prediction horizon increases to ϑ = 18 (23 ms at 2.66 GHz). The decrease in CSIT quality decreases the performance for CoMP, as coherent transmission is sensitive to phase errors. Interestingly, with SBCUG, there is still a clear gain with using CoMP as compared to using SC transmission. This is not the case with RUGRR. The CoMP schemes in combination with SBCUG is hence more robust to channel prediction errors than in combination with RUGRR. Even for these fairly long delays of 23 ms, we still obtain significant CoMP gains, 38% increase in average sum rate for users at pedestrian velocities in the 2.66 GHz band. Moreover, if the system could guarantee delays of maximum 10 or 5 ms, we could equivalently obtain significant CoMP gains for users at vehicular velocities of about 60 and 120 km/h respectively at a carrier frequency of 500 MHz.
All investigated scenarios above suggest that using SBCUG instead of RUGRR is especially important for ZF precoding. User grouping based on cellular scheduling increases the average sum rate performance of ZF precoding so that it becomes equal to that of RLP. The 5% percentile sum rate is increased by up to a factor 6.7. This is because SBCUG generates wellconditioned matrices. The channel errors from the weak base stations will then have less effect on the final solution. This is most evident in the lowest percentiles, since these include the user groups with the largest channel errors.
It is noticeable, from Table 3 and Figure 6, that with SBCUG, ZF sometimes outperforms RLP. In our studies, we have seen that this is due to the approximations made when calculating $\mathit{\u0112}\left[\Delta {H}^{\ast}{V}^{\ast}\mathrm{V\Delta H}\right]$ in (14) by using (7), (15) and (16). This overestimates the variance of the prediction error as discussed in Section 3.2. RLP then becomes overly cautious, yielding a slightly worse solution. However, these effects are small and only noticeable at the lowest noise floor.
In all the above, we have assumed that the quantization error is small compared to the prediction error and therefore negligible. As the prediction errors are mostly in the regions of over 20 dB, a feedback cost of 8 to 10 bits per complexvalued scalar channel would ensure this. With an adaptive quantization scheme, the poor channels might only need 4 to 6 bits per complexvalued scalar channel for the quantization error to be negligible compared with the prediction error, so the feedback cost can then be lowered. The overhead required to notify the base station on how many bits each channel require is low, as this relates to the shadow fading and only needs to be fed back on a slow varying time scale, related to the shadow fading.
An idea of how a nonnegligible adaptive quantization error would affect the results can be gained by studying the performance differences between different noise floors. The higher noise floors lead to less accurate predictions, and quantization errors would amplify this effect. However, with a fixed quantization granularity, the size of the quantization error would be independent of the channel prediction quality. Then, in the presence of nonnegligible quantization errors, other effects might occur, which are not present in the results presented her. This is a topic of importance, which will be left for future studies.
7 Investigation of user grouping strategies
Due to the high computational complexity of some of the user grouping schemes, all of them have not been evaluated on the extensive channel data of Section 6, but rather in a simulation environment. Three cells supported by N = 3 omnidirectional singleantenna base stations at a distance R = 500 m serve M = 3,6,…,27 singleantenna users, with independently blockfading channels. The simulations use 140 blockfading resource blocks. The channel gains H_{ i j } for each set of user i and base station j are modeled as zeromean and circular symmetric complex Gaussian variables. Their variance ${\sigma}_{{h}_{\mathit{\text{ij}}}}^{2}$ is given by the path loss model 128.1 + 37.6 log10(d) and lognormal shadow fading with 8dB standard deviation. The channels are generated in two steps. First, channel prediction error variances ${\sigma}_{\Delta {h}_{\mathit{\text{ij}}}}^{2}$ are calculated through (6) assuming that w = 4 flat fading subcarriers are predicted jointly and that the fading statistics for all channels H_{ i j } is perfectly represented by a known fourthorder AR model with poles in 0.96^{±0.09i} and 0.91^{±0.04i} yielding a flat Doppler spectrum. Such a spectrum generally causes channels that are harder to predict than those in the measurements, where there is a mixture of different Doppler spectra. Second, to ensure that the prediction and the prediction error are uncorrelated, each H_{ i j } is calculated through (2) with Δ H_{quant} = 0 and with Δ H_{ i j } and ${\mathit{\u0124}}_{\mathit{\text{ij}}}$ modeled as uncorrelated circular symmetric complex Gaussian variables with variances ${\sigma}_{\Delta {h}_{\mathit{\text{ij}}}}^{2}$ and ${\sigma}_{{h}_{\mathit{\text{ij}}}}^{2}{\sigma}_{\Delta {h}_{\mathit{\text{ij}}}}^{2}$, respectively. The parameters in the righthand column of Table 1 and a prediction horizon of ϑ = 8 are assumed.
Users are dropped randomly with equal probability within a circle of 360m radius from the cluster center. This area corresponds well to the area in which a user would be allocated for overlapping network centric cooperation clusters that are formed as described in [14, 17, 48].
Performance is evaluated in terms of sum rate and individual user rate using ZF JT CoMP over 1,000 sets of user positions. The results from an exhaustive search of which user groups give the best sum rate on each resource have been added. This is denoted as optimal best rate (opt. BR).
7.1 Results
Comparisons between all the user grouping and scheduling schemes described in Section 4, as well as RUGRR are presented in terms of sum rate (Figure 7) and average user rate (Figure 8) for M = 9 users. Note that the CUG scheme performs close to the much more complex GUG algorithm both for the near sum rate optimal groups, comparing GUGBR with BRCUG and for the ‘fair’ user groups, comparing GUGPF with SBCUG. Both GUGBR and BRCUG also perform close to the sum rate optimal user grouping obtained by exhaustive search. In terms of the lowest percentiles of the average user rates for the fair algorithms, GUGPF is more fair than SBCUG. This can be explained by the SBCUG being restricted to allocating resources fair amongst users in the same cell. Therefore, when the users are unevenly distributed, e.g. when 80% of the users belong to the same master base station, then these users will be allocated to less resources than the other 20% of the users. The low percentiles of SBCUG are still much better than those obtained with RUGRR and with the sum rate optimal user grouping algorithms. In Figure 9, we see that the multiuser scheduling gain for the BRCUG algorithm is on level with that of the sum rate optimal algorithm. For the more fair SBCUG, the gain in terms of sum rate is less.
8 Discussions and conclusions
The paper has investigated the sum rate performance gains by coordinated joint linear transmission (JT CoMP) from several sites, relative to conventional cellular transmission with frequency reuse 1.
We have taken several types of constraints into account to obtain a reasonably realistic setting. Measured channel sounding data were used to obtain fading channels from multiple transmitter sites for a large set of terminal positions. We focused on cooperation between three singleantenna (macro) sites, to model a scenario with reasonable demands on feedback and on backhaul in a small cooperation cluster. All users furthermore had pedestrian velocities and we predicted their channels by Kalman algorithms. This setting produced significant CSIT errors and allowed us to investigate the limits of performance due to channel outdating. To obtain reasonable computational complexity, we furthermore restricted focus to linear precoders that were designed jointly for the whole cluster based on the inaccurate CSIT.
Our results take delays over the backhaul links into account, via the required prediction horizon, but backhaul capacity within clusters is not constrained. Such constraints would reduce performance markedly [43]. Furthermore, quantization errors of the channel prediction feedback over uplinks in FDD systems have been assumed small, relative to the prediction errors. This assumption would, e.g. be fulfilled by using 10bit quantization of complex channel components. (For the considered case of three base station antennas per cooperating cluster, the resulting feedback load over the uplink would then be 30 bits per scheduled user for each scheduled block. This assumes feedback of predictions only by the scheduled users and only for scheduled resource blocks, as proposed in Section 4. Methods that further reduce the feedback overhead are under current investigation).
The first main conclusion that stands out from these results is the crucial importance of a good user grouping. Joint transmission to a group of users with a badly conditioned channel matrix would lead to scaling problems in a linear precoder that is designed under perantenna power constraints. With random user positions, such problems occur frequently, with the result that the advantages of CoMP relative to cellular transmission are lost.
A second main conclusion is that with reasonably good user grouping, JT CoMP combined with fair opportunistic scheduling provides significant performance gains for practically all of the sets of investigated user positions. This holds also at quite large CSIT error levels, e.g. at NMSE 9 dB on average over all positions at 0.28 wavelengths or 23 ms prediction horizons (Tables 2 and 5). However, for still longer prediction distances in space, the performance starts to deteriorate and the gains of using coherent joint transmission vanish [10]. A specialized ‘predictor antenna’ system for vehicles has recently been proposed to obtain accurate CSI also at very high velocities [49].
A third highlight is that these gains can be obtained by using a simple user grouping scheme that we have proposed and evaluated here. Its essence is ‘Perform multiuser scheduling with respect to frequency locally for each cell. Then, for each frequency resource block, design joint transmission precoders for the terminals that have thereby been allocated to use that resource block.’ The first step can be executed locally in the base stations as opposed to in the central control unit, providing less strain on the backhaul links. Multiuser scheduling gains over frequencyselective channels are then preserved and even amplified (comparing Tables 3 and 4) by using JT CoMP relative to singlecell transmission that uses the same schedulers. By enabling the use of a twostage feedback approach, the proposed user grouping scheme also reduces the CSI feedback overhead in FDD systems drastically.
The simulations provided in Section 7 have shown that this extremely simple algorithm performs very close to the much more complex, feedback and backhaul demanding greedy user grouping algorithm. It also performs close to rateoptimal. Its effectiveness in avoiding bad user groups is illustrated most strikingly by the resulting increase of the 5% percentile sum rate performance, relative to random user grouping, when using zeroforcing precoding (Tables 3, 4 and 5). This user grouping scheme could be improved further, by introducing a second scheduling round that eliminates the few remaining cases with channel matrices with large singular value spread. That would however increase both the delay and the computational complexity.
A similar user grouping scheme can also be used with multiantenna base stations, where we in a first step design (multiuser) MIMO beamformers for each cell. Joint CoMP precoders (beamformers for the whole cluster) are then designed in a second step and are added to the signal chains before the cellular beamformers [17, 50].
Robust precoding that takes the channel inaccuracy into account is an important safeguard against remaining cases with problematic channel matrices. We have studied the use of the iterative RLP design of linear precoders for this purpose. When provided a ‘tough’ user group, with a badly conditioned channel matrix, then robust precoder designed by using the RLP scheme outperforms standard zeroforcing by a factor of 3 in terms of 5% percentile sum rate (Tables 3, 4 and 5, for RUGRR). However when user groups are chosen that mostly ensure diagonaldominant channel matrices, then RLP does not have a great advantage over ZF.
We have furthermore found interesting interactions between channel estimation and the properties of RLP precoders. A question posed in the introduction was on the effects of large differences in estimation accuracy for strong and weak channels. Would the larger inaccuracy of estimates in weak channels spoil the precoder design? When the RLP design is used, the opposite happens. Large inaccuracies of weak channels lead to these transmitters being less used by the precoder. This leads to less need for rescaling of the solution to satisfy the transmit power constraint.
With good precoder design and user grouping schemes, the limits of performance for linear downlink JT CoMP will mainly be due to the CSIT quality and the outofcluster interference and noise level (see Figure 6, SBCUG). Cooperation cluster design is therefore crucial for improving the attainable performance. This includes reducing intercluster interference for pilots and for payload data by semistatic transmission resource planning, power control and antenna beamforming [48] and schemes that increase the probability that users will find all of their strongest transmitters within one of the clusters [17].
Appendix 1
Kalman filter
Each Kalman filter at user i is assumed to use measurements ${\rm Y}\left(t\right)={\left[{y}_{i}^{\mathit{\text{qw}}}\left(t\right),\dots ,{y}_{i}^{\left(q+1\right)w1}\left(t\right)\right]}^{T}$ at groups of w RSbearing subcarriers. From (1) and (4), we get
Here, the measurement noise $\aleph \left(t\right)={\left[{n}_{i}^{\mathit{\text{qw}}}\left(t\right),\cdots \phantom{\rule{0.3em}{0ex}},{n}_{i}^{\left(q+1\right)w1}\left(t\right)\right]}^{T}$ is assumed zero mean with known covariance matrix R^{ℵ} and the matrix $\Phi \in {\mathbb{C}}^{w\times w\xb7N}$ contains only known reference symbols and zeros. Reference symbols may be transmitted at orthogonal timefrequency resources by different transmitters. Alternatively, to reduce the RS overhead or to increase the RS pattern density in time and/or frequency, we may use quasiorthogonal RS. Such ‘overlapping’ or ‘code orthogonal’ pilots, have, e.g. been proposed in [23] and evaluated in [9, 17]. One benefit from using the later technique is that the addition of transmit antennas would not necessary cause an increase in overhead. However, whenever jointly estimated subchannels are not perfectly flat fading, code orthogonality is lost in the receiver. In [17], it was shown that this leads to a large degradation of prediction quality for the weakest channels when these are much weaker (e.g. by more than 10 dB) than the strongest channels. The energy leaking from the RS transmitted over strong channels will then be large in comparison to the energy of the received RS transmitted over weak channels. This leaked energy can be regarded as an extra noise term in the measurement, thus causing a noticeably lower experienced SNR for the weak channels. Since channels from antennas located at different base stations will generally have large gain differences while those located at the same base station will not, we here assume the use of orthogonal RS for antennas located at different base stations while quasiorthogonal RS may be used for those located at the same base station. As an example, with two transmitters that use overlapping RS with BPSK symbols {1,1}, two users and four jointly estimated subcarriers, w = 4, the matrices Φ in (21) could be given by Φ = [ I, I ] for user i = 1 and Φ = [ diag{1,1,1,1}, diag{1,1,1,1} ] for user i = 2, while for K_{CRS} = K, $h\left(t\right)={\left[{H}_{i1}^{0},{H}_{i1}^{1},{H}_{i1}^{2},{H}_{i1}^{3},{H}_{i2}^{0},{H}_{i2}^{1},{H}_{i2}^{2},{H}_{i2}^{3}\right]}^{T}$, i = 1,2.
The Kalman filter for updating the estimated state vectors for user i are given by
where $\widehat{x}\left({t}_{1}\right{t}_{2})$ is an estimate of the state space vector in (3) at time t_{1} based on measurements up to time t_{2}, $P\left({t}_{1}\right{t}_{2})=E\left[\left(x\left({t}_{1}\right)\widehat{x}\left({t}_{1}\right{t}_{2})\right){\left(x\left({t}_{1}\right)\widehat{x}\left({t}_{1}\right{t}_{2})\right)}^{\ast}\right]$ and $\mathcal{K}\left(t\right)$ is the Kalman filter gain. These recursively computed estimates are based on a set of past measurements Υ(t),Υ(t1),… that grows in time, without requiring an increasing memory size to store the measurements.
Appendix 2
Iterative adjustment of the penalty matrix S
The criterion (18) can be optimized by adjusting the transmit powers with a stepbystep Greedy algorithm to reduce power normalization loss problems due to the scaling of (1/c) in (10). We outline this procedure below for singleantenna base stations.
First, calculate the optimal precoder from (14) with V = I and S = ϵ I. Here, ϵ≪1 is a small realvalued number ensuring that S is positive definite. The resulting precoder minimizes the intracluster interference which might not be optimal w.r.t. (18). This precoder is then used as the initial value for a sequence of iterative, onedimensional searches where we sequentially adjust the penalties on the transmit powers used by each base station.
Now, calculate u using (10) under the perbase station power constraint (11) and set
Here, ${\mathbf{1}}_{{j}_{max}}$ denotes a vector with a 1 if the corresponding base station has the highest transmit power and zeros otherwise. For example, assume a system with N = 3, M = 2 and
Then, the transmit powers at base station 1, 2 and 3 are [ 1 13 1.25 ], so j_{max} = 2 and ${\mathbf{1}}_{{j}_{max}}=\left[\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{0.3em}{0ex}}0\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{2.77626pt}{0ex}}1\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{2.77626pt}{0ex}}0\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{0.3em}{0ex}}\right]$. The parameter ρ_{1} is iteratively optimized w.r.t. (18) over an interval ]0,ρ_{1,max}[ where ρ_{1,max} is the smallest value that will cause j_{max} to change. This procedure can be repeated for the second strongest base station, denoted j_{max2}, with
Similarly, as for ρ_{1}, the parameter ρ_{2} is now optimized over ]ϵ,ρ_{2,max}[, while ρ_{1} is held fixed, where ρ_{2,max} is the smallest value that will cause the value of j_{max} or j_{max2} to change. In the above example, j_{max2} = 3 and ${\mathbf{1}}_{{j}_{max2}}=\left[\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{0.3em}{0ex}}0\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{2.77626pt}{0ex}}0\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{2.77626pt}{0ex}}1\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{0.3em}{0ex}}\right]$. Again, the procedure can be repeated for all remaining base station in the order of decreasing transmit power until the final matrix is
For clusters with few base stations, it is often sufficient to adjust only one scalar parameter in S related to the strongest base station as for (23). For clusters with many base stations, further improvements are obtained by adjusting additional diagonal elements in S starting with that associated with the second strongest base station.
For multiantenna base stations, all the colocated transmit antennas of one cell have average channel gains of the same order of magnitude. They should therefore be penalized using the same order of magnitude. Then, one penalty parameter value ρ_{ j } can be adjusted simultaneously for all antennas at one base station j at a time as for the singleantenna base station example above.
Appendix 3
Assuming no quantization errors, $E\left[{s}_{i}{s}_{j}^{\ast}\right]={\delta}_{\mathit{\text{ij}}}$,$\mathit{\u0112}\left[\mathrm{\Delta H}\right]=0$ and E[n n^{∗}] = σ^{2}I in (1), (2) and (10), the expected values of the power for the received message P_{S,i}, the intracluster interference P_{I,i} and the noise P_{N,i} at the i th user are given by
where
Assuming that $\mathit{\u0112}\left[\mathrm{\u0124\Delta}{H}^{\ast}\right]=0$,
where $\mathit{\u0112}\left[\Delta {H}_{\mathit{\text{ij}}}\Delta {H}_{\mathit{\text{in}}}^{\ast}\right]$ is element (j,n) of the covariance submatrix (C P(t + ϑt)C^{∗})^{k} of user i.
Appendix 4
Proof of Theorem 1
We will here prove that (14) minimizes the cost function (13). If S has full rank, then the inverse in (14) exists, so R_{RLP} by (14) exists and is unique. Assume that R_{RLP} by (14) does not minimize (13). Since the signal vector s in (10) is assumed white, any alternative potentially superior linear precoder can then be expressed as a linear function of s at time t only. Then there must exist a matrix
with a complex matrix T and a real number ϵ with which we can decrease the value of J. We can then write the error signal ϵ and the vector of transmitted signals u as
Using these notations and that ∥v∥^{2} = tr(v v^{∗}) for a vector v, we can rewrite (13) as
where
First, note that J_{0} is not affected by the choice of T. Second, we note that J_{2}≥0, so the only way to decrease J is by choosing T such that J_{1}<0. Using E[s s^{∗}] = I we can expand J_{1} into
Through the trace rotation rule, tr(A B) = tr(B A), we get
Assuming that $\mathit{\u0112}\left[\mathrm{\Delta H}\right]=\mathit{\u0112}\left[\Delta {H}_{\text{quant}}\right]=0$ and $\mathit{\u0112}=\left[\Delta {H}^{\ast}{V}^{\ast}\mathrm{V\Delta}{H}_{\text{quant}}\right]=0$ and inserting (2) and (14) into (31) we get J_{1} = 0 for all T. Hence, we cannot choose a matrix T that will decrease the cost function, J = J_{0} + ϵ^{2}J_{2}. The minimum J = J_{0} is attained only by setting ϵ = 0, so R = R_{RLP} minimizes the cost function.
References
 1.
Lee J, Kim Y, Lee H, Ng BL, Mazzarese D, Liu J, Xiao W, Zhou Y: Coordinated multipoint transmission and reception in LTEadvanced systems. IEEE Com. Mag 2012, 50: 4450.
 2.
Tao X, Xu X, Cui Q: An overview of cooperative communications. IEEE Wireless Com. Mag 2012, 8: 6571.
 3.
Lee D, Seo H, Clerckx B, Hardouin E, Mazzarese D, Nagata S, Sayana K: Coordinated multipoint transmission and reception in LTEadvanced deployment: scenarios and operational challenges. IEEE Wireless Com. Mag 2012, 50: 148155.
 4.
Björnson E, Jaldén N, Bengtsson M, Ottersten B: Optimality properties, distributed strategies, and measurementbased evaluation of coordinated multicell OFDMA transmission. IEEE Trans. Signal Process 2011, 59: 60866101.
 5.
Lee W, Lee I, Kwak JS, Ihm B, Han S: MultiBS MIMO cooperation: challenges and practical solutions in 4G systems. IEEE Wireless Com 2012, 19: 8996.
 6.
Karakayali MK, Foschini GJ, Valenzuela RA: Network coordination for spectrally efficient communications in cellular systems. IEEE Wireless Com 2006, 13: 5661.
 7.
Zhang H, Dai H: Cochannel interference mitigation and cooperative processing in downlink multicell multiuser MIMO networks. EURASIP J. Wireless Commun. Netw 2004. doi:10.1155/S1687147204406148
 8.
3GPP TR 36.819 v11.0.0: 3rd Generation Partnership Project, Technical specification group radio access network; Coordinated multipoint operation for LTE physical layer aspects, (Release 11). 2011.http://www.3gpp.org/DynaReport/36819.htm . Accessed 14 June 2014
 9.
Apelfröjd R, Sternad M, Aronsson D: Measurementbased evaluation of robust linear precoding for downlink CoMP. In Proc. of IEEE ICC 2012. Ottawa; 10–15 June 2012.
 10.
Li J, Papadogiannis A, Apelfröjd R, Svensson T, Sternad M: Performance analysis of coordinated multipoint transmission schemes with imperfect CSI. In Proc. of IEEE PIMRC 2012. Sydney; 9–12 Sept 2012.
 11.
Zhang J, Chen R, Andrews JG, Ghosh A, Heath RW: Network MIMO with clustered linear precoding. IEEE Trans. Wireless Com 2009, 8: 19101921.
 12.
Gesbert D, Hanly S, Huang H, Shamai Shitz S, Simeone O, Yu W: Multicell MIMO cooperative networks: a new look at interference. IEEE J. Select. Areas Com 2010, 28: 13801408.
 13.
Spencer QH, Swindlehurst AL, Haardt M: Zeroforcing methods for downlink spatial multiplexing in multiuser MIMO channels. IEEE Trans. Signal Process 2004, 52: 461471. 10.1109/TSP.2003.821107
 14.
Mennerich W, Zirwas W: Reporting effort for cooperative systems applying interference floor shaping. In Proc. of IEEE PIMRC 2011. Toronto; 11–14 Sept 2011.
 15.
Huh H, Caire G, Papadopoulos HC, Ramprashad SA: Achieving “massive MIMO” spectral efficiency with a notsolarge number of antennas. IEEE Trans. Wireless Com 2012, 11: 32263239.
 16.
Lozano A, Heath RW, Andrews JG: Fundamental limits of cooperation. IEEE Trans. Information Theory 2013, 59: 52135226.
 17.
ARTIST4G D1.4: Interference avoidance techniques and system design, Artist4G technical deliverable. . Accessed 7 Nov 2013 https://ictartist4g.eu/projet/deliverables
 18.
Ravindran N, Jindal N: Multiuser diversity vs. accurate channel state information in MIMO downlink channels. IEEE Trans. Wireless Com 2012, 11: 30373046.
 19.
Thiele L, Olbrich M, Kurras M, Matthiesen B: Channel aging effects in CoMP transmission: gains from linear channel prediction. In 45th Asilomar Conf. on Signals, Systems and Computers. Pacific Grove; 6–9 Nov 2011.
 20.
Wild T: Comparing downlink coordinated multipoint schemes with imperfect channel knowledge. In Proc. of IEEE VTC Fall 2011. San Francisco; Sept 2011.
 21.
Su L, Yang C, Han S: The value of channel prediction in CoMP systems with large backhaul latency. In Proc. of IEEE WCNC 2012. Paris; 1–4 April 2012.
 22.
Aronsson D: Channel estimation and prediction for MIMO OFDM systems  key design and performance aspects of Kalmanbased algorithms, Dissertation, Uppsala University, 2011. . Accessed 7 Nov 2013 http://www.signal.uu.se/Publications/ptheses.html
 23.
Aronsson D, Sternad M: Kalman predictor design for frequencyadaptive scheduling of FDD OFDM uplinks. In Proc. of IEEE PIMRC 2007. Athens; 3–7 Sept 2007.
 24.
Shenouda MB, Davidson TN: On the design of linear transceivers for multiuser systems with channel uncertainty. IEEE J. Selected Areas Com 2008, 26: 10151024.
 25.
Vucic N, Boche H, Shi S: Robust transceiver optimization in downlink multiuser MIMO systems with channel uncertainty. IEEE Trans. Signal Process 2009, 57: 35763587.
 26.
Öhrn K, Ahlén A, Sternad M: A probabilistic approach to multivariable robust filtering and openloop control. IEEE Trans. Automatic Control 1995, 40: 405418. 10.1109/9.376052
 27.
Sternad M, Ahlén A: Polynomial Methods in Optimal Control and Filtering. Edited by: K Hunt, Hunt K. Control Engineering Series, Peter Peregrinus, London, 1993), Chapter 3;
 28.
Bränmark LJ: Robust sound field control for audio reproduction  a polynomial approach to discretetime acoustic modeling and filter design, Dissertation, Uppsala University, 2011.
 29.
Hunger R, Dietrich FA, Joham M, Utschick W: Robust transmit zeroforcing filter. Proc. of ITG Workshop on Smart Antennas (Munich, 18–19 March 2004)
 30.
Castro PM, Joham M, Castedo L, Utschick W: Robust MMSE linear precoding for multiuser MISO systems with limited feedback and channel prediction. In Proc. of IEEE PIMRC. Cannes; 15–18 Sept 2008.
 31.
Zhang X, Palomar DP, Ottersten B: Statistically robust design of linear MIMO transceivers. IEEE Trans. Signal Process 2008, 56: 36783689.
 32.
Marsch P, Fettweis G: On multicell cooperative transmission in backhaulconstrained cellular systems. Annalas Telecommun 2008, 63(5):253269. doi:10.1007/s1224300800283
 33.
Chang RY, Tao Z, Zhang J, Kuo CCJ: Multicell OFDMA downlink resource allocation using a graphic framework. IEEE Trans. Vehicular Technol 2009, 58: 34943507.
 34.
Shen Z, Chen R, Andrews JG, Jr. Heath RW, Evans BL: Low complexity user selection algorithms for multiuser MIMO systems with block diagonalization. IEEE Trans. Signal Process 2006, 54: 36583663.
 35.
Yoo T, Jindal N, Goldsmith A: Multiantenna downlink channels with limited feedback and user selection. IEEE J. Selected Areas Com 2007, 25: 14781491.
 36.
Diehm F, Fettweis G: Centralized scheduling for joint decoding cooperative networks subject to signaling delays. In Proc. of IEEE VTC Fall 2011. San Francisco; 5–8 Sept 2011.
 37.
Wigren T: Recursive noise floor estimation in WCDMA. IEEE Trans. Vehicular Technol 2010, 59: 26152620.
 38.
Kailath T, Sayed AH, Hassibi B: Linear Estimation. Prentice Hall, Upper Saddle River; 2000.
 39.
Zakhour R, Gesbert D: A twostage approach to feedback design in multiuser MIMO channels with limited channel state information. In Proc. of IEEE PIMRC 2007. Athens; 3–7 Sept 2007.
 40.
Seifi N, Viberg M, Jr Heath RW, Zhang J, Coldrey M: Multimode transmission in network MIMO downlink with incomplete CSI. EURASIP J. Adv. Signal Process 2011. doi:10.1155/2011/743916
 41.
Bonald T: A scorebased opportunistic scheduler for fading radio channels. In Proc. of European Wireless Conf (EWC). Barcelona; 24–27 Feb 2004.
 42.
Viswanath P, Tse DNC, Laroia RL: Opportunistic beamforming using dumb antennas. IEEE Trans. Inf. Theory 2002, 48: 12771294. 10.1109/TIT.2002.1003822
 43.
Papadogiannis A, Bang HJ, Gesbert D, Hardouin E: Efficient selective feedback design for multicell cooperative networks. IEEE Trans. Vehicular Technol 2011, 60: 196205.
 44.
Wiesel A, Eldar YC, Shamai S: Zeroforcing precoding and generalized inverses. IEEE Trans. Signal Process 2008, 56: 44094418.
 45.
Lakshmana TR, Botella C, Svensson T: Partial joint processing with efficient backhauling using particle swarm optimization. EURASIP J. Wireless Com. Netw 2012. doi:10.1186/168714992012182
 46.
Medbo J, Siomina I, Kangas A, Furuskog J: Propagation channel impact on LTE positioning accuracy  a study based on real measurements of observed time difference of arrival. In IEEE PIMRC 2009. Tokyo; Sept 2009.
 47.
Sjanic Z, Gunnarsson F, Fritsche C, Gustafsson F: Cellular network nonlineofsight reflector localisation based on synthetic aperture radar methods. IEEE Trans. Antenn. Propag 2014, 62: 22842287.
 48.
Zirwas W, Menerich W: The importance of interference floor shaping for CoMP systems. In Proc. of International OFDM Workshop. Hamburg; 31 Aug 2011.
 49.
Sternad M, Grieger M, Apelfröjd R, Svensson T, Aronsson D, Martinez AB: Using “predictor antennas” for longrange prediction of fast fading for moving relays. In IEEE WCNC 2012. Paris; April 2012.
 50.
Zirwas W, Mennerich W, Khan A: Main enablers for advanced interference mitigation. Wiley Online Library Trans. Emerging Tel. Tech., Special Issue LTEA 2013, 24: 1831.
Acknowledgements
The research leading up to these results has received funding from the European Commission’s Seventh Framework Program FP7ICT2009 under grant of arrangement no. 247223 also referred to as ARTIST4G. We thankfully acknowledge the contributions and insights of our colleagues within the project, in particular Wolfgang Zirwas from NSN, Tommy Svensson, Jingya Li and Tilak Lakshmana from Chalmers, Michael Grieger, Fabian Diehm and Richard Fritzsche from TU Dresden, Valeria D’Amico from Telecom Italia and Daniel Aronsson, now at Mathworks, Sweden. The research was also partially funded by the Swedish Research Council via the framework program Dynamic Multipoint Wireless Transmission. We thank Ericsson Research for providing the channel measurements.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Apelfröjd, R., Sternad, M. Design and measurementbased evaluations of coherent JT CoMP: a study of precoding, user grouping and resource allocation using predicted CSI. J Wireless Com Network 2014, 100 (2014). https://doi.org/10.1186/168714992014100
Received:
Accepted:
Published:
Keywords
 Coordinated multipoint
 Channel predictions
 User grouping
 Resource allocation
 Robust precoding