Skip to main content

MSE minimized joint transmission in coordinated multipoint systems with sparse feedback and constrained backhaul requirements


In a joint transmission coordinated multipoint (JT-CoMP) system, a shared spectrum is utilized by all neighbor cells. In the downlink, a group of base stations (BSs) coordinately transmit the users’ data to avoid serious interference at the users in the boundary of the cells, thus substantially improving area fairness. However, this comes at the cost of high feedback and backhaul load; In a frequency division duplex system, all users at the cell boundaries have to collect and send feedback of the downlink channel state information (CSI). In centralized JT-CoMP, although with capabilities for perfect coordination, a central coordination node have to send the computed precoding weights and corresponding data to all cells which can overwhelm the backhaul resources. In this paper, we design a JT-CoMP scheme, by which the sum of the mean square error (MSE) at the boundary users is minimized, while feedback and backhaul loads are constrained and the load is balanced between BSs. Our design is based on the singular value decomposition of CSI matrix and optimization of a binary link selection matrix to provide sparse feedback—constrained backhaul link. For comparison, we adopt the previously presented schemes for feedback and backhaul reduction in the physical layer. Extensive numerical evaluations show that the proposed scheme can reduce the MSE with at least \(25\%\), compared to the adopted and existing schemes.

1 Introduction

Mobile communication systems are becoming an essential part of social networks, interactive media (e.g. augmented and virtual reality), internet of things (IoT), and a facilitator for the digital economy. This drives the fifth-generation and beyond (B5G) of mobile communication systems to scale mainly in three dimensions; (a) Rate is scaled to \(10\ {\mathrm{Gbps}}\) which is ten-times of peak data rate in the fourth generation (4G) of long term evolution (LTE), release 10, and end-to-end latency decreases to \(1\ {\mathrm{ms}}\) which is one-fifth of latency compared to 4G. (b) A massive scale in the number of connected devices in IoT is needed. (c) Higher system reliability and quicker round trip times are to be available in transportation systems and industrial process control. To fulfill these requirements, a combination of various new techniques such as massive multiple input multiple output (MMIMO), dense small cells, cooperative communications such as device-to-device (D2D) and coordinated multipoint (CoMP), advanced air interface, additional spectrum at higher frequencies (mm-wave), and integrated access and backhaul (IAB) are needed  [1,2,3,4,5].

One of the most promising concepts to cover all the above requirements is ultra-dense networks (UDN) with frequency reuse (close to) one, where more small base stations (BSs) are deployed within the service area. However, using the same frequency for all BSs, exposes the users to severe inter-cell-interference (ICI), especially at the cell edges. Recently, to reap the benefits of UDN, cloud radio access network (CRAN) architecture has been proposed, where ICI can be effectively mitigated by employing the CoMP technique [6, 7]. Initially, CoMP was introduced for LTE-A by the third generation partnership project (3GPP) to mitigate ICI in cell-edge users [8]. Qualcomm has implemented a fifth-generation (5G) CoMP testbed and showed a fourfold increase in system capacity [9]. Currently, CoMP is considered as one of the potential technologies for 5G cellular networks. Accordingly, 5G enhanced some aspects of CoMP such as control signaling and channel feedback [10, 11].

There are three main categories for downlink CoMP, i.e. joint transmission (JT), dynamic point selection (DPS), and coordinated scheduling/beam-forming (CS/CB). In JT-CoMP, the data related to a user is available at all serving BSs and is transmitted simultaneously by each BS. This transmission can be coherent or non-coherent. In coherent transmission (also known as multiple input multiple output (MIMO) network), the signal strength is enhanced by precoding the data to exploit the phase and amplitude information of each channel. We consider coherent JT-CoMP, since it generally outperforms the laters in system performance [12].

In JT-CoMP, a group of BSs forms virtual antenna array distributed across multiple cells. In the downlink, two or more geographically separated BSs cooperate to jointly and coordinately transmit to cell-edge users, where the improvement is most needed and exploit the interference as a useful signal. Superposition of signals at the user position is performed in a way to maximize the desired signal (constructive) and at the same time minimize the ICI (destructive). This requires that accurate channel state information (CSI) is available at the transmitter side. In time division duplex (TDD) transmission, CSI is acquired from the reciprocity of uplink and downlink channels, while in frequency division duplex (FDD) transmission, users need to feedback the received CSI from all serving BSs to the BS with the strongest link. In the centralized approach of FDD JT-CoMP, the CSI for all users is aggregated in the central coordination node (CCN) to calculate the precoding weights for the subsequent downlink transmission. Through backhaul, the precoding weights with the users’ data are to be sent from CCN to the corresponding BSs. Finally, each BS transmits a weighted combination of all users’ data to the users.

The throughput of downlink JT-CoMP heavily relies on the quality of the CSI at the transmitters. Feedback latency and reliability of the feedback channel degrade the system performance  [13, 14]. Other impairments such as imperfect carrier and sampling frequencies among BSs cause a mismatch between the precoder and the actual channel, which limits the potential gains of JT-CoMP [15, 16]. In practical implementations of JT-CoMP, feedback and backhaul loads are two main challenges that need to be addressed properly. As system performance depends on the CSI quality, CSI has to be sent back at very low latency to avoid it to be outdated before being used for precoding. Therefore a large amount of CSI is required, which poses a considerable feedback load. The users’ data needs to be available at all serving BSs. Moreover, the precoding weights are to be sent to the corresponding BSs, which pose a heavy burden on the backhaul traffic. Also, increasing the number of cooperating BSs to improve the spectral efficiency, increases the backhaul load [17,18,19].

Although CoMP is one of the main solutions to mitigate ICI and has been considered from 4G to B5G, feedback and backhaul loads have been identified as two of the key challenges for its practical implementation and prevent the CoMP from real take off. It is expected in the 5G era, a powerful fiber backhaul is available at least for macro-cells in CRAN architecture. Such deployment can inherently provide a low-latency and high-capacity backhaul needed for JT-CoMP, while connections of small-cells might be ranging from fiber to various relaying and IAB approaches, which have lower capacity but might be more cost-effective [3, 5].

It is highly desirable to reduce the feedback and backhaul loads by routing users’ data to a limited number of BSs. This reduction can be done in the medium access control (MAC) layer  [20,21,22,23,24] or physical (PHY) layer  [7, 25,26,27]. Furthermore, recently by considering the cache mechanism to BSs, the transmission latency and the total backhaul bandwidth consumption is reduced which is based on upper layers of the network [28]. In this paper, we aim to reduce feedback and backhaul loads simultaneously in JT-CoMP using the PHY layer schemes.

In the PHY layer based schemes, limited feedback and backhaul precoders are designed with respect to sum-rate maximization  [25,26,27] or maximizing the number of users admitted to the network  [7]. In [25], absolute and relative thresholding was proposed for feedback load reduction in the FDD downlink. In absolute thresholding, only CSI with a corresponding signal to noise ratio (SNR) exceeding a predefined threshold is sent back, whereas in relative thresholding, the threshold is set based on the strongest channel. It has been shown that the latter technique provides a good trade-off between sum-rate performance and feedback overhead. An absolute threshold is used for selective feedback in  [26], where for limiting the user data exchange through the backhaul, two schemes including scheduling and precoding techniques are employed. The proposed framework has a good sum-rate performance with limited system overhead. The idea of relative thresholding is followed by  [27], where the precoder is designed using a successive second order cone programming (SSOCP) to maximize the sum-rate for all cell-edge users.

Recently, the 3GPP initiated a standardization activity to employ codebook-based precoding at BSs with an aim to decrease CSI feedback overhead to satisfy the spectral efficiency requirement of future cellular systems. In 3GPP LTE, codebook type I was introduced  [29] and for more accurate CSI feedback to better support the transmission in new radio (NR), codebook type II was introduced in 3GPP Release 15 [30]. This throughput gain comes at the expense of a significant increase in feedback overhead. To this end, Release 16 introduces enhanced CSI feedback by compressing the CSI report in the frequency domain and extending the codebook type II to support MIMO channels with rank larger than two  [31]. These enhancements increase throughput and reduce CSI feedback overhead [32].

In this paper, we design a novel JT-CoMP transmission scheme based on singular value decomposition (SVD) to minimize the sum mean square error (MSE) at boundary users. As in practical networks especially mobile systems, implementation of continuous rate adaptation is impossible and rate is selected from a limited discrete set [33,34,35], we consider the MSE criteria for system performance evaluation. We optimize a binary link selection matrix, in which each element corresponds to the link between an antenna of a BS and an antenna of a user. If an element is one, the corresponding antenna serves the user, otherwise, it is not involved in the transmission. We propose a two-layer recursive optimization method; In the inner layer, the SVD of the CSI matrix is utilized to design a precoder fulfilling a sum-power constraint. In the outer layer, the link selection matrix is designed, providing required feedback and backhaul load reductions and load balancing between BSs. To obtain a further reduction of the feedback load, we consider a CSI codebook based limited feedback strategy, where each user selects a codeword from a CSI codebook and feeds back its index to the serving BS. The CCN collects all the codeword indexes and calculates the precoding matrix. Random vector and uniformly distributed quantizations are employed respectively for quantizing the channel direction information and phase ambiguity.

We compare our scheme with two recent works in this area  [26, 27], having a close target performance goal for our design. Moreover, we adopt two existing precoders, zero forcing (ZF) and Wiener, in our two layer optimization scheme to provide sparsity constraints on feedback and backhaul. As shown, our scheme outperforms  [26, 27] with at least \(30\%\), and adopted ZF and Wiener precoders with \(25\%\) from the MSE aspect. The key contributions of this paper are summarized as follows:

  • In the previously presented approaches, just the average backhaul or feedback loads are controlled, while the hardware must be available for the worst-case scenario. In our scheme, the feedback and backhaul loads are strictly constrained.

  • As the load balancing has a key role in radio resource optimization, we consider a constraint for the number of users that are served by a specific BS. This association constraint between BSs and users, reduces the maximum load in each BS.

  • The proposed scheme is not sensitive to the type of receiver and has the same performance in receivers with different receive filters, while the performance of adopted ZF and Wiener precoders depend on the type of receive filter. Therefore, an advantage of the proposed scheme is that it has good performance in simplified receivers such as receivers using no receive filter.

  • We employ a CSI codebook based feedback strategy to further reduce the feedback load. In this regard, the random vector and uniform distribution are used for quantizing the channel direction and phase information. It is showed that by employing only 6 bits for quantization, a performance near to the full CSI feedback is attainable. In this technique, users employ different CSI codebooks to independently quantize their CSI.

  • The proposed scheme has good convergence properties. It converges after transmitting 5 subframes, in the worst case.

The remainder of the paper is organized as follows, in Sect. 2, preliminaries including notation and system model are presented. Section 3 is devoted to designing the new proposed JT-CoMP transmission scheme, in Sect. 4, benchmark schemes are explained, and Sect. 5 numerically evaluates the proposed scheme and its efficiency in comparison with the benchmarks. The paper conclusions are made in Sect. 6.

2 Preliminaries

2.1 Notation

In this paper, scalar variables are denoted by small italic letters e.g. x and vector variables by small italic bold letters e.g. \({\varvec{x}}\). Sets are denoted by calligraphic letters e.g. \({\mathcal{W}}\). The absolute value of scalar variables or the number of members in a set or matrix is shown by \(\left| {.}\right|\), the maximum integer lower than x is shown with \(\left\lfloor x\right\rfloor\). The Euclidean and Frobenius norm of vector \({\varvec{x}}\) are denoted by \(\left\| {\varvec{x}}\right\| _{2}\) and \(\left\| {\varvec{x}}\right\| _{F}\). The transpose, conjugate, and conjugate transpose (Hermitian) of matrix \({{\varvec{H}}}\) are shown respectively by \({{\varvec{H}}}^{T}\), \({{\varvec{H}}}^{*}\) and \({{\varvec{H}}}^{H}\). Indeed \({{\varvec{H}}}(i,:)\) and \({{\varvec{H}}}(:,j)\) denote the i-th row vector and the j-th column vector of the matrix \({{\varvec{H}}}\), respectively, while \({{\varvec{H}}}(i,j)\) is the i-th row and the j-th column element of the matrix \({{\varvec{H}}}\). The vectors \({\varvec{d}}({\varvec{{\varvec{A}}}})\) and \({\varvec{\lambda}}({\varvec{{\varvec{A}}}})\) contain diagonal and eigenvalues of the square matrix \({{\varvec{A}}}\). The operator \(Tr({\varvec{A}})=\sum _{i}{\varvec{A}}(i,i)\) and \(Tr({\varvec{d}})=\sum _{i}{\varvec{d}}(i)\) denote trace of a matrix or vector. The maximum eigenvalue of a Hermitian matrix \({{\varvec{A}}}\) is represented by \(\lambda _{\max }({{\varvec{A}}})\). The number of nonzero elements in matrix \({\varvec{A}}\) is shown by \(nnze({\varvec{A}})\), while the number of nonzero rows and columns are shown by \(nnzr({\varvec{A}})\) and \(nnzc({\varvec{A}})\), respectively. Inner matrix product of \({\varvec{A}}\in {\mathbb{C}}^{M\times N}\) and \({\varvec{B}}\in {\mathbb{C}}^{M\times N}\) is denoted with \(\varvec{C}={\varvec{A}}\cdot {\varvec{B}}\), where \(\varvec{C}(i,j)={\varvec{A}}(i,j){\varvec{B}}(i,j), 1\le i\le M,1\le j\le N\). Identity matrix of size \(N\times N\) is represented by \({\mathbf{I}}_{N}\) and the all one matrix of size \(M\times N\) is shown by \({\mathbf{1}}_{M\times N}\). The list of main variables in the paper is presented in Table 1.

Table 1 List of main variables

2.2 System model and network structure

Figure 1 shows a schematic form of the considered network, including 3 neighboring cells and 3 users at the common boundary of the cells. There is a cluster area in the middle of the cells, in which there would be high interference, in case of no media division or coordination between cells. This is shown with a gray area in Fig. 1.

Fig. 1
figure 1

A simple schematic model for the considered network

In downlink transmission of the JT-CoMP scheme, BSs are coordinated and jointly transmit as a single virtual multi-antenna transmitter with distributed antennas. To this end, data for the users in the cluster center is sent to all BSs via a backhaul link. Each BS transmits a linear combination of users’ data with a proper precoding weight. Precoding weights are calculated in the CCN, based on the CSI of all links between BSs and cluster centered users.

In TDD transmission, CSI could be implicitly estimated at the BSs based on channel reciprocity. But in FDD, CSI is estimated by the users and are sent back to the BSs. All CSI is sent from the BSs to the CCN to calculate the precoding weights.

Figure 2 shows the downlink of the mentioned system. In general, there are \({\mathrm{N}}_{b}\) BSs each of them with \({\mathrm{N}}_{t}\) antennas that are serving \({\mathrm{N}}_{u}\) users, each of them with \({\mathrm{N}}_{r}\) antennas. Thus, there are in total \({\mathrm{N}}_{B}={\mathrm{N}}_{b}{\mathrm{N}}_{t}\) transmit and \({\mathrm{N}}_{U}={\mathrm{N}}_{u}{\mathrm{N}}_{r}\) receive antennas. The channel gain at the u-th user, \(1\le u\le {\mathrm{N}}_{u}\) is defined by \({\varvec{H}}_{u}\in {\mathbb{C}}^{{\mathrm{N}}_{r}\times {\mathrm{N}}_{B}}\) in which \({\varvec{H}}_{u}(i,j),\,1\le i\le {\mathrm{N}}_{r},\,1\le j\le {\mathrm{N}}_{B}\) is the channel gain between the i-th antenna of the u-th user and the j-th transmit antenna. The mentioned transmit antenna belongs to the b-th BS, where \(b=\left\lfloor j/{\mathrm{N}}_{b}\right\rfloor\).

Fig. 2
figure 2

Downlink transmission setup in a JT-CoMP System

The aggregated data symbols are denoted by \({\varvec{x}}=\left[ {\varvec{x}}_{1}^{T}\,\cdots \,{\varvec{x}}_{{\mathrm{N}}_{u}}^{T}\right] ^{T}\in {\mathbb{C}}^{{\mathrm{N}}_{U}\times 1}\) in which \({\varvec{x}}_{u}\in {\mathbb{C}}^{{\mathrm{N}}_{r}\times 1}\) is the data symbol for the u-th user, where \(E\left\{ {\varvec{x}}{\varvec{x}}^{H}\right\} =\sigma _{x}^{2}{\mathbf{I}}_{{\mathrm{N}}_{U}}\). Each BS transmits a linear combination of all precoded data symbols. The precoding matrix corresponding to the u-th user, which belongs to different BSs, is denoted by \({\varvec{W}}_{u}\in {\mathbb{C}}^{{\mathrm{N}}_{B}\times {\mathrm{N}}_{r}},1\le u\le {\mathrm{N}}_{u}\) and the aggregated precoding matrix is denoted by \({\varvec{W}}=\left[ {\varvec{W}}_{\text {1}}\,\cdots {\varvec{W}}_{{\mathrm{N}}_{u}}\right] \in {\mathbb{C}}^{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}\). Thus, the transmitted signal by all BSs is \({\varvec{W}}_{1}{\varvec{x}}_{1}+\cdots +{\varvec{W}}_{{\mathrm{N}}_{u}}{\varvec{x}}_{{\mathrm{N}}_{u}}\) and the received signal at the u-th user is

$$\begin{aligned} {\varvec{y}}_{u}={\varvec{H}}_{u}({\varvec{W}}_{1}{\varvec{x}}_{1}+\cdots +{\varvec{W}}_{{\mathrm{N}}_{u}}{\varvec{x}}_{{\mathrm{N}}_{u}})+{\varvec{n}}_{u}, \end{aligned}$$

where \({\varvec{n}}_{u}\) is a complex Gaussian random variable with zero mean and variance of \(\sigma _{n}^{2}\). The signal to interference and noise ratio (SINR) at the u-th user is computed as

$$\begin{aligned} {SINR_{u}=\frac{\left\| {\varvec{H}}_{u}{\varvec{W}}_{u}\right\| _{2}^{2}}{\sum _{i=1,i\ne u}^{{\mathrm{N}}_{u}}\left\| {\varvec{H}}_{i}{\varvec{W}}_{i}\right\| _{2}^{2}+\sigma _{n}^{2}}.} \end{aligned}$$

By considering the aggregated channel gain as \({\varvec{H}}=\left[ {\varvec{H}}_{\text {1}}^{T}\,\cdots {\varvec{H}}_{{\mathrm{N}}_{u}}^{T}\right] ^{T}\in {\mathbb{C}}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{B}}\), the aggregated received signal \({\varvec{y}}=\left[ {\varvec{y}}_{1}^{T}\,\cdots \,{\varvec{y}}_{{\mathrm{N}}_{u}}^{T}\right] ^{T}\in {\mathbb{C}}^{{\mathrm{N}}_{U}\times 1}\) is computed as

$$\begin{aligned} {\varvec{y}}={\varvec{H}}{\varvec{W}}{\varvec{x}}+{\varvec{n}}, \end{aligned}$$

where \({\varvec{n}}=\left[ {\varvec{n}}_{1}^{T}\,\cdots \,{\varvec{n}}_{{\mathrm{N}}_{u}}^{T}\right] ^{T}\in {\mathbb{C}}^{{\mathrm{N}}_{U}\times 1}\) is the noise vector. The receive filter at the u-th user is denoted by \({\varvec{g}}_{u}\in {\mathbb{C}}^{{\mathrm{N}}_{r}\times {\mathrm{N}}_{r}}\) and the detected signal at the u-th receiver is \(\widetilde{{\varvec{x}}}_{u}={\varvec{g}}_{u}{\varvec{y}}_{u}\). If the receive filters are aggregated as \(\varvec{G}=diag\left( {\varvec{g}}_{1}\,\cdots \,{\varvec{g}}_{{\mathrm{N}}_{u}}\right) \in {\mathbb{C}}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{U}}\) , the detected symbols are computed as

$$\begin{aligned} \widetilde{{\varvec{x}}}=\varvec{G}{\varvec{y}}. \end{aligned}$$

This paper aims to minimize the weighted sum MSE at all users which can be calculated as

$$\begin{aligned} \hbox {MSE}=E\left\{ \left\| {\varvec{a}}\left( {\varvec{x}}-\varvec{\alpha }\widetilde{{\varvec{x}}}\right) \right\| _{\text {2}}^{\text {2}}\right\} , \end{aligned}$$

where \({\varvec{a}}=diag(a_{1}{\mathbf{I}}_{{\mathrm{N}}_{r}}\,\cdots \,a_{{\mathrm{N}}_{u}}{\mathbf{I}}_{{\mathrm{N}}_{r}})\in \mathbb {R}_{+}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{U}}\), and \(a_{u}\) is the non-negative user weight. To make the MSE calculation to be meaningful, \(\varvec{\alpha }=diag(\alpha _{1}{\mathbf{I}}_{{\mathrm{N}}_{r}}\,\cdots \,\alpha _{{\mathrm{N}}_{u}}{\mathbf{I}}_{{\mathrm{N}}_{r}})\in \mathbb {R}_{+}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{U}}\) is considered, where \(\alpha _{u}\) is a scalar factor which can be considered as an automatic gain control (AGC) gain in the receiver  [36,37,38,39]. In Sect. 3.4, the traditional receive filters, and the AGC scalar factor are stated for the proposed system.

3 Design of a novel JT-CoMP transmission scheme with sparse feedback and constrained backhaul

This section aims to design a novel scheme for downlink transmission in a centralized JT-CoMP system. The goal of the design is to minimize the MSE in (5). As stated before, the drawback of JT-CoMP is its feedback and backhaul loads. It is interesting to design a transmission scheme in which the CSI requirement and backhaul load are constrained. Towards this aim, we define a binary link selection matrix \({\varvec{S}}\in {\mathcal{B}}^{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}\), where \({\varvec{S}}(i,j), 1\le i\le {\mathrm{N}}_{B},1\le j\le {\mathrm{N}}_{U}\) is 1 when the link between the i-th transmit antenna and the j-th receive antenna is active and it is 0 when the mentioned link is idle. The column-wise sub-matrix of the link selection matrix related to the b-th BS is defined as \({\varvec{S}}^{b}={\varvec{S}}\left( (b-1){\mathrm{N}}_{t}+1:b{\mathrm{N}}_{t},\,:\right) \in {\mathcal{B}}^{{\mathrm{N}}_{t}\times {\mathrm{N}}_{U}}\), and the row-wise sub-matrix related to the u-th user is denoted as \({\varvec{S}}_{u}={\varvec{S}}\left( :, (u-1){\mathrm{N}}_{r}+1:u{\mathrm{N}}_{r}\right) \in {\mathcal{B}}^{{\mathrm{N}}_{B}\times {\mathrm{N}}_{r}}\). The sparse precoding matrix is defined as

$$\begin{aligned} \hat{{\varvec{W}}}={\varvec{S}}\cdot {\varvec{W}}, \end{aligned}$$

where the backhaul load reduction is proportional to the cardinality of the set \({\mathcal{S}}_{BH}=\left\{ \hat{{\varvec{W}}}(i,j)=0,\,1\le i\le {\mathrm{N}}_{B},1\le j\le {\mathrm{N}}_{U}\right\}\), and is defined as

$$\begin{aligned} r_{bl}\triangleq 1-\frac{nnze({\varvec{S}})}{\left| {\varvec{S}}\right| }. \end{aligned}$$

Similarly, the feedback load reduction is proportional to the number of zeros in the sparse aggregated channel matrix \(\hat{{\varvec{H}}}\), i.e. the cardinality of \({\mathcal{S}}_{FB}=\left\{ \hat{{\varvec{H}}}(i,j)=0,\,1\le i\le {\mathrm{N}}_{U},1\le j\le {\mathrm{N}}_{B}\right\}\). We assume the equivalent feedback of \(\hat{{\varvec{W}}}\) as \(\hat{{\varvec{H}}}={\varvec{H}}\cdot {\varvec{S}}^{T}\) and this results in a feedback load reduction as

$$\begin{aligned} r_{fl}\triangleq 1-\frac{nnze({\varvec{S}}^{T})}{\left| {\varvec{S}}^{T}\right| }. \end{aligned}$$

Note that in our proposed scheme, the selection matrices for feedback and precoding are transpose of each other, and the feedback and backhaul load reduction ratios are the same, i.e. \(r_{fl}=r_{bl}\). Indeed \(nnzc({\varvec{S}}^{b})\) shows the number of users which are served by the b-th BS. The load of BSs may be balanced, by considering the following constraint  [40]

$$\begin{aligned} \max _{1\le b\le {\mathrm{N}}_{b}}\left\{ nnzc({\varvec{S}}^{b})\right\} -\min _{1\le b\le {\mathrm{N}}_{b}}\left\{ nnzc({\varvec{S}}^{b})\right\} \le 1. \end{aligned}$$

In the following, a general optimization problem to find \({\varvec{W}}\) and \({\varvec{S}}\) is set up and solved. The transmission scheme is briefly shown in Fig. 3. At the beginning of the transmission, \(\varvec{S=1}_{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}\), i.e. we start with a non-sparse case. Substituting (3) and (4) in (5) and considering \(\sigma _{x}^{2}=1\) and \(E\left\{ {\varvec{n}}\right\} =0\), the MSE in (5) is computed as

$$\begin{aligned} MSE= & {} E\left\{ \left\| {\varvec{a}}\left( {\varvec{x}}-\varvec{\alpha }\varvec{G}\left( {\varvec{H}}\hat{\varvec{W}}{\varvec{x}}+{\varvec{n}}\right) \right) \right\| _{2}^{2}\right\} =E\left\{ \left\| {\varvec{a}}\left( \varvec{I}-\varvec{\alpha }\varvec{G}{\varvec{H}}\hat{\varvec{W}}\right) {\varvec{x}}-{\varvec{a}}\varvec{\alpha }\varvec{G}{\varvec{n}}\right\| _{2}^{2}\right\} \nonumber \\= & {} \left\| {\varvec{a}}\left( \varvec{\alpha }\varvec{G}{\varvec{H}}\hat{\varvec{W}}-\varvec{I}\right) \right\| _{F}^{2}+\sigma _{n}^{2}\left\| {\varvec{a}}\varvec{\alpha }\varvec{G}\right\| _{F}^{2}, \end{aligned}$$

where \(\sigma _{n}^{2}\) is the noise variance. The detailed steps of MSE computation are described in “Appendix 1”. The goal is to minimize the MSE provided that the total transmission power is constrained to \(P_{t}\), feedback and, backhaul loads are constrained and load is balanced between BSs. Thus the optimization problem is set up as follows

$$\begin{aligned} \min _{{\varvec{S}}{,\,}{\varvec{W}}}\underbrace{\left\| {\varvec{a}}\left( \varvec{\alpha }\varvec{G}{\varvec{H}}({\varvec{S}}.{\varvec{W}})-{\mathbf{I}}\right) \right\| _{F}^{2}}_{M_{1}}+\underbrace{\sigma _{n}^{2}\left\| {\varvec{a}}{\varvec{\alpha }G}\right\| _{F}^{2}}_{M_{2}} \end{aligned}$$

subject to:

$$\begin{aligned}&C(1)\,:\,Tr\left( \left( {\varvec{S}}.{\varvec{W}}\right) ^{H}\left( {\varvec{S}}.{\varvec{W}}\right) \right) \le P_{t} \end{aligned}$$
$$\begin{aligned}&C(2)\,:\,nnze({{\varvec{S}}})\le {\mathrm{Q}}\in \left\{ 1,2,\ldots ,{\mathrm{N}}_{B}{\mathrm{N}}_{U}\right\} \end{aligned}$$
$$\begin{aligned}&C(3)\,:\,\max _{1\le b\le {\mathrm{N}}_{b}}\left\{ nnzc({\varvec{S}}^{b})\right\} -\min _{1\le b\le {\mathrm{N}}_{b}}\left\{ nnzc({\varvec{S}}^{b})\right\} \le 1 \end{aligned}$$
Fig. 3
figure 3

Representation of data flow in the proposed JT-CoMP system with sparse feedback and constrained backhaul

Remark 1

Note that \({\varvec{S}}\) and \({\varvec{W}}\) are calculated at the CCN, which is not necessarily aware of the receiver and its filter, thus the CCN can only manage to minimize the part \(M_{1}\) in (11).

Remark 2

Users estimate channel \({\varvec{H}}\) and find \({\varvec{S}}\) from the received data, and feed back a composite and sparse version of the CSI, which also contains the information about the receive filter. I.e., \({\varvec{H}}_{f}=\left( \varvec{\alpha }\varvec{G}{\varvec{H}}\right) \cdot {\varvec{S}}^{T}\) is sent back and aggregated in the CCN. The CCN receives the sparse composite CSI and it estimates the full composite CSI as

$$\begin{aligned} \tilde{{\varvec{H}}}={\varvec{H}}_{f}+(\varvec{1}_{{\mathrm{N}}_{U}\times {\mathrm{N}}_{B}}-{\varvec{S}}^{T}).\varvec{{\mathcal{H}}}\in {\mathbb{C}}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{B}}, \end{aligned}$$

where \(\varvec{{\mathcal{H}}}\) may be an old version of \({\varvec{H}}_{f}\) or long term channel statistics (e.g. received signal strength indicator (RSSI)).

Remark 3

As seen, problem (11) is too complicated to be solved directly, especially for its boolean parameters. Thus, we use a sub-optimum solution by converting it to a two-layer optimization procedure. In the inner layer, \({\varvec{W}}\) is calculated considering a fixed \({\varvec{S}}\) and subject to constraint C(1). In the outer layer, \({\varvec{S}}\) is found subject to constraints C(2) and C(3), i.e. the problem is converted to

$$\begin{aligned} \min _{{\varvec{S}}}\left[ \min _{{\varvec{W}}}\left\| {\varvec{a}}\left( \varvec{\alpha }\varvec{G}{\varvec{H}}({\varvec{S}}\cdot {\varvec{W}})-{\mathbf{I}}\right) \right\| _{F}^{2}+\sigma _{n}^{2}\left\| {\varvec{a}}{\varvec{\alpha }G}\right\| _{F}^{2}\ \hbox {}\ C(1)\right] \ \hbox {}\ C(2)\ \hbox {and}\ C(3). \end{aligned}$$

In the following, we first explain the two-layer optimization scheme, and next considerations about traditional receive filters are explained.

3.1 Inner layer optimization: precoder design

In this section, we aim to design a robust precoder by minimizing the part \(M_{1}\) in (11) for different types of receive filter that also has good performance in a receiver with no filter. We design the precoder weights based on the composite channel gains, \(\tilde{{\varvec{H}}}\). If \({\mathcal{W}}=\left\{ \hat{{\varvec{W}}}|Tr\left( \hat{{\varvec{W}}}^{H}\hat{{\varvec{W}}}\right) \le P_{t}\right\}\) is the set of all possible weights that is satisfying the total power constraint C(1), by considering fixed \({\varvec{S}}\) and \({\varvec{a}}={\mathbf{I}}\), the problem (11) can be written as

$$\begin{aligned} \min _{\hat{{\varvec{W}}}\in {\mathcal{W}}}\left\| \tilde{{\varvec{H}}}\hat{\varvec{W}}-{\mathbf{I}}\right\| _{F}^{2}. \end{aligned}$$

By substituting \(\hat{{\varvec{W}}}={\varvec{S}}.{\varvec{W}}={\varvec{W}}-\left( \varvec{1}_{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}-{\varvec{S}}\right) .{\varvec{W}}\) in (17) and applying the triangle inequality, an upper bound for the objective function is acquired as

$$\begin{aligned} \left\| \tilde{{\varvec{H}}}\hat{\varvec{W}}-{\mathbf{I}}\right\| _{F}^{2}= & {} \left\| \tilde{{\varvec{H}}}{\varvec{W}}-{\mathbf{I}}-\tilde{{\varvec{H}}}\left( \left( \varvec{1}_{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}-{\varvec{S}}\right) .{\varvec{W}}\right) \right\| _{F}^{2}\nonumber \\\le & {} \underbrace{\left\| \tilde{{\varvec{H}}}{\varvec{W}}-{\mathbf{I}}\right\| _{F}^{2}}_{F1}+\underbrace{\left\| \tilde{{\varvec{H}}}\left( \left( \varvec{1}_{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}-{\varvec{S}}\right) .{\varvec{W}}\right) \right\| _{F}^{2}}_{F2}. \end{aligned}$$

Now, instead of optimization of the objective function in (17), we try to optimize its upper bound. First, we consider the section F1 of (18) and adapt the Theorem 2 from [41] to minimize it as follows

$$\begin{aligned} \min _{{\varvec{S}}\cdot {\varvec{W}}\in {\mathcal{W}}}\left\| \tilde{{\varvec{H}}}{\varvec{W}}-{\mathbf{I}}\right\| _{F}^{2}. \end{aligned}$$

Theorem 1

Let \({\mathcal{W}}\) denotes a nonempty convex set, then \({{\varvec{U}}}_{W}={\varvec{V}}_{\tilde{H}}\) and \({{\varvec{V}}}_{W}={{\varvec{U}}}_{\tilde{H}}\) are optimal for the problem (19), where \({\varvec{U}}_{\tilde{H}}\), \({\varvec{V}}_{\tilde{H}}\) are unitary matrices from the SVD of \(\tilde{{\varvec{H}}}\) and \({\varvec{U}}_{W}\), \({\varvec{V}}_{W}\) are obtained from the SVD of \({\varvec{W}}\).

The proof of Theorem 1 is similar to the proof of Theorem 2 in  [41]. However, in  [41], the scalar factor in MSE computation is omitted and the Theorem is proved for perfect and statistical CSI.

Consider the SVD of the composite channel as \(\tilde{{\varvec{H}}}={{\varvec{U}}}_{\tilde{H}}{\varvec{\Sigma }}_{\tilde{H}}{{\varvec{V}}}_{\tilde{H}}^{H}\) with \({\varvec{\Sigma }}_{\tilde{H}}=\left[ {\varvec{\Lambda }}_{\tilde{H}}\,{\mathbf{0}}\right] \in {\mathbb{C}}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{B}}\) and \({\varvec{\Lambda }}_{\tilde{H}}=diag(\lambda _{\tilde{H}}(1),\ldots ,\lambda _{\tilde{H}}({\mathrm{N}}_{U}))\in {\mathbb{C}}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{U}}\) containing the singular values of the composite channel in decreasing order. Unitary matrices \({\varvec{U}}_{\tilde{H}}\in {\mathbb{C}}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{U}}\) and \({\varvec{V}}_{\tilde{H}}\in {\mathbb{C}}^{{\mathrm{N}}_{B}\times {\mathrm{N}}_{B}}\) are scaling and rotation matrices such that \({\varvec{U}}_{\tilde{H}}{\varvec{U}}_{\tilde{H}}^{H}= {{\mathbf{I}}_{{\mathrm{N}}_{U}}}\) and \({\varvec{V}}_{\tilde{H}}{\varvec{V}}_{\tilde{H}}^{H}= {{\mathbf{I}}_{{\mathrm{N}}_{B}}}\). Denote the SVD of \({\varvec{W}}={{\varvec{U}}}_{W}{\varvec{\Sigma }}_{W}{{\varvec{V}}}_{W}^{H}\) with \({\varvec{\Sigma }}_{W}=\left[ {\varvec{\Lambda }}_{W}\,{\mathbf{0}}\right] ^{T}\in {\mathbb{C}}^{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}\) and \({\varvec{\Lambda }}_{W}=diag(\lambda _{W}(1),\ldots ,\lambda _{W}({\mathrm{N}}_{U}))\in {\mathbb{C}}^{{\mathrm{N}}_{U}\times {\mathrm{N}}_{U}}\) containing the singular values of weights. Based on Theorem 1, the left and right singular vectors for the optimal precoder are equal to the right and left singular vectors of the composite channel which simplifies the norm in problem (19) as

$$\begin{aligned} \left\| \tilde{{\varvec{H}}}{\varvec{W}}-{\mathbf{I}}\right\| _{F}^{2}=\left\| {\varvec{U}}_{\tilde{H}}{\varvec{\Lambda }}_{\tilde{H}}{\varvec{\Lambda }}_{W}{\varvec{U}}_{\tilde{H}}^{H}-{\mathbf{I}}\right\| _{F}^{2}. \end{aligned}$$

Since the Frobenius norm is invariant with respect to unitary transformation [42], it is equivalent to

$$\begin{aligned} \left\| {\varvec{U}}_{\tilde{H}}{\varvec{\Lambda }}_{\tilde{H}}{\varvec{\Lambda }}_{W}{\varvec{U}}_{\tilde{H}}^{H}-{\mathbf{I}}\right\| _{F}^{2}=\left\| {\varvec{\Lambda }}_{\tilde{H}}{\varvec{\Lambda }}_{W}-{\mathbf{I}}\right\| _{F}^{2}=\left\| {\varvec{\lambda}}_{\tilde{H}}.{\varvec{\lambda}}_{W}-{\mathbf{1}}_{{\mathrm{N}}_{U}\times 1}\right\| _{2}^{2}, \end{aligned}$$

where \({\varvec{\lambda}}_{\tilde{H}}=\left[ \lambda _{\tilde{H}}(1),\ldots ,\lambda _{\tilde{H}}({\mathrm{N}}_{U})\right] \in {\mathbb{C}}^{{\mathrm{N}}_{U}\times 1}\) and \({{\varvec{\lambda}}}_{W}=\left[ \lambda _{W}(1),\ldots ,\lambda _{W}({\mathrm{N}}_{U})\right] \in {\mathbb{C}}^{{\mathrm{N}}_{U}\times 1}\). Therefore, the section F1 is simplified to

$$\begin{aligned} \min _{{\varvec{\lambda}}_{W}}\left\| {\varvec{\lambda}}_{\tilde{H}}\cdot {\varvec{\lambda}}_{W}-{\mathbf{1}}_{{\mathrm{N}}_{U}\times 1}\right\| _{2}^{2}. \end{aligned}$$

Second, we try to minimize the section F2 of (18). In this regard, an auxiliary vector \(\varvec{\Phi }_{i,j}\), which is constructed by stacking the columns of \({\varvec{V}}_{\tilde{H}}.{\varvec{U}}_{\tilde{H}}^{*}\) one below the other, is defined, and the elements of \({\varvec{W}}\) which are corresponding to idle links (zero elements of \({\varvec{S}}\) ) are denoted as \(\varvec{Z}=\left( \varvec{1}_{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}-{\varvec{S}}\right) .{\varvec{W}}\), where

$$\begin{aligned} \varvec{Z}(i,j)= & {} {\left\{ \begin{array}{ll} \varvec{\Phi }_{i,j}{\varvec{\lambda}}_{W} &{} {\varvec{S}}(i,j)=0,\ 1\le i\le {\mathrm{N}}_{B}, 1\le j\le {\mathrm{N}}_{U}\\ 0 &{} \hbox {otherwise} \end{array}\right. }, \end{aligned}$$
$$\begin{aligned} \varvec{\Phi }_{i,j}= & {} \left[ V_{\tilde{H}}(i,1),\ldots ,V_{\tilde{H}}(i,{\mathrm{N}}_{U})\right] .\left[ U_{\tilde{H}}^{*}(j,1),\ldots ,U_{\tilde{H}}^{*}(j,{\mathrm{N}}_{U})\right] \in {\mathbb{C}}^{1\times {\mathrm{N}}_{U}}. \end{aligned}$$

Using these definitions, section F2 of (18) is rewritten as

$$\begin{aligned} \min _{{\varvec{\lambda}}_{W}}\left\| \tilde{{\varvec{H}}}\varvec{Z}\right\| _{F}^{2}. \end{aligned}$$

To compute power constraint, \(Tr\left( \hat{{\varvec{W}}}^{H}\hat{{\varvec{W}}}\right)\) is computed as

$$\begin{aligned} Tr\left( \hat{{\varvec{W}}}^{H}\hat{{\varvec{W}}}\right) =\left\| \hat{{\varvec{W}}}\right\| _{F}^{2}=\left\| {\varvec{W}}-\varvec{Z}\right\| _{F}^{2}\le P_{t}. \end{aligned}$$

As \(\hat{{\varvec{W}}}\) and \(\varvec{Z}\) are disjoint elements of \({\varvec{W}}\), the total transmission power, \(\left\| \hat{{\varvec{W}}}\right\| _{F}^{2}\) is computed as

$$\begin{aligned} \left\| \hat{{\varvec{W}}}\right\| _{F}^{2}= & {} \left\| {\varvec{S}}.{\varvec{W}}+\left( \varvec{1}-{\varvec{S}}\right) .{\varvec{W}}\right\| _{F}^{2}=\left\| \hat{{\varvec{W}}}+\varvec{Z}\right\| _{F}^{2}=\left\| \hat{{\varvec{W}}}\right\| _{F}^{2}+\left\| \varvec{Z}\right\| _{F}^{2}\Rightarrow \nonumber \\ \left\| \hat{{\varvec{W}}}\right\| _{F}^{2}= & {} \left\| {\varvec{W}}\right\| _{F}^{2}-\left\| \varvec{Z}\right\| _{F}^{2}. \end{aligned}$$

By considering (23) and (26), the total power constraint C(1) can be rewritten as

$$\begin{aligned} Tr\left( \hat{{\varvec{W}}}^{H}\hat{{\varvec{W}}}\right) =\left\| \hat{{\varvec{W}}}\right\| _{F}^{2}=\left\| {\varvec{W}}\right\| _{F}^{2}-\left\| \varvec{Z}\right\| _{F}^{2}=\sum _{u=1}^{{\mathrm{N}}_{U}}\lambda _{W}^{2}(u)-\sum _{i,j|{\varvec{S}}(i,j)=0}\left| \varvec{\Phi }_{i,j}{\varvec{\lambda}}_{W}\right| ^{2}\le P_{t}. \end{aligned}$$

Finally, by adding (25) to problem (22), the sparse precoding design problem is summarized as

$$\begin{aligned} \min _{{\varvec{\lambda}}_{W}}\left\| {\varvec{\lambda}}_{\tilde{H}}\cdot{\varvec{\lambda}}_{W}-{\mathbf{1}}_{{\mathrm{N}}_{U}\times 1}\right\| _{2}^{2}+\left\| \tilde{{\varvec{H}}}\varvec{Z}\right\| _{F}^{2} \end{aligned}$$

subject to:

$$\begin{aligned} \sum _{u=1}^{{\mathrm{N}}_{U}}\lambda _{W}^{2}(u)-\sum _{i,j|{\varvec{S}}(i,j)=0}\left| \varvec{\Phi }_{i,j}{\varvec{\lambda}}_{W}\right| ^{2}\le P_{t}. \end{aligned}$$

In order to solve the above problem, the sequential quadratic programming (SQP) optimization method [43, 44] can be used. In this method, the search direction denoted as \({\varvec{d}}\) is obtained by solving the following sub-problem

$$\begin{aligned} \min \nabla f\left( {\varvec{\lambda}}_{W}\right) ^{T}+\frac{1}{2}{\varvec{d}}^{T}\varvec{\phi }{\varvec{d}} \end{aligned}$$

subject to:

$$\begin{aligned} g\left( {\varvec{\lambda}}_{W}\right) +\nabla g\left( {\varvec{\lambda}}_{W}\right) ^{T}{\varvec{d}}\le 0, \end{aligned}$$

where \(f\left( {\varvec{\lambda}}_{W}\right)\) and \(g\left( {\varvec{\lambda}}_{W}\right)\) are the objective and constraint functions and \(\varvec{\phi }\) is a symmetric positive definitive matrix. By a starting point \({\varvec{\lambda}}_{W}\), the new point is generated as \(\overline{{\varvec{\lambda}}_{W}}={\varvec{\lambda}}_{W}+\varepsilon {\varvec{d}}\), where \(\varepsilon\) is a scalar non-negative step size.

3.2 Outer layer optimization: selection matrix design

Finding the optimal solution of the outer layer of the problem (11) might be too computationally cumbersome, as it involves Boolean constraints. One naive way to find \({\varvec{S}}\) is to use exhaustively search among all possible combinations of the selection matrix for the one that gives the best MSE. Although the exhaustive search might be the only mechanism for a truly optimum selection of users to be served by each BS under load balancing constraint, its computational complexity grows quickly and it becomes impractical.

An alternative is to use the convex optimization techniques. Since the elements of \({\varvec{S}}\) are binary variables, it makes the optimization problem NP-hard. To solve this optimization problem, it’s possible to use the concept of linear programming relaxation  [45], where the constraint that each element must be a binary variable is relaxed to a weaker constraint that each is a real number in the interval \(\left[ 0,1\right]\). The performance of convex optimization with relaxation is very close to the optimal one based on exhaustive search. Although its complexity is not as prohibitive as the exhaustive search, it is still high, being approximately in the order of \({\mathcal{O}}\left( n^{3}\right)\), where \({\mathcal{O}}\) presents the complexity and \(n={\mathrm{N}}_{B}{\mathrm{N}}_{U}\) [46].

To deal with this issue, a Greedy algorithm that attempts to approximate the optimal solution can be implemented. The Greedy algorithms have been widely applied in the framework of wireless communication, particularly in scheduling for CoMP systems, where the objective is to select the set of users that maximizes a certain metric function [47]. The Greedy algorithm which is guaranteed to converge has two main advantages. First, it allows a considerable reduction in complexity, requiring roughly \({\mathcal{O}}\left( n^{2}\right)\) operations. Second, it can be applied to a wide range of metrics of interest  [48]. We use the following Greedy algorithm to find a local optimum for our problem

$$\begin{aligned} \min _{{\varvec{S}}}f(\tilde{{\varvec{H}}},{\varvec{S}}) \end{aligned}$$

subject to C(2) and C(3), where the metric function \(f(\tilde{{\varvec{H}}},{\varvec{S}})\) is defined as

$$\begin{aligned} f(\tilde{{\varvec{H}}},{\varvec{S}})=\min _{{\varvec{\lambda}}_{W}}\left\| {\varvec{\lambda}}_{\tilde{H}}\cdots {\varvec{\lambda}}_{W}-{\mathbf{1}}_{{\mathrm{N}}_{U}\times 1}\right\| _{2}^{2}+\left\| \tilde{{\varvec{H}}}\varvec{Z}\right\| _{F}^{2},\hbox {}\ (30). \end{aligned}$$

The principle of the proposed Greedy algorithm is as follows. We start with an initial selection matrix obtained by randomly selecting \(Q=\left\lfloor {\mathrm{N}}_{B}{\mathrm{N}}_{U}(1-r_{fl})\right\rfloor\) elements of \({\varvec{S}}\) to be one and the remaining to be zero, providing the conditions C(2) and C(3) are fulfilled. Then, we select the first element among the zero-value elements and find the one-value element that when replaced with the selected zero-value leads to a reduction in the metric function. When this occurs, \({\varvec{S}}\) is updated by replacing the zero-value element with the one-value that presents the largest reduction in the metric function. This process is repeated for other zero-value elements and therefore the selection matrix is designed in a way that the MSE is minimized. The algorithm is summarized in Algorithm 1.

figure a

The computational complexity of the pseudo code in Algorithm 1 is computed in “Appendix 2”. The overall complexity of the Algorithm 1 is \({\mathcal{O}}\left( C{\mathrm{N}}_{B}^{2}{\mathrm{N}}_{U}^{3}\right)\), where C is a constant. Then, the complexity is a function of the number of coordinated BSs, number of users and number of transmit and receive antennas. It increases with the square of the number of BSs and transmit antennas and with the third power of the number of users and receive antennas.

3.3 Limited feedback using CSI codebook

For a further reduction of the feedback load, we consider a CSI codebook based limited feedback strategy, where each user selects a codeword from a pre-designed CSI codebook and feeds back its index to the serving BS [7, 49]. The CCN collects all the codeword indexes sent from different BSs and calculates the precoding matrix. Different CSI codebooks are employed by the users, so they can independently quantize their per-BS channel direction information (CDI). The per-BS CDI for the u-th user can be expressed as \(\tilde{{\varvec{h}}}_{u,b}={\varvec{h}}_{u,b}/\left\| {\varvec{h}}_{u,b}\right\|\), where \({\varvec{h}}_{u,b}\in {\mathbb{C}}^{{\mathrm{N}}_{r}\times {\mathrm{N}}_{t}}\) is the CSI of the links spanning from the b-th BS to the u-th user, i.e. \({\varvec{h}}_{u,b}(i,j)={\varvec{H}}_{u}(i,j),\,1\le i\le {\mathrm{N}}_{r},\,(b-1){\mathrm{N}}_{t}+1\le j\le b{\mathrm{N}}_{t}\). Random vector quantization (RVQ) is considered for quantizing the per-BS CDIs, where the quantized version of the CDI is given by [49] as

$$\begin{aligned} {\hat{\varvec{h}}}_{u,b}=\arg \max _{\varvec{c}_{n}\in {\mathcal{C}}_{u,b}}\left| \tilde{{\varvec{h}}}_{u,b}\varvec{c}_{n}^{H}\right| , \end{aligned}$$

where \({\mathcal{C}}_{u,b}\) is the CSI codebook used by the u-th user to quantize the CDI of the b-th BS, which consists of \(2^{{\mathrm{B}}_{{\mathrm{CDI}}}}\) codewords. The codeword \(\varvec{c}_{n}\in {\mathbb{C}}^{{\mathrm{N}}_{r}\times {\mathrm{N}}_{t}}\), is a random vector with unit norm, and \({\mathrm{B}}_{{\mathrm{CDI}}}\) denotes the number of bits used for quantizing the CDI.

We assume that the per-BS channel norm, \(\left\| {\varvec{h}}_{u,b}\right\|\) is perfectly known at the CCN, which is not included in the feedback information. The knowledge of these scales can be obtained at each BS by averaging the per-BS channels [50]. After aggregating of all the CDI indexes in the CCN, the CSI is reconstructed as

$$\begin{aligned} {\hat{\varvec{H}}}_{u}=\left[ \left\| {\varvec{h}}_{u,1}\right\| {\hat{\varvec{h}}}_{u,1},\ldots ,\left\| {\varvec{h}}_{u,{\mathrm{N}}_{b}}\right\| {\hat{\varvec{h}}}_{u,{\mathrm{N}}_{b}}\right] . \end{aligned}$$

The global CDI quantization error is computed as

$$\begin{aligned} \varepsilon _{u}=1-\left| \tilde{{\varvec{H}}}_{u}{\hat{\varvec{H}}}_{u}^{H}\right| ^{2}=1-\left| \sum _{b=1}^{{\mathrm{N}}_{b}}\gamma _{u,b}^{2}\mu _{u,b}e^{j\varphi _{u,b}}\right| ^{2}, \end{aligned}$$

where \(\tilde{{\varvec{H}}}_{u}=\left[ \tilde{{\varvec{h}}}_{u,1},\ldots ,\tilde{{\varvec{h}}}_{u,{\mathrm{N}}_{b}}\right]\) is the global CDI vector, \(\gamma _{u,b}=\frac{\left\| {\varvec{h}}_{u,b}\right\| }{\sqrt{\sum _{b=1}^{{\mathrm{N}}_{b}}\left\| {\varvec{h}}_{u,b}\right\| }}\) is the normalized per-BS channel norm and \(\mu _{u,b}=\left| \tilde{{\varvec{h}}}_{u,b}{\hat{\varvec{h}}}_{u,b}^{H}\right|\) is the normalized per-BS quantization gain. The angle between the per-BS CDI and its codeword is denoted as \(\varphi _{u,b}\), i.e. \(e^{j\varphi _{u,b}}=\tilde{{\varvec{h}}}_{u,b}{\hat{\varvec{h}}}_{u,b}^{H}\diagup \left\| \tilde{{\varvec{h}}}_{u,b}{\hat{\varvec{h}}}_{u,b}^{H}\right\|\), which is named as phase ambiguity (PA). As expected, in the perfectly quantized CDI condition, \(\varepsilon _{u}=0\).

In contrast to a single point transmission system, where the PA does not affect the CDI quantization performance, in coherent transmission, the PA affects the co-phasing of the system and degrades the feedback scheme performance. This is owing to the fact that the codeword selection in (35) only maximizes the magnitude and ignores its phase. The performance degradation due to the PA is more severe for cell-edge users [51].

The PA is uniformly distributed in \(\left[ -\pi ,\pi \right]\) and employing a uniform quantizer for PA quantization is optimal  [7]. In this regard, the PA can be fed back with aid of a few bits by using a scalar uniform quantizer. By considering \({\mathrm{B}}_{\mathrm {PA}}\) bits to quantize the PA, the quantized PA is given by

$$\begin{aligned} \hat{\varphi }_{u,b}=\arg \min _{\phi _{n}}\left| \varphi _{u,b}-\phi _{n}\right| ,\ \phi _{n}=n\frac{2\pi }{2^{\mathrm {B_{PA}}}}-\pi ,\ n=0,\ldots ,2^{\mathrm {B_{PA}}}-1. \end{aligned}$$

3.4 Receiver considerations

In this section, traditional receive filters are adopted for the transmission scheme. Receivers use stream specific pilots to estimate the effective channel, which includes the precoding weights and feedback (similar to implicit channel in the 3GPP standard). The effective channel for the u-th user is defined as

$$\begin{aligned} \bar{{\varvec{H}}}_{u}={\varvec{H}}_{u}\hat{{\varvec{W}}}_{u}\in {\mathbb{C}}^{{\mathrm{N}}_{r}\times {\mathrm{N}}_{r}}. \end{aligned}$$

Common linear filters such as matched filter (MF) or minimum mean square error (MMSE) filter may be used in receivers to combat channel distortion and noise. Although motivated by aiming for a low-complexity receiver, we consider a receiver to use no filter at all, i.e. the receive filter at the u-th user is \({\varvec{g}}_{u}^{No}=\varvec{I}_{{\mathrm{N}}_{r}}\). In MF, the filter is designed to maximize the signal portion of the received signal, and as the signal to interference ratio is not minimized, it is useful in low-noise conditions  [36]. In our sparse system, the MF receive filter for the u-th user is

$$\begin{aligned} {\varvec{g}}_{u}^{MF}=\bar{{\varvec{H}}}_{u}^{H}. \end{aligned}$$

The MMSE receive filter is designed to minimize the MSE and finds a good tradeoff between the signal portion and the interference  [52]. To compute the MMSE filter in the sparse system, by setting \(\alpha _{u}=1,u=1,\ldots ,{\mathrm{N}}_{u}\) in (5), the MSE for the u-th user can be expressed as

$$\begin{aligned} \hbox {MSE}_{u}=E\left\{ \left\| \varvec{E}_{u}\right\| _{2}^{2}\right\} =E\left\{ tr\left( \varvec{E}_{u}^{H}\varvec{E}_{u}\right) \right\} =tr\left( E\left\{ \left( {\varvec{x}}_{u}-{\varvec{g}}_{u}{\varvec{y}}_{u}\right) ^{H}\left( {\varvec{x}}_{u}-{\varvec{g}}_{u}{\varvec{y}}_{u}\right) \right\} \right) . \end{aligned}$$

To minimize the MSE, we can apply the trick of taking the conjugate complex derivative [53] w.r.t. \({\varvec{g}}_{u}^{H}\) and set to zero as

$$\begin{aligned} \frac{\partial \hbox {MSE}_{u}}{\partial {\varvec{g}}_{u}^{H}}={\varvec{g}}_{u}E\left\{ {\varvec{y}}_{u}{\varvec{y}}_{u}^{H}\right\} -E\left\{ {\varvec{x}}_{u}{\varvec{y}}_{u}^{H}\right\} =0, \end{aligned}$$

where the expectation in the first term is the variance of the received signal by the u-th user and the second term is the cross-correlation of the data symbol with the received signal. Noting (1), these expectations are computed as follows

$$\begin{aligned} {\varvec{R}}_{y_{u}}&= E\left\{ {\varvec{y}}_{u}{\varvec{y}}_{u}^{H}\right\} =\underbrace{\bar{{\varvec{H}}}_{u}{\varvec{R}}_{x_{u}}\bar{{\varvec{H}}}_{u}^{H}}_{\mathrm{desired}}+\underbrace{{\varvec{H}}_{u}\sum _{i\ne u}\left( \hat{{\varvec{W}}}_{i}{\varvec{R}}_{x_{i}}\hat{{\varvec{W}}}_{i}^{H}\right) {\varvec{H}}_{u}^{H}}_{\mathrm{interference}}+\underbrace{{\varvec{R}}_{n_{u}}}_{\mathrm{noise}}, \end{aligned}$$
$$\begin{aligned} E\left\{ {\varvec{x}}_{u}{\varvec{y}}_{u}^{H}\right\}= & {} {\varvec{R}}_{x_{u}}\bar{{\varvec{H}}}_{u}^{H}, \end{aligned}$$

where \({\varvec{R}}_{x_{u}}=\sigma _{x}^{2}{\mathbf{I}}_{{\mathrm{N}}_{r}}\) and \({\varvec{R}}_{n_{u}}=\sigma _{n}^{2}{\mathbf{I}}_{{\mathrm{N}}_{r}}\) are the variance of the data symbols and noise for the u-th user. Setting the derivative (41) to zero gives the MMSE receive filter in the sparse system as

$$\begin{aligned} {\varvec{g}}_{u}^{\rm MMSE}={{\varvec{R}}}_{x_{u}}\bar{{\varvec{H}}}_{u}^{H}{{\varvec{R}}}_{y_{u}}^{-1}. \end{aligned}$$

Note that the received signal variance according to (42) has three parts, including desired signal, interference and noise. The desired part can be computed from the effective channel directly, while the interference needs explicit channel estimation and some information about the precoding weights. However, it is possible to compute the received signal variance directly from the two-dimensional (frequency-time) received signal vector (similar to reference signals in LTE Release 14 [54, Section 6.10]).

Finally, scalar factor \(\alpha\) in (5) is considered as the AGC gain and is used to adapt the input signal with the dynamic range of the analog to digital converter (ADC)  [36]. Similar to MMSE receive filter computation, the optimum value of the scalar factor can be calculated with minimization of MSE as

$$\begin{aligned} \alpha _{u}=\frac{tr\left( {\varvec{R}}_{x_{u}}\varvec{\bar{H}}_{u}^{H}{\varvec{g}}_{u}^{H}\right) }{tr\left( {\varvec{g}}_{u}{\varvec{R}}_{y_{u}}{\varvec{g}}_{u}^{H}\right) }. \end{aligned}$$

4 Schemes for comparison

4.1 Two-layer optimization with inner ZF precoder

In this section, we adopt the conventional ZF precoder as the inner precoder for the proposed two-layer optimization scheme. The ZF precoder is designed to remove the interference completely and has good performance in high SNRs [36]. Using the composite channel in (15), the ZF precoder can be designed as

$$\begin{aligned} {\varvec{W}}^{\rm ZF}=\beta _{ZF}\tilde{{\varvec{H}}}\left( \tilde{{\varvec{H}}}\tilde{{\varvec{H}}}^{H}\right) ,^{-1} \end{aligned}$$

where \(\beta _{ZF}\) is used for power control, and in the sum-power constraint of \(P_{t}\) it is computed as

$$\begin{aligned} Tr\left( \left( {\varvec{W}}^{\rm ZF}\right) ^{H}{\varvec{W}}^{\rm ZF}\right) \le P_{t}\Rightarrow \beta _{ZF}=\sqrt{\frac{P_{t}}{tr\left( \left( \tilde{{\varvec{H}}}\tilde{{\varvec{H}}}^{H}\right) ^{-1}{{\varvec{R}}}_{x}\right) }}. \end{aligned}$$

In the outer layer of the proposed two-layer optimization, we compute the metric function for the Greedy algorithm based on the inner ZF precoder and design the selection matrix in a way that the metric function is minimized. By substituting the closed form precoding matrix from (46) in (11), a new metric function for the Greedy algorithm is computed as

$$\begin{aligned} f(\tilde{{\varvec{H}}},{\varvec{S}})=\left\| \beta _{ZF}\tilde{{\varvec{H}}}\left[ {\varvec{S}}\cdot \left( \tilde{{\varvec{H}}}\left( \tilde{{\varvec{H}}}\tilde{{\varvec{H}}}^{H}\right) ^{-1}\right) \right] -\varvec{I}\right\| _{F}^{2}. \end{aligned}$$

The following steps of the Greedy algorithm are similar to Algorithm 1, except substitution of the metric function with (48) and removing steps 2 and 3.

4.2 Two-layer optimization with inner Wiener precoder

Similar to the previous section, we adopt the conventional Wiener precoder as inner precoder for the proposed two-layer optimization scheme. The Wiener precoder minimizes the interference and maximizes the signal to interference and noise ratio (SINR). This precoder has better performance in comparison with ZF, especially in low SNRs  [36]. The Wiener precoder is derived as

$$\begin{aligned} {\varvec{W}}^{WF}= & {} \beta _{\rm WF}{\varvec{F}}^{-1}\tilde{{\varvec{H}}}^{H}, \end{aligned}$$
$$\begin{aligned} \varvec{F}= & {} \tilde{{\varvec{H}}}^{H}\tilde{{\varvec{H}}}+\frac{tr\left( \varvec{G}{\varvec{R}}_{n}\varvec{G}^{H}\right) }{P_{t}}{\mathbf{I}}_{N}, \end{aligned}$$

where \(\beta_{\mathrm{WF}}\) controls the transmitter power, and for the sum-power of \(P_{t}\) it is computed as

$$\begin{aligned} \beta_{\rm WF}=\sqrt{\frac{P_{t}}{tr\left( {\varvec{F}}^{-2}\tilde{{\varvec{H}}}^{H}{\varvec{R}}_{x}\tilde{{\varvec{H}}}\right) }}. \end{aligned}$$

The metric function for the Greedy algorithm in the outer layer is computed by substituting the closed form precoding matrix (50) in (11) as

$$\begin{aligned} f(\tilde{{\varvec{H}}},{\varvec{S}})=\left\| \beta _{\rm WF}\tilde{{\varvec{H}}}\left[ {\varvec{S}}\cdot \left( {\varvec{F}}^{-1}\tilde{{\varvec{H}}}^{H}\right) \right] -I\right\| _{F}^{2}. \end{aligned}$$

In Algorithm 1, by substitution of the metric function with (52) and removing steps 2 and 3, the Greedy algorithm for the outer layer is obtained.

Note that for computing the Wiener precoding matrix in (50), it is required that the receive filter is known in the CCN, which is contradictory to designing the precoder using only the composite channel. Therefore, the Wiener precoder cannot be used directly as the inner precoder in the proposed transmission system, and here we consider it only for comparison purpose.

4.3 Selective feedback precoder

In the selective feedback technique  [26], users with weak links are prevented from feeding back their CSI to the CCN and each user feeds back at least its strongest CSI. By exploiting a binary feedback index matrix, the coefficients of the channel matrix whose CSI is below a specified threshold are replaced with zeros. This technique can be categorized as an absolute thresholding approach for feedback load reduction.

To overcome the backhaul overhead related to routing users’ data to several BSs, two schemes are proposed: one scheme based on MAC layer scheduling, and the other is based on the physical layer precoding. In this paper, we consider the latter, where by vectorization and eliminating of zero elements of the channel matrix, the precoder is designed using the ZF precoding approach.

4.4 SSOCP based relative thresholding precoder

In relative thresholding, users feed back only the CSI of links with channel value within a threshold relative to the strongest BS. In  [27], an SSOCP based precoder for maximizing the weighted sum-rate is proposed in which the long term channel statistics are used to model the statistical interference for the unknown CSI. The precoder design problem with per antenna power constraint is considered as

$$\begin{aligned} \max _{{\varvec{W}}}\prod _{u}\left( 1+\gamma _{u}\right) \end{aligned}$$

subject to:

$$\begin{aligned} \sum _{j=1}^{{\mathrm{N}}_{U}}\left| W(i,j)\right| ^{2}\le P_{a}, (b-1){\mathrm{N}}_{t}+1\le i\le b{\mathrm{N}}_{t}, 1\le b\le {\mathrm{N}}_{b}, \end{aligned}$$

where \(\gamma _{u}\) is the SINR for the u-th user and \(P_{a}\) is a per antenna power constraint. For comparison with the sum-power constant of \(P_{t}\) in (12), we consider \(P_{t}={\mathrm{N}}_{B}P_{a}\).

5 Results and discussion

The numerical evaluation program is developed based on 3GPP time-coherent channel model [55] by MATLAB. The performance of the proposed scheme is evaluated by Monte-Carlo simulation. The inner optimization is performed using the SQP method and the outer layer is based on the Greedy algorithm. The simulation parameters are summarized in Table 2.

Table 2 Simulation parameters

5.1 Channel model

Consider a JT-CoMP scenario where a set of \(\mathrm {{{N}}}_{u}=3\) single antenna users at the cluster center are being served by \({\mathrm{N}}_{b}=3\) cooperating BSs with each \({\mathrm{N}}_{t}=1,\ 2\) antennas. The cell radius is \(R=500\) m and the cell-edge SNR is variable. According to an example in Fig. 4, users are uniformly dropped at the cluster center, along an ellipse with semi-major and semi-minor axis of length \(\frac{R}{16}\) and \(\frac{h/2}{16}\), respectively where \(h=\frac{\sqrt{3}}{2}R\) is the height of the hexagon of the cluster area. We consider the 3GPP channel model [55, 56]. The fading channel model includes the path-loss component \(\gamma _{PL}=128.1+37.6\log (R)\) (R is in \(\mathrm {km}\)), \(\gamma _{SF}={\mathcal{N}}(0,\,8\ \mathrm {dB})\) shadowing fading and a Rayleigh fast fading component \(\Gamma\) which is simulated as a circularly symmetric complex Gaussian random variable as \({\mathcal{CN}}(0,\,1)\). The i.i.d channel between the BSs and the users is calculated as

$$\begin{aligned} {\mathbf{H}}_{iid}=\Gamma {{\mathbf{C}}^{\frac{1}{2}}}\sqrt{\mathrm {G}\,\gamma _{PL}\,\gamma _{SF}}, \end{aligned}$$

where \(\mathrm {G}=1\) is the gain of the antennas at the BSs and \(\mathrm {{\mathbf{C}}\in \mathbb {R}}^{{\mathrm{N}}_{T}\times {\mathrm{N}}_{T}}\) is the correlation matrix of the antennas at the BSs, with the correlation between the antennas being \(\rho =0.5\) for all antenna pairs. We consider a time coherent channel model, where the CSI is varied only due to the effect of user movement, and the channel coefficient of the new CSI is based on Clarke’s model  [57]. The channel evolves in time as

$$\begin{aligned} {\mathbf{H}}(t+\Delta t)=\sqrt{\rho }{\varvec{H}}_{iid}+\sqrt{1-\rho }{\varvec{H}}(t), \end{aligned}$$

where \(\rho =J_{0}(2\pi f_{d}\triangle t)\) is the channel correlation coefficient. Here, \(J_{0}(.)\) is the zero-order Bessel function, the Doppler frequency is \(f_{d}=\frac{vf_{c}}{c}\) with the velocity of the user being v, the carrier frequency is \(f_{c}=2\ \mathrm {GHz}\), c is the velocity of propagation, and \(\triangle t\) is the evolved. The value of \(\triangle t\) is considered \(1\ {\mathrm{ms}}\) as the FDD uplink/downlink frame duration.

Fig. 4
figure 4

The hexagon in the middle of the cells denotes the cluster area under consideration where the users are located at the cluster center

The receiver noise power is \({\mathrm{N}}_{0}=\mathrm {k}_{{\mathrm{B}}}\mathrm {T}_{0}{\mathrm{B}}_{n}\) Watts, where \(\mathrm {{{k}}}_{{\mathrm{B}}}\) is the Boltzmann’s constant \(1.38\times 10^{-23}\ \mathrm {Joules/Kelvin}\) , \(\mathrm {T}_{0}=290\ \mathrm {Kelvin}\) is the operating temperature, and \({\mathrm{B}}_{n}=10\ \mathrm {MHz}\) is the system bandwidth. The number of channel realization is \(10^{3}\) and maximum BS power with cell-edge \(SNR=10\ \mathrm {dB}\) is \(22.8\ \mathrm {dBm}\) or \(0.19\ \mathrm {W}\).

5.2 Simulation results

This section is devoted to the numerical evaluation of the performance of the designed JT-CoMP scheme. The general form of the network structure is depicted in Fig. 1. Time invariant and variant channel models are adapted from (54) and (55). To comprehensively evaluate the proposed scheme, we consider three stages. In the first stage, the proposed precoder is compared with the adopted ZF and Wiener precoders in Figs. 5 and 6, and comparison with selective feedback  [26] and SSOCP based relative thresholding [27] precoders are performed in Fig. 9. In the second stage, performance of the proposed scheme is widely analyzed in Figs. 10, 11, and 12, where the effect of load reduction, probability distribution of MSE, and time convergence of the algorithm are investigated. In the third stage, the limited feedback effect on the proposed scheme is analyzed in Figs. 14 and 15.

5.2.1 Comparison to other schemes

In Fig. 5, performance of the adopted ZF, Wiener, and the proposed scheme are compared in a wide range of edge SNR in a receiver without receive filter. Note, in throughout the simulations, SNR is defined before any receive filter. The system is in full feedback (FFB)—full backhaul (FBH) configuration and the channel is time-invariant. We consider three types of receivers: receiver without receive filter, MF receiver, and MMSE receiver. In all receivers, the proposed scheme has better performance in comparison to adopted ZF and Wiener precoders. Performance of the proposed scheme is not sensitive to the type of receiver, while performance of adopted ZF and Wiener precoders depends on the receiver filter. Therefore, a privilege of the proposed scheme is that, it has good performance with simplified receivers, such as receivers using no receive filter.

Fig. 5
figure 5

Performance of the adopted ZF, Wiener and proposed precoders in terms of MSE in a receiver a without receive filter, b with MF receive filter, and c with MMSE receive filter in FFB-FBH configuration with \({\mathrm{N}}_{t}=1\). The channel is time-invariant

In Fig. 6, the performance of the proposed scheme is compared with the adopted ZF and Wiener precoders in sparse feedback (SFB) - constrained backhaul (CBH) configuration. We consider load reduction as \(r_{fl}=r_{bl}=0.33\) and the receiver is without receive filter. The channel is time-variant with \(\triangle t=1\ {\mathrm{ms}}\) and \(v=5\ {\mathrm{km/h}}\). The achievable MSE is depicted for a wide range of edge SNRs. From the results shown in this figure, the proposed precoder outperforms adopted ZF and Wiener precoders for at least \(25\%\) from the MSE aspect. The superior performance of the proposed scheme is valid for the MF and MMSE receive filters, but due to the space limitation, the simulation results of other common filters are not shown.

Fig. 6
figure 6

Performance of the adopted ZF, adopted Wiener and the proposed precoders in terms of MSE in SFB-CBH configuration with \(r_{fl}=r_{bl}=0.33,\ {\mathrm{N}}_{t}=1\). The channel is time-variant with \(\triangle t=1\ {\mathrm{ms}}\) and \(v=5\ {\mathrm{km/h}}\)

By substitution of \(\hat{{\varvec{W}}}_{u}\) instead of \({\varvec{W}}_{u}\) in (2), the SINR for the SFB-CBH configuration is computed. In Fig. 7, the cumulative density function (CDF) of the SINR of the proposed scheme is compared with the adopted ZF and Wiener precoders in edge SNRs of \(5\ \mathrm {dB}\) and \(10\ \mathrm {dB}\). The load reduction is \(r_{fl}=r_{bl}=0.11\) and channel parameters are similar to Fig. 6. From the results shown in this figure, in edge SNR of \(5\ \mathrm {dB}\), the proposed precoder outperforms the adopted ZF and Wiener precoders with \(6.49\ \mathrm {dB}\) and \(3.85\ \mathrm {dB}\) on \(80\%\) point, respectively. In edge SNR of \(10\ \mathrm {dB}\), the superiority of the proposed precoder on others is \(6.36\ \mathrm {dB}\) and \(4.73\ \mathrm {dB}\), respectively.

To evaluate the performance of the individual users in the proposed scheme, we define the MSE difference as \(\triangle \hbox {MSE}=\hbox {MSE}_{m}-\hbox {MSE}_{\acute{m}}\), where the user m experiences the best MSE and the \(\acute{m}\) experiences the worst one in a given channel realization. The computation of MSE at each user is described in “Appendix 1”. The CDF of the MSE difference is shown in Fig. 8. Based on this result, the proposed scheme has less variance compared to the ZF precoder in SFB-CBH configuration. Although in this configuration, the CDF of MSE difference in the proposed precoder is slightly better than the Wiener precoder. As expected, in the FFB-FBH configuration, the ZF precoder distributes an equal MSE to the users, hence the difference becomes zero.

Fig. 7
figure 7

CDF comparison of the proposed precoder with adopted ZF and Wiener precoders in terms of the users’ SINR. The channel is time-variant and \(r_{fl}=r_{bl}=0.11\), \({\mathrm{N}}_{t}=1,{\mathrm{N}}_{r}=1\), \(\Delta t=1\ {\mathrm{ms}}\) and \(v=5\ {\mathrm{km/h}}\)

Fig. 8
figure 8

CDF comparison of the proposed precoder with adopted ZF and Wiener precoders in terms of the users’ MSE difference in SFB-CBH (\(r_{fl}=r_{bl}=0.33\)) and FFB-FBH configurations and edge SNR of \(10\ \mathrm {dB}\). The channel is time-variant and \({\mathrm{N}}_{t}=1,{\mathrm{N}}_{r}=1\), \(\Delta t=1\ {\mathrm{ms}}\) and \(v=5\ {\mathrm{km/h}}\)

In Fig. 9, the MSE of the proposed scheme is compared with selective feedback  [26] and SSOCP based relative thresholding [27] precoders in a time-variant channel with edge SNR of \(10\ \mathrm {dB}\), \({\mathrm{N}}_{t}=1\) and SFB-CBH configuration. In the selective feedback precoder, to change \(r_{fl}\) from 0 to \(60\%\), it is needed to change the absolute threshold level from \(-100\) to \(-120\ \mathrm {dB}\), while in the SSOCP precoder, the relative threshold level must be changed from 0 to \(11\ \mathrm {dB}\). Note that, in these precoders, only average loads can be controlled by adjusting the threshold value, while in the proposed precoder, the loads can be controlled strictly. As seen, the proposed scheme outperforms the selective feedback and SSOCP precoders for at least \(30\%\) from the MSE aspect.

Fig. 9
figure 9

Performance comparison of the proposed scheme with selective feedback and SSOCP based relative thresholding precoders in edge SNR of \(10\ \mathrm {dB}\) in time-variant channel

Based on numerical evaluations, we can conclude that the proposed scheme has better MSE performance in comparison to the 3GPP Release 15 codebook type II precoder. These numerical comparisons are omitted here due to limited space. Although, in Release 16 and 17 more advanced and effective CSI reporting is possible.

5.2.2 Performance analysis of the proposed scheme

In Fig. 10, performance of the proposed scheme is compared in various configurations w.r.t to feedback and backhaul load reductions. The receiver is without receive filter and the channel is time-variant. As expected, the proposed precoder has the best performance in FFB-FBH configuration and with increasing the \(r_{fl}\) and \(r_{bl}\), the system performance decreases. It is worth noting, for \(r_{fl}=r_{bl}=0.11\), the MSE increases as 40%. To evaluate the effect of backhaul load reduction alone, a sparse feedback-full backhaul (SFB-FBH) configuration is considered where \(r_{fl}=0.11\) and \(r_{bl}=0\). As expected, the proposed precoder has better performance in comparison to SFB-CBH with equal \(r_{fl}\) and slightly worse performance in comparison to FFB-FBH configuration.

Fig. 10
figure 10

MSE performance of the proposed scheme as function of edge SNR in terms of feedback load ratios in time-variant channel with \({\mathrm{N}}_{t}=1\)

In Fig. 11, the CDF of the MSE in the proposed scheme is showed for different feedback and backhaul load reduction values in SFB-CBH configuration. Edge SNR is \(10\ \mathrm {dB}\) and the \(\mu\) values in the legend show the average value. As seen, the average MSE is increased by increasing the feedback and backhaul load reductions.

Fig. 11
figure 11

Comparison of CDF of the SFB-CBH system in terms of MSE in time-variant channel. The cell-edge SNR is \(10\ \mathrm {dB}\). The \(\mu\) values in the legend, shows the average value

Figure 12 depicts the convergence behavior of the proposed scheme. A SFB-CBH configuration with \(r_{fl}=0.11\) and \(r_{bl}=0.11\) is assumed and the MSE is shown for different edge SNRs. As seen, the scheme converges after transmitting an acceptable number of precoded data. In the worst case, the MSE converges after 5 subframes.

Fig. 12
figure 12

MSE over time of SFB-CBH with \(r_{fl}=r_{bl}=0.11\) in various cell-edge SNRs and time-variant channel

Based on numerical evaluations, the SINR performance of the proposed scheme in SFB-CBH configuration is slightly decreases with increasing \(\triangle t\) that can be considered as CSI reporting period. These numerical comparisons are omitted here due to limited space.

5.2.3 Performance of the proposed scheme in the CRAN network

To evaluate the performance of the proposed scheme in 5G and B5G systems, a scenario in the ultra-dense CRAN is considered, where in a square area of \(400\mathrm {m}\times 400\mathrm {m}\), both users and BSs are uniformly distributed. To satisfy seamless coverage, the density of BSs is anticipated to come up to \(40{-}50\ \mathrm {BSs/km^{2}}\) [7, 58], therefore the number of BSs and users are set to 8 and 14 to have densities of \(50\ \mathrm {BSs/km^{2}}\) and \(87\ \mathrm {Users/km^{2}}\). Because of limitation in the maximum number of users that can be supported by each backhaul link, each user is assumed to be served with its nearest 3 BSs [7]. It is considered each BS has \({\mathrm{N}}_{t}=8,16,32\) transmit antennas and the users are equipped with \({\mathrm{N}}_{r}=2\) receive antennas. Figure 13 compares the performance of the proposed scheme in SFB-CBH (\(r_{fl}=r_{bl}=0.3\)) and FFB-FBH configurations. As expected, the proposed precoder has the best performance when \({\mathrm{N}}_{t}=32\) and by decreasing the number of transmit antennas, the MSE increases. It is worth to note that when the number of transmit antennas is high, the performance degradation arising from feedback load reduction is negligible.

Fig. 13
figure 13

MSE performance of the proposed precoder as function of edge SNR in various number of transmit antennas in SFB-CBH (\(r_{fl}=r_{bl}=0.3\)) and FFB-FBH configurations. The channel is time-variant with \(\Delta t=1\ {\mathrm{ms}}\), \(v=5\ {\mathrm{km/h}}\) and \({\mathrm{N}}_{r}=2\)

5.2.4 Feedback quantization effect

In Fig. 14, performance of the proposed quantization scheme in the SFB-CBH configuration with \(r_{fl}=0.167\), \({\mathrm{N}}_{t}=2\), \({\mathrm{N}}_{r}=1\), perfect PA, and varying bit number for CDI quantization is depicted. The achievable MSE is plotted for a wide range of edge SNRs in a receiver without receive filter. We can see a performance gap between the scheme of perfect CDI and of \({\mathrm{B}}_{{\mathrm{CDI}}}\) bits quantization. However, with few bits for CDI quantization, the performance of the CSI codebook based feedback is significantly improved, and with \({\mathrm{B}}_{{\mathrm{CDI}}}=8\) bits, the performance loss is negligible.

Fig. 14
figure 14

MSE performance of the CSI codebook based quantization in SFB-CBH proposed scheme with \(flr=0.167\), \({\mathrm{N}}_{t}=2\), perfect PA and varying bit number for CDI quantization

Figure 15, shows the performance of the proposed quantization scheme in a similar system configuration stated for Fig. 14, with \({\mathrm{B}}_{{\mathrm{CDI}}}=8\) bits for CDI quantization and varying number of bits for the PA quantization. From the results shown in this figure, when the \({\mathrm{B}}_{\mathrm {PA}}\) increases from 1 to 4, the MSE is considerably decreased and in comparison to perfect PA, when \({\mathrm{B}}_{\mathrm {PA}}=4\) bits, the performance loss is negligible.

Fig. 15
figure 15

Impact of bit number of PA quantization in MSE performance of the proposed scheme with \(flr=0.167\), \({\mathrm{N}}_{t}=2\) and \({\mathrm{B}}_{{\mathrm{CDI}}}=8\) bits

The impact of the number of CDI and PA quantization bits on system performance implies that, only a small number of bits is necessary to benefit from a CSI codebook based quantization scheme in the proposed precoder. Especially, by considering the total number of bits for quantization of each link spanning from a BS to a user as \({\mathrm{B}}=\left( \frac{{\mathrm{B}}_{{\mathrm{CDI}}}+{\mathrm{B}}_{\mathrm {PA}}}{{\mathrm{N}}_{t}}\right)\), it is clear that by employing \({\mathrm{B}}=6\) bits quantization, a performance near to the full CSI feedback is attainable.

6 Conclusion

For a centralized JT-CoMP FDD downlink system, we designed and investigated the performance of a novel sparse feedback and constrained backhaul transmission scheme. To design the precoder matrix by providing feedback and backhaul load reductions, under a total power constraint and load balancing between BSs, a sub-optimum two-layer optimization method was proposed. In the inner layer, SVD of the CSI matrix was utilized to design the precoder matrix fulfilling a sum-power constraint and pre-known idle links. In the outer layer, the Greedy algorithm was exploited to design the link selection matrix, providing required feedback and backhaul load reductions and load balancing between BSs. In addition, sparse feedback and constrained backhaul schemes were introduced with adopting ZF and Wiener precoders. To further reduction of the feedback load, a CSI codebook based limited feedback strategy was considered. Numerical evaluations show a performance gain in terms of MSE of the proposed scheme, when compared to adopted ZF, Wiener, selective feedback and SSOCP based relative thresholding.

Availability of data and materials

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.



Joint transmission coordinated multipoint


Base station


Channel state information


Mean square error


Singular value decomposition


Sparse feedback


Constrained backhaul


Zero forcing


Frequency division duplex


Central coordination node


Signal to noise ratio


Time division duplex


Automatic gain control


Received signal strength indicator


Matched filter


Minimum mean square error


Analog to digital converter


Wiener filter


Successive second order cone programming


Signal to noise and interference ratio


Full feedback


Full backhaul


Cumulative density function


  1. K. Manolakis, V. Jungnickel, C. Oberli, T. Wild, V. Braun, N. Vucic, M. Castaneda, Cooperative cellular networks: overcoming the effects of real-world impairments. IEEE Veh. Technol. Mag. 10(3), 30–40 (2015)

    Article  Google Scholar 

  2. V. Jungnickel, K. Manolakis, W. Zirwas, B. Panzner, V. Braun, M. Lossow, M. Sternad, R. Apelfröjd, T. Svensson, The role of small cells, coordinated multipoint, and massive MIMO in 5g. IEEE Commun. Mag. 52(5), 44–51 (2014)

    Article  Google Scholar 

  3. A. Osseiran, J. Monserrat, P. Marsch, O. Queseth, H. Tullberg, M. Fallgren, K. Kusume, A. Höglund, H. Droste, I. Silva, P. Rost, M. Boldi, J. Sachs, P. Popovski, D. Gozalvez-Serrano, P. Fertl, Z. Li, F. Moya, G. Fodor, J. Lianghai, 5G Mobile and Wireless Communications Technology, vol. 06 (Cambridge University Press, Cambridge, 2016)

    Google Scholar 

  4. S.M.R. Islam, N. Avazov, O.A. Dobre, K. Kwak, Power-domain non-orthogonal multiple access (noma) in 5g systems: potentials and challenges. IEEE Commun. Surv. Tutor. 19(2), 721–742 (2017)

    Article  Google Scholar 

  5. M. Hashemi, M. Coldrey, M. Johansson, S. Petersson, Integrated access and backhaul in fixed wireless access systems, in 2017 IEEE 86th Vehicular Technology Conference (VTC-Fall), pp. 1–5 (2017)

  6. A. Checko, H.L. Christiansen, Y. Yan, L. Scolari, G. Kardaras, M.S. Berger, L. Dittmann, Cloud ran for mobile networks—a technology overview. IEEE Commun. Surv. Tutor. 17(1), 405–426 (2015)

    Article  Google Scholar 

  7. C. Pan, H. Ren, M. Elkashlan, A. Nallanathan, L. Hanzo, Robust beamforming design for ultra-dense user-centric c-ran in the face of realistic pilot contamination and limited feedback. IEEE Trans. Wirel. Commun. 18(2), 780–795 (2019)

    Article  Google Scholar 

  8. 3GPP, Coordinated multi-point operation for lte; 3gpp tr 36.819 v11.0.0, 3GPP TSG RAN WG1, Tech. Rep., Sept. (2011)

  9. Qualcomm, How can comp extend 5g nr to high capacity and ultra-reliable communications? Tech. Rep. (2018)

  10. 3GPP, Technical specification group services and system aspects; release 15 description; 3gpp tr 21.915, v15.0.0, Tech. Rep., 2019-09

  11. 3GPP, Technical specification group radio access network; study on further enhancements to coordinated multi-point (comp) operation for lte; 3gpp tr 36.741, v14.0.0, Tech. Rep., 2017-03

  12. D. Lee, H. Seo, B. Clerckx, E. Hardouin, D. Mazzarese, S. Nagata, K. Sayana, Coordinated multipoint transmission and reception in lte-advanced: deployment scenarios and operational challenges. IEEE Commun. Mag. 50(2), 148–155 (2012)

    Article  Google Scholar 

  13. J. Li, A. Papadogiannis, R. Apelfröjd, T. Svensson, M. Sternad, Performance evaluation of coordinated multi-point transmission schemes with predicted CSI, in IEEE 23rd International Symposium on Personal. Indoor and Mobile Radio Communications—(PIMRC) 2012, pp. 1055–1060 (2012)

  14. Z. Mayer, J. Li, A. Papadogiannis, T. Svensson, On the impact of control channel reliability on coordinated multi-point transmission. EURASIP J. Wirel. Commun. Netw. 2014(1), 28 (2014).

    Article  Google Scholar 

  15. K. Manolakis, V. Jungnickel, C. Oberli, T. Wild, V. Braun, Impairments in cooperative mobile networks: Models, impact on performance and mitigation, in European Wireless 2014; 20th European Wireless Conference, pp. 1–8 (2014)

  16. K. Manolakis, C. Oberli, V. Jungnickel, F. Rosas, Analysis of synchronization impairments for cooperative base stations using OFDM. Int. J. Antennas Propag. 2015, 1–14 (2015)

    Article  Google Scholar 

  17. B. Makki, J. Li, T. Eriksson, T. Svensson, Throughput analysis for multi-point joint transmission with quantized CSI feedback, in 76th IEEE Vehicular Technology Conference, VTC Fall 2012; Quebec City, QC; Canada; 3 September 2012 through 6 September 2012, pp. 1 – 5 (2012)

  18. L. Shi, Z. Hu, T. Zhang, Z. Zeng, Performance analysis of delayed limited feedback based on per-cell codebook in comp systems, in 2015 IEEE Wireless Communications and Networking Conference, WCNC 2015, New Orleans, LA, USA, March 9–12, 2015, pp. 363–368 (2015)

  19. S. Zhou, J. Gong, Z. Niu, Distributed adaptation of quantized feedback for downlink network MIMO systems. IEEE Trans. Wireless Commun. 10(1), 61–67 (2011)

    Article  Google Scholar 

  20. B. Makki, J. Li, T. Eriksson, T. Svensson, Throughput analysis for multi-point joint transmission with quantized CSI feedback. IEEE Veh. Technol. Conf. (VTC Fall) 2012, 1–5 (2012)

    Google Scholar 

  21. B. Makki, J. Li, T. Eriksson, T. Svensson, Coordinated multi-point joint transmission with partial channel information feedback, in European Wireless 2013; 19th European Wireless Conference, pp. 1–5 (2013)

  22. M. Lossow, S. Jaeckel, V. Jungnickel, V. Braun, Efficient mac protocol for JT comp in small cells. IEEE Int. Conf. Commun. Workshops ICC 2013, 1166–1171 (2013)

    Google Scholar 

  23. T.R. Lakshmana, J. Li, C. Botella, A. Papadogiannis, T. Svensson, Scheduling for backhaul load reduction in comp. IEEE Wirel. Commun. Netw. Conf. WCNC 2013, 227–232 (2013)

    Google Scholar 

  24. D. Marabissi, G. Bartoli, R. Fantacci, M. Pucci, An optimized comp transmission for a heterogeneous network using EICIC approach. IEEE Trans. Veh. Technol. 65(10), 8230–8239 (2016)

    Article  Google Scholar 

  25. A. Papadogiannis, H.J. Bang, D. Gesbert, E. Hardouin, Downlink overhead reduction for multi-cell cooperative processing enabled wireless networks, in IEEE 19th International Symposium on Personal. Indoor and Mobile Radio Communications, Sept 2008, 1–5 (2008)

  26. A. Papadogiannis, H.J. Bang, D. Gesbert, E. Hardouin, Efficient selective feedback design for multicell cooperative networks. IEEE Trans. Veh. Technol. 60(1), 196–205 (2011)

    Article  Google Scholar 

  27. T.R. Lakshmana, A. Tölli, R. Devassy, T. Svensson, Precoder design with incomplete feedback for joint transmission. IEEE Trans. Wirel. Commun. 15(3), 1923–1936 (2016)

    Article  Google Scholar 

  28. Y. Yu, T. Hsieh, A. Pang, Millimeter-wave backhaul traffic minimization for comp over 5g cellular networks. IEEE Trans. Veh. Technol. 68(4), 4003–4015 (2019)

    Article  Google Scholar 

  29. 3GPP, Technical specification group radio access network; physical channels and modulation, release 10, ts 36.211 v 12.5.0, Tech. Rep., March 2015

  30. The 5g evolution: 3gpp releases 16–17, January 2020., Tech. Rep

  31. 3GPP, Nr; physical layer procedures for data, ts 38.214 v15.4.0, Tech. Rep., December 2018

  32. R. Ahmed, F. Tosato, M. Maso, “Overhead reduction of NR type ii CSI for NR release 16, in WSA 2019; 23rd International ITG Workshop on Smart Antennas, pp. 1–5 (2019)

  33. M. Taki, M.B. Nezafati, Delay constrained throughput optimisation with imperfect CSI using discrete adaptive power. Int. J. Electron. 105(12), 2033–2051 (2018).

    Article  Google Scholar 

  34. M. Taki, M.B. Nezafati, Integrated scheduling and link adaptation for heterogeneous networks: design and performance analysis. Int. J. Electron. (2019).

    Article  Google Scholar 

  35. M. Taki, T. Svensson, M.B. Nezafati, Delay constrained throughput optimization in multi-hop AF relay networks, using limited quantized CSI. EURASIP J. Wirel. Commun. Netw. 2019(1), 102 (2019).

    Article  Google Scholar 

  36. M. Joham, W. Utschick, J.A. Nossek, Linear transmit processing in MIMO communications systems. IEEE Trans. Signal Process. 53(8), 2700–2712 (2005)

    Article  MathSciNet  Google Scholar 

  37. P. Xiao, M. Sellathurai, Improved linear transmit processing for single-user and multi-user MIMO communications systems. IEEE Trans. Signal Process. 58(3), 1768–1779 (2010)

    Article  MathSciNet  Google Scholar 

  38. S. Wahls, H. Boche, Linear IIR-MMSE precoding for frequency selective MIMO channels,” in 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3264–3267 (2011)

  39. D. Sacristán-Murga, M. Payaró, A. Pascual-Iserte, Robust linear precoding for MSE minimization in MIMO broadcast systems with channel gram matrix feedback, in 2011 IEEE 12th International Workshop on Signal Processing Advances in Wireless Communications, pp. 341–345 (2011)

  40. L. You, L. Lei, D. Yuan, Load balancing via joint transmission in heterogeneous LTE: modeling and computation, in 2015 IEEE 26th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), pp. 1173–1177 (2015)

  41. J. Wang, D.P. Palomar, Robust MMSE precoding in MIMO channels with pre-fixed receivers. IEEE Trans. Signal Process. 58(11), 5802–5818 (2010)

    Article  MathSciNet  Google Scholar 

  42. F. Zhang, Matrix Theory Basic Results and Techniques (Springer, New York, 2011)

    Book  Google Scholar 

  43. P. Boggs, J. Tolle, Sequential quadratic programming. Acta Numer. 4, 1–51 (1995)

    Article  MathSciNet  Google Scholar 

  44. Z. Zhu, An efficient sequential quadratic programming algorithm for nonlinear programming. J. Comput. Appl. Math. 175(2), 447–464 (2005)

    Article  MathSciNet  Google Scholar 

  45. A. Schrijver, Theory of Linear and Integer Programming (Wiley, New York, NY, 1998)

    MATH  Google Scholar 

  46. S. Joshi, S. Boyd, Sensor selection via convex optimization. Trans. Sig. Proc. 57(2), 451–462 (2009).

    Article  MathSciNet  MATH  Google Scholar 

  47. S. Sigdel, W. A. Krzymien, M. Al-Shalash, Greedy and progressive user scheduling for comp wireless networks, pp. 4218–4223 (2012)

  48. K. Elkhalil, A. Kammoun, T. Y. Al-Naffouri, M. S. Alouini, A blind antenna selection scheme for single-cell uplink massive MIMO, in 2016 IEEE Globecom Workshops (GC Wkshps), pp. 1–6 (2016)

  49. F. Yuan, C. Yang, Bit allocation between per-cell codebook and phase ambiguity quantization for limited feedback coordinated multi-point transmission systems. IEEE Trans. Commun. 60(9), 2546–2559 (2012)

    Article  Google Scholar 

  50. R. Bhagavatula, R.W. Heath, Adaptive limited feedback for sum-rate maximizing beamforming in cooperative multicell systems. IEEE Trans. Signal Process. 59(2), 800–811 (2011)

    Article  MathSciNet  Google Scholar 

  51. D. Su, X. Hou, C. Yang, Quantization based on per-cell codebook in cooperative multi-cell systems. IEEE Wirel. Commun. Netw. Conf. 2011, 1753–1758 (2011)

    Google Scholar 

  52. E. Biglieri, R. Calderbank, A. Constantinides, A. Goldsmith, A. Paulraj, H. Poor, MIMO Wireless Communications (Cambridge University Press, Cambridge, 2007)

    Book  Google Scholar 

  53. K. B. Petersen, M. S. Pedersen, J. Larsen, K. Strimmer, L. Christiansen, K. Hansen, L. He, L. Thibaut, M. Barão, S. Hattinger, V. Sima, and W. The, “The matrix cookbook,” Tech. Rep. (2006)

  54. 3GPP, Evolved universal terrestrial radio access (e-utra); physical channels and modulation, TS 36.211 V14.4.0, 2017-09

  55. 3GPP, Evolved universal terrestrial radio access, radio frequency system scenarios, TR 36.942-a20 Release 10 (2015)

  56. J. Anderson, A. Svensson, Coded Modulation Systems, ser. Information Technology Series. Springer US, 2003. [Online].

  57. T. Rappaport, A.O.M.C. Safari, Wireless Communications: Principles and Practice, ser. Prentice Hall communications engineering and emerging technologies series. Prentice Hall PTR, 2002. [Online].

  58. M. Yu, A. Tang, X. Wang, C. Han, Joint scheduling and power allocation for 6g terahertz mesh networks, in 2020 International Conference on Computing, Networking and Communications (ICNC), pp. 631–635 (2020)

Download references


The authors declare that there is no fund of this work.

Author information

Authors and Affiliations



The individual contributions of each authors to the manuscript are the same. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mehrdad Taki.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix 1

The MSE at the u-th user is computed as

$$\begin{aligned} {\hbox {MSE}_{u}}= & {} E\left\{ \left\| {\varvec{x}}_{u}-\alpha _{u}\widetilde{{\varvec{x}}}_{u}\right\| _{\text {2}}^{\text {2}}\right\} =E\left\{ \left( {\varvec{x}}_{u}-\alpha _{u}\widetilde{{\varvec{x}}}_{u}\right) \left( {\varvec{x}}_{u}-\alpha _{u}\widetilde{{\varvec{x}}}_{u}\right) ^{H}\right\} \nonumber \\= & {} {{{\left\{ {\varvec{x}}_{u}{\varvec{x}}_{u}^{H}\right\} -\alpha _{u}^{H}E\left\{ {\varvec{x}}_{u}\widetilde{{\varvec{x}}}_{u}^{H}\right\} -\alpha _{u}E\left\{ \widetilde{{\varvec{x}}}_{u}{\varvec{x}}_{u}^{H}\right\} +\alpha _{u}\alpha _{u}^{H}E\left\{ \widetilde{{\varvec{x}}}_{u}\widetilde{{\varvec{x}}}_{u}^{H}\right\} .}}} \end{aligned}$$

As the user’s data symbols have unit power variance, \(E\left\{ {\varvec{x}}_{u}{\varvec{x}}_{u}^{H}\right\} ={\mathbf{I}}_{{\mathrm{N}}_{r}}\). By considering the detected symbol at the u-th user as \(\widetilde{{\varvec{x}}}_{u}={\varvec{g}}_{u}{\varvec{y}}_{u}\) and using (1), \(E\left\{ {\varvec{x}}_{u}\widetilde{{\varvec{x}}}_{u}^{H}\right\}\) and \(E\left\{ \widetilde{{\varvec{x}}}_{u}{\varvec{x}}_{u}^{H}\right\}\) are computed as

$$\begin{aligned}&E\left\{ {\varvec{x}}_{u}\widetilde{{\varvec{x}}}_{u}^{H}\right\} =E\left\{ {\varvec{x}}_{u}\left( {\varvec{g}}_{u}{\varvec{H}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{W}}_{i}{\varvec{x}}_{i}+{\varvec{g}}_{u}{\varvec{n}}_{u}\right) ^{H}\right\} \nonumber \\&\quad =E\left\{ \left( {\varvec{x}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{x}}_{i}^{H}{\varvec{W}}_{i}^{H}\right) {\varvec{H}}_{u}^{H}{\varvec{g}}_{u}^{H}+{\varvec{x}}_{u}{\varvec{n}}_{u}^{H}{\varvec{g}}_{u}^{H}\right\} \nonumber \\&\quad =E\left\{ {\varvec{x}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{x}}_{i}^{H}{\varvec{W}}_{i}^{H}\right\} {\varvec{H}}_{u}^{H}{\varvec{g}}_{u}^{H}+{\varvec{x}}_{u}E\left\{ {\varvec{n}}_{u}^{H}\right\} {\varvec{g}}_{u}^{H}={\varvec{W}}_{u}^{H}{\varvec{H}}_{u}^{H}{\varvec{g}}_{u}^{H}, \end{aligned}$$
$$\begin{aligned}&E\left\{ \widetilde{{\varvec{x}}}_{u}{\varvec{x}}_{u}^{H}\right\} =E\left\{ \left( {\varvec{g}}_{u}{\varvec{H}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{W}}_{i}{\varvec{x}}_{i}+{\varvec{g}}_{u}{\varvec{n}}_{u}\right) {\varvec{x}}_{u}^{H}\right\} \nonumber \\&\quad =E\left\{ \left( {\varvec{g}}_{u}{\varvec{H}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{W}}_{i}{\varvec{x}}_{i}{\varvec{x}}_{u}^{H}\right) +{\varvec{g}}_{u}{\varvec{n}}_{u}{\varvec{x}}_{u}^{H}\right\} \nonumber \\&\quad ={\varvec{g}}_{u}{\varvec{H}}_{u}E\left\{ \sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{W}}_{i}{\varvec{x}}_{i}{\varvec{x}}_{u}^{H}\right\} +{\varvec{g}}_{u}E\left\{ {\varvec{n}}_{u}\right\} {\varvec{x}}_{u}^{H}={\varvec{g}}_{u}{\varvec{H}}_{u}{\varvec{W}}_{u}, \end{aligned}$$

where the user data symbols are independent and zero mean. The term \(E\left\{ \widetilde{{\varvec{x}}}_{u}\widetilde{{\varvec{x}}}_{u}\right\}\) in (56) is the variance of the received signal which can be evaluated as

$$\begin{aligned}&E\left\{ \widetilde{{\varvec{x}}}_{u}\widetilde{{\varvec{x}}}_{u}\right\} =E\left\{ {\varvec{g}}_{u}\left[ \left( {\varvec{H}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{W}}_{i}{\varvec{x}}_{i}+{\varvec{n}}_{u}\right) \left( \sum _{j=1}^{{\mathrm{N}}_{u}}{\varvec{x}}_{j}^{H}{\varvec{W}}_{j}^{H}{\varvec{H}}_{u}^{H}+{\varvec{n}}_{u}^{H}\right) \right] {\varvec{g}}_{u}^{H}\right\} \nonumber \\&\quad =E\left\{ {\varvec{g}}_{u}\left[ {\varvec{H}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{W}}_{i}{\varvec{x}}_{i}\sum _{j=1}^{{\mathrm{N}}_{u}}{\varvec{x}}_{j}^{H}{\varvec{W}}_{j}^{H}{\varvec{H}}_{u}^{H}+{\varvec{H}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{W}}_{i}{\varvec{x}}_{i}{\varvec{n}}_{u}^{H}+{\varvec{n}}_{u}\sum _{j=1}^{{\mathrm{N}}_{u}}{\varvec{x}}_{j}^{H}{\varvec{W}}_{j}^{H}{\varvec{H}}_{u}^{H}+{\varvec{n}}_{u}{\varvec{n}}_{u}^{H}\right] {\varvec{g}}_{u}^{H}\right\} \nonumber \\&\quad ={\varvec{g}}_{u}E\left\{ {\varvec{H}}_{u}\sum _{i=1}^{{\mathrm{N}}_{u}}{\varvec{W}}_{i}{\varvec{x}}_{i}\sum _{j=1}^{{\mathrm{N}}_{u}}{\varvec{x}}_{j}^{H}{\varvec{W}}_{j}^{H}\right\} {\varvec{g}}_{u}^{H}+{\varvec{g}}_{u}E\left\{ {\varvec{n}}_{u}{\varvec{n}}_{u}^{H}\right\} {\varvec{g}}_{u}^{H}\nonumber \\&\quad ={\varvec{g}}_{u}\left( \sum _{i=1}^{{\mathrm{N}}_{u}}\left( {\varvec{H}}_{u}{\varvec{W}}_{i}\right) \left( {\varvec{H}}_{u}{\varvec{W}}_{i}\right) ^{H}\right) {\varvec{g}}_{u}^{H}+\sigma _{n}^{2}{\varvec{g}}_{u}{\varvec{g}}_{u}^{H}. \end{aligned}$$

Finally, the MSE at the u-th user is obtained as

$$\begin{aligned} \hbox {MSE}_{u}={\mathbf{I}}_{{\mathrm{N}}_{r}}-\alpha _{u}\left( {\varvec{g}}_{u}{\varvec{H}}_{u}{\varvec{W}}_{u}\right) ^{H}-\alpha _{u}{\varvec{g}}_{u}{\varvec{H}}_{u}{\varvec{W}}_{u}+\alpha _{u}\alpha _{u}^{H}\left[ {\varvec{g}}_{u}\left( \sum _{i=1}^{{\mathrm{N}}_{u}}\left( {\varvec{H}}_{u}{\varvec{W}}_{i}\right) \left( {\varvec{H}}_{u}{\varvec{W}}_{i}\right) ^{H}\right) {\varvec{g}}_{u}^{H}+\sigma _{n}^{2}{\varvec{g}}_{u}{\varvec{g}}_{u}^{H}\right] . \end{aligned}$$

Similarly, the sum MSE at all users is computed as

$$\begin{aligned} \hbox {MSE}= & {} E\left\{ \left\| {\varvec{x}}-\varvec{\alpha }\widetilde{{\varvec{x}}}\right\| _{2}^{2}\right\} =E\left\{ \left\| {\varvec{x}}-\varvec{\alpha }\varvec{G}\left( {\varvec{H}}{\varvec{W}}{\varvec{x}}+{\varvec{n}}\right) \right\| _{2}^{2}\right\} \nonumber \\= & {} E\left\{ \left\| \left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) {\varvec{x}}-\varvec{\alpha }\varvec{G}{\varvec{n}}\right\| _{2}^{2}\right\} =E\left\{ \left( \left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) {\varvec{x}}-\varvec{\alpha }\varvec{G}{\varvec{n}}\right) \left( \left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) {\varvec{x}}-\varvec{\alpha }\varvec{G}{\varvec{n}}\right) ^{H}\right\} \nonumber \\= & {} E\left\{ \left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) {\varvec{x}}{\varvec{x}}^{H}\left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) ^{H}\right\} -E\left\{ \left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) {\varvec{x}}\left( \varvec{\alpha }\varvec{G}{\varvec{n}}\right) ^{H}\right\} \nonumber \\&-E\left\{ \left( \varvec{\alpha }\varvec{G}{\varvec{n}}\right) {\varvec{x}}^{H}\left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) ^{H}\right\} +E\left\{ \left( \varvec{\alpha }\varvec{G}{\varvec{n}}\right) \left( \varvec{\alpha }\varvec{G}{\varvec{n}}\right) ^{H}\right\} \nonumber \\= & {} \left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) E\left\{ {\varvec{x}}{\varvec{x}}^{H}\right\} \left( {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right) ^{H}+\left( \varvec{\alpha G}\right) E\left\{ {\varvec{n}}{\varvec{n}}^{H}\right\} \left( \varvec{\alpha G}\right) ^{H}\nonumber \\= & {} \left\| {\mathbf{I}}-\varvec{\alpha }\varvec{G}{\varvec{H}}{\varvec{W}}\right\| _{F}^{2}+\sigma _{n}^{2}\left\| \varvec{\alpha }\varvec{G}\right\| _{F}^{2}, \end{aligned}$$

where \(E\left\{ {\varvec{n}}\right\} =\varvec{0}_{{\mathrm{N}}_{U}},\) \(E\left\{ {\varvec{n}}{\varvec{n}}^{H}\right\} =\sigma _{n}^{2}{\mathbf{I}}_{{\mathrm{N}}_{U}}\) and \(E\left\{ {\varvec{x}}{\varvec{x}}^{H}\right\} ={\mathbf{I}}_{{\mathrm{N}}_{U}}\).

Appendix 2

To compute the complexity of the SQP based inner loop optimization, the subproblem (31, 32) is considered, where the gradient of the objective function is computed as

$$\begin{aligned} \nabla f\left( {\varvec{\lambda}}_{w}\right)= & {} \nabla \left[ \left\| {\varvec{\lambda}}_{\widetilde{H}}.{\varvec{\lambda}}_{W}-\varvec{1}_{{\mathrm{N}}_{U}\times 1}\right\| _{2}^{2}+\left\| \widetilde{{\varvec{H}}}\varvec{Z}\right\| _{F}^{2}\right] =\nabla \left\| {\varvec{\lambda}}_{\widetilde{H}}.{\varvec{\lambda}}_{W}-\varvec{1}_{{\mathrm{N}}_{U}\times 1}\right\| _{2}^{2}+\nabla \left\| \widetilde{{\varvec{H}}}\varvec{Z}\right\| _{F}^{2}\nonumber \\= & {} \nabla f_{1}\left( {\varvec{\lambda}}_{W}\right) +\nabla f_{2}\left( {\varvec{\lambda}}_{W}\right) \end{aligned}$$

By considering \(\nabla f\left( {\varvec{\lambda}}_{W}\right) =\left[ \frac{\partial f\left( {\varvec{\lambda}}_{W}\right) }{\partial \lambda _{1}},\ldots ,\frac{\partial f\left( {\varvec{\lambda}}_{W}\right) }{\partial \lambda _{{\mathrm{N}}_{U}}}\right]\), the \(\nabla f_{1}\left( {\varvec{\lambda}}_{W}\right)\) is computed as

$$\begin{aligned}&\frac{\partial f_{1}\left( {\varvec{\lambda}}_{W}\right) }{\partial \lambda _{u}}=\frac{\partial }{\partial \lambda _{u}}\sum _{i=1}^{{\mathrm{N}}_{U}}\left( \lambda _{\widetilde{H}}\left( i\right) \lambda _{W}\left( i\right) -1\right) ^{2}=2\lambda _{\widetilde{H}}\left( u\right) \left( \lambda _{\widetilde{H}}\left( u\right) \lambda _{W}\left( u\right) -1\right) \end{aligned}$$
$$\begin{aligned}&\nabla f_{1}\left( {\varvec{\lambda}}_{W}\right) =2{\varvec{\lambda}}_{\widetilde{H}}\cdot \left( {\varvec{\lambda}}_{\widetilde{H}}.{\varvec{\lambda}}_{W}-{\mathbf{1}}_{{\mathrm{N}}_{U}\times 1}\right) \end{aligned}$$

By applying the chain rule and using the gradient of norm as \(\nabla \left\| {\varvec{A}}{\varvec{x}}\right\| ^{2}=2{\varvec{A}}^{T}{\varvec{A}}{\varvec{x}}\), the \(\nabla f_{2}\left( {\varvec{\lambda}}_{W}\right)\) is computed as

$$\begin{aligned} \nabla f_{2}\left( {\varvec{\lambda}}_{W}\right) =2\widetilde{{\varvec{H}}}^{H}\widetilde{{\varvec{H}}}\varvec{Z}\left( \nabla \varvec{Z}\right) =2\widetilde{{\varvec{H}}}^{H}\widetilde{{\varvec{H}}}\varvec{Z}\left[ \left( \varvec{1}_{{\mathrm{N}}_{B}\times {\mathrm{N}}_{U}}-{\varvec{S}}\right) .\left( {\varvec{V}}_{\widetilde{H}}{\varvec{U}}_{\widetilde{H}}^{H}\right) \right] \end{aligned}$$

Similarly, the gradient of the constraint is computed as

$$\begin{aligned} \nabla g\left( {\varvec{\lambda}}_{W}\right)= & {} \nabla \left[ \sum _{u=1}^{{\mathrm{N}}_{U}}\lambda _{W}\left( u\right) ^{2}-\sum _{i,j|{\varvec{S}}(i,j)=0}\left\| \varvec{\Phi }_{i,j}{\varvec{\lambda}}_{W}\right\| ^{2}-P_{t}\right] \nonumber \\= & {} 2{\varvec{\lambda}}_{W}-\sum _{i,j|{\varvec{S}}(i,j)=0}2\varvec{\Phi }_{i,j}^{H}\varvec{\Phi }_{i,j}{\varvec{\lambda}}_{W} \end{aligned}$$

The computational complexity of \(\nabla f_{1}\left( {\varvec{\lambda}}_{W}\right)\) and \(\nabla f_{2}\left( {\varvec{\lambda}}_{W}\right)\) is \({\mathcal{O}}\left( {\mathrm{N}}_{U}\right)\) and \({\mathcal{O}}\left( {\mathrm{N}}_{B}{\mathrm{N}}_{U}^{2}\right)\) respectively, therefore the complexity of \(\nabla f\left( {\varvec{\lambda}}_{W}\right)\) is simplified to \({\mathcal{O}}\left( {\mathrm{N}}_{B}{\mathrm{N}}_{U}^{2}\right)\). The complexity of \(g\left( {\varvec{\lambda}}_{W}\right)\) and \(\nabla g\left( {\varvec{\lambda}}_{W}\right)\) is \({\mathcal{O}}\left( C{\mathrm{N}}_{U}\right)\) and \({\mathcal{O}}\left( Q{\mathrm{N}}_{U}^{2}\right)\) respectively, where Q is the number of zero elements in the matrix \({\varvec{S}}\). From the simulations, the SQP sub-problem converges within \(C<100\) iterations with no further improvement. By assuming \(Q\ll {\mathrm{N}}_{B}{\mathrm{N}}_{U}\) overall computational complexity of solving the problem (29) can be simplified to \({\mathcal{O}}\left( C{\mathrm{N}}_{B}{\mathrm{N}}_{U}^{2}\right)\).

By using the computed complexity for inner loop optimization, the complexity of the Algorithm1 is computed. The complexity of the pseudo code presented in Algorithm 1 is

$$\begin{aligned} {\mathcal{O}}\left( B_{1}+MAXRETRIES\left( B_{2}+Q\left( B_{3}+\left( {\mathrm{N}}_{B}{\mathrm{N}}_{U}-Q\right) B_{4}\right) \right) \right) \end{aligned}$$

where MAXRETRIES is the number of outer iteration and the terms \(B_{1}\), \(B_{2}\), \(B_{3}\) and \(B_{4}\) present blocks of the pseudo code in steps \(1-4\), \(6-7,22\), \(9,17-20\) and \(11-15\), respectively. Based on the simulations, the Greedy algorithm for outer loop converges within \(MAXRETRIES\le 3\) iterations.

The initialization steps of the Algorithm 1 in block \(B_{1}\) include computation of \({\varvec{S}}\) with complexity of \({\mathcal{O}}\left( Q{\mathrm{N}}_{B}{\mathrm{N}}_{U}\right)\), computation of the SVD with complexity of \({\mathcal{O}}\left( Q{\mathrm{N}}_{B}{\mathrm{N}}_{U}\min \left( {\mathrm{N}}_{B},{\mathrm{N}}_{U}\right) \right)\), solving problem (29) with complexity of \({\mathcal{O}}\left( C{\mathrm{N}}_{B}{\mathrm{N}}_{U}^{2}\right)\) and evaluating the MSE with complexity of \({\mathcal{O}}\left( {\mathrm{N}}_{B}{\mathrm{N}}_{U}^{2}\right)\). By considering the worst-case complexity, the complexity of \(B_{1}\) is \({\mathcal{O}}\left( {\mathrm{N}}_{B}{\mathrm{N}}_{U}^{2}\right)\). In blocks \(B_{2}\) and \(B_{3}\), the time and computational complexity can grow with \({\mathrm{N}}_{B}{\mathrm{N}}_{U}\), therefore the complexity of these blocks is \({\mathcal{O}}\left( {\mathrm{N}}_{B}{\mathrm{N}}_{U}\right)\). The complexity of the block \(B_{4}\) mainly depends on solving the \(f\left( \widetilde{{\varvec{H}}},q\right)\) in step 13 which is \({\mathcal{O}}\left( {\mathrm{N}}_{B}{\mathrm{N}}_{U}^{2}\right)\). Finally, by ignoring the lower-order terms, the overall computational complexity of the Algorithm 1 can be simplified to \({\mathcal{O}}\left( MAXRETRIES\times Q\left( {\mathrm{N}}_{B}{\mathrm{N}}_{U}-Q\right) C{\mathrm{N}}_{B}{\mathrm{N}}_{U}^{2}\right)\). By assuming \(Q\ll {\mathrm{N}}_{B}{\mathrm{N}}_{U}\) and ignoring the small constants, the overall complexity is simplified to \({\mathcal{O}}\left( C{\mathrm{N}}_{B}{\mathrm{N}}_{U}^{3}\right)\).


The purpose of this study was to design a downlink in a centralized JT-CoMP system with sparse feedback and constrained backhaul links. The system consists of neighboring cells and users at the common boundary, or cluster area, in the middle of the cells. The channels are assumed to be time-variant following 3GPP channel model. The throughput of the system in terms of MSE was optimized using a two-layer method including an inner SVD precoder design and an outer Greedy link selection approach. Furthermore, sparse feedback and constrained backhaul schemes based on ZF and Wiener precoders were defined and used as benchmark for the proposed scheme.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nezafati, M.B., Taki, M. & Svensson, T. MSE minimized joint transmission in coordinated multipoint systems with sparse feedback and constrained backhaul requirements. J Wireless Com Network 2021, 103 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: