A QoS-based joint user association and resource allocation scheme in ultra-dense networks

Ultra-dense networks (UDNs) have become an important architecture for the fifth generation (5G) networks. A large number of small base stations (SBSs) are deployed to provide high-speed and seamless connections for users in the network. However, the advantage of increasing the system capacity brought by the dense distribution of SBSs comes at the cost of severe inter-cell interference. Although the user-centric virtual cell method has been proposed to solve the interference problem, some challenges have been encountered in practical applications. For example, inter-cell interference still exists to a certain extent, and the cell load may be imbalance. Hence, under the virtual cell architecture, we propose a quality of service (QoS)-based joint user association and resource allocation scheme in UDNs. In order to mitigate the interference, balance cell load and improve the system throughput, a non-convex NP-hard problem is formulated. To effectively solve this problem, we decouple the formulated problem into three sub-problems: user association, physical resource block (PRB) allocation and power allocation. First, we consider the QoS requirements of user equipment (UE) and perform user association based on the PRB estimation method. Then, based on the overlapped virtual cells constructed, we propose a graph-based PRB allocation scheme for reducing virtual inter-cell interference. Moreover, we solve power allocation sub-problem by using the difference of concave (DC) programing method. The simulation results show that our proposed scheme is superior to other schemes in terms of user rates, cell load and system throughput.

the purpose of utilizing joint transmission technology [6] to reduce interference and enhance useful signals can be achieved, thereby greatly increasing the user rates. However, under the virtual cell architecture, inter-cell interference still exists within a certain range. Due to the irregular network topology and randomly distributed UEs, the cell load may be imbalance, which will cause the degradation of the system performance. Therefore, it is necessary to seek a joint user association and resource allocation scheme under the virtual cell architecture. It is worth noting that the process of user association is equivalent to the process of virtual cell clustering in this paper.
There have been quite a few studies on coordinated multiple points (CoMP) transmission, which can effectively suppress inter-cell interference, improve system capacity and meet QoS of edge users. In [7], authors propose a two-step joint clustering and scheduling scheme. A load-aware clustering scheme is designed and solved by game theory in the first step. Based on clustering results, a fair graph-coloring based intercell resource scheduling can be employed at the second step to maximize the resource utilization. [8,9] present load-aware user-centric CoMP clustering algorithms which consider trade-off between load balance and spectrum efficiency. In addition, some other studies focus on user-centric virtual cell, where each UE has some base stations associated with it to avoid low signal to interference plus noise ratio (SINR) [10,11] propose load-aware virtual cell schemes, which can meet the user QoS requirements. Firstly, the authors find the optimal radius of the virtual cell. Then they select the appropriate activation radius based on the cell load. In [12], the authors propose a beamforming problem to maximize the sum-rate in a virtual cell network [13] formulate an energy efficiency maximization problem in the user-centric virtual cell networks by optimizing beamforming vectors and access points cluster, and considering both rate requirements and power budgets.
In recent years, some significant efforts are devoted to reducing the computational complexity of user association and resource allocation. In [14], the authors propose a cluster-based energy-saving resource allocation method in UDNs. Firstly, a modified K-means method is proposed in the clustering stage, which can reduce complexity and inter-cluster interference. And then in resource allocation stage, a two-step PRB allocation algorithm is performed and an iterative power allocation scheme is designed and solved by a non-cooperative game [15] presents the joint optimization scheme of capacity maximization and power minimization for user association and resource allocation. To reduce the complexity, the clustering scheme is adopted in the solution of the optimization problem. As an effective method to reduce the computational complexity, graph theory has been widely used in resource allocation [16][17][18][19]. [20] formulates a graphbased joint user-centric overlapped clustering and resource allocation problem, in which traffic-load and the limited number of PRBs are considered. In order to improve system throughput and reduce computational complexity, [21][22][23][24][25] utilize the DC programming technology, which can be used to convert a non-convex function into the difference of two convex functions, and to obtain an approximately optimal solution through an iterative method. However, the joint user association and resource allocation scheme under the virtual cell architecture, considering the use of multiple PRBs for each user to reduce interference, cell load balance, guarantee QoS requirements and improve system throughput, has not been studied in previous work.
In this paper, under the virtual cell architecture, a QoS-based joint user association and resource allocation scheme is proposed to mitigate the interference, balance cell load and improve the system throughput. It is worth mentioning that the joint problem is a non-convex NP-hard problem. To solve this problem, we propose a three-step scheme. In the first step, we design a user association scheme, in which QoS requirements and cell load is considered. In the second step, we use the modified graph theory for PRB allocation to achieve the purpose of reducing inter-cell interference. In the third step, we apply the DC programming technology to power allocation, and perform an iterative update algorithm to obtain an approximate optimal solution for convergence. Finally, simulations indicate that our scheme is proved to be better than the other schemes.
The main contributions of this paper can be summarized as follows: 1. A novel framework is proposed for jointly designing QoS-based user association and resource allocation under the virtual cell architecture. This is the first attempt to consider the use of multiple PRBs for each user to reduce interference, balance cell load, guarantee QoS requirements of UE and improve system throughput. 2. To solve the non-convex NP-hard joint user association and resource allocation problem, three independent sub-problems are formulated.

a.
A new user-centric overlapped virtual cell clustering scheme is proposed to solve the user association problem of load imbalance with considering the QoS requirements. b. A low-complexity PRB allocation scheme is constructed by using modified graph-based method to further mitigate virtual inter-cell interference. c.
The DC programing method is used to power allocation and an iterative update algorithm is performed to obtain an approximate optimal solution for convergence.
3. Simulations verify that the performance of our proposed scheme is better than that of other schemes in terms of user rates, cell load and system throughput.
The rest of the paper is organized as follows. We give the system model and formulate the joint optimization problem in Sect. 2. Section 3 describes our three-step joint scheme and discuss its implementation process. We analyze the complexity of our scheme in Section IV. Section V evaluates the proposed methods and compares the performance with the other method. Finally, conclusions are drawn in the last section.

System model and problem formulation
The two-tier heterogeneous UDNs scenario [8] is one of the main scenarios of 5G. We consider a downlink two-tier heterogeneous UDNs that consists of a macro base station (MBS), N SBSs and K UEs, as is shown Fig. 1. In order to harvest densification gains and avoid cross-tier interference, the control plane (C-plane) and user plane (U-plane) split architecture in [26] is adopted in our system, where a network controller unit (NCU) is installed in MBS. The NCU takes charge of virtual cell formation, wireless radio resources allocation, and mobility management. SBSs only undertake the task of data transmission. The SBS index set is N = {1, ..., N } , the UE index set is K = {1, ..., K } . We make three assumptions: (1) SBSs and UEs are randomly distributed following Poisson point process (PPP) distribution within coverage area of MBS; (2) backhaul is ideal and channel state information (CSI) knowledge is perfect; (3) with zero-forcing (ZF) precoding, intra-cluster interference is completely eliminated.
The matrix X is defined as a user association matrix, in which the elements are x k,n , where x k,n ∈ {0, 1}, ∀k ∈ K, ∀n ∈ N . x k,n = 0 means that SBS n is not accessed by UE k , and x k,n = 1 means that SBS n is accessed by UE k . L represents the total number of PRBs per cell, and the PRB index set is L = {1, ..., L} . The matrix Y is defined as a PRB allocation matrix, in which the elements are y k,l , where y k,l ∈ {0, 1}, ∀k ∈ K, ∀l ∈ L . y k,l = 0 indicates that PRB l is not allocated to UE k , y k,l = 1 indicates that PRB l is allocated to UE k . p n,l represents the power allocated by SBS n to PRB l , and h k,n,l represents the channel gain between SBS n and UE k on PRB l . σ 2 indicates the additive white Gaussian noise (AWGN). Each PRB is assumed to have a bandwidth of B . The SINR of UE k on PRB l can be written as where C k represents the serving virtual cell cluster of UE k. According to the Shannon formula, the rate of UE k on PRB l can be expressed as x k,n p n,l h k,n,l

Fig. 1 The user-centric virtual cells system model of the ultra-dense network
The rate of UE k can be written as The total system throughput is The joint optimization problem is formulated as where p max n is the maximum transmission power of SBS n . N max is the maximum number of SBSs in the virtual cell cluster. R k,min is the minimum required rate of UE k . C1 indicates that the total allocated power of UEs cannot be larger than the maximum transmission power limit for each SBS. C2 implies that at least one PRB is allocated to each UE. C3 means that the number of PRBs used by users cannot be more than the maximum number of PRBs in each small cell (SC). C4 expresses that the number of each UE accessed SBSs cannot be larger than the maximum number of SBSs in the virtual cell cluster. C5 explains that the UE rate should be greater than the minimum required rate.

QoS-based joint user association and resource allocation scheme in UDNs
It can be observed that the objective function (5) is a non-convex NP-hard problem. Since the problem of user association and resource allocation are coupled, joint optimization problem (5) will result in extremely high computational complexity, especially in UDN scenarios. So, a novel framework is proposed for jointly designing QoS-based user association and resource allocation under the virtual cell architecture. The flowchart of the proposed user-centric overlapped clusters framework is described in Fig. 2.
x k,n y k,l p n,l ≤ p max n , ∀n We divide the problem (5) into three sub-problems: user association, PRB allocation and power allocation. The user association sub-problem is to determine the serving virtual cell cluster for each UE based on PRB estimation. A two-step graph-based approach is applied to the PRB allocation sub-problem. The DC programing method is used to power allocation sub-problem. Both of the PRB allocation and the power allocation sub-problem are to reduce virtual inter-cell interference and improve the system throughput. In this section, we will describe the three parts of the proposed scheme, and discuss implementation process for the scheme in practical systems.

User association
Our proposed user association scheme is divided into two stages: the estimation stage and the allocation stage. In the estimation stage, in order to estimate the number of PRBs required by each UE and cell load of each SC, we make two assumptions: 1) the transmit power of each SBS is equally allocated to all PRBs; 2) for each UE, the interference on each PRB comes from all SBSs except cooperative SBSs. The SINR of UE k can be written as where h k,n is the channel gain between SBS n and UE k . The number of PRBs required by UE k can be estimated by where ⌈·⌉ denotes the ceiling function that rounds up to the nearest integer. R k is the maximum achievable throughput from one PRB, it can be estimated as: The estimated load of SC n is where N n is the set of UEs associated to SC n. We define the Max Reference Signal Receiving Power (RSRP) rule: each UE associates SBSs with the first a few maximum RSRP values as its virtual cell cluster.
Here we introduce our proposed user association scheme in detail from the perspectives of UE and SBS. In the initial state, the candidate SBSs list for each UE consists of all SBSs. In the estimation stage, from the perspective of UE, N max SBSs are selected by each UE as its virtual cell cluster based on Max RSRP rule. From the perspective of SBS, we estimate the load and get the UEs list for each SC. In the allocation stage, from the perspective of SBS, the SCs is sorted in descending order according to the SC load. If the first SC is overloaded, we will sort the UEs of this SC in descending order based on SINR values. And the UE with the largest SINR is preferentially accepted and allocated PRBs. For other UEs we sequentially judge whether the remaining PRBs in this SC are sufficient or not. If the remaining PRBs are sufficient, the user association request is accepted, otherwise the request is rejected and UE is removed from the UE list of this SC. From the perspective of UE, the SBS of the first overloaded SC is removed from the candidate SBSs list for the UEs that are not associated to this SBS. Those UEs will reselect the cooperative SBSs according to the Max RSRP rule, and associate in turn until all the SCs are not overloaded. The specific algorithm is given in Algorithm 1.
As an example in Fig. 3, we assume that the maximum number of SBSs in the virtual cell cluster is 2, the number of PRBs for all SBSs is 16, and the number of PRBs required by each UE is 4. As shown in Fig. 3a, UE1, …, UE6 select cooperative SBSs based on the Max RSRP rule respectively. In Fig. 3b Now we can obtain the user association matrix X based on Algorithm 1. And the clustering result of virtual cells is {C 1 , C 2 , ..., C K }.

PRB allocation
After the UEs are associated to the SBSs, the next step is to solve the user-centric PRB allocation problem. Let F = XX T − N max I K , then the matrix F is the overlapped indicator matrix of the virtual cell cluster C i and C j (where i, j ∈ {1, ..., K } ). The element f i,j represents the number of overlapped SBSs in the virtual cell cluster C i and C j .
The graph coloring method has been widely used in resource allocation for decades to reduce the computational complexity. In the graph theory, each UE represents a vertex and the edge between the two vertices indicates interference between the two UEs. It can be easily seen that the underlying PRB allocation sub-problem can be converted into a graph coloring problem, but this graph coloring problem cannot be directly solved by existing graph-based method. In [20], each UE is only allocated to one PRB, but in our proposed scheme each UE can be allocated to multiple PRBs. Thus, the previous graph coloring method cannot be used in our proposed scheme directly. In order to use the graph coloring theory, we extend the graph coloring scheme in [20] for PRB allocation. The specific process is as follows.

Graph construction
The graph is constructed as G = (V , E) , where V is the set of vertices {C 1 , C 2 , ..., C K } corresponding to the virtual cell clusters {C 1 , C 2 , ..., C K } , and E is the edge connecting any two vertices. d(C k ) represents the degree of the vertex C k , which is equal to the number of all edges associated with the vertex C k . We construct edges based on matrix F . If f i,j > 1 , an edge between the cluster C i and the cluster C j is formed. We can get the number of PRBs used by each SC N used n , ∀n ∈ N by Algorithm 1. We assume that the total transmitted power of each SBS is allocated equally to the PRBs used by each SC, ie. P max n N used n , ∀n ∈ N . The interference of UE k on PRB l can be written as (10)

Graph coloring
The vertex with the maximum degree should be colored preferentially. The reasons are as follows: (1) the higher the vertex degree is, the larger the number of vertices adjacent to the vertex are. This means that the number of clusters overlapped SBSs is larger; (2) the orthogonal PRBs need to be allocated among the clusters with the same serving SBS; (3) since the number of PRBs are insufficient, PRBs need to be reused. If the vertex with the maximum degree is colored, we can choose more PRBs to other vertices with lower degrees. To solve this problem conveniently, our method is divided into two stages: the sorting stage and coloring stage.
In the sorting stage, we search for the vertex C * 1 with the highest degree in the V firstly. We define the set D C k as the set of the vertices adjacent to the vertex C k ,where C k ∈ V , ∀k ∈ K . Then, we find the set D C * 1 based on matrix F . The degree of all vertices in D C * 1 is reduced by one. This operation is performed on the remaining vertices in the V to obtain vertices C * 2 , C * 3 , …, until all vertices are sequentially placed into empty set S.
In the coloring stage, we sequentially label the PRBs required by each vertex in the set S , and get the set L S The corresponding natural number set is L S = 1, ..., We assign the colors to the elements in the set L S in turn. For the first L elements of the set L S , we randomly assign L different colors to them and update matrix Y . For other elements of the set L S , we firstly find the vertex C * k corresponding to the element. Then, we find the set D C * k based on matrix F . Thirdly, we choose the PRB with the minimum interference that is not used by the vertices of the set D C * k , and assign the color to vertex C * k . Finally, we update the interference on this PRB and matrix Y .
Now we can obtain the PRB allocation matrix Y based on Algorithm 2.

Power allocation
After forming the user-centric overlapped virtual cell cluster and allocating PRB to each UE, we will solve the user-centric power allocation problem. Relying on the user association matrix X and the PRB allocation matrix Y , the problem (5) is converted into the problem (13) where We can observe that the objective function of (13) is not concave [23]. However, it has a special structure that we can utilize. The specific utilization process is as follows.
We define f (p) = K k=1 L l=1 f k,l (p) , and g(p) = K k=1 L l=1 g k,l (p) , where and p ∈ P , P denotes the feasible set spanned by constraints C1 and C5 . Then, f (p) and g(p) are obviously two concave functions. Thus, utilizing the structure of objective function, the DC programming method [21][22][23][24][25] can be applied to convert the objective function of (13) into f (p) − g(p) . In the similar manner, C5 can be written as (13)  x k,n y k,l p n,l ≤ p max n , ∀n C5 : L l=1 y k,l B log 2 1 + SINR k,l ≥ R k,min , ∀k x k,n p n,l h k,n,l x k,n p k,n,l h k,n,l + x k ′ ,n p n,l h k,n,l + σ 2   (15) g k,l (p) = y k,l B log 2 x k ′ ,n p n,l h k,n,l + σ 2   (16) x k,n p k,n,l h k,n,l + x k ′ ,n p n,l h k,n,l + σ 2   In DC programming, we can start from a feasible initial point and solve the optimization problem iteratively. In order to solve the convex problem, let τ denote the iteration number. At the τ-th iteration, we employ the first-order Taylor approximation for g(p) and g k,l (p) , then where p (τ ) is the solutions of the problem at τ-th iteration, ∇ denote the gradient operation, and ∇g p (τ ) is a column vector with NL elements. Each element of ∇g p (τ ) can be computed as Hence, by substituting g(p (τ ) ) and ∇g(p (τ ) ) into the optimization problem (13), the problem (13) can be written as In order to use the DC programming method to solve the power allocation subproblem, we need to prove the following three propositions. (19) gives a tight lower bound for the objective function of (13).

Proposition 1 The approximation of
Proof Since g p (τ ) is a concave function, due to the first-order condition of the concave functions [27], we have.
When p = p τ , the equality holds which shows the tightness of the lower bound.

Proposition 2
The approximation of. (19) results in a sequence of improved solutions for the problem of (13).

Computational complexity
Here, we evaluate the complexity of our proposed scheme. The complexity of Algo- , where the complexity of graph construction algorithm is O(NKL 2 N 2 max ), and the complexity of PRB allocation algorithm is O(L 2 ) . As a comparison, the complexity of exhaustive search algorithm for the optimal solution is O(K L ) and the complexity of SA algorithm or RA algorithm is O(L) . The complexity of the optimal exhaustive search increases exponentially as the number of UEs increases. Hence, compared with the optimal exhaustive search solution, our proposed solution efficiently reduces the complexity in UDNs.

System simulation
In this section, we aim to characterize the performance of the proposed framework under different conditions via numerical simulations. We consider one MBS coverage with a circle of 300 m radius. SBSs and UEs are randomly distributed following PPP distribution with density M and K . By default, we set M = 50 SBS/km 2 , K = 350 UE/km 2 and N max = 3 . We consider three different rate requirements of UEs: 256kbps, 512kbps and 1024kbps, where each user rate requirement is randomly simulated. The other simulation parameters are listed in Table 1. We consider Reyleigh fading to models the channels and assume that the path loss model of each SBS is given by [29]  It can be seen that as the cluster size increases, the load of the SC increases under the two different user association schemes. The reasons are as follows: as the size of the SBS cluster increases, from the perspective of the UE, more SBSs become the cooperative SBSs of UE, from the perspective of SBS, more UEs are associated to SBS. Figure 4 shows the user association results based on the Max RSRP rule. Larger cluster size may improve RSRP but may also reduce total throughput. When N max is not less than 4, the SCs are overloaded because of insufficient PRBs. However, due to interference issue, the maximum cluster size of SBSs or CoMP will restrict to 3 in most cellular system. Thus, we set N max = 3 by default. Figure 5 gives us the results of user association rule based on Algorithm 1. Compared with the Max RSRP rule, Algorithm 1 eliminates the overloaded SC within a certain range. The system throughput versus the density of UEs Fig. 12 The system throughput versus the density of SBSs SC load versus the different density of UEs under user association rule based on Max RSRP and algorithm 1 respectively. It is clearly observed that the SC load is balanced by Algorithm1, and SBSs are reselected as cooperative SBSs cluster for UEs that cannot associate to the overload SBSs. Figures 10 show that the system throughput of the proposed user association based on Algorithm1 is higher than that of the user association based on Max RSRP rule. This is because that the SC load is considered by Algorithm 1. The UEs of overloaded SC are transferred to the non-overloaded SC, which effectively guarantees user QoS requirements. Through Figs. 4, 5 and 10, we have a conclusion that the optimal number of N max is 3 considering the trade-off between load balance and system throughput.
In Figs. 11 and 12, we compare Algorithm 2 with other algorithm such as random allocation (RA) [30], sequential allocation (SA) [30] and uniform frequency reuse (UFR) [31] on user-centric clusters by Algorithm1. Figure 11 depicts the system throughput versus the different density of UEs under 50 SBS/km 2 . We observe in Fig. 11 that the system throughput is increased for all the solutions when the UEs become denser. However, the performance of our proposed Algorithm 2 outperforms that of other algorithms in mitigating the virtual inter-cell interference and improving system throughput.
Next, Fig. 12 describes system throughput versus the different density of UEs under 350 UE/km 2 . We can observe that when the density of SBSs is less than 50 the system throughput increases. And then the system throughput becomes stable. This is because Fig. 13 The system throughput evolution in different density of UEs when the density of the SBCs is small, only a few UEs can be served by the SBSs due to the limited number of PRBs. As SBSs become denser, the UEs can be served by more cooperative SBSs and the system throughput also increases. From Figs. 11 and 12, we can conclude that our proposed algorithm 2 performance is superior to that of other algorithms.
We validate the convergence of our power allocation scheme by examining the evolution of system throughput in iterations. Figure 13    We can see that the throughput can reach stable state in less than 12 iterations. In Figs. 11 and 12, due to the density and the locations of UEs (or SBSs) are different, the final convergent values of system throughput in different density of UEs (or SBSs) are unequal. This shows that the performance of power allocation scheme using the DC programing method is better than that of the original fixed power allocation scheme in terms of system throughput.

Conclusion
In this paper, we propose a novel QoS-based joint user association and resource allocation scheme under the virtual cell architecture in a downlink two-tier heterogeneous UDNs. To mitigate interference, balance cell load, guarantee QoS requirements of UE and improve system throughput, a non-convex NP-hard problem is formulated, and this joint problem is decoupled into the three independent sub-problems. To effectively solve these sub-problems, we propose three schemes: a new user-centric overlapped virtual cell clustering scheme, a low-complexity PRB allocation scheme and a power allocation scheme using the DC programing method. Simulation results confirm that our proposed schemes are better than existing schemes in terms of user rates, cell load and system throughput.