Autonomous Algorithms for Centralized and Distributed Interference Coordination: A Virtual Layer Based Approach

Interference mitigation techniques are essential for improving the performance of interference limited wireless networks. In this paper, we introduce novel interference mitigation schemes for wireless cellular networks with space division multiple access (SDMA). The schemes are based on a virtual layer that captures and simplifies the complicated interference situation in the network and that is used for power control. We show how optimization in this virtual layer generates gradually adapting power control settings that lead to autonomous interference minimization. Thereby, the granularity of control ranges from controlling frequency sub-band power via controlling the power on a per-beam basis, to a granularity of only enforcing average power constraints per beam. In conjunction with suitable short-term scheduling, our algorithms gradually steer the network towards a higher utility. We use extensive system-level simulations to compare three distributed algorithms and evaluate their applicability for different user mobility assumptions. In particular, it turns out that larger gains can be achieved by imposing average power constraints and allowing opportunistic scheduling instantaneously, rather than controlling the power in a strict way. Furthermore, we introduce a centralized algorithm, which directly solves the underlying optimization and shows fast convergence, as a performance benchmark for the distributed solutions. Moreover, we investigate the deviation from global optimality by comparing to a branch-and-bound-based solution.


I. INTRODUCTION
Increasing bandwidth requirements, not least due to the fast growing popularity of handheld devices with high data rate consumption, bring cellular networks to the brink of their capacity. Until recently, an end to the growth of this demand is not yet in sight. In order to optimally exploit the available bandwidth, current cellular networks experienced a paradigm shift towards frequency reuse-1. Consequently, this leads to an increased susceptibility to interference such that current and future cellular networks are usually interference limited. This situation is aggravated by a trend towards ever smaller cell sizes. Especially users at the cell-edge are affected by high inter-cell interference (ICI). In theory, fully coordinated networks, where neighboring base stations act as a large distributed antenna array, promise a vast boost in performance [3], [4]. However, this makes great demands on synchronization and backhaul bandwidth. In fact, the promised gains from such schemes turn out to be hard to implement in practice [5]. As a consequence, distributed schemes for interference mitigation, incorporating joint scheduling and adaptive power allocation, are of utmost interest. However, due to mobile users and varying channel conditions, such algorithms have to be dynamic and able to operate autonomously. 1 Moreover, future cellular systems, including pico and femto cells, must be selforganizing to maintain flexibility and scalability. Therefore, self-optimizing interference coordination schemes are needed.
In this paper, we introduce such schemes with special focus on cellular space division multiple access (SDMA) networks. All proposed algorithms can be seen as applications of a general radio resource management framework that configures and optimizes network operation autonomously and that allows us to incorporate service requirements and performance targets. The framework combines the following three essential ingredients: (i) Network utility maximization (NUM). By formulating a network utility maximization problem, we steer the system to a desired operating point. This includes fairness goals (like proportional fair and max-min).
(ii) Virtual model. On top of regular scheduling and resource allocation, we maintain a virtual model of the network that captures and simplifies the complicated interference situation in the real world. It comprises a static 'cooled-down' version of the network based on long-term gains, thus suppressing the influence of fast-fading. We use this model to obtain granular power control decisions. Thus, it can be seen as a new layer for long-term resource allocation decisions in a cooperative way (including corresponding message exchange). (iii) Suitable short-term scheduling. Instantaneously, we employ a popular gradient scheduler which is known to asymptotically converge to the solution of the underlying utility maximization problem. The scheduler has to take the power constraints into account (which can be strict or average constraints) that are obtained in the virtual layer to manage interference.
To implement this approach, we equip each base station with an additional sector controller that requires additional (infrequent) long-term feedback from the mobiles. Note, that this is a practical assumption since currently, standardized schemes (such as long-term evolution (LTE)-advanced) also consider advanced feedback concepts that are not limited to the serving base station [6]. Moreover, we permit a limited message exchange between the sector controllers. Figure 1 depicts the general approach. Based upon the long-term feedback, the sector controllers create and update a 'virtual model' of the network. The sector controller is thereby just the entity in each base station that takes care of all (virtual) resource allocation and scheduling issues. The main idea of this virtual model is to optimize over 'virtual resources', based on long-term averaged versions of true variables, in order to distributively control a rapidly changing complex network. Thus, the virtual model captures the complicated interference interdependencies in the 'real' network and is used for the control and adaption of power allocations. The virtual model allows each sector to efficiently compute estimates of the gradient of the system utility function with respect to transmit powers of particular resources in the sector, thus allowing local maximization of the overall utility. More precisely these estimates are eventually generated by a 'virtual scheduling' process. The virtual model and this virtual scheduling process can be considered as an additional virtual layer for resource allocation. The idea is that if user rates improve in the virtual layer, they also improve in the real network. Instead of exchanging channel state information (CSI) with all relevant base stations, only the (gradient) information (called sensitivities) that is obtained in the virtual layer needs to be exchanged. The virtual model, being based on average quantities, can be further justified since the goal of the algorithm is to adapt the transmit power levels to average interference levels and not to track fast fading. As mentioned before, we focus on fixed codebook-based schemes in SDMA networks. In particular, we assume that each base station maintains a fixed codebook of a certain size comprising precoding vectors, called 'beams'. These beams can be used to support multiple users on the same time-frequency resource. Using fixed codebooks is a practical assumption and allows us to compare our algorithms with practical schemes used in current cellular systems.
The optimization within the virtual layer can be organized either in a centralized or in a distributed manner. Clearly, a distributed implementation is favorable but in order to accurately quantify the tradeoffs involved, we also investigate a centralized solution. This does not only provide a valuable benchmark for the distributed algorithms, but may also be a feasible option for small networks, where a central controller is indeed possible. Our centralized baseline algorithm is based on an alternating optimization approach, solving scheduling and power optimization in the virtual control plane separately. Since user rates are strongly coupled via transmit powers, we employ a successive convex approximation technique to tackle the inherent non-convexity. Due to the non-convex nature of the underlying optimization problem, global optimality cannot be guaranteed. Therefore, we additionally assess the deviation from global optimality by comparing our approach to an optimal solution based on branch-and-bound (BNB) in a simplified setting.

A. Related Work
There is a significant amount of research on interference mitigation in cellular networks [7], which is often treated as a specific aspect of self-organization in cellular networks. A comprehensive survey on this is provided in [8]. A straightforward approach to avoid interference is to use a frequency reuse factor greater than one or some fractional frequency reuse scheme [9]. Another line of research targets reuse-1 networks and interference mitigation by power control and resource allocation. There have been a variety of suggestions for joint multicell power control and scheduling in cellular networks such as [10], [11] (see also [12] for an overview). Multicell coordination via joint scheduling, beamforming, and power adaptation is considered in [13]. Thereby, fairness requirements (leading to concave utility functions) are fundamental for current and future cellular standards. The work [14] considers joint power allocation and user assignment to cells in the NUM context, taking into account a mixture of concave and non-concave utilities. In [15], a gradient algorithm-based scheme for self-organizing resource allocation in LTE systems is proposed. However, a multitude of information has to be exchanged between coordinating sectors.
Although many of the aforementioned references consider distributed schemes, none treats resource allocation and interference management in multiuser MIMO systems. By contrast, our framework explicitly aims to exploit the freedom in terms of resource and power allocation offered by SDMA. Thereby, our framework builds upon and extends the framework introduced in [16], [17] for single-antenna (SISO) networks.
It is commonly accepted that the underlying optimization problems, which are non-convex in general, can be solved optimally only for a limited set of problems and utilities in reasonable time. Existing solutions in cellular networks often rely on uplink downlink duality [18]. Since they attempt to solve such non-convex problems directly, successive approximation techniques become an increasingly popular tool to treat this nonconvexity, used for example for power control in DSL [19] and multihop networks [20]. Global optimality is often achieved using branch-and-bound-based approaches [21], [22] or monotonic optimization [23], however at a high computational complexity. Other directions include interference pricing [24], [25] or game theoretic approaches [26]. Only recently, distributed coordination schemes in cellular networks have gained increasing popularity [27], [16], [11] due to the high complexity of centralized approaches. For example, [28] considers the derivation of transmit beamformers, also based on interference prices, for dense small cell networks.

B. Organization
The paper is organized as follows. In Section II, we introduce the considered system model, introduce the notation, and describe the optimization problem that we address. In Section III, we present a virtual control layer for solving this problem and introduce three distributed algorithms that are based on different realizations of this virtual control plane. In Section IV, we propose an alternative centralized scheme, while in Section V, we present an optimal solution based on branch-and-bound. In Section VI, we present systemlevel simulation results that evaluate the performance of the distributed algorithms and moreover investigate simpler scenarios to compare these to the centralized and the optimal baselines. Eventually, in Section VII, we state the most important conclusions.

II. SYSTEM MODEL AND NOTATION
We consider the downlink of a cellular OFDMA network, where each cell is sub-divided into three sectors. In total, we have M sectors m ∈ {1, . . . , M }. Each sector is served by a base station having n T transmit antennas with a corresponding sector controller responsible for user selection and resource allocation. There are I users randomly distributed in the system, each equipped with n R receive antennas. In the following, we assume n R = 1. We assume that each user has pending data at all times. Let I m be the number of users associated to sector m.
We assume slotted time with time slots t = 1, 2, 3, ..., called transmission time interval (TTI). In reference to LTE specifications, orthogonal frequency-division multiplexing (OFDM) sub-carriers are grouped into J sub-bands, called physical resource blocks (PRBs) 2 . For the duration of a time slot, the system is in a fixed fading state from a finite set F . Note, that the assumption of finite fading states is made to foster the analysis. In our simulations, however, we evaluate the performance using established 3GPP channel models (cf. Section VI). Nevertheless, the assumption of a finite set of fading states is often used in the literature [29] and can be further justified by the observation that, first, the measurement accuracy of the user devices is limited and, second, there is only a finite number of modulation and coding schemes that the base station can choose from. Fading state l, in turn, induces a finite set of possible scheduling decisions k ∈ K (l). We denote by π (l) the probability of fading state l (with l π (l) = 1). The sector controllers perform linear precoding, where precoding vectors for beamforming are taken (as for example in LTE) from a fixed Nelement codebook C N := {u 1 , ..., u N } which is publicly known. In the following, we identify a beamforming vector (u b ∈ C nT ) by its index b ∈ {1, ..., N }. From the sector controllers' perspective, this makes a beam b on a specific PRB j a possible resource for user selection and power allocation. This relationship is depicted schematically in Figure 2. Let P m jb be the power assigned to beam b on PRB j by base station m. This value is determined differently in each of the presented approaches under comparison. In summary, the (non-trivial) task of each sector controller is to find a scheduling decision (being an assignment of available resources on PRBs-to users) and a suitable power allocation, such that a global network utility function is maximized. Eventually, we have the following additional notational conventions. Let h m ij (t) ∈ C nT be the vector of instantaneous complex channel gains from base station m to user i on PRB j (we assume frequency-flat channels within a PRB). Accordingly, h m ij (t) , u denotes the scalar product of (beamforming) vector u and channel h m ij (t). The noise power at the mobile terminal is denoted as σ 2 . Throughout the paper, we label vectors and matrices with bold-faced letters. For the ease of reference, the most important notation used throughout this paper is summarized in Table I.

A. Problem Statement
The overall goal is to devise autonomous network control schemes which maximize an overall increasing concave utility function U . The utility function is defined as the sum of sector utility functions U m , which are in turn defined over average user ratesX m . Consequently, the problem to solve for each sector m ∈ {1, . . . , M } is given by Thereby, µ l jb (k) = µ l ijb (k) i∈{1,...,Im} is a vector comprising elements µ l ijb (k), which represent the rate that user i is assigned on PRB j and beam b when the system is in fading state l and scheduling decision  k is chosen. They can be zero if the particular resource is not assigned to user i by decision k. φ lm jk denotes the fraction of time that scheduling decision k is chosen on PRB j, provided the system is in fading state l.
In a nutshell, we are not interested in a specific 'snapshot' of the system, but only in ergodic rates. Therefore, a possible control algorithm should not adapt to a specific system state but should be able to optimize the system performance over time.
Problem (1 to 4) is solved by applying a gradient scheduler [30] at each time instance. The gradient scheduler chooses the best scheduling decision k * according to The gradient scheduler tracks average user ratesX m (t) and updates them after each time slot according toX where the fixed parameter β > 0 determines the size of the averaging window. The gradient scheduler is known to asymptotically solve the problem (for β → 0) without knowing the fading distribution. In case of logarithmic utilities, the gradient scheduler becomes the well-known proportional fair scheduler 3 .

III. AUTONOMOUS DISTRIBUTED POWER CONTROL ALGORITHMS FOR INTERFERENCE MITIGATION
We now turn to the distributed power control schemes for autonomous interference management in cellular networks, which are designed to enable network entities to locally pursue optimization of the global network 3 We use i log(X m i ) as sector utility in this paper, leading to a proportional fair operating point.
utility. We introduce three basic approaches. It is important to note that all algorithms are special cases of the well-known gradient algorithm [30], [31], whose convergence behavior has been thoroughly analyzed. Therefore, we refrain from reproducing this theoretical analysis. However, in Section VI, for illustration purposes, we present numerical results indicating a fast convergence behavior. The main difference between the algorithms that we propose is the granularity of power control. The first algorithm uses an opportunistic scheduler which only adapts the power per frequency sub-band, which is then distributed equally among activated beams. We call this opportunistic algorithm (OA). It leaves full choice to the actual scheduler as to which beam to activate at what time. The scheduler can therefore decide opportunistically (⇒ opportunistic algorithm). However, the power budget per PRB which is distributed (equally) among activated beams is determined by an associated control scheme.
The second algorithm is the virtual sub-band algorithm (VSA), which enforces strict power constraints on each beam by requiring all beams to be switched on all the time (with power values given by the associated control). This has the advantage of making the interference predictable (assuming known power values). However, it leaves only limited freedom for the actual scheduler, whose task is reduced to user selection for each beam. Since a beam is always turned on, it can be treated as an independent resource for scheduling, just like a 'virtual' sub-band (⇒ virtual sub-band algorithm).
The third is a hybrid approach, which permits opportunistic scheduling at each time instance but, in addition, enforces average power constraints per beam. We call it cost-based algorithm (CBA). It leaves more freedom for opportunistic scheduling than the virtual sub-band algorithm. In contrast to requiring all beams to be used at all times with strict power values, we only require the target beam power values to be kept on average. Thus, instantaneously, the scheduler is free to make opportunistic decisions based on the current system state. In order to assure that the average power constraints are kept, we introduce an additional cost term into the utility maximization and the gradient scheduler (⇒ cost-based algorithm).
We focus on the applicability in different fading environments, comparing the overall performance with respect to a network-wide utility function as well as the performance of cell-edge users. Thereby, we show that although the algorithms behave differently in different user mobility scenarios, in general, it is more beneficial to impose average rather than strict power constraints.
As we demonstrate later, the three algorithms perform differently with different mobility assumptions on the users. A problem that arises with increased mobility is that the (virtual) model lacks behind the actual network state. Especially when the controllers are restricted to a gradual power adaption process on a per-beam granularity, they might not always be able to fully exploit multiuser diversity. Since the proposed algorithms put different emphasis on opportunistic scheduling in power adaption and resource allocation decisions, they perform differently when facing user mobility and fast fading.
Let us now turn to the control plane, the virtual layer, of the considered algorithms. They all have the following general procedure in common. The goal of the control plane is to obtain estimates of the partial derivatives of the network utility with respect to the power allocation of particular resources and to adapt the power control policy accordingly. These estimates can be seen as estimates of the sensitivity of the network utility to changes of the allocation strategies. Thereby, the allocation can be, as in the OA of Section III-A, the power allocation of a PRB which is then divided equally among activated beams. Or it can even be, as in the VSA of Section III-B, the power allocation of an individual beam. Or it can also be, as in the CBA of Section III-C, simply an average power constraint of an individual beam, which does not have to be kept at every single time instance.
A further similarity between all algorithms is that in addition to ordinary short-term CSI, the mobiles (not necessarily often) report long-term feedback to their base stations, which is then used to calculate (virtual) user rates and, accordingly, (virtual) average rates. These average rates are a good representation of the interference coupling throughout the network and are used to calculate the 'sensitivities' to power changes on particular resources in other sectors (and in the own sector). This sensitivity information is compiled in messages which are exchanged between the sectors 4 . Upon reception of the message vectors, the sector controllers are now able to calculate the desired estimate of the system utility's sensitivity to power changes on particular resources and adjust powers (or power constraints) accordingly. It is important to always distinguish between actual rates and virtual rates. The actual average rates are the ones tracked by the ordinary (proportional fair) scheduler and determine the sector's utility. The virtual average rates are based on long-term feedback of averaged channel gains and do not have an immediate physical meaning in the 'real world'. They are created by a 'virtual' scheduler based on 'virtual' scheduling decisions. The set of all these rates forms a 'virtual' model of the system which is used to derive power adaption decisions.
The question remains how the virtual average rates and accordingly the sensitivities to power changes are calculated. Besides the granularity of power control, this virtual model, which can be also seen as a virtual layer for interference mitigation above the actual short-term scheduling, is the main difference between the investigated algorithms. In the following, we will discuss this in detail.

A. Opportunistic Algorithm (OA)
OA can be seen as a straightforward extension of the multi-sector gradient (MGR) algorithm in [16] to multi-antenna networks. Although it is designed for multi-antenna networks, it does not perform power control on a per-beam basis (as opposed to the other two algorithms) but gives complete freedom to the sector controllers with respect to the number and choice of beams that are active at every given time instance. The only value that is controlled is the power budget per PRB. Nevertheless, SDMA is applied where multiple users can be scheduled on the same PRB, however on different beams. Consequently, the long-term feedback of users comprises a codebook index (being the maximizing index b * in (7)) as well as a corresponding gain whereh m ij is the channel from user i to its sector controller on PRB j, averaged to eliminate the influence of fast-fading.
The task of finding scheduling decision k now amounts to determining the best subset of users to be scheduled on a PRB, subject to the constraint that each user can only be scheduled exclusively on its reported beam. Let us define (virtual) user rates (time index omitted), assuming a user is scheduled on PRB j (otherwise R m ij (k) = 0), given by In this paper, we use ρ(x) = log(1 + x). Thereby,P m j (k) is the current power value of PRB j divided by the number of users scheduled, thus depending on decision k. Moreover,P m ′ jG m ′ ij represents a long-term estimate of the interference of sector m ′ on PRB j (which can be measured by the mobiles).
Having defined the virtual user rates R, virtual average user rates X and sensitivities are derived by virtual scheduling (based on a similar procedure in [16]) as follows. We run the following steps n v times per TTI in each sector m and for any PRB j. Thereby, the parameter n v determines how long the virtual scheduler runs before accepting the sensitivities. Consequently, a larger value means more overhead by the virtual layer but better results.
• We determine the virtual scheduling decision k * using a gradient scheduler according to • We update virtual average user rates according to • We update sensitivities according to 9 Thereby, β 1 and β 2 are small averaging parameters. Using (8), the derivates in (9) are given by Starting with equal power, the adaption of the PRB powers can be summarized as follows. From time to time, the sensitivities are exchanged and summed up by each sector controller for each beam and PRB. Since each D (m,m) j is an estimation of the sensitivity of sector m's utility to a power change in sectorm, the summation gives an estimate of the network utility's sensitivity. Then, the power is increased on the PRB with the largest positive sum and decreased on the PRB with the largest negative sum.

B. Virtual Subband Algorithm (VSA)
VSA requires long-term feedback that comprises average link gains per beam and sector from each mobile. We define to be the average link gain (the bar denotes empirical averaging over time) of mobile terminal i on beam b and PRB j to sector controller m. Based on the long-term CSI, the control plane of VSA responsible for determining the power allocation per beam works as follows. Given average gains in (10), each sector controller calculates corresponding virtual rates according to Note that since all beams are activated at all times, we have an additional intra-sector interference term (as opposed to OA), since this interference can no longer be eliminated, e.g., by switching off beams. Let X m i be the virtual average rate of user i in sector m (not to be confused with actual average ratesX in (6)), defined as Here,φ m ijb represent optimal time fractions of resource usage for sector m. They are determined as a solution to the following optimization problem (for fixed virtual user rates): We rely on an explicit solution to (13) since we cannot apply the virtual scheduling from [16]. This is because the resources for power control (which are now individual beams) are no longer orthogonal but cause interference to each other [1]. Having the virtual user rates, the sector controllers calculate sensitivities to power changes on beams for all sectors (including self) and beams, given by Note that the small coefficient ε > 0 stems from an application of Theorem 1 (which can be found at the end of Section III-C) in order to ensure the differentiability of problem (13). The such generated sensitivities are exchanged from time to time between all sector controllers. Thereby, every sector k receives J · N sensitivity values from all other (M − 1) sectors, in addition to the J · N values from its own sector. Thus, we have summing up the sensitivities of all sectors (including itself) to a power change of beam b on PRB j in sector m and which can be either positive or negative. Note that sector indices m andm in the RHS of (15) are interchanged compared with (14), since in (14), we are interested in how the beam in sectorm interferes with sector m, while in (15), it is of interest how the beam in sector m interferes with (all) sector(s)m.
Since D m,m jb represent estimates of the sector utilities to a power change on jb in sector m, D m jb clearly is an estimate of the sensitivity of the system's utility to a power change on the respective beam. Depending on the D m jb , we can now make a power adjustment which steers the system operating point towards a greater utility in the virtual model.
The power adjustment is carried out in steps of ∆ > 0, which is small and fixed. LetP m (t) denote the current total allocated power in sector m and P max the upper bound on the total sector powers. Then, the following procedure is applied.  (2) IfP m (t) < P max , pick (jb) * (if there is one) such that D (m) (jb) * (t) is the largest among those jb with D (m) jb (t) > 0. Set P m (jb) * (t + 1) = P m (jb) * (t) + min ∆, P max −P m (t) .
(3) IfP m (t) = P max and max jb D By intuition, the algorithm reallocates power to the beams with large positive utility-sensitivity. Note that for numerical reasons, it may be necessary to specify a certain minimum power per beam P min b instead of allowing beam powers to be reduced to zero. In this case, the changes to the algorithmic notation above are straight forward, so we do not explicitly state them here.

C. Cost-Based Algorithm (CBA)
Since CBA enforces average power constraints per beam, the following additional constraint to the NUM problem (1 to 4) is introduced.
It includes cost term c jb k,P m j which represents the power cost or power consumption of beam b (on PRB j) given scheduling decision k andP m j , which is the total power budget of PRB j. We assume that each beam that is activated gets an equal share of the available total PRB powerP m j . Thus, if n j (k) is the number of beams that are activated on PRB j if decision k is chosen, the 'cost' of activating beam b on PRB j becomes To solve the modified problem (1 to 4 and 16), we also have to modify the gradient scheduler (5 to 6). The modified gradient scheduler now chooses the scheduling decision k * according to Dual parameters λ jb (t), measuring the deviation of powers over time from the target power values on a particular beam, are updated according to the following rule: Average user ratesX m (t) are maintained and updated as in (6). The above algorithm can be seen as an application of the greedy primal dual (GPD) algorithm presented in [31].
The virtual control plane differs from VSA in the following. In contrast to the VSA, where every beam is switched on all the time, the situation is different here. To enable the calculation of derivatives of the rates with respect to beam powers (needed in (22)), we introduce scaling factors α m jb (k), which scale target beam powersP m jb to powers 'costs' 5 C m jb (k) that are instantaneously used by the virtual scheduler. Thus, α m jb (k)P m jb = C m jb (k). Given average gains (10) as in VSA, the sector controllers calculate (virtual) user rates given by with jb ′ , and I inter = m =m b Gm ijbPm jb . Virtual average user rates are calculated by CBA as follows: Again,φ m jk are optimal time fractions of resource usage for sector m; however, in contrast to VSA where those time fractions were calculated explicitly, CBA uses the approach of OA to determine the virtual average rates (and implicitly the time fractions) through virtual scheduling. As before, to distinguish real and virtual scheduler, we use capital letters for virtual scheduler quantities whenever possible. In each TTI, the virtual scheduler performs n v scheduling runs. In each run, the following steps are carried out on each PRB j: • We determine the virtual scheduling decision k * similar to (17). • We update virtual average rates similar to (6). • We update the virtual average power costs for each beam b similar to (18). • We update sensitivities for each beam b and sectorm (β 2 > 0 small) according to Using (19) to (20), the derivatives in (22) are given by The power adaption is then carried out similar to the other algorithms. For each resource, each sector sums up values D (m,m) jb from all sectors and increases the power level on the beam with largest positive sum while decreasing the power level on the beam with largest negative sum. However, CBA adapts only power constraints per beam, not actually used beam powers. Figure 3 gives a sketch of power trajectories using the simulation environment described below, in Section VI-A. To find out whether CBA really holds the average power constraints, we exponentially average instantaneously used powers per beam with the same time constant used for scheduling and compare the result to the target power values determined by the virtual model. The left side of Figure 3 shows the averaged powers, as actually used by the 'real' scheduler, while the right side shows the target power values determined by the virtual scheduling procedure. Note that since scheduling is opportunistic, instantaneously, the power levels fluctuate highly and beam powers can differ from the target values (or a beam can be completely turned off). This is illustrated by the 'zoomed-in image' in Figure 3 (left), where actual powers without averaging are shown. It turns out that on average, the power constraints are kept remarkably well.
Apart from the intuitive benefits of instantaneously allowing opportunistic scheduling, we observe that from a practical point of view, average power constraints are further justified since hybrid automatic repeat request (HARQ) coding is essentially performed over multiple successive transmissions.
In all three presented algorithms, naturally, questions arise regarding differentiability. Note that the problem to be solved by each sector controller is given in (13). Due to the maximum operator, it might not be differentiable everywhere even with the utility function being differentiable. Therefore, let us define a slightly modified version of problem (13), given by One can show the following: Theorem 1: Let 0 < ε < 1 be finite and U m be an increasing concave utility function, defined in (0, ∞). Then, the family of functions f ε (R) (defined by (23)) with R ijb ≥ c > 0 (∀i, j, b) is differentiable everywhere and converges for any sequence ε n → 0 to f in (13) (which is continuous) in the uniform sense.
Proof: The proof can be found in Appendix 1. By Theorem 1, we can replace our utility function with a smooth, uniformly convergent approximation, which can be locally maximized in the power control loop. Note that this replacement is already incorporated in the calculation of sensitivities (Equation 14).

IV. AN ALTERNATING OPTIMIZATION BASED APPROACH
Given the distributed nature of the above presented algorithms, the question arises: Can a centralized controller do better? Therefore, in the following, we want to compare the algorithms of Section III with a centralized solution.
The optimization problem which the virtual controller has to solve is given by s.t. ∀b, j, m : with R m ijb (P ) = log 1 + F m ijb (P ) and F m ijb defined in (11). As described in Section III, the power allocation problem is so far solved using a distributed gradient ascent procedure. However, when we allow a centralized controller for the network, we can instead solve (24 to 26) directly each time a power update is desired and use the resulting power allocation directly for actual resource allocation.
Obviously, problem (24 to 26) is highly non-convex. In the following, we try to solve the problem by alternating the optimization in P (holding Φ constant) and Φ (holding P constant). We therefore have a scheduling sub-problem and a power allocation (PA) sub-problem. The overall procedure is summarized in Algorithm 1. 1: initialize counter τ = 0, select feasible Φ(0) and P (0) arbitrarily 2: repeat 3: τ ← τ + 1 4: solve scheduling sub-problem using fixed P (τ − 1) and obtain Φ(τ ) 5: solve SCA-based power allocation sub-problem using fixed Φ(τ ) and obtain P (τ ) (Algorithm 2) 6: until converged 7: use P (τ ) to update power allocation in network Since the scheduling sub-problem is convex (cf. Lemma 2), we only have to care about the power allocation sub-problem. We try to tackle this problem by a successive convex approximation (SCA) approach similar to [19]. The sub-problem in P is still highly non-convex. However, using Lemma 3, we obtain a convexified version of the power allocation sub-problem.

Algorithm 1 Alternating optimization-based virtual layer
Lemma 2: With constant P , optimization problem (24-26) is a convex optimization problem in Φ. Proof: The proof follows since non-negative weighted addition and scalar composition preserve concavity [32]. is a convex optimization problem. Proof: The proof can be found in Appendix 2.

Lemma 3: Using a concave lower bound (assuming appropriately chosen constants) to the user rates, given byR
The SCA procedure is summarized in Algorithm 2.

5:
τ ⇐ τ + 1 6: until converged The algorithm is initialized as described in the first step of Algorithm 2, which is equivalent to a high signalto-interference-plus-noise ratio (SINR) approximation (which can be seen when applying this initialization in (27)). The high-SINR approximation assumes that for large SINR, log(1 + SINR) ≈ log(SINR). This is a common initialization for this kind of algorithm [33], [19]. Each iteration of the algorithm comprises two steps, a maximize step and a tighten-step. In the maximize step, we find a solution to the current convexified version (28 to 29) of the power control problem. This solution is then used in the tighten step to update the convex approximation parameters α and β for each link according to (30). The algorithm converges when the tighten step (30) does not produce any (significant) changes. Being based on an inner approximation framework by Marks and Wright [34], it can be shown that Algorithm 2 converges at least to a KKT point of the PA sub-problem.

V. APPROACHING GLOBAL OPTIMALITY: COMPARISON WITH BRANCH-AND-BOUND
It is known that the underlying optimization problem is non-convex; thus, the gradient ascent-based algorithms presented in Section III as well as the alternating optimization-based algorithm of Section IV will most likely converge to a local maximum. Thus, although simulation results (cf. Section VI) show already high gains in utility, the question remains how good the solution found actually is, that is how much of the achievable performance gains is actually realized?
To simplify the analysis in this section, we restrict ourselves to the single-antenna case (which reduces all algorithms to MGR in [16]). The main difficulty of the analysis is that even in this simple setting, the underlying problem ∀m : j P m j ≤ P max (33) with R m ij = log 1 + , is highly non-convex (even in the two-sector, two-user case).
Moreover, since this is essentially a joint optimization in time fractions Φ and power values P , the complexity is significantly higher than with power control only. To gain insight into the deviation from the optimal solution of (31 to 33), we compare our algorithm's performance in the simplified setting with a (near) optimal solution based on BNB. Although computationally very expensive, it gives us a good impression on how much of the achievable performance is actually reached. We assume, without loss of generality, that the maximum sum power available in each cell in (33) is normalized to one. Branch-and-bound is a standard algorithm for global optimization, which creates a search tree where at each node, an upper and a lower bound to the problem are evaluated. Details can be found for example in [35]. It is based on constantly sub-dividing the feasible parameter region and for each node of the resulting tree calculating the upper and lower bounds to the objective function. For ease of notation, we will combine all parameters to our objective function in one vectorP = (p 1 , . . . , p n ). Let P (0) denote our initial parameter region. This region forms a convex n-dimensional polytope with V extreme points (or vertices)P v . We collect the corner points of polytope P (0) in set V (0) . Consequently, the parameter region associated with node l is denoted as P (l) with associated vertices V (l) .
Due to the structure of the constraints, we have n 2 parameter pairs with independent constraints. Each of the parameter/constraint pairs form a (two-dimensional) simplex. When branching, we split the parameter region of node l, P (l) , in sub-regions P (l+1) and P (l+2) by splitting the longest edge among the edges of all ( n 2 ) simplexes together. Doing so, we divide the parameter region into two sub-regions of equal size. Figure 4 visualizes the above-stated procedure by showing the simplex created by two interdependent power values together with an independent third dimension. The dashed line indicates the splitting into two sub-polytopes. Below, we explain how the upper and lower bounds to the sub-problems are found. It is well known that the Shannon rate can be written as a difference of concave functions f m ij Using this, our non-concave objective function (31) can be rewritten aŝ Unfortunately, φ m ij f m ij (P ) (and thus also φ m ij g m ij (P )) is neither convex nor concave (note that f (a, b) = a · log(1 + b) is not concave). However, we can define the concave function(s): , and in the same wayg m ij (P ). Using this, (34) can be upper bounded by Equation 35 is still not concave since it includes a sum of a concave function (f m ij (P )) and a convex function (−g m ij (P )). To get a concave objective function, we replaceg m ij (P ) by it's convex envelope (cf. [35]), defined as follows: Let P 1 , . . . , P V be the vertices of a polytope P. The convex envelope γ(P ) of a concave function g(P ) over P can be expressed as Using this definition, the (convex) optimization problem we have to solve in order to obtain our upper bound (at node l) is given by with λ = (λ 1 , . . . , λ V ),P ∈ P (l) ,P v ∈ V (l) , and V being the number of vertices collected in V (l) . We use three different approaches to obtain a lower bound at node l. The first lower bound is obtained by taking the maximum of the objective function values at each of the corner points of the respective parameter region. Second, we evaluate the optimal parameter vector from calculating the upper bound. Third, we use a standard solver to compute a (local) optimum of the original non-convex problem. If one of the three methods leads to a higher value, the current global lower bound is replaced.

VI. NUMERICAL EVALUATION
In this section, we present numerical results to evaluate the distributed algorithms of Section III as well as the centralized and BNB-based solutions of Section IV and Section V, respectively.

A. Distributed Algorithms (System-Level Simulations)
In order to compare the performance of the three algorithms in a setting close to practice, we conduct system-level simulations based on LTE.
1) Simulation Setup: The general simulation setup can be found in Table II. We employ a grid of seven hexagonal cells, each comprising three sectors (to ensure equal interference conditions for each sector, a wrap-around model is used at the system borders). Users are distributed randomly over the whole area; thus although we have, say, 210 user in total, leading to an average of ten users per sector, some sectors have a higher load than others. A detailed description of the precoding codebook can be found in [36]. As channel model, we use the WINNER model [37], Scenario C2. All simulations are performed with realistic link adaptation based on mutual information effective SINR mapping (MIESM); moreover, we use explicit modeling of HARQ (hybrid automatic repeat request) using chase combining. For efficiency reasons, we simulate a limited number of PRBs (J = 8) compared to a real-world system; however, the results are expected to scale to a higher number of PRBs. 2) Baseline Algorithm: As baseline, we use a non-coordinative scheduling algorithm called greedy beam distance (GBD) algorithm, with a codebook size of 3 bits (eight beams). GBD requires feedback from each user, comprising a CDI and a channel quality information (CQI) value. Note that the CQI is only an estimate of the users' SINR values in case the user alone is scheduled on the PRB (with full power), since the scheduling decisions cannot be known in advance. Thus, it does not contain intra-sector interference. Moreover, it contains only an estimate of the inter-sector interference. On each PRB, the users are greedily scheduled on their best beams (using their proportionally fair weighted CQI feedback as utility). Thereby, a minimum beam distance has to be kept, in order to minimize the interference between users scheduled on the same PRB. This distance is based on a geometrical interpretation of beamforming. It means that given a user is scheduled on a certain beam, adjacent beams (up to a certain 'distance') are blocked and users that reported one of those beams are excluded from the list of scheduling candidates for the respective PRB. We use a minimum distance of 3, that is, with the used eight-beam codebook at most three users can be scheduled on the same PRB. Of course, no adaptive power allocation is performed; the power is distributed equally among the PRBs (and further among the thereon scheduled users).
3) Simulation Results: We use two essential performance metrics for our comparison. First, since we have a system-wide proportional fair utility, we compare the geometric mean of average user rates (GAT) as a measure of the increase of the overall utility. The equivalence of maximizing GAT and sum utility can be seen when observing that the geometric mean of some set of values is equal to the exponential average of the logarithm of the values. Therefore, the sum of logarithms is maximized precisely if the geometric mean is maximized. Second, the performance of cell-edge users, which we measure by the 5% quantile of average user throughputs, is a natural benchmark for each distributed base station coordination algorithm. Figure 5 shows cell-edge user throughput over GAT for the evaluated algorithms with user mobility (and therefore with fast fading), while Figure 6 depicts the simulation results in a setting without user mobility. We compare the three approaches of Section III to the uncoordinated baseline of Section VI-A2. Thereby, for each algorithm, the performance is evaluated for different average number of users per sector. More precisely, from left to right, the markers in the figure represent an average number of 5, 8, 10, 12, and 15 users per sector (corresponding to total numbers of 105, 168, 210, 252, and 315 users in the network, respectively). Obviously, increasing values on the ordinate corresponds to increased fairness (with respect to our 5% quantile metric), while higher values on the abscissa correspond to a higher sum utility (with respect to our GAT metric). In both figures, we observe that a higher number of users lead to increased sum utility (an effect called multiuser diversity) while the cell-edge users (represented by the 5% of users with lowest throughput) suffer from a reduced performance (due to an increased competition for resources). In case of user mobility ( Figure 5), we observe that CBA clearly outperforms the other algorithms. In fact, for every simulated user density, either the 5% fairness (low user densities) or both metrics are improved (higher user densities). For example, in the case of 210 users in the network (corresponding to the third marker on the curves in Figure 5), the CBA algorithm improves the performance with respect to GAT by about 10% and in cell-edge user throughput by more than 35%. It can also be observed that the VSA algorithm works best at high user densities, while in case of only a few users in the cell, the OA algorithm (which requires the least overhead with respect to additional feedback and signaling) and even the no-power control baseline show a better performance, at least with respect to the cell-edge performance. The decreased GAT shown by VSA in the fading case is basically the cost of leaving no freedom to the schedulers for opportunistic scheduling, but to strictly specify the powers to be used for all beams and time instances. Figure 6 depicts the simulation results in a setting without user mobility. We can see that OA offers already quite high performance gains, both in GAT and in cell-edge user throughput, although being the least complex algorithm. However, the performance is even further improved by the other two algorithms. VSA can improve especially the gains of cell-edge users, for example, in the case of 210 users in the network to more than 200% compared with no-power control. However, CBA again shows the best performance, significantly improving the global utility compared with VSA (and even more compared with no-power control.) Again, this gain increases with increasing number of users in the network.
Besides the performance, the speed of convergence is of large practical interest. To illustrate this, Figure 7 shows the GAT metric for all algorithms over time for the case of stationary users (with 210 users in total). It can be seen that the fastest convergence is achieved by the OA and VSA algorithms, while the CBA algorithm, which in the end achieves the highest GAT, needs about 3,000 TTIs to converge. This is expectable, since the scheduler has to follow the power values obtained from the virtual layer only on average; thus, it always 'lacks behind' the adaptation of the virtual model. Regardless of which algorithm performs best, it is of general importance that distributed power control algorithms do not only work in a specific environment but can adapt to changing network conditions (which, of course, is also related to the notion of convergence). Figure 8 demonstrates this autonomous adaptation ability by showing the cell-edge user throughput over time (exemplary for CBA in the static case) for multiple consecutive drops, where, in each drop, a new user location and assignment pattern are generated. It can be observed that the the performance is improving very fast compared to the baseline.

B. Performance of centralized solution based on AO
Subsequently, we investigate the performance of the centralized scheme based on alternating optimization (AO), as introduced in Section IV. For this, we compare the performance with that of the distributed solution. More precisely, we use the VSA algorithm of Section III-B. Note that the only difference between the subsequently investigated algorithms is the power adaptation procedure. While in the distributed solution (in subsequent figures denoted as VSA), the power is adapted gradually based on the exchange of sensitivities, in the centralized solution (in subsequent figures denoted as AO), the solution to the optimization problem in the virtual layer is directly used to globally update the power allocation. We are predominantly interested  in the convergence speed of the geometric mean of average user rates in the different approaches and the relative performances in different signal-to-noise ratio (SNR) regimes. Moreover, we investigate both a static setting without mobility and a setting where a user moves at a velocity of 3 km/h.
In Figure 9, we compare the behavior of the AO and the distributed approach for constant channels. In general, it can be observed that the convergence speed of the average rates using AO is slightly faster. Often, both algorithms converge to the same solution; however, this is not necessarily always the case (as shown in Figure 9). Figure 10 compares the performance of AO-based and gradient ascent-based solution in a setting with moderate user mobility. It can be observed that the convergence speed of the centralized scheme is still slightly higher than in the distributed case, and in most cases, the centralized solution outperforms the gradient-based approach (however, at a much higher complexity). An interesting effect that can be observed is that the influence of the initial state on the outcome of the distributed solution is significant. This is also depicted in Figure 10. Note that the only difference between the solid and the dashed red line in Figure 10 lies in the initial power values. In the first case, we start at a minimal power assigned to all sub-carriers (note that for technical reasons, it is not possible to assign exactly zero power to the resources) and in the latter case, we start at an allocation where power is distributed evenly among the resources.
In Figure 11, it can be observed that the gains from the centralized solution are higher in high-SNR regimes. While at low SNRs, when the system is essentially noise-limited, both approaches converge to the same solution; in high-SNR regimes, when the system is essentially interference-limited, the direct AO-based optimization outperforms the gradual power adaptation. In summary, the centralized solution has some clear advantages. First, it is invariant to the initialization of the system and is thus able to adapt very fast to changing environmental conditions. Second, especially when the channels are varying fast and the SNR is high, the centralized scheme offers a higher performance in comparison to the distributed scheme. However, one should note that at realistic SNRs, the distributed solution still achieves almost the same performance and at the same time scales smoothly with the network size, while a centralized solution can only be implemented in very restricted scenarios.

C. Global Optimality
In the following, we compare the performance of the distributed algorithm to an (nearly) optimal solution based on BNB. To simplify the analysis in this section and due to the high complexity of the BNB algorithm, we restrict ourselves to the simple case of two sectors with two users (no mobility) and two PRBs and a single antenna.
Given a certain tolerance ε, BNB converges to an optimal solution of (31). However, it has exponential complexity and is therefore only feasible in very small settings. We compare our distributed coordination algorithm with an equal power non-cooperative scheduler and the BNB algorithm described in Section V.
In Figures 12 and 13, the blue line represents the GAT obtained with the coordination algorithm, while the red line represents the outcome of a non-cooperative equal-power scheduler. The solid black line indicates the outcome of BNB. Finally, the dotted black line depicts the configured ε tolerance. Thus, the true optimum is guaranteed to lie in between the solid and dotted black lines. To get an overview of the performance in different scenarios, we investigate three different settings, namely, a setting with weak interference, a setting with approximately equal interference, and a setting with high interference. In Figure 12, the left marker corresponds to the weak interference case, the middle marker to the case of equal interference, while the right marker corresponds to the case of strong interference. The weak interference is characterized by G m ij > G m ′ ij (m = m ′ ) for all i, j, m, m ′ ; thus, for all users, the link gains to the serving base station are higher than the gains to the interfering base station. Note that this is, from a practical point of view, the more interesting case since due to handover algorithms, most of the time, the gains to the own base station are stronger than to the interferers. It can be observed that the distributed power control performs quite well since we see a significant improvement over equal power scheduling. Moreover, the distributed power control appears to realize already a large portion of the available gains.  In the 'equal' interference case, the gains to the interfering base stations are roughly the same as the gains to the own base stations; thus, G m ij ≈ G m ′ ij (m = m ′ ) for all i, j, m, m ′ . It can be observed that the power control algorithm leaves a larger portion of the available performance gains unused, compared to the weak interference case, although it shows significant improvements over non-cooperative scheduling.
The strong interference case is characterized by G m ij < G m ′ ij (m = m ′ ), for all i, j, m, m ′ ; thus, for all users, the link gains to the interfering base stations are higher than the gains to the 'own' base stations. Here, the gap in the non-cooperative algorithm's performance, compared to the BNB result, is significantly worse than in the other cases. By contrast, the power control algorithm shows roughly the same performance gap than in the case of equal gains.
Again, to illustrate the convergence, Figure 13 gives an example of the performance over time (here, for the weak interference case). Obviously, the power control algorithm converges to a local maximum soon.

VII. CONCLUSIONS
We proposed and compared three distributed algorithms for autonomous interference coordination in cellular SDMA networks. The algorithms are based on a virtual layer that models the interference interdependencies in the network and gradually adapts power control levels. The proposed algorithms differ in granularity of power control, required feedback, signaling overhead, and the virtual model itself. System-level simulations indicate high gains both in overall utility and in cell-edge user throughput for all three algorithms in static environments without user mobility. While VSA offers a very fine granularity of power control on a per-beam level, it suffers from the lack of freedom to instantaneously perform power allocation in an opportunistic manner. In fact, in an environment with significant user mobility, only CBA achieves significant gains in both metrics. This demonstrates the superiority of CBA's approach to enforce average power constraints but instantaneously allowing opportunistic scheduling. Comparisons to a centralized benchmark scheme reveal that although the convergence of the centralized scheme is much faster than in the distributed case, the performance in overall utility is comparable. APPENDICES APPENDIX 1: PROOF OF THEOREM 1 First, we show that f ε (R) is everywhere differentiable. For this purpose, we use the following Lemma 4 and set H(x, y) := U (R, Φ).
Lemma 4: Consider a function H (x, y), x ∈ [a 1 , a 2 ], y ∈ [a 3 , a 4 ], with some finite a 1 < a 2 and a 3 < a 4 . Assume that H (x, y) and its partial derivative on x are continuous, and H (x, y) is concave in x (for each y) and strictly concave in y (for each x). Then, the function is continuously differentiable in [a 1 , a 2 ]. (This can be generalized to the multi-dimensional case. Namely, the domains [a 1 , a 2 ] and [a 3 , a 4 ] can be replaced by arbitrary convex compact sets in finite-dimensional vector spaces, and derivatives replaced by gradients.) Note, that Lemma 4 cannot be applied directly to (13) since strict concavity in y in this case is not given. Clearly, R and Φ are compact since φ ijb ∈ [0, 1] and R ijb ∈ [c, B], B < ∞, c > 0. Moreover, the concavity of U follows by definition. Now, we have to show that f ε (R) converges for any zero sequence ε n → 0 to f . First, fix a zero sequence ε * n . Moreover, we define a sequence of functions , with (n ∈ N). We can use Dini's Theorem to show that {f n (R)} n∈N converges uniformly to f . This theorem states that if {f n } n∈N (f n : K → R, n ∈ N, being a sequence of continuous functions and K being a compact metric space) converges pointwise to f (f : K → R being a continuous function) and if f n (x) ≥ f n+1 (x) (∀x ∈ K and ∀n ∈ N), then {f n } n∈N converges uniformly to f .

APPENDIX 2: PROOF OF LEMMA 3
Following the approach in [19], the convexity of optimization problem (28 to 29) can be easily shown by noting thatR m ijb in (28) can be written as where log(F m ijb ) can be further decomposed into Due to the convexity of log-sum-exp [32], the term is convex, and thus,R m ijb is concave. Noting that non-negative weighted addition and scalar composition preserve concavity concludes the proof.