Traffic demand-aware topology control for enhanced energy-efficiency of cellular networks

The service provided by current mobile networks is not adapted to spatio-temporal fluctuations in traffic demand, but such fluctuations offer opportunities for energy savings. In particular, significant gains in energy efficiency are realizable by disengaging temporarily redundant hardware components of base stations. We therefore propose a novel optimization framework that considers both the load-dependent energy radiated by the antennas and the remaining forms of energy needed for operating the base stations. The objective is to reduce the energy consumption of mobile networks, while ensuring that the data rate requirements of the users are met throughout the coverage area. Building upon sparse optimization techniques, we develop a majorization-minimization algorithm with the ability to identify energy-efficient network configurations. The iterative algorithm is load-aware, has low computational complexity, and can be implemented in an online fashion to exploit load fluctuations on a short time scale. Simulations show that the algorithm can find network configurations with the energy consumption similar to that obtained with global optimization tools, which cannot be applied to real large networks. Although we consider only one currently deployed cellular technology, the optimization framework is general, potentially applicable to a large class of access technologies.


I. INTRODUCTION
Mobile networks are constantly growing, with the number of mobile subscribers worldwide exceeding 7.1 billion in 2014 and the current forecast of 9.5 billion subscribers in 2020 [2].
During this time, an even more drastic increase is predicted for mobile data traffic, which is expected to grow by the factor of 8. To keep pace with this trend, mobile network operators attempt to boost the network capacity by deploying additional antennas and base stations.This will lead to a dramatic increase in the energy consumption, thereby causing additional operational expenses (OPEX) and higher carbon dioxide (CO 2 ) emissions.Therefore, enhancing the energy efficiency of mobile communications systems has a positive double effect: lower greenhouse gas emissions and reduced OPEX.
Current mobile networks have been designed to provide the best possible service to the users at all times.Global network parameters and the network topology are largely static, although, as pointed out in many studies (see for instance [3], [4], [5], [6]), the traffic load fluctuates significantly over time and space.Such spatio-temporal fluctuations create large capacity surpluses at times of low traffic demand, which in turn offers opportunities for energy savings through adaptation of the service supply to the actual demand.However, in order to utilize the capacity surpluses for significant energy savings, it is essential to reduce the energy consumed by hardware and auxiliary equipment (e.g.coolers), which is a dominant form of energy consumption in current mobile networks.In fact, for a typical network with today's technology, base stations consume over 50% of the total network energy budget [7].Most of the energy consumed by base stations is spent on powering hardware and auxiliary equipment.From this, we conclude that significant energy savings can be achieved only by temporarily disengaging redundant hardware components of a base station.Indeed, as pointed out by [5], reducing the number of active base stations in periods of low traffic load offers a huge potential for energy savings.Switching off base stations and introducing a cooperation between different operators in an urban scenario is expected to lead to a reduction in energy consumption of up to 29% [5].This effect will become even more pronounced in the future since, as aforementioned, the number of base stations is expected to increase over the next years to meet the growing demand for wireless access [8].

A. Our contribution
This paper deals with the problem of minimizing the overall energy consumption in the downlink channel of mobile (cellular) networks.By taking into account the energy consumed by hardware and auxiliary equipment, we address key shortcomings of most existing approaches to the challenge of boosting energy efficiency of cellular networks.The underlying problem is of combinatorial nature because it essentially amounts to selecting a subset of network elements corresponding to the most energy-efficient network configuration, while providing the desired network coverage.More precisely, motivated by [9], we formulate a combinatorial optimization problem to find a network configuration that consumes the least amount of energy, while satisfying traffic demands expressed in terms of minimum data rate requirements.In doing so, we balance different forms of energy consumption in an optimal manner by taking into account both the load-dependent energy used for transmission and the static energy consumed by hardware regardless of the actual load.Similar to [9], the technology specific constraints are defined to capture the QoS requirements of the users.Although our optimization framework is generic in the sense that it can be applied to multi radio access technology (multi-RAT) systems by incorporating different RAT specific constraints, owing to the lack of space, our focus is on a single RAT according to the long term evolution (LTE) standard.
The underlying combinatorial problem is in general hard to solve so, even for networks of moderate size, we follow a widely-known relaxation approach (see for instance [10]) and relax the problem by another problem as an intermediate step that can be solved efficiently.More precisely, we apply convex relaxation techniques to the constraints and approximate the objective function by a concave function.To try to minimize the concave function over the convex set, we use of a majorization-minimization (MM) technique [11].Our algorithm exhibits major advantages over other existing schemes.In particular, the proposed algorithm is able to find good solutions in a relatively short time so that it can even be used to utilize fluctuations in traffic demand on a relatively short time scale.It can further cope with a variety of network elements such as cells, sectors or even antennas.As a result, we can take into account different sources of energy consumption to arrive at an energy efficient network configuration.In addition, the algorithm is able to balance the minimization of the static energy consumption, which requires the deactivation of as many network elements as possible, against the minimization of the load-dependent energy consumption, which calls for an appropriate load balancing to avoid highly loaded base stations.
In order to ensure a broad range of applicability, the load-dependent energy consumption can be modeled as any concave or convex function of the load.
We also discuss how our algorithm can be extended along two directions.First, we show how to incorporate techniques motivated by coordinated multi-point (CoMP) transmission [12], which actually makes some relaxations superfluous and leads to further energy savings.Second, we use the load-based model of [13] for interference-coupled cellular networks together with the framework of standard interference functions [14] to show that larger energy savings are possible at the cost of slightly increased complexity and overhead.

B. Related Work
Over recent years, some research effort has been devoted to exploiting temporal and spatial redundancies in wireless systems for energy savings.For instance, References [15], [16], [17] address the problem of finding an optimal number of base stations and cell site placements so as to minimize the overall energy consumption subject to some quality of service (QoS) requirements of the users.Assuming a wireless network based on time division multiple access (TDMA), the objective of the study in [15] is to minimize the overall expected energy consumption by optimizing the number of base stations and their locations.The authors formulate the problem as a mixed integer programming problem and suggest using a simplex method together with the branch and bound algorithm.The drawback of this approach is that, due to the TDMA assumption, the analysis does not carry over to systems with inter-cell interference, which is one of the major challenges faced by designers of modern wireless communication systems [18], [19].Furthermore, branch and bound methods may be slow [20], which excludes an application of these methods to real-time scenarios, even if the underlying problem is of moderate size.
References [16], [9] propose centralized and decentralized algorithms for wireless communication networks to address the problem of base station selection in the presence of traffic load fluctuations.Although the proposed approach seems to provide good solutions in reasonable time, it does not allow incorporation of different sources of energy consumption, which is of utmost importance in modern networks consisting of hierarchical structures.In addition, the authors focus on numerical evaluations to justify the approach.No analytical justification for the performance of the proposed algorithms is given.
The authors of [17] argue in favor of sleep mode techniques coupled with various network planning schemes.A genetic algorithm is used to find energy-efficient network deployments, so the authors have developed a purely heuristic approach to put selected base stations into a sleep mode for energy-efficient network operation.In addition to the lack of any mathematical justification, the main shortcoming of this work is that the proposed approach cannot incorporate other radio technologies other than UMTS terrestrial radio access network (UMTS: universal mobile telecommunications system).In contrast, as mentioned before, our optimization framework is general enough to be applied to multi-RAT scenarios, including the second, third and fourth generations of cellular networks [21].

C. Notation and Paper Organization
For a vector x ∈ R N , its ith component is x i ∈ R. Similarly, for a matrix X ∈ R M ×N , its (i, j)-th component is x i,j .Inequalities involving vectors, such as x 1 ≥ x 2 , are to be understood as component-wise inequalities.The set R + denotes the set of non-negative real numbers, while Given a matrix X ∈ R M ×N , we use x := vec(X) ∈ R M N to denote the vector obtained by stacking the columns of X.Note that the entries of x may be confined to take values on [0, 1] Definition 1 (l 0 -norm): For any vector x ∈ R N and matrix X ∈ R M ×N , their l 0 -norms |x| 0 and |X| 0 are equal to the number of nonzero elements of x and X, respectively.For a scalar x ∈ R, |x| 0 := 1 if x = 0 and |x| 0 := 0 otherwise. 1he remainder of this paper is organized as follows.Section II introduces the underlying system model and in section III we outline the general problem to solve.In section IV the proposed algorithm to find solutions for our optimization problem is derived based on a worstcase inter-cell interference assumption.Section V presents how to explicitly take into account a more realistic inter-cell interference model.We present empirical evaluations of the proposed algorithm in section VI.

II. SYSTEM MODEL
We consider the downlink channel of a multi-cell LTE network with an established network topology and a central network controller.The central network controller is responsible for collecting measurements, executing the proposed algorithm, and propagating updated network configuration parameters throughout the network.We assume that the network consists of L base stations.Each base station has multiple sectors (called cells in the following), and we denote the set of cells belonging to base station l by S l .The set of all base stations is denoted by L, and we use M := ∪ l∈L S l to denote the set of all M cells in the network.The cell deployment is assumed to be dense enough so that coverage areas of different cells overlap.This implies that users can be served by different neighboring cells.

A. Ensuring coverage via test points
In order to ensure the desired coverage anytime and everywhere in the considered area, we impose coverage constraints by adopting the concept of test points [22].
Definition 2 (Test point): A test point (TP) is a centroid of a pre-defined subarea that represents an aggregated QoS requirement resulting from individual QoS demands of all potential users in this subarea. 2Without loss of generality, we assume N TPs with the set of all TPs denoted by N := {1, 2, ..., N}.
A consequence of this definition is that small-scale fluctuations in QoS demand at the user level are averaged out at the TPs.These small-scale fluctuations must be compensated by the lower layers of the protocol stack (e.g. through adaptive modulation or coding).
Assumption 1: The QoS requirement for a TP corresponds to the aggregated expected traffic over the respective area per unit time.This traffic requirement is expressed in terms of the minimum required data rate per TP.Assumption 2: If the minimum rate requirement of TP j is met, so are the requirements of the users in the associated subarea. 3or services with no explicit data rate requirements (e.g.voice calls), we assume that they can be supported if a minimum data rate per service request is ensured.By Assumption 1, each TP j ∈ N is assigned rate requirement r j , and we collect the rate requirements of all TPs in the vector r = [r 1 , r 2 , . . ., r N ] ∈ R N ++ .In general, a TP can be assigned to any cell, and an assignment should be understood as follows.If TP j ∈ N is assigned to cell i ∈ M, then all users in the respective subarea associated with TP j are served by cell j.The assignment of the TPs to the cells is subject to optimization in this paper.We use X = [x i,j ] ∈ {0, 1} M ×N to denote the assignment matrix where x i,j = 1 if TP j is assigned to cell i and x i,j = 0 otherwise.Assumption 3: While each TP is assigned to exactly one cell, each cell can serve multiple TPs, and the set of TPs served by cell i under assignment X is denoted by N i (X) ⊂ N .
We point out that this assumption has been widely used in previous studies [22], [9], [16], and it is valid throughout the paper except for Section IV-C, where it is shown how to include scenarios in which each TP can be served by multiple cells.Note that if N i (X) = ∅ for some i ∈ M, then cell i can be deactivated for energy savings because no TP is assigned to cell i.In contrast, if N i (X) = ∅, then cell i is active, and each TP connected to it induces some amount of cell load.Definition 3 (Cell Load): Given the assignment x := vec(X), the load of cell i, denoted by ρ i (x) ∈ [0, 1] or simply ρ i for notational simplicity, is defined to be the ratio of the number of resource blocks requested by TPs served by cell i ∈ M to the total number of resource blocks B i available at this cell. 4e use ρ := [ρ 1 , . . ., ρ M ] T ∈ [0, 1] M to denote the vector of all cell loads.From the definition of cell load, we have the following: Fact 1: The load at cell i satisfies ρ i > 0 if and only if (iff) cell i serves at least one TP.

B. Spectral efficiency and resource usage
The optimal assignment of TPs to cells is strongly influenced by the spectral efficiency of the corresponding links.For the analysis in this paper, we adopt an OFDMA-based model for the spectral efficiency that is widely used in the literature [18], [23], [24].The spectral efficiency also depends on radio propagation properties.Therefore, we associate to each TP a path-loss vector and write the path-loss vectors of all TPs as columns of the path-loss matrix G = [g i,j ] ∈ R M ×N ++ , where g i,j captures the long-term path loss and shadowing effects for a radio link from cell i to TP j.

Assumption 4 (Reliable path-loss estimates):
A reliable estimate of G is available at the central network controller.

Remark 1:
The problem of reliable estimation and tracking of the path-loss matrix is out of the scope of the paper.However, the matrix captures only long-term fading effects, so reliable estimates of G can be obtained and tracked in practice.Promising algorithmic solutions to this estimation problem are for instance presented in [25].Moreover, in network planning problems, knowledge of G is a very common assumption in the literature [13], [18], [22].Now we are in a position to define the signal-to-interference-noise-ratio (SINR) γ i,j : R M + → R + between cell i ∈ M and TP j ∈ N by [13], [18]: where P i > 0 is the transmit power per resource block of cell i and σ 2 > 0 is the noise power per resource block.Accordingly, the link spectral efficiency ω i,j : R M + → R + (in bits per resource block 5 ) for the link from cell i to TP j is given by [23] where η BW i,j ∈ R ++ and η SINR i,j ∈ R ++ are suitably chosen constants, referred to as bandwidth and SINR efficiency, respectively.These constants depend on the overall system design, which includes the choice of scheduling protocols and multi-antenna techniques.The choice of these constants has no impact on our results, so they are assumed to be arbitrary and fixed throughout the paper.For realistic values of these constants, we refer the interested reader to [18], [23].
From (2), we can easily see that the necessary number of resource blocks b i,j at cell i to serve TP j with data rate r j is equal to b i,j = r j ω i,j (ρ) > 0. In addition, following Definition 3, the load at cells can be computed by the following system of non-linear equations Remark 2: In practice, cells need to reserve some fraction of their resource blocks for signaling.If cell i has B * i resource blocks in total, and it needs to reserve a i > 0 of its resource blocks for signaling, then the resource blocks at cell i available for allocation to TPs are B i = B * i − a i .For a fixed assignment X cell load ρ in (3) can be efficiently computed by means of fixedpoint algorithms (c.f.Sec.V).However, the assignment of TPs to cells is the main subject of our optimization problem and thus we cannot evaluate (3) easily.In order to keep the complexity of the optimization problem tractable, we lower bound the spectral efficiency.

Assumption 5 (Worst-Case Interference):
We have the worst-case interference scenario if all cells are fully loaded, i.e. ρ = 1.
Unless otherwise stated, we assume the worst-case interference, which results in a lower bound on the link spectral efficiency ω i,j (ρ) ≥ ωi,j := ω i,j (1) for every ρ ∈ [0, 1] M .In general, this bound diminishes gains in energy savings when taking into account the energy consumption of hardware, and we show in Section V how to incorporate the actual link spectral efficiency to improve the energy savings.Nevertheless, having fully loaded cells as in Assumption 5 is desirable because it has been proven in [26] that full load (i.e.ρ = 1) is optimal with respect to the transmit energy consumption (see also [27]).
Remark 3: The worst-case interference assumption cannot exploit the full potential for energy savings, but the assumption is of high practical relevance because it is an effective way to avoid coverage holes as a result of deactivating cells based, for instance, on imperfect information.

C. Energy consumption model
In contrast to most work in literature, we consider a model for the energy consumption of a base station and its cells that takes into account not only the cell load-dependent transmit energy radiated by antennas, but also the remaining sources of energy consumption that are independent of the cell load as long as the cell/base station is active.Definition 4 (Active base station/cell): Consider a particular base station l ∈ L and its cells i ∈ S l .Let ρ i ∈ [0, 1] be the load of cell i.We say that a cell i is active iff ρ i > 0 and that base station l is active iff one of its cells is active, i.e. i∈S l ρ i > 0. If a cell or base station is not active it is said to be inactive.
With Definition 4 we are in the position to define the energy consumption of a base station.

Definition 5 (Energy consumption):
Given a TP assignment X inducing a cell load ρ, the energy consumption E l (ρ) ≥ 0 of base station l is defined to be the power that the respective base station consumes per unit of time, where E l (ρ) = 0 iff base station l is inactive.
The function E l (ρ) depends on the hardware setup of the base station, but it can be split into three parts in general: (i) The static energy consumption of the base station c l > 0 (due to shared hardware between sectors e.g.cooling, power supply, etc.), (ii) The static energy consumption e i > 0 (i ∈ S l ) of its active cells (e.g.due to power amplifiers, signal processing units, etc.), and (iii) the load-dependent dynamic energy consumption of its active cells f i (ρ i ) (i ∈ S l ), where By these definitions and Fact 1, E l (ρ) is a discontinuous function of the cell load, and we have where S l,active ⊂ S l is the set of of active cells of base station l.Therefore, the total energy consumption in a network, which is the accumulated energy consumption of all active base stations, yields For concreteness, we make the following assumption throughout the paper (see also Remark 4) Assumption 6 (Concave dynamic energy consumption): and continuously differentiable.
In particular, this assumption is satisfied by a linear dependency of the base station energy consumption and the cell load reported in current studies such as [28].
Remark 4: In fact, the load-dependent dynamic energy consumption can also be assumed to be a convex function of the load.Moreover, we could even assume that it is a sum of convex and concave functions.The optimization framework presented in this paper can be straightforwardly extended to cover these cases.

III. PROBLEM STATEMENT
Spatio-temporal redundancies in coverage and capacity resulting from day-time fluctuations in traffic demand present great opportunities for energy savings by deactivating redundant cells at times of relatively low traffic demand.Indeed, if the traffic demand decreases, some or all entries of the rate requirement vector r ∈ R N ++ become relatively small, which can be utilized to reduce the total energy consumption by minimizing the cost function in (4) subject to different constraints that follow from the system model and (3).Formally, the problem under consideration can be stated as follows (note that the complete set of equations is referred to as (5)): where the optimization variables are x i,j and ρ i (i ∈ M, j ∈ N ).In particular, Assumption 3 is captured by (5c) together with (5e).Constraints (5b) and (5d), in contrast, ensure that the cell load is in accordance with Definition 3.
To ensure feasibility of the above problem and to show the effectiveness of our approach, we consider scenarios where the rate requirements of TPs are sufficiently low for a reasonable amount of redundancies that allow for deactivation of cells.Moreover, if the traffic requirements in the system are sufficiently low or the number of cells is sufficiently large, ρ ⋆ is expected to be sparse with zero entries specifying cells that can be deactivated.

IV. ENERGY-EFFICIENCY OPTIMIZATION
The difficulty of problem (5) lies in its combinatorial nature.In fact, it can be shown that the problem is related to the classical bin-packing problem, which is known to be NP-hard [29].
Consequently, the complexity is expected to grow exponentially with the number of cells.On the positive side, problem (5) has a special structure that can be exploited by majorizationminimization techniques [11], which have been widely used in recent years to tackle various problems in compressed sensing [30] and machine learning [31].
Instead of finding a global solution to (5), we will pursue a less ambitious goal.We apply the majorization-minimization techniques mentioned above to develop a low-complexity anytime algorithm that has a strong analytical justification.This algorithm is expected to provide good results (in terms of low energy consumption) with low-complexity.To this end, we reformulate problem (5) to pose it in a more tractable form.First, we observe that each load ρ i is, in fact, a function of X (c.f.Definition 3 and (5b)).We can therefore modify the problem to have only X as an optimization variable.Recall that, if at least one TP is served by cell i ( i.e., j∈N x i,j ≥ 1), then it follows from Fact 1 that the cell load at cell i is non-zero and |ρ i | 0 = 1.Hence, the objective function in (5a) can be equivalently written as where s i := vec(S i ) with S i ∈ {0, 1} M ×N being a matrix of zeros, except for its ith row, which is a row of ones.Similarly, t l := vec(T l ) with T l ∈ {0, 1} M ×N is a matrix of zeros, except for its rows i ∈ S l , which are rows of ones.
Definition 6: Given the assignment x and the load dependent energy consumption f i (ρ i (x)) B i ωi,j x i,j (c.f.(5b)), we define the function fi : Considering Definition 6 and using ρ i ≤ 1 (see Definition 3) in (5b), we arrive at an equivalent problem given by min.
where the assignment variables x i,j (i ∈ M, j ∈ N ) are the only optimization variables.

A. Problem relaxation
To obtain an optimization problem that is computationally tractable, we first relax the binary constraint (7d) to6 x i,j ∈ [0, 1] , ∀i ∈ M, ∀j ∈ N .
The above makes all constraints convex, so now the only problem is the objective function, which is not continuous due to the l 0 -norm.We also note that by Assumption 6 and Definition 6, the load-dependent term fi (x) in the objective function (7a) is concave and continuously differentiable for x ∈ [0, 1] N M since these properties are preserved under a composition with a linear function [32], [33].To address the non-continuity of the l 0 -norm, we consider the following relation [30]: By using ( 9) and the non-negativity of s i , t i , x, the cost function in (7a) can be equivalently written as We can therefore obtain an approximation to problem (5) by replacing the objective function by the right-hand side of (10) for a sufficiently small but fixed ǫ > 0.More precisely, for some ǫ > 0, the objective is to find a matrix X ∈ [0, 1] M ×N or, equivalently, a vector x = vec(X) that solves the following problem min.
Solving problem (11) is not straightforward because we need to minimize a non-convex function over a convex set.Fortunately, Reference [30] presents an optimization framework based on the majorization-minimization (MM) algorithm [11] to handle problems of this type.The framework can be used to decrease the value of the objective function in a computationally efficient way.
For completeness, we the reader can find some details of the MM algorithms in the appendix.

B. Majorization-minimization (MM) algorithm
For notational convenience, we define ĉl := c l log(1+ǫ −1 ) and êi := e i log(1+ǫ −1 ) , and we use these definitions in (11a) to simplify the objective function (ignoring unnecessary constants): where X ⊂ R M N is the closed convex set of points satisfying the constraints (11b)-(11d) and we have used the fact that M = ∪ l∈L S l .Since fi is concave and continuously differentiable by Assumption 6, so is the function in (12) for any ǫ > 0. Therefore, according to the explanations in the appendix, we can use the following function as a majorizing function of (12), where the gradient can be easily calculated: Thus, updates of the MM algorithm take the form (see the appendix) for some feasible starting point 7 x(0) ∈ X .In words, the MM algorithm solves iteratively a sequence of convex optimization problems.For the chosen majorizing function, the problem to be solved in every iteration is a linear programming problem (LP), which can be typically solved efficiently with standard optimization tools.
As discussed in the appendix, the sequence {x (n) } n∈N ⊂ X for some x(0) ∈ X generated by ( 14) produces a non-increasing sequence {h(x (n) )} n∈N of objective values.Therefore, as n → ∞, we expect the corresponding sequence of assignment matrices {X (n) } n∈N (note that x(n) =: vec(X (n) )) to evolve towards network configurations with low energy consumption.
We stop the algorithm if the improvements in the objective value are small enough in the sense that for some sufficiently small ǫ ⋆ > 0, the following condition is met Upon termination, the resulting assignment matrix X (n) ∈ [0, 1] M ×N needs to be mapped to a matrix X ⋆ ∈ {0, 1} M ×N in order to obtain a feasible point to the problem in (5).For this purpose, we use the heuristic described in Alg. 1.The main idea is as follows.We start by rounding the entries x (n) i,j to the closest integer, and then we check if the obtained assignment matrix is part of the set X .Otherwise, we activate additional cells and connect TPs to them.By using the standard LP solver of CPLEX, in our simulations most entries of the matrix X (n) ∈ [0, 1] M ×N are typically either zero or one, so the rounding operation rarely results in a violation of a constraint (but we emphasize that this is not guaranteed to be true in general).
For convenience, we summarize the complete approach in Alg. 2. 7 In our experience a good starting point is derived from a feasible assignment matrix obtained by connecting each TP to the cell providing the strongest received signal strength.

C. Serving a test point with multiple cells
By Assumption 3, each TP is restricted to be served by exactly one cell.This strict limitation introduces the non-convex constraint (5e) to the optimization problem in (5), which motivates the relaxation (8) and the heuristic mapping introduced in Alg. 1.To avoid these heuristic approaches for which we are not guaranteed to find solutions, we assume in this section that each TP can be served by multiple cells.This assumption is implemented by using (8) directly instead of (5e).As a result, there is no need for any relaxations of the constraints or the use of heuristic Algorithm 2 Network reconfiguration for improved energy efficient operation Input: set of TPs, set of cells, constraints Output: optimized network configuration according to X ⋆ .
1: initialize X (0) with a feasible point.7: connect the TPs to cells according to X ⋆ .
8: deactivate all cells no TP is connected to.
mappings such as that in Alg. 1.We only need to approximate the cost function as done in (11a) and apply the MM algorithm to the resulting optimization problem, and we note that these operations have a strong analytical justification.
The assumption of multiple cells serving one TP has a practical interpretation when considering Definition 2. It means that cells can serve only a fraction of the traffic generated in the area corresponding to some TP.In other words, we do not use a all-or-nothing approach, where cells should serve either all users or no users in the area corresponding to a TP.

V. LOAD-AWARE ENERGY-EFFICIENCY OPTIMIZATION
The model presented in Sec.II assumes the worst-case interference in a fully loaded system, which leads to a lower bound on the link spectral efficiency (c.f.Assumption 5).As pointed out in Remark 3, the main rationale behind this approach is the need for avoiding coverage holes when network elements are deactivated.The price is a sub-optimal performance in terms of energy efficiency because the interference is overestimated, and therefore users may use more resource blocks than required to keep their minimum data rate requirements.An immediate consequence of this is that more cells are activated than are necessary for meeting the minimum rate requirements at the TPs.In this section, we extend the optimization problem in (11) to incorporate more precise estimates of the load induced by a given user-cell assignment, which is not a trivial task because it involves load computation (with fixed assignments) that requires the solution of a system of nonlinear equations [18], [13], [34] (note that we can easily estimate the link spectral efficiency from the load by using ( 2)).
In what follows, we propose an approach that typically yields good approximations of the true link spectral efficiencies.The idea is to use a two-step alternating iterative scheme: Step 1 Compute the link spectral efficiency ∀ i∈M,j∈N ω i,j (ρ) defined in (2) for the load value obtained in the previous iteration of Step 2 of the algorithm (in the first iteration of the algorithm, we can use the worst-case spectral efficiency) and solve Problem (11) with these (fixed) link spectral efficiencies to obtain an TP-cell assignment X.
Step 2 For the TP-cell assignment obtained in Step 1, compute the load induced by this assignment.
Regarding the load computation in Step 2, we use the fact that the load ρ induced by a given assignment X is a fixed point of the following standard interference mapping (see [34], [35] and the references therein for further details): where Γ is a large constant and λ i,j := . Since J is a standard interference mapping and I i (ρ) is bounded above, we conclude that the fixed-point always exists and is unique [14], [36].Moreover, efficient iterative methods are known to approach the fixed point with an arbitrary precision [14], [36].We summarize the heuristic proposed in this section in Alg. 3.

VI. NUMERICAL EVALUATION
In the following we present a numerical evaluation of the performance of the proposed algorithm in different networks.We start by outlining the basic simulation scenario followed by a comparison with two reference schemes with respect to the energy savings and computational time.Next, we present the ability of the proposed algorithm to incorporate a variety of different base station energy consumption models.Finally, we show the performance gains achieved by applying Alg. 3 from Sec. V.

3:
Use Alg. 2 to obtain X (n) and remove deactivated cells from the set of cells to be considered in subsequent iterations.

4:
Compute the new link spectral efficiency ω (n) for the assignment X (n) by computing the fixed point of the standard interference mapping J .
5: end for 6: Return the network configuration resulting from X (Z) .

A. Basic Simulation Scenario
The simulated network is located in a square-shaped area of size 2km×2km, where L base stations are placed at locations chosen uniformly at random.Unless stated otherwise, each base station has three cells directed at 0 • , 120 • and 240 • respectively.Traffic generated by users is represented by N TPs on an irregular grid.Hence, each TP represents the traffic requirements of an area of different size.To obtain spatially varying traffic requirements, we use the following traffic model in each run of the simulations.We define three circular hot-spot areas with centers chosen uniformly at random within the area.There are two types of TPs: "hot-spot TPs (HTP)" and "standard TPs (STP)".Each TP in the simulation has probability 0.3 of being a HTP and probability 0.7 of being a STP.While the position of STP is chosen uniformly at random within the whole area, a HTP can be assigned uniformly at random to one of three hot-spot area.Its final position is determined in polar coordinates by sampling the distance from the hot-spot center from a normal distribution and the angle from a uniform distribution.We use a wrap around model to avoid boundary effects and determine the location of TPs to be placed outside the square-shaped area.The data rate requirements of TPs are derived from a normal distribution with µ d = 128 kbps and variance σ 2 d = 32 kbps 2 with a lower bound of 1 kbps.The signal attenuation for links between cells and TPs follows the ITU propagation model for urban macro cell environments with a horizontal antenna pattern for 3-sector cell sites with fixed antenna patterns [37].
Unless otherwise stated, we use the following simulation parameters: ǫ ⋆ = 10 −3 , ǫ = 10 −3 , B i = 20MHz, P i = 40dB, η SINR = 1, η BW = 0.83, c i = 500W and e i = 280W.The values of the last six parameters have been chosen to mimic the behavior of commercial LTE systems.Furthermore, we use f i (ρ i ) = 564 ρ i to model the load-dependent energy consumption, which is a value similar to the dynamic energy consumption of current macro cells with 6 transmit antennas [28].
The proposed algorithms are compared with a solution of the original problem in (5) and, where possible, with the centralized cell zooming approach from [16].The solution to the problem in ( 5) is obtained by using Matlab 2013a in combination with IBM's CPLEX on a Intel Core i7 PC with four cores.As shown later in this section, the computational time to solve (5) grows fast with the problem size.Therefore, to solve the problem in ( 5) in a reasonable time for comparison purposes, we confine our attention to small networks with M = 102 cells (L = 34 base stations) and N = 100 TPs, unless otherwise stated.We obtained the 95% confidence intervals depicted in the figures by applying the bias corrected and accelerated bootstrap method [38] to the outcome of 100 independent runs of the simulations.Results related to the overall network energy consumption will be normalized to the energy consumption of the network when all cells are active and fully loaded.Definition 7 (Normalized network energy consumption): Given a TP assignment X inducing cell load ρ and given the resulting network energy consumption E(ρ), the normalized network energy consumption is defined to be , where the term in the denominator is the energy consumption for a fully loaded system (ρ = 1).
We refer to the sparsity supporting majorization-minimization algorithm as "sMM" and to any algorithm that solves (5) directly as "MIP" algorithm (MIP: mixed-integer programming).
We refer to solutions obtained by the centralized cell zooming algorithm in [16] as "cCZ".The alternating approach proposed in Sec.V is referred to as "alternating sMM" algorithm.

B. Computational performance comparison between sMM, cCZ and MIP
The cCZ has limited capability to incorporate different energy consumption models and base stations with several sectors, so we confine ourselves to a simple base station model.We assume a homogeneous network model under which all base stations have only one omni-directional To show trends, we start with the standard setup described above, and we gradually increase the number of TPs in the system.Fig. 1 shows the normalized network energy consumption.As expected, the normalized network energy consumption for all three algorithms increase as the number of TPs increases.This is intuitive because additional TPs add extra rate requirements that increase the total system load, which in turn reduces the redundancy in the network to be exploited for energy savings.The proposed sMM algorithm as well as the MIP algorithm provide network configurations that exhibit much smaller normalized network energy consumption when compared with the network configurations obtained with the cCZ algorithm.The smallest energy consumptions are achieved with the MIP algorithm, which outperforms the proposed sMM algorithm.For the scenario with 200 TPs, the sMM algorithm results in normalized network energy consumption of 12% on average.For the same number of TPs, the average normalized energy consumption under the cCZ and MIP algorithm are 49% and 7%, respectively.Similarly, for 1000 TPs, the resulting average normalized network energy consumption of 31% for the sMM algorithm is still larger than the 21% normalized energy consumption corresponding to the MIP solutions.However, it is still much smaller than cCZ with 88% normalized energy consumption.
These results emphasize that the sMM algorithm is a suboptimal heuristic, which is able to find network configurations consuming low energy.Even though the resulting network energy consumption is not globally optimal, it shows much larger energy savings than the comparison scheme cCZ.
The main advantage of the proposed sMM algorithm is its fairly low computational complexity, which is directly affecting the time required to obtain an optimization result.Fig. 2 depicts the normalized time needed to obtain the results of Fig. 1.This time is normalized with respect to the computation time of the MIP algorithm with 100 cells and 100 TPs.The sMM algorithm always provides results in a substantially shorter time than the MIP algorithm.Even for a relatively small scenario of 100 cells and 300 TPs, the computation time is already about 200 times larger for the MIP algorithm compared to the proposed sMM algorithm.For larger setups with 1000 TPs the normalized time to solve the MIP was ≈ 237 compared to ≈ 0.49 for the sMM algorithm, which is an approximate 488 fold reduction in the computation time.We emphasize that the simulated scenarios are small and the computation of the MIP solution becomes infeasible in practical scenarios.Already for a network with 200 cells and 10,000 TPs, the sMM algorithm provided a solution in about 13s, whereas the MIP algorithm could not find a solution within one hour.Compared to the cCZ algorithm the proposed sMM algorithm takes longer time due to the lower complexity heuristic used in the cCZ algorithm.For a scenario of 300 TPs the average computation time is about 22 times larger for the sMM algorithm and with 1000 TPs it is about 43 times larger.However, with typical values of less than 1s, the computation time is still reasonably small to allow for an online implementation.Considering the advantages in energy savings, as seen from Fig. 1, the proposed sMM algorithm presents a good trade off between computation time and energy savings.

C. Cells with different sources of energy consumption
In contrast to other approaches to the problem of energy-efficient network topology control, our optimization framework can easily deal with heterogeneous networks in which cells have different static and load-dependent energy consumptions in (4).In other words, the proposed sMM algorithm can cope with different energy consumption models of cells.It can select those network configurations that exhibit as low overall energy consumption as possible.To illustrate the impact of different energy consumption models on the optimization result, we start by varying the static energy consumption of all cells, while keeping the load-dependent energy consumption To study the impact of the static energy consumption of cells e i , in the following simulations we use single cell omnidirectional base stations, and we set the load-dependent part for all cells and the common static part at base stations to zero f (ρ i ) = 0 and c l = 0.The static energy consumption of half of the cells is varied, while the static energy consumption of the other half remains unchanged.We refer to the cells with standard fixed energy consumption as type 1, while type 2 is used to refer to cells with a varying energy consumption.The energy consumption of type 2 cells is specified relative to that of type 1 cells.More precisely, an energy consumption relation of β = 0.5 means that if c i = 780W for type 1 cells, then c i = 390W for type 2 cells.The results for a scenario consisting of 100 cells and 100 TPs are shown in Fig. 3.
The simulation confirms the ability of our optimization framework to incorporate different static energy consumptions.When all cells consume the same amount of energy (β = 1), the algorithm makes no difference between type 1 and type 2 cells.The energy consumption of type 1 and type 2 cells is roughly the same indicating that equally many type 1 and type 2 cells are active in the obtained solution.In contrast, if type 2 cells consume less energy than type 1 (β < 1), then the algorithm prefers to deactivate type 1 cells, while attempting to keep type 2 cells active.Obviously, if β > 1, the situation is reversed in the sense that, if possible, type 2 cells are preferably selected for deactivation.
The differentiation becomes even more evident for cell deployments, where type 1 and type 2 cells are co-located.In such a case, two cells of different type are located at the same site and are "exchangeable" with respect to the service provided to the TPs (recall that we use omnidirectional cells in these simulations).In other words, if a TP is assigned to a location with two co-located cells, then it does not matter which cell is used to provide the service to the TP.This implies that the decision whether to deactivate a cell or not should depend only on the energy consumption of this cell in relation to its co-located cell 8 .The simulations with such a deployment are shown in Fig. 4, where we see that, for β < 1, there is no active cell of type 1, while, for β > 1, type 2 cells consume more energy and the simulations confirm that the algorithms clearly prefer to activate type 1 cell.
To obtain insight into the impact of the load-dependent energy consumption, we fix the static energy consumption of a single-cell omnidirectional base station to be e i = 780W and c l = 0W, and we vary the load-dependent energy consumption f i (ρ) = 564 c ′ ρ i by letting c ′ take values on c ′ ∈ {0, 1, 10}.For an increasing number of TPs, Fig. 5 shows the fraction of active cells, while the normalized network energy consumption is shown in Fig. 6.First we observe that the network energy consumption always increases with an increasing number of TPs, which is in fact no surprise.Moreover, the fraction of active cell increases with c ′ for both the sMM In other words, instead of deactivating as many cells as possible to minimize the static energy consumption, the algorithm deactivates the cells to find the best possible balance between the static and load-dependent energy consumption.This can be observed in Fig. 5, where we can see that the higher is the load-dependent energy consumption (which is reflected by c ′ ≥ 0), the more cells are activated under both the sMM algorithm and MIP algorithm.In particular, if

D. Alternating sMM algorithm
We now study the performance of the alternating sMM algorithm presented in Sec.V.The standard simulation parameters are used with a total number of Z = 10 iterations.To show the effect of different TP requirements, we performed simulations under our standard simulation setup for different mean data rates µ d at TPs and, for each mean data rate, we used 100 different realizations of the simulation scenario.The initial link spectral efficiency is computed based on the worst-case interference according to Eq. ( 2).Our goal is to show the huge potential for energy savings when the actual load is estimated as in Alg. 3, instead of assuming the worstcase interference scenario, which corresponds to the full-loaded system (see Definition 5).The outcome of the simulation is depicted in Fig. 7, which includes the 95% confidence level and shows the normalized network energy consumption with respect to the energy consumption when all cells are active.We can see that the application of the techniques from Sec. V leads to a significant reduction of the normalized network energy consumption for both the sMM algorithm and the optimal MIP solution.Furthermore, the largest reduction was always observed after the first iteration, which shows that the worst-case interference assumption is very conservative and the load estimation may lead to considerable performance gains.

VII. CONCLUDING REMARKS
We have introduced an optimization framework for enhancing the energy efficiency of cellular networks.In wireless systems, problems of this type are hard to solve because of their combinatorial nature and the nontrivial interference coupling among cells.Indeed, even with a simplifying assumption of the worst-case interference, the energy saving problem is a mixed integer programming problem that is strongly related to the bin-packing problem, which in turn is known to be NP-hard.As a result, we cannot expect to find optimal solutions quickly, we focused in this study on fast sub-optimal heuristics.Unlike many existing approaches in the literature, the proposed methods can naturally consider both the dynamic and static energy consumption of base stations with multiple cells in heterogeneous networks.
In the first proposed heuristic, we relaxed the mixed integer programming problem to a form suitable for the application of majorization-minimization techniques.The resulting algorithm requires the solution of a series of linear programming problems that can be efficiently solved with standard mathematical solvers.Therefore, it can be applied to large-scale problems, and it is also suitable for online operation.One limitation of this first method is that it uses the worstcase interference scenario, so it can be too conservative in terms of energy savings.To address this limitation, we also proposed a two-step alternating approach that obtain accurate values of the spectral efficiency of links by using the framework of standard interference functions.
Simulations show that the proposed fast heuristics are able to obtain network configurations that are competitive in terms of energy consumption against optimal algorithms.ACKNOWLEDGMENTS This work has been partly supported by the framework of the research project ComGreen under the grant-number 01ME11010, which is funded by the German Federal Ministry of Economics and Technology (BMWi).Part of this work has been performed in the framework of the FP7 project ICT-317669 METIS, which is partly funded by the European Union.The authors would like to acknowledge the contributions of their colleagues in METIS, although the views expressed are those of the authors and do not necessarily represent the project.

APPENDIX
Here we briefly summarize the majorization-minimization (MM) algorithm [11], which can be seen as a generalization of the well-known expectation-maximization (EM) algorithm.The presentation that follows is heavily based on that in the study in [39] (see also [1], [34]).
Suppose that the objective is to minimize a function h : X → R, where X ⊂ R N .Assume that there exists a solution to this optimization problem, and let x ⋆ ∈ X be a global minimizer of h; i.e., h(x ⋆ ) ≤ h(x) for every x ∈ X .Unless h has a special structure that can be exploited (e.g.convexity), finding x ⋆ is computationally intractable in general [40].Hence, we typically have to content ourselves with generating a sequence of vectors with non-increasing objective value.To this end, we can use the majorization-minimization (MM) technique, which drives h downhill with the help of a majorizing function g : X × X → R. In more detail, we say that g is majorizing function for h if it satisfies the following properties: C.1 g majorizes h at every point in X , i.e. h(x) ≤ g(x, y), ∀x, y ∈ X , C.2 g and h coincide at (x, x) so that h(x) = g(x, x), ∀x ∈ X .
By starting from a feasible point x (0) ∈ X , the MM algorithm generates a sequence x (n) n∈N ⊂ X with monotone decreasing function values h(x (n) ) according to (we assume that the optimization problems have a solution) x (n+1) ∈ arg min x∈X g(x, x (n) ) . ( Irrespective of the choice of g, we can easily verify monotonicity of the objective value with the help of ( 16), ( 17) and ( 18): h(x (n) ) = g(x (n) , x (n) ) ≥ g(x (n+1) , x (n) ) ≥ g(x (n+1) , x (n+1) ) = h(x (n+1) ).Therefore, since the function h is bounded below when restricted to X by assumption, we can conclude that h(x (n) ) → c ∈ R for some c ≥ h(x ⋆ ) as n → ∞.However, we emphasize that this in general does not imply the convergence of the sequence x (n) .
The choice of the function g is problem dependent, but it should be sufficiently structured in order to make the optimization problem in (18) tractable.In particular, in our study we deal with concave and continuously differentiable functions h.In such cases, a natural choice for g satisfying ( 16) and ( 17) is g(x, y) = h(y) + ∇h(y) T (x − y).
This particular choice is common in, for example, sparse signal recovery [30].
Remark 5: We note that, instead of solving the optimization problem in (18) exactly, it is sufficient for the monotonicity of the sequence {h(x (n) )} that g(x (n+1) , x (n) ) ≤ g(x (n) , x (n) ) for every n ∈ N.This observation is relevant if the right-hand side of (18) can only be solved asymptotically, in which case the iteration can be truncated whenever the above inequality is satisfied.
a given continuous function relating the energy consumption to the corresponding cell load.

Fig. 1 :
Fig. 1: Comparison of normalized network energy consumption obtained with the sMM algorithm, the cCZ algorithm and the solution of the MIP problem for increasing number of TPs.Normalization with respect to the network energy consumption for a fully loaded system (ρ = 1) when all cells are active.

Fig. 2 :
Fig. 2: Comparison of normalized computation time to obtain results with the sMM algorithm, the cCz algorithm and the direct solution of the MIP problem.Normalization with respect to the empirical average of the MIP's computation time for 100 cells and 100 TPs over 100 realizations .

Fig. 3 :
Fig. 3: Fraction of active type 1 and type 2 cells in the final solution obtained with sMM and MIP.Deployment uniformly at random for type 1 and type 2 cells.

βFig. 4 : 1 Fig. 5 :
Fig. 4: Fraction of type 1 and type 2 cells in the final solution obtained with sMM and MIP.Deployment of type 1 cells uniformly at random and type 2 cells are co-located with type 1.

Fig. 6 :
Fig. 6: Normalized network energy consumption for different dynamic energy consumption weights c' with increasing number of TPs.Normalization with respect to the energy consumption when all cells are active.

Fig. 7 :
Fig.7: Normalized network energy consumption when applying Alg. 3 when using the sMM algorithm compared to the using the MIP solution in each iteration.Normalization with respect to the energy consumption when all cells are active.