Self-optimized heterogeneous networks for energy efficiency

Fan, Shaoshuai; Tian, Hui; Sengul, Cigdem

doi:10.1186/s13638-015-0261-1

Research
Open access
Published: 04 February 2015

Self-optimized heterogeneous networks for energy efficiency

Shaoshuai Fan¹,
Hui Tian¹ &
Cigdem Sengul²

EURASIP Journal on Wireless Communications and Networking volume 2015, Article number: 21 (2015) Cite this article

1887 Accesses
7 Citations
Metrics details

Abstract

Explosive increase in mobile data traffic driven by the demand for higher data rates and ever-increasing number of wireless users results in a significant increase in power consumption and operating cost of communication networks. Heterogeneous networks (HetNets) provide a variety of coverage and capacity options through the use of cells of different sizes. In these networks, an active/sleep scheduling strategy for base stations (BSs) becomes an effective way to match capacity to demand and also improve energy efficiency. At the same time, environmental awareness and self-organizing features are expected to play important roles in improving the network performance. In this paper, we propose a new active/sleep scheduling scheme based on the user activity sensing of small cell BSs. To this end, coverage probability, network capacity, and energy consumption of the proposed scheme in K-tier heterogeneous networks are analyzed using stochastic geometry, accounting for cell association uncertainties due to random positioning of users and BSs, channel conditions, and interference. Based on the analysis, we propose a sensing probability optimization (SPO) approach based on reinforcement learning to acquire the experience of optimizing the user activity sensing probability of each small cell tier. Simulation results show that SPO adapts well to user activity fluctuations and improves energy efficiency while maintaining network capacity and coverage probability guarantees.

1 Introduction

To satisfy the explosive increase in mobile data traffic demand, heterogeneity is expected to be a key feature of future wireless networks [1-4]. Heterogeneous networks (HetNets) consist of a conventional cellular network overlaid with a diverse set of lower power small cell base stations (BSs), such as microcells, picocells, and femtocells, to improve spatial frequency reuse and coverage. This allows the network to achieve higher data rates while retaining seamless connectivity and mobility. However, the overall energy consumption and operating cost of networks are also increasing considerably by the deployment of these additional small cell base stations [5,6]. As a result, green wireless communication has attracted the attention of both researchers and network operators, and energy efficiency has become one of the key network management parameters [2,7]. Additionally, the future heterogeneous networks are also expected to operate in self-organizing manner to reduce operational expenditures (OPEX) due to the deployment of large numbers of BSs [8].

An effective way to adapt to the traffic demand while improving energy efficiency is performing active/sleep scheduling by taking advantage of the fluctuations in traffic demand over time and space [9]. In [10], using a sleep mode is shown to be effective especially when the cell size is small and under light traffic conditions for a single-tier network. For heterogeneous networks, Soh et al. [2] applied the tools from stochastic geometry to analyze the impact of load-aware sleeping strategy on coverage probability, finding its performance to be at least as good as without using a sleep mode. Active/sleep scheduling can be controlled via either the user equipment, the small cell, or the core network [11]. If it is network-controlled as proposed in [12], the information about the traffic load and user location are needed to identify hotspots to make the active/sleep decisions. Therefore, it is attractive to deploy distributed sleep mode strategies which do not involve the UE equipments, extra signaling overhead, and user location awareness. Wildemeersch et al. [5] investigated using small cells in a distributed way to offload the traffic from the macrocell network and exploiting their cognitive capabilities of user activity sensing to improve the energy efficiency by active/sleep scheduling. However, their analysis in a two-tier network environment only considered the network performance of traffic offloading and the user detection. The quality of service (QoS) of users such as coverage probability and throughput which should be guaranteed as the baseline of energy saving was ignored. Moreover, the operation status of BSs were not considered by their proposed user detection model in the literature, and additional energy consumption would be caused by the active BSs due to unnecessary sensing. Also a user’s cell association with small cell tiers will affect the detection of the user because only macrocell users could be detected under their proposed model. This issue makes the scheme not applicable to the general multi-tier heterogeneous network scenario.

In this paper, we propose an active/sleep scheduling scheme for K-tier heterogeneous networks exploiting self-organizing capabilities. In our scheme, to guarantee coverage, macrocells are always active. However, when a small cell does not serve any active users, it goes into a sleep mode, during which it wakes up only to sense macrocell user activity. If the small cell detects an active user within its coverage during the sensing period, it becomes active to offload traffic from the macrocell. We analyze the coverage probability, network capacity, and energy consumption of the proposed scheme in a K-tier heterogeneous network using stochastic geometry, accounting for cell association uncertainties due to random positioning of users and BSs, channel conditions, and interference. To save as much energy as possible, user detection follows a sensing probability, which is self-optimized by the network. The sensing probability optimization (SPO) approach based on reinforcement learning is proposed to acquire the experience of optimizing the user activity sensing probability of each small cell tier, considering the user activity fluctuations and user QoS such as coverage and throughput.

The rest of the paper is organized as follows: In Section 2, we describe the system model and propose the user activity sensing-based active/sleep scheduling scheme. In Section 2, we describe the energy efficiency optimization problem and present the details of the proposed fuzzy Q-learning-based SPO approach. In Section 2, we present the simulation results. Finally, we draw the conclusions.

2 User activity sensing-based active/sleep scheduling scheme

2.1 System model and assumptions

We consider a heterogeneous network that consists of K tiers of BSs, where the first tier of macrocell BSs is overlaid with K−1 tiers of denser and lower power small cell BSs. We consider that all tiers share the full spectrum and, hence, interference exists between tiers. All small cell BSs operate in open-access mode, such that they are accessible to all users. In order to improve energy efficiency, we propose an active/sleep scheduling scheme which makes use of monitoring user activity and self-organizing capabilities.

We model the user and BS activity using a time-slotted model as depicted in Figure 1. To guarantee coverage, macrocells are always active over the slot duration T. When a small cell does not serve any active user, it goes into a sleep mode to save energy but still senses macrocell user activity over a sensing time t _s. Active small cells do not sense user activity and only transmit during T−t _s to ensure that only macrocell users are detected during sensing time. This is because the small cells in our model use energy detection (ED) to sense user activity due to its low complexity and low power consumption [13,14]. However, ED still may have false positives as it may also be affected by noise or interference originated from macrocell users outside the small cell coverage [13]. Nevertheless, to keep the complexity low, in our model, if the detected energy is higher than the threshold, the small cell believes that there is an active macrocell user within its coverage range and becomes active during T−t _s by transmitting pilot signals. Subsequently, the user reports the presence of the small cell to the macrocell, and the user might be handed over to the small cell according to cell association policies (e.g., maximum received power-based cell association [15]).

The spatial distribution of macrocell BSs in the network is usually modeled by lattices or hexagonal cells since their deployment is considered well-planned. Nevertheless, it has been shown that modeling macrocell BSs by homogeneous Poisson point process (PPP) is tractable and accurate [16,17]. Small cells such as femtocell access points are also extensively modeled by PPP, mainly due to their uncoordinated and random deployment [15]. Therefore, for K-tier heterogeneous networks, we model the positions of BSs in the kth tier according to a homogeneous PPP Φ _k with density λ _k. Users are also located according to a homogeneous PPP Φ _u with density λ _u that is independent of Φ _k (k=1,2,⋯,K). The probability that there resides at least one active user within the coverage of a BS in the kth tier is [18]:

$$ p_{uk}=1-{e^{-{\lambda_{u}}p_{k}\pi{R^{2}}}}, $$

((1))

where λ _u p _k is the user intensity associating with the kth tier.

To reduce user detection energy, the sensing period of BSs in the kth small cell tier follows a certain probability p _sk (k=2,3,⋯,K) which is self-optimized by the network using the sensing probability optimization approach described in Section 2.

2.2 Analysis of the active/sleep scheduling scheme

The probability of BS active/sleep modes is determined by the sensing probability vector p _s=(p _s2,p _s3,⋯,p _sK) followed by small cells at each tier. In our scheme, a small cell becomes active if there is at least one user of the previously active small cell that needs to be served, or the previously sleeping small cell performs user activity sensing and detects a macrocell user. Note that this detection may be a false positive. The state transition process of BS active/sleep modes determined by the sensing probability vector could be summarized as:

$$ \begin{aligned} p_{ak} (\,p_{s})&= p_{ak}(\,p_{s})p_{uk} + (1-p_{ak}(\,p_{s}))p_{sk}p_{u1}{p_{\mathrm{d}}} \\&\quad+ (1-p_{ak}(p_{s}))p_{sk}\left({1 - p_{u1}} \right){p_{\mathrm{f}}}. \end{aligned} $$

((2))

Here, p _ak(p _s) and 1−p _ak(p _s) are the probabilities of BSs’s active/sleep modes in the kth tier. p _a1=1 because macrocells are always active. p _d and p _f are the detection probability and false alarm probability, calculated as [19]:

$$ {p_{\mathrm{d}}} = Q\left({\left({\frac{\eta }{{\sigma^{2}}} - \gamma - 1} \right)\sqrt {\frac{{N}}{{2\gamma + 1}}} } \right), $$

((3))

$$ {p_{\mathrm{f}}} = Q\left({\left({\frac{\eta}{{\sigma^{2}}} - 1} \right)\sqrt {N} } \right), $$

((4))

where Q(·) is the complementary distribution function of the standard Gaussian, η is the detection threshold used by energy detection, σ ² is the variance of the additive white Gaussian noise, γ is the signal-to-noise-plus-interference ratio (SINR), N=⌊τ _s f _s⌋ is the total sample size, f _s is the sample frequency. Note that, the detection probability and false alarm probability could be adjusted to certain target values, $p_{\mathrm {d}}^{*}$ and $p_{\mathrm {f}}^{*}$, by setting sensing threshold and sample frequency to appropriate values η ^∗ and $f_{\mathrm {s}}^{*}$, which is out of scope of this paper.

Theorem 1.

The probability that a user associates with the kth-tier small cell using the maximum received power cell association policy is:

$$ {p_{k}} = \frac{{{\lambda_{k}}p_{ak}(p_{s})}}{{\sum\limits_{i = 1}^{K} {p_{ai}(p_{s}){\lambda_{i}}{{\left({{P_{i}}/{P_{k}}} \right)}^{2/\alpha }}} }}, $$

((5))

where P _k is the transmit power of BSs in the kth tier, and α is the path loss exponent.

Proof.

See Appendix 1.

The coverage probability is defined as the probability that a user’s SINR from its associated BS is higher than the target SINR value τ.

Theorem 2.

The coverage probability of a user is:

$$ \begin{array}{l} {P_{\mathrm{c}}} = \sum\limits_{k = 1}^{K} {2\pi p_{ak}\left(\,{p_{s}}\right){\lambda_{k}}P_{k}^{2/\alpha }} \int_{0}^{\infty} r\exp \left\{\vphantom{- \pi \sum\limits_{i = 1}^{K} {r^{2}}p_{ai}(\,{p_{s}}){\lambda_{i}}P_{i}^{2/\alpha }\left(1 + \rho \left({\tau,\alpha } \right) \right)} - \tau {r^{\alpha} }{\sigma^{2}} \right. \\ \quad \quad\left. - \pi \sum\limits_{i = 1}^{K} {r^{2}}p_{ai}(\,{p_{s}}){\lambda_{i}}P_{i}^{2/\alpha }\left(1 + \rho \left({\tau,\alpha } \right) \right) \right\}dr, \end{array} $$

((6))

where $\rho \left ({\tau,\alpha }\right) = { {\tau }^{2/\alpha }}\int _{{{{\tau } }^{- 2/\alpha }}}^{\infty } {\frac {1}{{1 + {x^{\alpha /2}}}}dx} $.

Proof.

See Appendix 2.

If we assume orthogonal transmissions where equal resources are allocated to each user in a round-robin scheduling manner, the ergodic capacity C of a typical user in the K-tier heterogeneous network is given as:

$$ C = {\frac{{{t_{\mathrm{s}}}}}{T}{p_{1}}{C_{0}} + \frac{{T - {t_{\mathrm{s}}}}}{T}\sum\limits_{k = 1}^{K} {{p_{k}}{C_{k}}} }. $$

((7))

In Equation 7, C ₀ is the ergodic rate of a user associated with the first tier during the sensing time t _s when there is no interference from the other tiers, and C _k is the ergodic rate of a user associated with the kth tier during the time T−t _s.

Theorem 3.

The ergodic rate of a user associated with the first tier is:

$$ {}{\fontsize{8.9pt}{9.6pt}\selectfont{\begin{aligned} &{C_{0}} = \frac{{\ln \left({1 + \tau } \right)2\pi {\lambda_{1}}^{2}P_{1}^{2/\alpha }}}{{{p_{1}}^{2}\lambda_{u}}}\int_{0}^{\infty} r\exp \left\{\vphantom{\dot{\pi \sum\limits_{k = 1}^{K} {{r^{2}}p_{ak}\left({{p_{s}}} \right){\lambda_{k}}P_{k}^{2/\alpha }} - \pi {r^{2}}{\lambda_{1}}P_{1}^{2/\alpha }\rho \left({\tau,\alpha } \right)}\!}- \tau {r^{\alpha} }{\sigma^{2}}\right. \\ &\quad \quad-\left. \pi \sum\limits_{k = 1}^{K} {{r^{2}}p_{ak}\left(\, {{p_{s}}} \right){\lambda_{k}}P_{k}^{2/\alpha }} - \pi {r^{2}}{\lambda_{1}}P_{1}^{2/\alpha }\rho \left({\tau,\alpha } \right)\right\} dr\\ & + \frac{{2\pi {\lambda_{1}}^{2}P_{1}^{2/\alpha }}}{{{p_{1}}^{2}\lambda_{u}}}\int_{0}^{\infty} \int_{ln(1 + \tau)}^{\infty} r\exp \left\{\vphantom{\dot{\pi \sum\limits_{k = 1}^{K} {{r^{2}}p_{ak}\left({{p_{s}}} \right){\lambda_{k}}P_{k}^{2/\alpha }} - \pi {r^{2}}{\lambda_{1}}P_{1}^{2/\alpha }\rho \left({{e^{t}} - 1,\alpha } \right)}\!} - \left({{e^{t}} - 1} \right){r^{\alpha} }{\sigma^{2}} \right.\\ &\quad \quad-\left. \pi \sum\limits_{k = 1}^{K} {{r^{2}}p_{ak}\left(\, {{p_{s}}} \right){\lambda_{k}}P_{k}^{2/\alpha }} - \pi {r^{2}}{\lambda_{1}}P_{1}^{2/\alpha }\rho \left({{e^{t}} - 1,\alpha } \right)\right\} dtdr \end{aligned}}} $$

((8))

and the ergodic rate of a user associated with the kth tier is:

$$ {} {\fontsize{9.pt}{9.6pt}\selectfont{\begin{aligned} &{C_{k}} = \frac{{\ln \left({1 + \tau } \right)2\pi (p_{ak}({p_{s}}))^{2}{\lambda_{k}}^{2}P_{k}^{2/\alpha }}}{{p_{k}}^{2}\lambda_{u}}\int_{0}^{\infty} r\exp \left\{\vphantom{\pi \sum\limits_{i = 1}^{K} {r^{2}}p_{ai}({p_{s}}){\lambda_{i}}P_{i}^{2/\alpha }\left({1 + \rho \left({\tau,\alpha } \right)} \right)} - \tau {r^{\alpha} }{\sigma^{2}}\right. \\ &\quad \quad-\left. \pi \sum\limits_{i = 1}^{K} {r^{2}}p_{ai}({p_{s}}){\lambda_{i}}P_{i}^{2/\alpha }\left({1 + \rho \left({\tau,\alpha } \right)} \right) \right\} dr\\ &+ \frac{{2\pi (p_{ak}({p_{s}}))^{2}{\lambda_{k}}^{2}P_{k}^{2/\alpha }}}{{{p_{k}}^{2}\lambda_{u}}}\int_{0}^{\infty} \int_{ln(1 + \tau)}^{\infty} r\exp \left\{ \vphantom{\pi \sum\limits_{i = 1}^{K} {r^{2}}p_{ai}({p_{s}}){\lambda_{i}}P_{i}^{2/\alpha }\left({1 + \rho \left({{e^{t}} - 1,\alpha } \right)} \right)} - \left({{e^{t}} - 1} \right){r^{\alpha} }{\sigma^{2}} \right. \\ &\quad \quad -\left. \pi \sum\limits_{i = 1}^{K} {r^{2}}p_{ai}({p_{s}}){\lambda_{i}}P_{i}^{2/\alpha }\left({1 + \rho \left({{e^{t}} - 1,\alpha } \right)} \right) \right\} dtdr, \end{aligned}}} $$

((9))

Proof.

See Appendix 3.

According to the active/sleep model, the main power consumptions of BSs in the first macrocell tier consists of the constant power E _c1 and the processing power E _p1. Hence, the expected energy consumption of a macrocell is:

$$ {E_{1}} = E_{c1} + E_{p1}. $$

((10))

The main power consumptions of BSs in the remaining tiers consist of the constant power E _ck, the sensing power E _sk and the processing power E _pk during active mode (k=2,3,⋯,K) [5,20]. We consider that the energy consumption is proportional to time, and constant power is consumed over the entire time slot T. The expected energy consumption of a small cell in the kth tier (k=2,3,⋯,K) is:

$$ {} {\fontsize{8.8pt}{9.6pt}\selectfont{\begin{aligned} {E_{k}} = E_{ck} + \underbrace {\left({1 - p_{ak}(p_{s})} \right)p_{sk}E_{sk}{t_{\mathrm{s}}}/T}_{\mathrm{sensing\,energy\,for\,sleep\,mode}} + \underbrace {p_{ak}(p_{s})E_{pk}\left({T - {t_{\mathrm{s}}}} \right)/T}_{\mathrm{procesing\,energy\,for\,active\,mode}}\!. \end{aligned}}} $$

((11))

Consequently, the total energy consumption of the heterogeneous network is:

$$ E = \sum\limits_{k = 1}^{K} {{\lambda_{k}E_{k}}}. $$

((12))

3 Self-optimization of user activity sensing based on fuzzy Q-learning

While an active/sleep scheduling scheme improves energy efficiency, the introduction of sleep mode for the BSs may lead to outage or lower capacity affecting quality of service. To guarantee at least basic network performance, while improving energy efficiency, we formulate the optimization problem of our active/sleep scheduling scheme as follows:

$$ \textbf{P}:\quad\mathop {\min } \limits_{{p_{s}}} \ \ E $$

((13))

$$ \quad\ \mathrm{s.t.}\quad {P_{\mathrm{c}}} \ge {\varepsilon_{\mathrm{p}}} $$

((14))

$$ \quad\quad \quad \quad {C} \ge {\varepsilon_{\mathrm{c}}}. $$

((15))

where ε _p and ε _c are, respectively, the threshold coverage probability and average capacity offered to a user. P _c and C are as defined in Equations 6 and 7, respectively.

To solve the problem P, we propose a SPO approach based on fuzzy Q-learning [21-23], which optimizes the key sensing probabilities of the proposed active/sleep scheduling scheme by interacting with the uncertain environment and learning from the past experience. Our approach tunes the sensing probability for each K−1 tiers in a self-optimized manner according to the active user density λ _u. Assuming that the active user density does not fluctuate fast, we avoid real-time tuning and execute the tuning of the sensing probability periodically. Therefore, our approach accepts centralized operation, and the new values for sensing probabilities are computed by a centralized management entity and transmitted periodically to the BSs at each tier.

The tuning of the sensing probabilities is represented by an action vector a=(p _s2,p _s3,⋯,p _sK). To manage continuous state λ _u and action vector spaces, a fuzzy inference system is used. Firstly, the current state λ _u should be fuzzified into a fuzzy set s. The degree of truth α _i(λ _u) that the current state belongs to fuzzy sets s _i is determined by membership functions. For example, as shown in Figure 2, triangular membership functions are used to determined which sets the state λ _u belongs to and how much degree of truth can be obtained for each set.

Then, fuzzy inference rules are used to determine the tuning action a. The ith fuzzy inference rule for fuzzy set s _i can be described as:

$$ \begin{array}{l} \mathrm{IF\ current\ state\ is}\ s_{i}\\ \mathrm{THEN\ the\ action\ is}\ a_{i1} \ \text{with}\ q_{i1},\\ \quad \quad \quad \quad \quad \quad \quad \cdots \cdots \\ \ \quad \mathrm{or\quad \ the\ action\ is}\ a_{ij} \ \text{with}\ q_{ij},\\ \quad \quad \quad \quad \quad \quad \quad \cdots \cdots \\ \ \quad \mathrm{or\quad \ the\ action\ is}\ a_{iJ} \ \text{with}\ q_{iJ}. \end{array} $$

Here, a _ij is the discrete sensing probability tuning action vector of the jth inference result responding to the ith rule. q _ij represents the elementary quality, and the higher value of q _ij, the higher the trust for the corresponding sensing probability configuration.

The action of the ith rule is selected by an exploration/exploitation policy using ε-greed method as follows:

$$ c\left(i \right) = \left\{ {\begin{array}{*{20}{l}} {\mathop {{\text{random}}}\limits_{j = 1,2, \ldots,J} (j),\quad \quad\;\;\;{\text{with}}\;{\text{prob}}.\;\varepsilon }\\ {\arg \mathop {\max }\limits_{j = 1,2, \ldots,J} {q_{ij}},\quad \;{\text{with}}\;{\text{prob}}.\;1 - \varepsilon } \end{array}} \right.. $$

((16))

The inferred tuning action vector of sensing probabilities for state s is given as:

$$ a\left({{s}} \right) = \sum\limits_{i = 1}^{I} {{a_{ic\left(i \right)}}{\alpha_{i}}\left({{s}} \right)}. $$

((17))

In addition, during the trial-and-error process of action policy exploration, to avoid bad actions that result in negative performance, a must be checked according to the constraints of coverage probability and capacity (see Equations 14 and 15). Although the coverage probability and the ergodic capacity are not given in a closed-form expression, the integrals are fairly easy to compute. If the coverage probability and capacity derived from the output sensing probabilities do not meet the constraints, the action for current state should be reselected according to Equation 16 excluding the faulty actions.

After applying the tuning action, the corresponding feedback reward is obtained from the environment. We define the reward value as the inverse of average energy consumption during a tuning action period, where the corresponding punishments are coverage outage and capacity shortage:

$$ r = \left\{ \begin{array}{l} 1/E,\quad \text{if}\;{P_{\mathrm{c}}} \ge {\varepsilon_{\mathrm{p}}}\;\text{and}\;C \ge {\varepsilon_{\mathrm{c}}}\\ - 1,\quad \quad \quad \quad \text{otherwise} \end{array} \right.. $$

((18))

With the feedback reward, the quality function is updated to maximize the expected reward. The quality function Q _π(s,a) is defined as the expected sum of discounted rewards from the initial state s ₀ under the optimal action policy π as follows:

$$ {Q_{\pi} }(s,a) = {\mathbb{E}_{\pi} }\left[\sum\limits_{t = 0}^{\infty} {{\theta^{t}}r({s_{t}},{a_{t}})} \left| {{s_{0}} = s,{a_{0}} = a} \right.\right]. $$

((19))

s _t and a _t denote the state and the action of the fuzzy inference rule at step t, and θ is the discount factor.

The Q-learning algorithm updates the quality function iteratively:

$$ {Q_{t + 1}}({s_{t}},{a_{t}}) = {Q_{t}}({s_{t}},{a_{t}}) + \Delta {Q_{t}}, $$

((20))

where Δ Q _t depends on the reward value, the quality function and the value function. The quality function of the activated rules is calculated as:

$$ Q_{t}\left({{s_{t}},a_{t}\left({{s_{t}}} \right)} \right) = \sum\limits_{i } {{q_{ic\left(i \right)}}{\alpha_{i}}\left({{s_{t}}} \right)}. $$

((21))

And the value function of the new state after performing the applied action is calculated as:

$$ {V_{t}}({s_{t + 1}}) = \sum\limits_{i} {\mathop {\max }\limits_{j} q_{ij} \alpha_{i}\left({{s_{t+1}}} \right)}. $$

((22))

Using these three parameters obtained from Equations 18, 21, and 22, Δ Q _t is calculated as:

$$ \Delta {Q_{t}} = \xi ({r} + \theta {V_{t}}({s_{t + 1}}) - {Q_{t}}({s_{t}},{a_{t}})). $$

((23))

ξ is the learning rate for Q-learning.

Finally, the elementary quality $q_{\textit {ij}}^{}$, which determines the fuzzy inference rules, should be updated by:

$$ \Delta q_{ij} = \left\{ \begin{array}{l} \Delta Q \alpha_{i}\left({{s_{t+1}}} \right),\quad \text{if}\ j = c(i)\\ 0,\quad \quad \quad \quad \quad\quad\ \text{otherwise} \end{array} \right.. $$

((24))

4 Simulation results

In this section, we evaluate first the performance of the active/sleep scheduling scheme and then the benefits of reinforcement learning-based self-optimization approach. The simulation parameters are listed in Table 1 and are selected based on [2,5,9,24].

Table 1 Simulation parameters

Full size table

4.1 Performance of user activity sensing-based active/sleep scheduling scheme

We first evaluate the coverage probability of a typical user in the network when active/sleep scheduling is used. Figure 3 shows that with decreasing sensing probabilities, the coverage probability also decreases. This is expected as the BSs will be more likely to be in the sleep state, and hence, they will be less likely to cover active users. In addition, if there are more users in the network (i.e., the active user density is higher), it will be easier for BS to detect the active users, and consequently more BSs will be active. Therefore, the coverage probability will be improved with the increasing density of active users.

Figures 4 and 5 show the capacity of a typical user and the network capacity per unit area, which is calculated by summing all user capacities. As the sensing probabilities increase, more BSs will be active. On the downside, the network capacity is affected due to higher interference. Also, more users will be offloaded to small cells, where the users are not able to transmit during the sensing time of the small cells. On the positive side, the spectrum utilization will increase and more active BSs will lead to fewer users per cell, and hence, higher resource allocation per user. So, to guarantee the user capacity, the sensing probabilities should not be configured too low to ensure that there are enough BSs in the network to detect user activity and go into active mode to provide enough capacity for users. In addition, if there are more users in the network, the network capacity will improve as more users trigger the activation of more BSs. However, if there are more users in the network competing for the network capacity, the capacity of a typical user which is almost inversely proportional to the number of users in the network will obviously reduce.

Finally, Figure 6 shows the energy consumption performance. As expected, with increasing sensing probabilities, energy consumption also increases. Energy consumption will also increase with more users, which will make more BSs active under the same sensing probabilities. Consequently, to minimize the energy consumption, we should configure the sensing probabilities as low as possible to make more BSs sleep while maintaining coverage and capacity guarantees. In the next section, we evaluate how the self-optimization approach tunes the sensing probabilities respecting the trade-off for energy efficiency and quality of service.

4.2 Performance of self-optimization approach

Based on the analysis in the previous section, it is necessary to configure the sensing probabilities to adapt to the fluctuations in active user density and to maintain both energy efficiency and quality of service. An example of how the density of active user density fluctuates within a day is shown in Figure 7. To understand and compare the performance of our SPO approach under such fluctuations, we compare it against the following schemes:

Scheme 1: SPO. The sensing probabilities of K-tier heterogeneous networks are self-optimized periodically adapting to the user activity fluctuations using reinforcement learning.
Figure 7
User density fluctuations.
Full size image
Scheme 2: always sensing. All BSs in all small cell tiers always sense user activity during the sensing time. Hence, the sensing probability of BSs in every small cell tier is 1.
Scheme 3: always active. All BSs are always active, and they do not perform user activity sensing.
Scheme 4: only macrocell. All users are served by macrocells (i.e., there are no active small cells in the network).
Scheme 5: random sensing. Each small cell senses user activity with a certain probability (e.g., 0.3 in our evaluation).
Scheme 6: random sleep. Each small cell goes into the active/sleep mode with a certain probability (e.g., 0.3 in our evaluation) and does not do the user activity sensing.

Figure 8 shows the comparison of the coverage probability of all the schemes. We see that the coverage probability cannot be guaranteed if all the small cells are turned off (only macrocell case). The coverage probabilities of schemes that use user activity sensing fluctuate with the active user density. This is expected as the user density affects the probability of BSs being in active or sleep state. The coverage probability of random sensing scheme cannot be guaranteed when the active user density is low. This is because the sensing probabilities are not properly configured and there are not enough active BSs to guarantee the coverage. On the other hand, SPO is able to adapt to the user density fluctuations shown in Figure 7 and maintains the coverage probability around the threshold value (i.e., the target coverage performance) when the active user density is low. SPO fluctuates slightly around the threshold due to periodic optimization, and not real-time adaption. Nevertheless, SPO strikes the right balance by turning appropriate numbers of BSs active, managing to improve energy efficiency while guaranteeing target coverage.

Figure 9 shows the comparisons in terms of user capacity. The user capacity performance of all schemes fluctuate with the active user density mainly because of the fluctuations of the number of users per cell. Users will obtain high capacity when the user density is low under any of the schemes. The random sensing scheme cannot guarantee user capacity, when the active user density is high because there are not enough active BSs to provide the necessary capacity. On the other hand, SPO scheme turns as many as possible BSs to sleep and still guarantees the target user capacity. In addition, we can conclude from Figures 8 and 9 that SPO emulates the desired behavior by emphasizing the coverage probability when the active user density is low and the user capacity when the active user density is high.

Finally, we compare all schemes in terms of energy consumption in Figure 10. Compared to the always sensing, always active, and random sleep schemes, the energy consumption of SPO is greatly reduced by 14.37%, 83.78%, and 22.33%, respectively. The energy consumption of SPO is similar to the energy consumption of random sensing, but SPO guarantees better QoS. On the other hand, the random sensing and random sleep schemes cannot guarantee QoS, and also, their energy consumption may increase further if the probabilities of sensing and being active are not properly configured. In addition, only-macrocell scheme is the worst scheme because although the energy consumption is low, the spectrum utilization is also significantly low, and therefore, its coverage and capacity performance is much worse than the other schemes. In summary, SPO provides an efficient way to decide the active/sleep states of BSs with minimized energy consumption and guaranteed QoS of users as it tracks user activity and makes use of self-organization. In this way, the heterogeneous networks operate more flexibly and do not turn on BSs blindly, especially when there is no traffic demand, which consequently improves the energy efficiency.

5 Conclusions

This paper proposed an active/sleep scheduling scheme for K-tier heterogeneous networks, which senses and adapts to user activity. Coverage probability, network capacity, as well as energy consumption of the proposed active/sleep scheduling were analyzed using stochastic geometry, accounting for cell association uncertainties due to random positioning of users and BSs, propagation channel, and network interference. A reinforcement learning-based SPO approach was proposed to optimize the user activity sensing probability of each small cell tier, considering user activity fluctuations and user QoS. Simulation results showed that SPO achieves low energy consumption with guaranteed network capacity and coverage probability. Possible future work includes the exploitation of more environmental awareness capabilities. And it would be of interest to extend the proposed scheme to the case, where small cells perform opportunistic usage of the frequency spectrums, for higher frequency spectrum usage and energy efficiency.

6 Appendices

6.1 Appendix 1

6.1.1 Proof of Theorem 1

The received power of a typical user from the nearest BS in the kth tier is $P_{\textit {rk}}=P_{k}R_{k}^{-\alpha }$, where P _k is the transmit power of BSs in the kth tier, α is the path loss exponent, and R _k is the distance to the nearest BS in the kth tier. Under the maximum received power-based cell association scheme where a user is associated with a BS if the received power from the BS is higher than any others, a typical user is associated with the kth tier when P _rk>P _ri for all i∈{1,2,⋯,K},i≠k. Therefore,

$$ \begin{aligned} {p_{k}} &= {\mathbb{E}_{{R_{k}}}}\left[ \mathbb{P}{\left[ {{P_{rk}}\left({{R_{k}}} \right) > \max \limits_{i \ne k} {P_{ri}}\left({{R_{i}}} \right)} \right]} \right]\\ \quad \; &= {\mathbb{E}_{{R_{k}}}}\left[ {\prod\limits_{i = 1,i \ne k}^{K} {\mathbb{P}\left[ {{P_{rk}}\left({{R_{k}}} \right) > {P_{ri}}\left({{R_{i}}} \right)} \right]} } \right]\\ \quad \; &= {\mathbb{E}_{{R_{k}}}}\left[ {\prod\limits_{i = 1,i \ne k}^{K} {\mathbb{P}\left[ {{R_{i}} > {{\left({{P_{i}}/{P_{k}}} \right)}^{1/\alpha }}{R_{k}}} \right]} } \right]\\ \quad \; &= \int_{0}^{\infty} {\prod\limits_{i = 1,i \ne k}^{K} {\mathbb{P}\left[ {{R_{i}} > {{\left({{P_{i}}/{P_{k}}} \right)}^{1/\alpha }}{r}} \right]} \;{f_{{R_{k}}}}\left(r \right)dr}, \end{aligned} $$

((25))

where

$$ \begin{aligned} &\mathbb{P}\left[ {{R_{i}} > {{\left({{P_{i}}/{P_{k}}} \right)}^{1/\alpha }}r} \right]\\ &= \mathbb{P}\left[ {\textup{No\ BS\ closer\ than\ }{{\left({{P_{i}}/{P_{k}}} \right)}^{1/\alpha }}r\ \textup{in\ the}\ i\textup{th\ tier}} \right]\\ &= {e^{- p_{ai}\left({{p_{s}}} \right){\lambda_{i}}\pi {{\left({{P_{i}}/{P_{k}}} \right)}^{2/\alpha }}{r^{2}}}}, \end{aligned} $$

((26))

and the probability density function (PDF) of R _k is

$${} {\fontsize{9.5pt}{9.6pt}\selectfont{\begin{aligned} {f_{{R_{k}}}}\left(r \right) = \frac{{d\left(1-\mathbb{P}\left[ {{R_{k}} > r} \right]\right)}}{{dr}} = 2p_{ak}\left(\, {{p_{s}}} \right){\lambda_{k}}\pi r{e^{- p_{ak}\left({{p_{s}}} \right){\lambda_{k}}\pi {r^{2}}}}. \end{aligned}}} $$

((27))

Plugging (26) and (27) into (25), we obtain

$${} {\fontsize{8.7pt}{9.6pt}\selectfont{\begin{aligned} {p_{k}} &= 2\pi p_{ak}\left({{p_{s}}} \right){\lambda_{k}}\int_{0}^{\infty} \!{r\exp } \left\{\! - \pi \sum\limits_{i = 1}^{K} {p_{ai}\left({{p_{s}}} \right){\lambda_{i}}{{\left({{P_{i}}/{P_{k}}} \right)}^{2/\alpha }}{r^{2}}} \right\} dr\\ \quad \; &= \frac{{{\lambda_{k}}p_{ak}(\,p_{s})}}{{\sum\limits_{i = 1}^{K} {p_{ai}(p_{s}){\lambda_{i}}{{\left({{P_{i}}/{P_{k}}} \right)}^{2/\alpha }}} }}. \end{aligned}}} $$

((28))

6.2 Appendix 2

6.2.1 Proof of Theorem 2

Considering that only a macrocell user can transmit data and will have no interference from the other small cell tiers during the time t _s, the SINR of a macrocell user during the time t _s is higher than the SINR during the time T−t _s. So, the coverage probability of a typical user depends on SINR during the time T−t _s. The SINR of a typical user at a distance r from its serving BS in the kth tier during the time T−t _s is defined as

$$ {\text{SINR}_{k}}(r) = \frac{{{P_{k}}{h_{k}}{r^{- \alpha }}}}{{{\sum\nolimits}_{i = 1}^{K} {{\sum\nolimits}_{j \in {\Phi_{i}}} {{P_{i}}{h_{ij}}r_{ij}^{- \alpha }} } + {\sigma^{2}}}}, $$

((29))

where h _k and h _ij are the channel power gain due to small-scale fading form the serving BS and the jth BS in the ith tier, respectively, we assume that h _k∼ exp(1) and h _ij∼ exp(1), and r _ij is the distance from the jth BS in the ith tier excluding the serving BS. For a target SINR τ, the coverage probability of a typical user is

$$ \begin{aligned} {P_{\mathrm{c}}} &= \sum\limits_{k = 1}^{K}{p_{k}\mathbb{E}_{r}}\left[ {\mathbb{P}\left[ {{\text{SINR}_{k}}\left(r \right) > \tau } \right]} \right]\\ \quad \ \ &= \sum\limits_{k = 1}^{K}p_{k}\int_{0}^{\infty} {\mathbb{P}\left[ {\text{SINR}{_{k}}\left(r \right) > \tau } \right]{f_{{{sR}_{k}}}}\left(r \right)dr}. \end{aligned} $$

((30))

The PDF of the distance from a user served in the kth tier to the serving BS is

$${} {\fontsize{7.8pt}{9.6pt}\selectfont{\begin{aligned} &{f_{s{R_{k}}}}\left(r \right) = \frac{{d\left({1 - \mathbb{P}\left[ {{R_{k}} > r,\;{P_{rk}}\left({{R_{k}}} \right) > \mathop {\max }\limits_{i \ne k} {P_{ri}}\left({{R_{i}}} \right)} \right]} \right)}}{{p_{k}dr}}\\ & = \frac{{d\left({1 - \int_{r}^{\infty} {\prod\limits_{i = 1,i \ne k}^{K} {\left[ {{R_{i}} > {{\left({{P_{i}}/{P_{k}}} \right)}^{1/\alpha }}x} \right]} \;{f_{{R_{k}}}}\left(x \right)dx} } \right)}}{{p_{k}dr}}\\ &\mathop = \limits^{\left(a \right)} \frac{{d\left({1 - 2\pi p_{ak}\left({{p_{s}}} \right){\lambda_{k}}\int_{r}^{\infty} {x\exp } \left\{ - \pi \sum\limits_{i = 1}^{K} {p_{ai}\left({{p_{s}}} \right){\lambda_{i}}{{\left({{P_{i}}/{P_{k}}} \right)}^{2/\alpha }}{x^{2}}} \right\} dx} \right)}}{{p_{k}dr}}\\ & = \frac{{2\pi p_{ak}\left({{p_{s}}} \right){\lambda_{k}}r}}{{p_{k}}}\exp \left\{ - \pi \sum\limits_{i = 1}^{K} {p_{ai}\left({{p_{s}}} \right){\lambda_{i}}{{\left({{P_{i}}/{P_{k}}} \right)}^{2/\alpha }}{r^{2}}}\right \}, \end{aligned}}} $$

((31))

where (a) follows from (26) and (27). The user SINR in (29) is rewritten as ${\text {SINR}_{k}}(r) = \frac {{{h_{k}}}}{{P_{_{k}}^{- 1}{r^{\alpha } }Q}}$, where $Q = {\sum \nolimits }_{i = 1}^{K} {{I_{i}}} + {\sigma ^{2}}$. Therefore,

$$ \begin{aligned} &\mathbb{P}\left[ {\text{SINR}{_{k}}\left(r \right) > \tau } \right] \\ &= \mathbb{P}\left[ {{h_{k}} > {r^{\alpha} }P_{k}^{- 1}\tau Q} \right]\\ & = \int_{0}^{\infty} {\exp \left\{ - {r^{\alpha} }P_{k}^{- 1}\tau q\right\} {f_{Q}}\left(q \right)dq} \\ & = {\mathbb{E}_{Q}}\left[ {\exp \left\{ - {r^{\alpha} }P_{k}^{- 1}\tau q\right\} } \right]\\ & = \exp \left\{ - \frac{{\tau {\sigma^{2}}}}{{{r^{- \alpha }}{P_{k}}}}\right\} \prod\limits_{i = 1}^{K} {{\mathcal{L}_{{I_{i}}}}\left({{r^{\alpha} }P_{k}^{- 1}\tau } \right)}, \end{aligned} $$

((32))

where the Laplace transform of I _i is

$${} {\fontsize{9.4pt}{9.6pt}\selectfont{\begin{aligned} &{\mathcal{L}_{{I_{i}}}}\left({{r^{\alpha} }P_{k}^{- 1}\tau } \right)\\ & = {\mathbb{E}_{{I_{i}}}}\left[ {\exp \left\{ - {r^{\alpha} }P_{k}^{- 1}\tau {I_{i}}\right\} } \right]\\ & = {\mathbb{E}_{{\Phi_{i}}}}\left[ {\exp \left\{ - {r^{\alpha}}{P_{i}} \,{P_{k}^{- 1}}\tau \sum\limits_{j \in {\Phi_{i}}} {{h_{ij}}r_{ij}^{- \alpha }} \right\} } \right]\\ &= \exp \left\{ - 2\pi p_{ai}\left({{p_{s}}} \right){\lambda_{i}}\int_{{z_{i}}}^{\infty} {\left({1 - {\mathcal{L}_{{h_{i}}}}\left({{r^{\alpha} }{P_{j}}P_{k}^{- 1}\tau {x^{- \alpha }}} \right)} \right)xdx} \right\}\\ & = \exp \left\{ - 2\pi p_{ai}\left(\, {{p_{s}}} \right){\lambda_{i}}\int_{{z_{i}}}^{\infty} {\frac{x}{{1 + {{\left({{r^{\alpha} }{P_{i}}P_{k}^{- 1}\tau } \right)}^{- 1}}{x^{\alpha} }}}xdx} \right\} \\ & = \exp\left\{ - \pi p_{ai}\left(\, {{p_{s}}} \right){\lambda_{i}}{\left({{P_{i}}/{P_{k}}} \right)^{2/\alpha }}\rho \left({\tau,\alpha } \right){r^{2 }}\right\}, \end{aligned}}} $$

((33))

where z _i=(P _i/P _k)^1/α x is the shortest distance to the BS in the ith tier, and $\rho \left ({\tau,\alpha }\right) = { {\tau }^{2/\alpha }}\int _{{{{\tau } }^{- 2/\alpha }}}^{\infty } {\frac {1}{{1 + {x^{\alpha /2}}}}dx} $. Plugging (31), (32), and (33) into (30), we obtain

$$ \begin{aligned} &{P_{\mathrm{c}}} = \sum\limits_{k = 1}^{K} {2\pi p_{ak}(\,{p_{s}}){\lambda_{k}}} \int_{0}^{\infty} r\exp\left\{\vphantom{\pi \sum\limits_{i = 1}^{K} {{r^{2}}p_{ai}(\,{p_{s}}){\lambda_{i}}{{\left({{P_{i}}/{P_{k}}} \right)}^{2/\alpha }}\left({1 + \rho \left({\tau,\alpha } \right)} \right)}} - \tau {r^{\alpha} }{\sigma^{2}}/{P_{k}}\right. \\ &\quad -\left. \pi \sum\limits_{i = 1}^{K} {{r^{2}}p_{ai}(\,{p_{s}}){\lambda_{i}}{{\left({{P_{i}}/{P_{k}}} \right)}^{2/\alpha }}\left({1 + \rho \left({\tau,\alpha } \right)} \right)} \right\} dr\\ & = \sum\limits_{k = 1}^{K} {2\pi p_{ak}(\,{p_{s}}){\lambda_{k}}P_{k}^{2/\alpha }} \int_{0}^{\infty} r\exp \left\{\vphantom{\pi \sum\limits_{i = 1}^{K} {{r^{2}}p_{ai}({p_{s}}){\lambda_{i}}P_{i}^{2/\alpha }\left({1 + \rho \left({\tau,\alpha } \right)} \right)}} - \tau {r^{\alpha} }{\sigma^{2}}\right. \\ &\quad -\left. \pi \sum\limits_{i = 1}^{K} {{r^{2}}p_{ai}({p_{s}}){\lambda_{i}}P_{i}^{2/\alpha }\left({1 + \rho \left({\tau,\alpha } \right)} \right)} \right\} dr. \end{aligned} $$

((34))

6.3 Appendix 3

6.3.1 Proof of Theorem 3

The rate of the typical user is

$$ {c_{k}}\left(r \right) = \left\{ \begin{array}{l} \ln \left({1 + \text{SINR}{_{k}}\left(r \right)} \right),\quad \text{if}\;\text{SINR}{_{k}}\left(r \right) > \tau \\ 0,\quad \quad \quad \quad \quad \quad \quad \quad\quad\quad \text{otherwise} \end{array} \right.. $$

((35))

And the ergodic rate of the typical user associated with the kth tier during the time T−t _s is

$$ \begin{aligned} {C_{k}} &= \frac{1}{N_{k}}{\mathbb{E}_{r}}\left[ {\mathbb{E}_{\text{SINR}{_{k}}}}\left[ {c_{k}}(r) \right] \right]\\ \quad \ &= \frac{1}{N_{k}}\int_{0}^{\infty} {{\mathbb{E}_{\text{SINR}{_{k}}}}\left[ {c_{k}} (r)\right]{f_{s{R_{k}}}}\left(r \right)dr}, \end{aligned} $$

((36))

where the average number N _k of users per cell in the kth tier is calculated as N _k=p _k λ _u/p _ak(p _s)λ _k. And the throughput of the user is inversely proportional to the number of users in the cell due to the round-robin scheduling manner. Since $\mathbb {E}\left [ X \right ] = \int _{0}^{\infty } {\mathbb {P}\left [ {X > x} \right ]dx} $ for X>0, we obtain

$$ {} \begin{aligned} &{\mathbb{E}_{\text{SINR}{_{k}}}}\left[ {{c_{k}}\left(r \right)} \right]\\ & = \int_{0}^{\infty} {\mathbb{P}\left[ {{c_{k}}\left(r \right) > t} \right]dt} \\ & = \int_{0}^{\ln \left({1 + \tau } \right)} {\mathbb{P}\left[ {{\text{SIN}}{{\mathrm{R}}_{k}}\left(r \right) > \tau } \right]dt} \\ &\quad + \int_{\ln \left({1 + \tau } \right)}^{\infty} {\mathbb{P}\left[ {{\text{ln}}\left({{{1 + \text{SIN}}}{{\mathrm{R}}_{k}}\left(r \right)} \right) > t} \right]dt} \\ & = \ln \left({1 + \tau } \right)\mathbb{P}\left[ {{\text{SIN}}{{\mathrm{R}}_{k}}\left(r \right) > \tau } \right]\\ &\quad+ \int_{\ln \left({1 + \tau } \right)}^{\infty} {\mathbb{P}\left[ {{h_{k}} > {r^{\alpha} }P_{k}^{- 1}\tau Q\left({{e^{t}} - 1} \right)} \right]dt} \\ & = \ln \left({1 + \tau } \right)\mathbb{P}\left[ {{\text{SIN}}{{\mathrm{R}}_{k}}\left(r \right) > \tau } \right]\\ & + \int_{\ln \left({1 + \tau } \right)}^{\infty} {\exp\! \left\{\! - \frac{{\left({{e^{t}} - 1} \right){\sigma^{2}}}}{{{r^{- \alpha }}{P_{k}}}}\right\} \prod\limits_{i = 1}^{K} {{{\cal L}_{{I_{i}}}}\left({{r^{\alpha} }P_{k}^{- 1}\left({{e^{t}} - 1} \right)} \right)} dt}. \end{aligned} $$

((37))

Plugging (31), (32), (33), and (37) into (36), we obtain the ergodic throughput of a user associated with the kth tier during the time T−t _s in (9). During the sensing time t _s, the user associated with the first tier will have no interference from the other tiers. Therefore, the user SINR is ${\text {SINR}_{0}}(r) = \frac {{{h_{k}}}}{{P_{_{k}}^{- 1}{r^{\alpha } }Q_{0}}}$, where Q ₀=I ₁+σ ². The ergodic throughput of the first tier user during the time t _s is

$$ \begin{aligned} {C_{0}} &= \frac{1}{N_{1}}{\mathbb{E}_{r}}\left[ {{\mathbb{E}_{\text{SINR}{_{0}}}}\left[ {c_{0}}(r) \right]} \right]\\ \quad \ \ &= \frac{1}{N_{1}}\int_{0}^{\infty} {{\mathbb{E}_{\text{SINR}{_{0}}}}\left[ {c_{0}} (r)\right]{f_{s{R_{1}}}}\left(r \right)dr}. \end{aligned} $$

((38))

The ergodic rate at distance r is

$$ {} \begin{aligned} &\mathbb{E}_{\text{SINR}{_{0}}}\left[ {{c_{0}}\left(r \right)} \right]\\ & = \ln \left({1 + \tau } \right) \mathbb{P}\left[{\text{SIN}}{\mathrm{R}}_{0}\left(r \right) > \tau \right] \\ &\quad + \int_{\ln \left({1 + \tau } \right)}^{\infty} \mathbb{P}\left[ {{h_{1}} > {r^{\alpha} }P_{1}^{- 1}\tau {Q_{0}}\left({{e^{t}} - 1} \right)} \right]dt \\ & = \ln \left({1 + \tau } \right)\mathbb{P}\left[ {\text{SIN}}{{\mathrm{R}}_{0}}\left(r \right) > \tau \right]\\ & \quad+ \int_{\ln \left({1 + \tau } \right)}^{\infty} \exp \left\{ - \frac{{\left({{e^{t}} - 1} \right){\sigma^{2}}}}{{{r^{- \alpha }}{P_{1}}}}\right\} {L_{{I_{1}}}}\left({r^{\alpha} }P_{1}^{- 1}\left({{e^{t}} - 1} \right) \right)dt, \end{aligned} $$

((39))

where

$$ \begin{aligned} &\mathbb{P}\left[ {\text{SINR}{_{0}}\left(r \right) > \tau } \right] \\ &= \mathbb{P}\left[ {{h_{1}} > {r^{\alpha} }P_{1}^{- 1}\tau Q_{0}} \right]\\ &= \exp \left\{ - \frac{{\tau {\sigma^{2}}}}{{{r^{- \alpha }}{P_{1}}}}\right\} {{\mathcal{L}_{{I_{1}}}}\left({{r^{\alpha} }P_{1}^{- 1}\tau } \right)}. \end{aligned} $$

((40))

Plugging (31), (33), (39), and (40) into (38), we obtain the ergodic throughput of a user associated with the first tier during the time t _s in (8).

References

S Parkvall, A Furuskar, E Dahlman, Evolution of LTE toward IMT-advanced. IEEE Commun. Mag. 49(2), 84–91 (2011).
Article Google Scholar
YS Soh, TQS Quek, M Kountouris, H Shin, Energy efficient heterogeneous cellular networks. IEEE J. Sel. Areas Commun. 31(5), 840–850 (2013).
Article Google Scholar
A Damnjanovic, J Montojo, Y Wei, T Ji, T Luo, M Vajapeyam, T Yoo, O Song, D Malladi, A survey on 3GPP heterogeneous networks. IEEE Wireless Commun. 18(3), 10–21 (2011).
Article Google Scholar
A Ghosh, N Mangalvedhe, R Ratasuk, B Mondal, M Cudak, E Visotsky, T Thomas, J Andrews, P Xia, H Jo, H Dhillon, T Novlan, Heterogeneous cellular networks: from theory to practice. IEEE Commun. Mag. 50(6), 54–64 (2012).
Article Google Scholar
M Wildemeersch, TQS Quek, CH Slump, A Rabbachin, Cognitive small cell networks: energy efficiency and trade-offs. IEEE Trans. on Commun. 61(9), 4016–4029 (2013).
Article Google Scholar
Y Chen, S Zhang, S Xu, GY Li, Fundamental tradeoffs on green wireless networks. IEEE Commun. Mag. 49(6), 30–37 (2011).
Article Google Scholar
Z Hasan, H Boostanimehr, VK Bhargava, Green cellular networks: a survey, some research issues and challenges. IEEE Commun. Surveys Tutorials. 13(4), 524–540 (2011).
Article Google Scholar
OG Aliu, A Imran, MA Imran, B Evans, A survey of self organisation in future cellular networks. IEEE Commun. Surveys Tutorials. 15(1), 336–361 (2013).
Article Google Scholar
G Auer, V Giannini, C Desset, I Godor, P Skillermark, M Olsson, M Imran, D Sabella, M Gonzalez, O Blume, A Fehske, How much energy is needed to run a wireless network?IEEE Wireless Commun. 18(5), 40–49 (2011).
Article Google Scholar
W Guo, T O’Farrell, in Conference on Wireless Advanced. Green cellular network: deployment solutions, sensitivity and tradeoffs (IEEEPiscataway, New Jersey, USA, London, England, 20–22 June2011), pp. 42–47.
Google Scholar
I Ashraf, F Boccardi, L Ho, SLEEP mode techniques for small cell deployments. IEEE Commun. Mag. 49(8), 72–79 (2011).
Article Google Scholar
L Saker, SE Elayoubi, R Combes, T Chahed, Optimal control of wake up mechanisms of femtocells in heterogeneous networks. IEEE J. Sel. Areas Commun. 30(3), 664–672 (2012).
Article Google Scholar
T Yucek, H Arslan, A survey of spectrum sensing algorithms for cognitive radio applications. Commun. Surveys Tutorials IEEE. 11(1), 116–130 (2009).
Article Google Scholar
D Bhargavi, CR Murthy, in 2010 IEEE Eleventh International Workshop on Signal Processing Advances in Wireless Communications (SPAWC). Performance comparison of energy, matched-filter and cyclostationarity-based spectrum sensing (IEEEPiscataway, New Jersey, USA, Marrakech, Morocco, 20–23 June2010), pp. 1–5.
Chapter Google Scholar
H-S Jo, YJ Sang, P Xia, JG Andrews, Heterogeneous cellular networks with flexible cell association: a comprehensive downlink SINR analysis. Wireless Commun. IEEE Trans. on. 11(10), 3484–3495 (2012).
Article Google Scholar
JG Andrews, F Baccelli, RK Ganti, A tractable approach to coverage and rate in cellular networks. IEEE Trans. on Commun. 59(11), 3122–3134 (2011).
Article Google Scholar
B Blaszczyszyn, MK Karray, HP Keeler, in 2013 Proceedings IEEE INFOCOM. Using Poisson processes to model lattice cellular networks (IEEEPiscataway, New Jersey, USA, Truin,Italy, 14–19 April2013), pp. 773–781.
Chapter Google Scholar
JFC Kingman, Poisson Processes (Oxford University Press, Oxford, 1992).
Google Scholar
Y-C Liang, Y Zeng, EC Peh, AT Hoang, Sensing-throughput tradeoff for cognitive radio networks. IEEE Trans. on Wireless Commun. 7(4), 1326–1337 (2008).
Article Google Scholar
I Ashraf, LT Ho, H Claussen. Improving energy efficiency of femtocell base stations via user activity detection (IEEEPiscataway, New Jersey, USA, Sydney, Australia, 18–21 April2010), pp. 1–5.
WX Shi, SS Fan, N Wang, CJ Xia, Fuzzy neural network based access selection algorithm in heterogeneous wireless networks. J. China Inst. Commun. 31(9), 151–156 (2010).
Google Scholar
S Fan, H Tian, C Sengul, Self-optimization of coverage and capacity based on a fuzzy neural network with cooperative reinforcement learning. EURASIP J. Wireless Commun. Netw. 2014(1), 1–14 (2014).
Article Google Scholar
J Li, J Zeng, X Su, W Luo, J Wang, Self-optimization of coverage and capacity in LTE networks based on central control and decentralized fuzzy Q-learning. Int. J. Distributed Sensor Networks. 2012(1), 1–10 (2012).
MATH Google Scholar
3GPP 36.814 V9.0.0, Further advancements for E-UTRA physical layer aspects. http://www.3gpp.org. Accessed 30 March 2010.

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61471060), the National Major Science and Technology Special Project of China (No. 2013ZX03003016), and the Funds for Creative Research Groups of China (No. 61421061).

Author information

Authors and Affiliations

State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, No. 10 Xitucheng Road, Beijing, 100876, Haidian District, China
Shaoshuai Fan & Hui Tian
Department of Computing and Communication Technologies, Oxford Brookes University, Wheatley Campus, Wheatley, Oxford OX33 1HX, UK
Cigdem Sengul

Authors

Shaoshuai Fan
View author publications
You can also search for this author in PubMed Google Scholar
Hui Tian
View author publications
You can also search for this author in PubMed Google Scholar
Cigdem Sengul
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Tian.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fan, S., Tian, H. & Sengul, C. Self-optimized heterogeneous networks for energy efficiency. J Wireless Com Network 2015, 21 (2015). https://doi.org/10.1186/s13638-015-0261-1

Download citation

Received: 07 August 2014
Accepted: 19 January 2015
Published: 04 February 2015
DOI: https://doi.org/10.1186/s13638-015-0261-1

Self-optimized heterogeneous networks for energy efficiency

Abstract

1 Introduction

2 User activity sensing-based active/sleep scheduling scheme

2.1 System model and assumptions

2.2 Analysis of the active/sleep scheduling scheme

Theorem 1.

Proof.

Theorem 2.

Proof.

Theorem 3.

Proof.

3 Self-optimization of user activity sensing based on fuzzy Q-learning

4 Simulation results

4.1 Performance of user activity sensing-based active/sleep scheduling scheme

4.2 Performance of self-optimization approach

5 Conclusions

6 Appendices

6.1 Appendix 1

6.1.1 Proof of Theorem 1

6.2 Appendix 2

6.2.1 Proof of Theorem 2

6.3 Appendix 3

6.3.1 Proof of Theorem 3

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Rights and permissions

About this article

Cite this article

Share this article

Keywords