 Research
 Open Access
Mobility management in HetNets: a learningbased perspective
 Meryem Simsek†^{1}Email author,
 Mehdi Bennis†^{2} and
 Ismail Guvenc^{3}
https://doi.org/10.1186/s1363801502442
© Simsek et al.; licensee Springer. 2015
 Received: 11 August 2014
 Accepted: 5 January 2015
 Published: 14 February 2015
Abstract
Heterogeneous networks (HetNets) are expected to be a key feature of longterm evolution (LTE)advanced networks and beyond and are essential for providing ubiquitous broadband user throughput. However, due to different coverage ranges of base stations (BSs) in HetNets, the handover performance of a user equipment (UE) may be significantly degraded, especially in scenarios where highvelocity UE traverse through small cells. In this article, we propose a contextaware mobility management (MM) procedure for small cell networks, which uses reinforcement learning techniques and intercell coordination for improving the handover and throughput performance of UE. In particular, the BSs jointly learn their longterm traffic loads and optimal cell range expansion and schedule their UE based on their velocities and historical data rates that are exchanged among the tiers. The proposed approach is shown not only to outperform the classical MM in terms of throughput but also to enable better fairness. Using the proposed learningbased MM approaches, the UE throughput is shown to improve by 80% on the average, while the handover failure probability is shown to reduce up to a factor of three.
Keywords
 Cell range expansion
 HetNets
 Load balancing
 Mobility management
 Reinforcement learning
 Contextaware scheduling
1 Introduction
To cope with the wireless traffic demand within the next decade, operators are underlaying their macrocellular networks with lowpower base stations (BSs) [1]. Such networks are typically referred as heterogeneous networks (HetNets), and their deployment entails a number of challenges in terms of capacity, coverage, mobility management (MM), and mobility load balancing across multiple network tiers [2]. Mobility management, in particular, is essential to ensure a continuous connectivity to mobile user equipment (UE) while maintaining satisfactory quality of service (QoS). Therefore, poor mobility management may lead to handover failures (HOFs), radio link failures, as well as unnecessary handovers, typically referred as pingpong (PP) events. Such deficiencies result in low resource utilization efficiency and poor user experience. In order to solve such problems, mobility parameters in each cell need to be dynamically optimized according to cell traffic loads, coverage areas of different cells, and velocities of the UE.
In the advent of HetNets, recent studies have demonstrated that HOFs and PPs can be serious problems due small cell sizes [2,3]. MM mechanisms, which have been included in the first release of the longterm evolution (LTE) standard (Rel8), were originally developed for networks that only involve macrocells [4]. The defined MM procedures for the macrocellonly scenarios have been widely discussed in the literature, e.g., in [519]. It has been shown that MM for macrocellonly scenarios yield highly reliable handover execution, where HOFs and PPs can be typically avoided due to large cell sizes [20]. However, the deployment of a large number of small cells (e.g., femtocells, picocells, etc.) increases the complexity of MM in HetNets, since mobile UE may trigger frequent handovers when they traverse the coverage area of a small cell. This leads to less reliable handover execution in HetNets.
While MM carries critical importance for HetNets to minimize HOFs and PPs, mobility load balancing is also crucial to achieve load balancing among different network tiers. In HetNets, the load among tiers is unbalanced due to significant differences in transmit power levels. UE tend to connect to the macrocell even when the path loss conditions between the small cell and the UE are better, because the handover decision is based on the highest reference signal received power (RSRP) measured at a UE [21]. As a remedy to this, the 3rd Generation Partnership Project (3GPP) standardized the concept of cell range expansion to virtually increase a small cell’s coverage area by adding a bias value to its RSRP, which leads to traffic offloading from the macrocell. To enhance the overall system performance, not only cellspecific handover parameter optimization, such as the range expansion bias (REB) value adaptation, but also scheduling and resource allocation must be performed in a coordinated manner across different tiers. A survey of these and various other existing mobility management and load balancing approaches considering such aspects for small cells in LTEadvanced networks is provided in [22].
In this article, a joint MM and UE scheduling approach is proposed using tools from reinforcement learning. The proposed MM approach utilizes parameter adaptation both in the long term and the short term. Hereby, macro and picocells learn how to optimize their longterm traffic load, whereas in the shortterm, the UE association process is carried out based on history and velocitybased scheduling. We propose multiarmed bandit (MAB) and satisfactionbased MM learning techniques as a longterm load balancing approach aiming at improving the overall system throughput while at the same time reducing the HOF and PP probabilities. A contextaware scheduler is proposed as a shortterm UE scheduling approach considering fairness.
The rest of the article is organized as follows. Section 3 provides a brief review about recent mobility management works in the literature and summarizes our contribution in this paper. Section 3 describes our system model. In Section 3, the problem formulation for MM is presented. In Section 3, the contextaware scheduler is described. In Section 3, we introduce our learning based MM approaches. Section 3 presents system level simulation results, and finally, Section 3 concludes the article.
2 Related work and contribution
In this section, we first summarize some of the recent studies on mobility management in LTEadvanced HetNets and use these to highlight the challenges and open problems that should be further addressed in the research community. Subsequently, a typical HetNet scenario with macro and picocells deployed in the same frequency band is considered to describe our contribution in this paper.
2.1 Literature review
The handover process requires a smooth transfer of a UE’s active connection when moving from one cell to another, while still maintaining the guaranteed QoS. The objective is to have mobility procedures resulting in low probability of experiencing radio link failures, HOFs, and PP events. Mobility solutions meeting those objectives are often said to be robust. The enhancement for handover robustness in HetNet LTE networks have been subject to recent interest. In LTE Rel11, mobility enhancements in HetNets have been investigated through a dedicated study item [2]. In this study item and cited work items therein, mobility performance enhancement solutions for cochannel HetNets are analyzed taking into account mobility robustness improvement. Proposed solutions are related to optimizing the handover procedure by dynamically adapting handover parameters for different cell sizes and UE velocities.
Mobility management techniques for HetNets have been recently investigated in the literature, e.g., [2328]. In [23], the authors evaluate the effect of different combinations of various MM parameter settings for HetNets. The conclusions are aligned with the HetNet mobility performance evaluations in 3GPP [2], i.e., HetNet mobility performance strongly depends on the cell size and the UE speed. The simulation results in [23] consider that all UE have the same velocity in each simulation setup. Further results on the effects of MM parameters are presented in [24], where the authors propose a fuzzylogicbased controller. This controller adaptively modifies handover parameters for handover optimization by considering the system load and UE speed in a macrocellonly network.
In [25], the authors evaluate the mobility performance of HetNets considering almost blank subframes in the presence of cell range expansion and propose a mobility based intercell interference coordination scheme. Hereby, picocells configure coordinated resources by muting on certain subframes, so that macrocells can schedule their highvelocity UE in these resources which are free of cochannel interference from the picocells. However, the proposed approach only considers three broad classes of UE velocities: low, medium, and high. Moreover, no adaptation of the REB has been taken into account. In [26], the authors propose a hybrid solution for HetNet MM, where MM in a macrocell is network controlled while UEautonomous in a small cell. In the scenario in [26], macrocells and small cells operate on different component carriers which allows the MM to be maintained on the macrocell layer while enabling small cells to enhance the user plane capacity. In addition to these approaches, a fairnessbased MM solution is discussed in [27]. Here, the cell selection problem in HetNets is formulated as a network wide proportional fairness optimization problem by jointly considering the longterm channel conditions and the distribution of user load among different tiers. While the proposed method enhances the celledge UE performance, no results related to mobility performance are presented.
In [28], the authors propose a MABbased intercell interference coordination approach that aims at maximizing the throughput and handover performance by subband selection for transmission for a smallcellonly network. The proposed approach deals with the tradeoff of increasing the subband size for throughput and handover success rate maximization and decreasing the subband size as far as possible to minimize interference. The only parameter which is optimized is the number of subbands based on some signaltointerferenceplusnoiseratio (SINR) thresholds. While the HOF rate is decreased by the proposed approach, the PP probability is not analyzed. To the best of our knowledge, there is no previous work related to learningbased HetNet MM in the literature, which jointly considers handover performance, load balancing, and fairness.
2.2 Contribution

In the proposed MM approaches, we focus on both shortterm and longterm solutions. In the long term, a traffic load balancing procedure in a HetNet scenario is proposed, while in the short term, the UE association process is solved.

To implement the longterm load balancing method, we propose two learningbased MM approaches by using reinforcement learning techniques: a MABbased and a satisfactionbased MM approach.

The shortterm UE association process is based on a proposed contextaware scheduler considering a UE’s throughput history and velocity to enable fair scheduling and enhanced cell association.
3 System model
where p _{ k } is the transmit power of BS k, σ ^{2} is the noise variance, and g _{ i(k),k }(t _{ n }) is the channel gain from cell k to UE i(k) associated to BS k. The bandwidth B _{ i(k)} in Equation 2 is the bandwidth which is allocated to UE i(k) by BS k at time t _{ n }.
3.1 Handover procedure
4 Problem formulation for throughput maximization
The condition in (12) implies that the total transmitted power over all RBs does not exceed the maximum transmission power of BS k. Since our optimization approach in (6) aims at maximizing the total rate, the last condition (13) dictates that the instantaneous rate is larger than a minimum rate for BS k. Due to the distributed nature of this optimization problem in (6), we will investigate two reinforcement learning techniques in this paper, as will be discussed in Sections 3 and 3.
5 Shortterm solution: a contextaware scheduler
where \(\text {sort}_{\min (v_{i(k)})}\) sorts the candidate UE according to their velocity starting with the slowest UE. After the sorting operation, if more than one UE can be selected for RB r, the UE with minimum velocity is selected. The rationale behind introducing a sorting/ranking function for candidate UE according to their velocity is that highvelocity UE will not be favored over slow moving UE. This has two advantages: 1) Highvelocity UE might pass through the picocell quickly and should therefore not be favored to avoid PPs, and 2) the channel conditions of lowvelocity UE changes slowly which may result, especially for slowmoving celledge UE, in poor rates if they are not allocated to many RBs.
In (15), a moving average rate is considered from macrocell to picocell, whereas in the classical MM approach, a UE’s rate history is not considered and is equal to zero. In other words, in the classical proportional fair scheduler, the average rate \(\overline {\phi }_{i}(t_{n})\) in (14) is \(\overline {\phi }_{i}(t_{n}) = \overline {\phi }_{i(k)}(t_{n}) = 0\) when a UE is handed over to cell k, whereas we redefine it according to (15), i.e., \(\overline {\phi }_{i}(t_{n}) = \overline {\phi }_{i(p)}(t_{n}+T_{s})\). The proposed MM approach, instead, considers the historical rate when UE i(m) was associated to the macrocell m in the past. The incorporation of a UE’s history enables the scheduler to perform fair resource allocation even in the presence of a sequence of handovers. Since handovers occur more frequently in HetNets due to small cell sizes, such a historybased scheduler leads to fair frequency resource allocation among the UE of a cell. More specifically, UE recently joining a cell will not be preferred over other UE of the cell since their historical average rate will be taken into account.
6 Longterm solution: learningbased mobility management techniques
To solve the optimization approach defined in Section 3, we rely on the self organizing capabilities of HetNets and propose an autonomous solution for load balancing by using tools from reinforcement learning [29]. Hereby, each cell develops its own MM strategy to perform optimal load balancing based on the proposed learning approaches presented in Sections 3 and 3 and schedules its UE according to the contextaware scheduler presented in Section 3. An overview of the interrelation of the shortterm and longterm MM approaches is provided in Algorithm 2 (see Appendix). To realize this, we consider the game \(\mathcal {G}=\{\mathcal {K},\{\mathcal {A}_{k}\}_{k\in \mathcal {K}},\{u_{k}\}_{k\in \mathcal {K}}\}\), in which each cell autonomously learns its optimal MM strategy. Hereby, the set \(\mathcal {K}=\{\mathcal {M}\cup \mathcal {P}\}\) represents the set of players (i.e., BSs), and for all \(k\in \mathcal {K}\), the set \(\mathcal {A}_{k} = \{\beta _{k}\}\) represents the set of actions player k can adopt. For all \(k\in \mathcal {K}\), u _{ k } is the utility function of player k. The action definition implies that the BSs learn at each time instant t _{ n } to optimize their traffic load in the long term using the cell range expansion process. Each BS learns its action selection strategy based on the MAB or satisfactionbased learning approaches as presented in Sections 3 and 3, respectively.
6.1 Multiarmed banditbased learning approach
The first learningbased MM approach is based on the MAB procedure, which aims to maximize the overall system performance. MAB is a machine learning technique based on an analogy with the traditional slot machine (onearmed bandit) [30]. When pulled at time t _{ n }, each machine/player provides a reward. The objective is to maximize the collected reward through iterative pulls, i.e., learning iterations. The crucial tradeoff the player faces at each trial is between exploitation of the action that has the highest expected utility and exploration of new actions to get more information about the expected utility performance of the other actions. The player selects its actions based on a decision function reflecting this explorationexploitation tradeoff.

Actions: \(\mathcal {A}_{k}=\{\beta _{k}\}\) with β _{ m }=[0,3,6] dB and β _{ p }=[0,3,6,9,12,15,18] dB is the CRE bias. We consider higher bias values for picocells due to their low transmit power. The considered bias values are selected considering the results in [31] and also based on our extensive simulation studies.

Strategy:
 1.
Every BS k learns its optimum CRE bias value on a longterm basis considering its load as defined in (5). This is interrelated with the handover triggering by defining the cell border of each cell. The problem formulation defined in Section 3 optimizes this load in the long term, i.e., over time t _{ n }.
 2.
A UE is handed over to BS k if it fulfills the condition (4).
 3.
RBbased scheduling is performed based on the contextaware scheduler defined in Section 3.
 1.

Utility Function: The utility function is a decision function in MAB learning and is composed by an exploitation term represented by player k’s mean reward for selecting an action until time t _{ n } and an exploration term that considers the number of times an action has been selected so far. Player k selects its action \(a_{j(k)}(t_{n})\in \mathcal {A}_{k}\) at time t _{ n } through maximizing a decision function \(d_{k,a_{j(k)}}(t_{n})\), which is defined as$$ {\footnotesize{\begin{aligned} d_{k,a_{j(k)}}(t_{n}) = u_{k, a_{j(k)}}(t_{n}) + \sqrt{\frac{2\log\left(\sum_{i=1}^{\mathcal{A}_{k}} n_{k,a_{i(k)}}(t_{n})\right)}{n_{k,a_{j(k)}}(t_{n})}}, \end{aligned}}} $$(16)
where \(u_{k, a_{j(k)}}(t_{n})\) is the mean reward of player k at time t _{ n } for action a _{ j(k)}, \(n_{k,a_{j(k)}}(t_{n})\) is the number of times action a _{ j(k)} has been selected by player k until time t _{ n }, and · represents the cardinality operation.
The decision function in the form as in (16) has been proposed by Auer et al. in [30]. The main advantage of Auer’s solution is that the decision function does not rely on the regret which is the loss due to the fact that the globally optimal policy (which is usually not known in practical wireless networks) is not followed in each learning iteration. It is clear that the regret will increase with time, because the globally optimal policy is usually not known and hence cannot be followed by the player. Thus, the MABbased strategies need to bound the increase of the regret, at least asymptotically. Auer et al. defined an upper confidence bound (UCB) within which the regret will be present. The UCB considers the player’s average reward and the number of times an action has been selected until t _{ n }. Relying on these assumptions, we define our decision function in Equation 16.
During the first \(t_{n}=\mathcal {A}_{k}\) iterations, player k selects each action once in a random order to initialize the learning process by receiving a reward for each action. For the following iterations \(t_{n}>\mathcal {A}_{k}\), action selection is performed according to Algorithm 1. In each learning iteration, the action \(a^{*}_{j(k)}\) that maximizes the decision function in (16) is selected by player k. Then, the parameters \(s_{k,a_{j(k)}}(t_{n})\), \(n_{k,a_{j(k)}}(t_{n})\), and \(u_{k,a_{j(k)}}(t_{n})\) are updated, where \(s_{k,a_{j(k)}}(t_{n})\) is the cumulated reward of player k after playing action a _{ j(k)} and is equal to 1 if i=j and zero otherwise.
6.2 Satisfactionbased learning approach
The idea of satisfaction equilibrium was introduced in [32], where players having partial or no knowledge about their environment and other players are solely interested in the satisfaction of some individual performance constraints instead of individual performance optimization. Here, we consider the player to be satisfied if its cell reaches a certain minimum level of total rate and if at least 90% of the UE in the cell obtain a certain average rate. The rationale behind considering these satisfaction conditions is to guarantee a minimum rate for each individual UE, while at the same time improving the total rate of the cell.
 1.
In the first learning iteration t _{ n }=1, the probability of each action is equal and an action is selected randomly.
 2.
In the following learning iterations t _{ n }>1, the player changes its action selection strategy only if the received utility does not satisfy the cell, i.e., if the satisfaction condition is not fulfilled.
 3.
If the satisfaction condition is not fulfilled, the player k selects its action a _{ j(k)}(t _{ n }) according to the probability distribution π _{ k }(t _{ n }).
 4.
Each player k receives a reward ϕ _{ k,tot}(t _{ n }) based on the selected actions.
 5.The probability π _{ k,j }(t _{ n }) of action a _{ j(k)}(t _{ n }) is updated according to the linear rewardinaction scheme:whereby for the selected action and zero for the nonselected actions. Moreover, b _{ k }(t _{ n }) is defined as follows:$$b_{k}(t_{n})=\frac{u_{k,\text{max}} + \phi_{k,\text{tot}}(t_{n}) u_{k,\text{min}}}{2 u_{k,\text{max}}}, $$(18)
where u _{ k,max} is the maximum rate in case of singleUE and \({u_{k,\text {min}}= \frac {1}{2} u_{k,\text {max}}}\). Hereby, \(\lambda = \frac {1}{100 t_{n}+T_{s}}\) is the learning rate.
7 Simulation results
Simulation parameters
Parameter  Value 

Cellular layout  Hexagonal grid, 
Three sectors per cell, reuse 1  
Carrier frequency  2 GHz 
System bandwidth  10 MHz 
Subframe duration  1 ms 
Number of RBs  50 
Number of macrocells  1 
Number of PBSs per macrocell P  {1,2,3} 
Max. macro (pico) BS transmit power  \(P_{\text {max}}^{M} = 46\) dBm 
(\(P_{\text {max}}^{P} = 30\) dBm)  
Macro path loss model  128.1+37.6 log10(R) dB (R[km]) 
Pico path loss model  140.7+36.7 log10(R) dB (R[km]) 
Traffic model  Full buffer 
Scheduling algorithm  Proportional fair 
Transmission mode  Transmit diversity 
Min. dist. MBSPBS  75 m 
Min. dist. PBSPBS  40 m 
Min. dist. MBSMUE  35 m 
Min. dist. PBSPUE  10 m 
Hotspot radius  60 m 
Thermal noise density  −174 dBm 
Macro BS antenna gain  14 dBi 
Pico BS antenna gain  5 dBi 
To compare our results with other approaches, we consider a baseline MM approach as defined in [2]. In the baseline MM approach, handover is performed based on layer 1 and layer 3 filtering as described in Section 3. For the baseline MM approach, we consider proportional fairbased scheduling, with no information exchange between macro and pico BSs. This baseline approach is referred to as classical HO approach. In the following, we will compare our proposed MM approaches with this baseline MM approach which is aligned with the handover procedure defined in 3GPP [2].
7.1 UE throughput and sumrate
Confidence intervals of the sumrates for a confidence level of 95%
MM approach  Number of UE per macrocell  

10  20  30  50  
Classical MM  [45.74, 56.33]  [58.77, 67.15]  [62.61, 71.94]  [67.51, 76.81] 
MAB MM  [92.04, 113.92]  [116.79, 132.7]  [124.4, 142.33]  [124.4, 142.33] 
Satisfaction MM  [92.37, 114.11]  [116.07, 132.84]  [124.04, 142.2]  [128.64, 145.77] 
7.2 HOF and PP probabilities
7.3 Convergence behavior of MAB and satisfactionbased MM
8 Conclusions
We propose two learningbased MM approaches and a historybased contextaware scheduling method for HetNets. The first approach is based on MABbased learning and aims at system performance maximization. The second method aims at satisfying each cell and each UE of a cell and is based on satisfactionbased learning. System level simulations demonstrate the performance enhancement of the proposed approaches compared to classical MM method. While up to 80% gains are achieved in average for UE throughput, the HOF probability is reduced significantly by the proposed learningbased MM approaches.
9 Endnote
^{a} We consider lower REB values for macro BSs to avoid overloaded macrocells due to their large transmission power.
10 Appendix
Declarations
Acknowledgements
This research was supported in part by the SHARING project under the Finland grant 128010 and by the U.S. National Science Foundation under the grants CNS1406968 and AST1443999.
Authors’ Affiliations
References
 Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 20132018. Cisco Public Information (2014).Google Scholar
 3GPP, Evolved Universal Terrestrial Radio Access (EUTRA); Mobility Enhancements in Heterogeneous Networks. Technical Report 3GPP TR 36.839 V11.1.0 (Oct. 2012).Google Scholar
 Y Peng, YZW Yang, Y Zhu, in Proc. IEEE Symposium on Personal Indoor and Mobile Radio Communications (PIMRC). Mobility Performance Enhancements for LTEAdvanced Heterogeneous Networks (Sydney, Sept 2012).Google Scholar
 H Holma, A Toskala, LTEAdvanced: 3GPP Solution for IMTAdvanced (John Wiley & Sons, Ltd, UK, 2012).View ArticleGoogle Scholar
 J AonsoRubio, in Proc. IEEE Network Operations and Management Symp. Selfoptimization for handover oscillation control in LTE, (2010), pp. 950–953.Google Scholar
 G Hui, P Legg, in Proc. IEEE Vehic. Technol. Conf. (VTC). LTE handover optimisation using uplink ICIC, (2011), pp. 1–5.Google Scholar
 K Kitagawa, T Komine, in Proc. IEEE Int. Symp. Personal Indoor Mobile Radio Commun. (PIMRC). A handover optimization algorithm with mobility robustness for LTE systems, (2011), pp. 1647–1651.Google Scholar
 K Ghanem, H Alradwan, in Proc. IEEE Int. Symp. Commun. Syst., Networks and Digital Signal Proc. Reducing pingpong handover effects in intra EUTRA networks, (2012), pp. 1–5.Google Scholar
 H Zhang, X Wen, B Wang, W Zheng, Z Lu, in Proc. IEEE Int. Conf. on Research Challenges in Computer Science, 1. A novel selfoptimizing handover mechanism for multiservice provisioning in LTEAdvanced, (2009), pp. 221–224.Google Scholar
 T Jansen, I Balan, J Turk, I Moerman, T Kurner, in Proc. IEEE Vehic. Technol. Conf. (VTC). Handover parameter optimization in LTE selforganizing networks, (2010), pp. 1–5.Google Scholar
 K Dimou, M Wang, Y Yang, M Kazmi, A Larmo, J Pettersson, W Muller, Y Timner, in Proc. IEEE Vehic. Technol. Conf. (VTC). Handover within 3GPP LTE: design principles and performance, (2009), pp. 1–5.Google Scholar
 W Li, X Duan, S Jia, L Zhang, Y Liu, J Lin, in Proc. IEEE Vehic. Technol. Conf. (VTC). A dynamic hysteresisadjusting algorithm in LTE selforganization networks, (2012), pp. 1–5.Google Scholar
 TH Kim, Q Yang, JH Lee, SG Park, YS Shin, in Proc. IEEE Vehic. Technol. Conf. (VTC). A mobility management technique with simple handover prediction for 3G LTE systems, (2007), pp. 259–263.Google Scholar
 M Carvalho, P Vieira, in Proc. Int. Symp. on Wireless Personal Multimedia Commun.An enhanced handover oscillation control algorithm in LTE SelfOptimizing networks, (2011).Google Scholar
 W Luo, X Fang, M Cheng, X Zhou. Proc. IEEE Int. Workshop on Signal Design and Its Applications in Commun, (2011), pp. 193–196.Google Scholar
 I Balan, T Jansen, B Sas, in Proc. Future Network and Mobile Summit. Enhanced weighted performance based handover optimization in LTE, (2011), pp. 1–8.Google Scholar
 T Jansen, I Balan, S Stefanski, I Moerman, T Kurner, in Proc. IEEE Vehic. Technol. Conf. (VTC). Weighted performance based handover parameter optimization in LTE, (2011), pp. 1–5.Google Scholar
 HD Bae, B Ryu, NH Park, in Australasian Telecommunication Networks and Applications Conference (ATNAC). Analysis of handover failures in LTE femtocell systems, (2011), pp. 1–5.Google Scholar
 Y Lei, Y Zhang, in Proc. IEEE Int. Conf. Computer Theory and Applications. Enhanced mobility state detection based mobility optimization for femto cells in LTE and LTEAdvanced networks, (2011), pp. 341–345.Google Scholar
 M. P WileyGreen, T Svensson, in Proc. IEEE GLOBECOM. Throughput, Capacity, Handover and Latency Performance in a 3GPP LTE FDD Field Trial (FL, USA, Dec. 2010).Google Scholar
 3GPP, Evolved Universal Terrestrial Radio Access (EUTRA); Further advancements for EUTRA Physical Layer Aspects, Technical Report 3GPP TR 36.814 V9.0.0, (Oct. 2010).Google Scholar
 D Xenakis, LMN Passas, C Verikoukis, Mobility Management for Femtocells in LTEAdvanced: Key Aspects and Survey of Handover Decision Algorithms. IEEE Comm. Surveys Tutorials. 16(1) (2014).Google Scholar
 S Barbera, SPH MMichaelsen, K Pedersen, in Proc. IEEE Wireless Comm. and Networking Conf. (WCNC). Mobility Performance of LTE CoChannel Deployment of Macro and Pico Cells (France, Apr. 2012).Google Scholar
 RBP Munoz, I de la Bandera, On the potential of handover parameter optimization for selforganizing networks. IEEE Trans. Vehicular Technol. 62(5), 1895–1905 (2013).View ArticleGoogle Scholar
 D LopezPerez, I Guvenc, X Chu, Mobility management challenges in 3GPP heterogeneous networks. IEEE Comm. Mag. 50(12) (2012).Google Scholar
 KI Pedersen, RPH CMichaelsen, S Barbera, Mobility enhancements for LTEadvanced multilayer networks with intersite carrier aggregation. IEEE Comm. Mag. 51(5) (2013).Google Scholar
 J Wang, JPJ DWLiu, G Shen, in Proc. IEEE Vehic. Technol. Conf. (VTC). Optimized Fairness Cell Selection for 3GPP LTEA MacroPico HetNets (CA, Sept. 2011).Google Scholar
 A Feki, V Capdevielle, L Roullet, AG Sanchez, in 11th Int. Symp. on Modeling & Optimization in Mobile, Ad Hoc & Wireless Networks (WiOpt). Handover Aware Interference Management in LTE Small Cells Networks, (2013).Google Scholar
 ME Harmon, SS Harmon, Reinforcement Learning: A Tutorial. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.2480&rep=rep1&type=pdf.
 NCBP Auer, P Fischer, Finite time analysis for the multiarmed bandit problem. Mach. Learn. 17, 235–256 (2012).Google Scholar
 3GPP, Performance Study on ABS with Reduced Macro Power, Technical Report 3GPP R1113806, (Nov. 2011).Google Scholar
 S Ross, B Chaibdraa, in Proc. 19th Canadian Conf. on Artificial Intelligence. Satisfaction equilibrium: achieving cooperation in incomplete information games, (2006).Google Scholar
Copyright
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.