Skip to content


  • Research
  • Open Access

A QoE-maximization-based vertical handover scheme for VLC heterogeneous networks

  • 1Email author,
  • 1, 2,
  • 1,
  • 1 and
  • 3
Contributed equally
EURASIP Journal on Wireless Communications and Networking20182018:269

  • Received: 31 May 2018
  • Accepted: 1 November 2018
  • Published:


Visible light communication (VLC), not only provides indoor illumination, but also offers broadband connection. It has the benefits of huge bandwidth, high security, low cost, and health safety. However, duplex communication, user mobility, and seamless coverage are becoming challenging tasks in VLC networks. A VLC heterogeneous network (VLC-HetNet) that combines VLC and existing radio networks, in which a good vertical handover (VHO) scheme is critical to guarantee continuous transmission, has been proposed to solve these problems. In this paper, we propose a new VHO scheme to maximize the quality of experience (QoE) of a user. When a user makes a handover between VLC and the RF network, the influence on the user’s QoE may be positive (defined as the QoE profit) or negative (the reduced QoE value is defined as the handover cost). The handover decision is made in order to enhance the QoE and reduce the handover cost of the user, by formulating the problem as a Markov decision process (MDP). The simulation results show that the proposed VHO scheme is adaptive to user movement and achieves a relatively high average QoE, low handover failure probability, and few handover times.


  • Quality of experience (QoE)
  • Visible light communication (VLC)
  • Markov decision process (MDP)
  • Vertical handover (VHO) scheme
  • Heterogeneous system

1 Introduction

Currently, data consumption in wireless networks is undergoing a drastic increase due to the upsurge in demand for mobile and multimedia services and their applications. However, this increase has pushed the radio frequency (RF)-based wireless technologies to their limits, and the spectrum is already allocated under license [1]. Visible light communication (VLC) utilizes low-cost light-emitting diodes (LEDs) as transmitters and p-intrinsic-n (PIN) photodiodes (PDs) as receivers. VLC is an option to overcome the crowded radio spectrum for wireless communication networks [2]. VLC has some benefits over RF technology: tremendous bandwidth, high data rate, non-interference nature, energy efficiency, and high security. However, one defect in VLC is the limited coverage area due to line-of-sight (LOS) transmission. In addition, LOS propagation cannot be assured in indoor environments due to the high probability of obstruction because of the haphazard movement of user equipment (UE) and other objects [3]. Consequently, VLC is limited in providing seamless data access.

In recent years, a VLC heterogeneous network (VLC-HetNet) was found to be backing both VLC and orthodox RF system that combines the advantages of VLC and RF technology [46]. The constraints of the VLC are due to its susceptibility to LOS transmissions. A vertical handover (VHO) from VLC to RF communication is a heterogeneous network that can overcome this limitation. In a VLC-HetNet, a VLC LOS link delivery may be a better choice due to its advantages mentioned earlier over the RF; a better quality of service (QoS) and a better appreciable quality of experience (QoE), to UE. When the VLC LOS link is blocked, the UE will experience a serious degradation of QoE because of the communication interruption. The communication connection needs to maintain an RF link to deliver uninterrupted data transmission with QoE that may be less than that of a VLC network.When the VLC channel is resumed, it is desirable to connect back to the VLC link because of its substantial benefits. Furthermore, the UE switching from available to unavailable VLC hotspots will cause additional signal and delay costs. The immediate vertical handover (I-VHO) scheme is a traditional algorithm in which the UE will handover immediately when the optical channel is blocked. This may not always be the best method because it can cause possible ping-pong effects [3]. If the LOS link is regained quickly after a short interruption, the dwell action can avoid the needless handovers that drastically decrease the QoE of the UE. On the contrary, the dwell vertical handover (D-VHO) scheme sets the dwelling time as the period of a short interruption to avoid the ping-pong effect. However, these handover schemes do not consider QoE which is used to describe the overall performance of a network from the user’s perspective.

In this paper, we investigate a QoE-maximization-based VHO (Q-VHO) scheme in a VLC-indoor environment. The handover decision is made by evaluating the QoE profit and delay cost. QoE profit is defined as the increase in a UE’s QoE when Q-VHO is performed. The delay cost is evaluated according to the additional delay caused by a handover decision. A Markov chain is proposed to model the blocking and non-blocking of the VLC link, and the arrival and departure rates are modeled as Poisson processes. Additionally, a Markov decision process (MDP) is formulated by maximizing the QoE value and reducing the handover delay.

The proposed Q-VHO scheme determines whether to perform a handover, given the UE’s current position and the transmission mode of UE: whether VLC or RF.

The main contributions of this paper are as follows:
  • We propose a Q-VHO scheme for VLC-HetNet that can maintain continuous data transmission when UEs move. The handover decisions are made by optimizing the QoE profit and delay cost; therefore, the UE’s QoE can be maintained.

  • We define the QoE profit and delay cost to evaluate the QoE reward and the additional handover delay caused by a handover decision. The scheme aims to simultaneously maximize UE QoE and minimize the delay cost when a handover is trigged.

2 Related works

Vertical handover is an essential technique to ensure continuous transmission in heterogeneous networks. The VHO techniques implemented in RF heterogeneous systems have recently been studied. Liu et al. [7] proposed a QoE-driven VHO algorithm based on IEEE 802.21-Media Independent Handover (MIH). The simulation results showed that the QoE-driven VHO algorithm could maintain better QoE of a multimedia service by considering video content and initiated VHO immediately when the QoE of the multimedia service became unacceptable. Singhrova et al. [8] proposed a neuro-fuzzy multi-parameter-based vertical handover decision algorithm (VHDA). Six parameters (radio signal strength, velocity of the user, available bandwidth, number of users, battery level, and coverage area) were considered in the proposed VHDA. In a simulated environment, the number of total vertical handovers decreased by 13.3% and 29.8% compared with those of the classic fuzzy method.

A novel analytical model to improve the modeling of vertical handover for a combined cellular/WLAN system was investigated by Kirsal et al. [9]. The model allowed users to perform a downward vertical handover to the WLAN and/or an upward vertical handover to the cellular network, where the WLAN system is inside the hotspot of a cellular network. Bin et al. [10] introduced a handover decision algorithm based on a multi-attribute utility function in a vehicle heterogeneous wireless network. Users make vertical handover decisions based on the access network, traffic, and number of users. An optimized vertical handover algorithm based on a Markov process is mainly used for the decision-making process. The algorithm affects the QoS of vehicle terminals. The mobile node receives the threshold of the received signal strength (RSS) from network access points or the base stations as the reference, and makes the handover decision when the RSS is subjected to certain conditions. Vegni et al. [11] investigated a soft/hard VHO technique modeled by means of a multi-dimensional Markov chain by assuming a probabilistic approach as the handover decision metric. A Markov decision process with the objective of maximizing the total expected reward per connection was presented in beyond 3rd generation (B3G) and 4th generation (4G) heterogeneous wireless networks [12].

VLC usually has a high signal-to-noise ratio (SNR) that is much larger than the traditional RF links and provides high throughput while maintaining indoor illumination. VLC can largely offload the data traffic from the indoor RF network and potentially solve the spectrum shortage crisis. However, handover in VLC is substantially different from that in RF networks because the directivity of the optical channel is higher than that in RF, which results in a limited coverage area of one VLC hotspot. To solve this problem, researchers have studied the handover schemes in VLC-HetNet. A hybrid VLC-LTE system was proposed in [13]. VHO algorithm through prediction (PVHO) was deployed to address the problem. A mobile terminal (MT) recorded the key parameters in real time and then processed them to offer guidance for proper handover decisions. Analysis of the numeric results indicates that the performance of PVHO is superior under various circumstances.

Wan et al. [14] studied the downlink resource allocation (RA) problem for a hybrid VLC and Wi-Fi system. A resource allocation algorithm combined with optical power dynamic allocation was proposed by maximizing the best effort (BE) service users’ aggregate throughput and users’ proportional fairness under the premise of guaranteeing the minimum rate requirement of delay-constrained (DC) service users. Basnayaka et al. [15] focused on improving the per user average and outage throughput in a VLC and RF heterogeneous network. The VLC system resources were assumed to be fixed, and the spectrum and power requirements for the RF system were quantified. The simulation results showed that the network can achieve better per user rate performance. In [16], the authors presented a novel hard-link switching scheme for VLC networks using pre-scanning based on received signal strength (RSS) prediction. The proposed system achieves both hard-link switching and soft-link switching reward without exchanging device hardware and the IEEE 802.15.7 medium access control (MAC) protocol. Because the hard handover may result in lost data connection, a soft handover scheme was presented in [17] by using orthogonal frequency division multiple access (OFDMA) under mobility. Wang et al. [4] investigated a Markov decision process problem and implemented a dynamic method to obtain a trade-off between the cost of switching and the delay requirement. They proposed a scheme to determine whether to perform VHO given the queue length and the condition of the optical channel. A novel VLC-HetNet protocol was proposed in [5, 6] that combined access, horizontal, and vertical handover mechanisms for a mobile terminal (MT) to resolve user mobility among different hotspots and an OFDMA system. The simulation results showed improvements in the capacity performance of the VLC-HetNet compared to that of the RF system.

3 System model

The VLC-HetNet network is composed of an RF access point (AP), a number of overlapping VLC hotspots, UE, and a control center that are connected to an external network as shown in Fig. 1. Some UE can receive both VLC and RF signals; the other UE have only RF transceivers and are called RF-UE. UE can access the VLC-HetNet by sending request packets via RF links only. The VLC hotspots and RF AP are linked to the control center in a bus topology. The RF link is used for UE uplink connection or downlink transmission when the optical channel is blocked. We assume that the RF AP has a queue for processing the uplink packets of the UE with a buffer size and a downlink queue for transmitting data packets. An UE can download data via a VLC hotspot or RF AP, depending on the condition of the VLC link and UE’s transmission mode. The RF links are assumed to cover the whole room. An overlap region is an interference area among two or more VLC hotspots. The overlap region is also termed the interference region. This happens as a result of two waves of the same frequency adding up to form an amplitude that will either be larger or smaller than the individual waves, depending on whether or not their peaks and troughs match up [18]. This causes a serious degradation of the UE’s QoE when the user is found in this region. When UE moves around, vertical handover to switch transmission from the VLC to RF network or from the RF to VLC network should be considered to ensure continuous data links. Handover can be triggered when the VLC link is blocked or unblocked. The UE measures its QoE from the APs of the VLC and the RF through the control center. When the VLC link is blocked, the UE’s QoE degrades drastically. In order to prevent a complete degradation which could result in breakdown in communication, a vertical handover is triggered. The control center handles the handover process by receiving requests, processing them, and executing handover. Handover requests are sent through the RF AP. Since a UE moves in a random manner, the duration of time spent in and out of the VLC hotspot is a random variable. The movements of UE can be described by their directions d (radians), velocities v (m/s), time durations t (s), and pause time pt (s).
Fig. 1
Fig. 1

System model diagram

The QoE of a UE is defined according to three types of services: audio, video, and data transfer. Basically, UE’s QoE can be obtained from the mean opinion score (MOS) of data traffic [19]. The MOS has five values from 1 to 5 indicating users satisfactory degrees: “Bad,” “Poor,” “Fair,” “Good,” and “Excellent,” respectively [20]. In video traffic, the MOS primarily depends only on the loss of a slice of frame from the video stream [21]. The MOS can be simplified as a function of the peak signal–to-noise ratio (PSNR) with some transformation [22]. The QoE function Q video is given as
$$ Q_{{video}} (P_{{snr}}) = 4.5 - \frac{3.5}{1+ {exp} (b_{1}(P_{{snr}}-b_{2}))} $$
where b1 and b2 are the parameters determining the shape of the function and Psnr is the experienced PSNR. Traffic such as file transfer and web browsing in non-real-time are called elastic traffic. The corresponding QoE is defined as an increasing function of throughput θ:
$$ Q_{{elastic}} (\theta) = b_{3} {log}(b_{4}\theta) $$
where b3 and b4 are determined by the required maximal and minimal throughputs [23]. For audio traffic, the QoE function Q audio is defined by a nonlinear mapping of the R factor:
$$ \begin{aligned} Q_{{audio}}(R_{f}) = 1 + 0.035R_{f} + 7 \times 10^{-6} \times \\R_{f} (R_{f} -60)(100- R_{f}) \end{aligned} $$
where Rf is the R factor defined by ITU to reflect the audio quality impairment from different aspects [24, 25]. In order to translate end-to-end (E2E) QoE parameters, such as delay (D), data rate (R) and packet loss ratio (Pl), into QoE values, the following models can be applied according to service types:
$$\begin{array}{*{20}l} QoE_{{audio}} = 1 + 0.0335R_{f} + 7 \times 10^{-6} \times \\ R_{f} (R_{f} -60)(100- R_{f}) \end{array} $$
$$ QoE_{{video}} = 4.5-\frac{3.5}{ 1 + {exp}(0.5 \times (PSNR-30))} $$
$$ QoE_{{data}} = 2.1 \times {log}_{10}[0.3 \times R \times (1-P_{l})] $$

Please refer to [21, 26] for more details about these expressions. In this work, we assume the service type in Eq. (2) where b3 and b4 are determined by the required minimal and maximal throughputs, which are assumed to be 1 and 10 Mbps, respectively. The resulting parameters are set as b3=0.6021 and b4=31.228.

We suppose that the arrival and departure processes of the packets in the uplink and downlink queue follow a Poisson distribution [4]. λ (packets/s), μV (packets/s), and μRF (packets/s) represent the arrival rate of the packets and the VLC and RF serving rates respectively. Intermittent interruption of the VLC link is described by an ON-OFF model, whereby ON means the VLC is not blocked and OFF means the VLC is blocked. The blockage of the VLC link follows an exponential random process. γ1 and γ2 are the mean duration of the VLC channel when not blocked and blocked, respectively. The rate of change of the VLC channel from non-blocked to blocked is denoted as 1/γ1 (s−1) while 1/γ2 (s−1) represents the rate of change of the VLC link from blocked to non-blocked. Using the above definitions, we formulate a Markov decision process to select the optimal decision whenever UE transitions from the VLC hotspot to the RF or vice versa.

The UE’s QoE varies according to the type of coverage region it is found in, the duration of time in that coverage region, the transmission mode, and the handover delay. Vertical handover from VLC to RF or from RF to VLC affects UE’s QoE. For instance, if the UE has a higher QoE in VLC but has to handover to the RF network which provides a lower QoE, there will be a negative effect on the UE’s QoE. The extent of this effect depends on how long the UE remains outside of VLC coverage. Additionally, if the UE moves outside of the VLC hotspot and does not handover to the RF network, the QoE degradation may be worse. By contrast, if the UE moves to the VLC coverage area and makes the handover to VLC, there will be a significant improvement in QoE. The longer the UE remains in VLC, the better the improvement. Sometimes UE can move in and out of the VLC region frequently within a short period of time. If handover is done immediately for every transition, it will lead to the accumulation of significant handover delay and signaling cost and may even result in the ping-pong effect [27], thereby reducing QoE. Therefore, ping-pong effect can be regarded as QoE penalty. When a handover decision is made, there can be a benefit, as measured by QoE, and also a penalty, as measured by QoE and handover delay. An optimum decision must therefore be made to obtain the maximum reward.

4 QoE-maximization-based vertical handover scheme

To optimize the vertical handover decision making, the problem is formulated as a Markov decision process (MDP). In order to apply MDP, we must first define the state spaces and derive the transition probabilities between states from the Markov chain. Furthermore, we need to define the action space and the decision epochs when an action should be chosen.

4.1 State and action spaces

The state space of a user equipment as it receives data is given as
$$ s= \{(v,l_{i},t_{x}),v\in V,l_{i}\in L_{i},t_{x}\in T_{x} \} $$

where V={ON,OFF} represents the condition of the VLC link, Li indicates the number of packets in the downlink of the ith user and Tx={0,1} stands for the current transmission mode. When V=ON, the VLC link is available to the user. If V=OFF, the VLC link is blocked or unavailable. The RF channel is assumed to always be available. For each UE, packets in the downlink can be transmitted over the RF or VLC channel. Tx=0 indicates transmission over the RF channel; Tx=1 represents transmission over the VLC channel. We define A={0,1} as the action space of the ith UE. Depending on the current state of the optical channel, the downlink queue length of a particular UE and its transmission mode, an action a(s) is taken to handover UE from one system to another or not. The condition of the VLC link can switch from ON to OFF and vice versa. In addition, the downlink queue length changes according to the arrival and serving rates and the condition of the VLC channel. The channel serving rate depends on the UE transmission mode and the channel gain. If a UE is downloading data through an RF AP when the VLC link suddenly becomes available, a decision must be made to handover to VLC or not. On the other hand, a UE may be transmitting on VLC when the link becomes blocked abruptly. A decision of whether to switch transmission mode to RF must be made when the VLC link is blocked.

4.2 Markov chain and transition probability

The transition probabilities between states can be derived from the two-dimensional Markov chain shown in Fig. 2 [4]. The transmission mode of the UE and the action are not included in the Markov chain because they are partly random and partly under the influence of the decision-making process. For V=ON, the transition probability is given by
$$ P^{a(s)}_{s\rightarrow s^{'}} = \left\{\begin{array}{ll} \frac{\lambda}{\lambda + \frac{\eta}{\gamma_{1}} + \mu_{ON}}, \text{\qquad if}\ s^{'}=[{ON}, L_{i}+1, a(s)]\\ \frac{\frac{\eta}{\gamma_{1}}}{\lambda+\frac{\eta}{\gamma_{1}}+\mu_{ON}}, \text{\qquad if}\ s^{'} = [{OFF}, L_{i}, a(s)]\\ \frac{\mu}{\lambda+\frac{\eta}{\gamma_{1}}+ \mu_{ON}}, \text{\qquad if}\ s^{'}= [{ON}, L_{i}-1,a(s)]\\ \end{array}\right. $$
Fig. 2
Fig. 2

Illustration of the transition rates for 0<li<Li

where μON=cμV+(1−c)μRF, η is the weighting factor with unit of packets, λ+(η/γ1)+μON is the sum of the weighted transition rates, and c=|txa(s)|. For V=OFF, the transition probability is given by
$$ \mathrm{P}^{a(s)}_{{\mathrm{s}\rightarrow \mathrm{s}^{'}}} = \left\{\begin{array}{ll} \frac{\lambda}{\lambda+\frac{\eta}{\gamma_{2}}+\mu_{{OFF}}}, \text{\qquad if}\ s^{'}=[{{OFF}},L_{i}+1, a(s)]\\ \frac{\frac{\eta}{\gamma_{2}}}{\lambda+\frac{\eta}{\gamma_{2}}+\mu_{{OFF}}}, \text{\qquad if}\ s^{'}=[{{ON}},L_{i},a(s)]\\ \frac{\mu}{\lambda+\frac{\eta}{\gamma_{2}}+\mu_{{OFF}}}, \text{\qquad if}\ s^{'}=[{{OFF}},L_{i}-1, a(s)]\\ \end{array}\right. $$

where μOFF=(1−c)μRF, and λ+(η/γ2)+μOFF is the sum of the weighted transition rates. The solution of the MDP is the optimal action a(s) to be taken for a given system state s.

4.3 Reward model

When a VLC channel is blocked or recovered, a decision must be made to either change or maintain the current transmission mode. The selected action for a given system state has an associated expected benefit and penalty. In our case, the benefit is in the form of QoE profit, and the penalty is described as the handover cost. The net effect of the QoE profit and handover cost represents the expected reward. The goal of an MDP is to choose the action that maximizes the cumulative function of the random rewards. In our handover scheme, the goal is to maximize the QoE profit of the UE and minimize handover cost.

4.3.1 QoE profit-based benefit function

The QoE profit (Qp) of a UE is defined as
$$ Q_{p} = \frac{Q*D}{T} $$
where Q is the value of the UE’s QoE, D denotes the duration of a certain QoE value, and T is the time duration of the downlink. The Q and T are inversely proportional. For a given state, s={v,li,tx} and action a(s), the expected QoE profit for a UE is given as
$$ Q_{p}(s,a(s)) = \frac{{Q_{V}\gamma_{1}c} +\left[Q_{{RF}}\frac{l_{i}-\mu_{V}\gamma_{1}}{\mu_{{RF}}}\right](1-c)}{\gamma_{1}+\gamma_{2} + (l_{i}-\mu_{V}\gamma_{1})/\mu_{{RF}}} $$

where QV is the QoE value for VLC, QRF is the QoE value when the UE transmits data via the RF link and c=|txa(s)|. QV=b3log(b4θ1) and QRF=b3log(b4θ2), where θ1 and θ2 are the expected throughputs for VLC and RF channels, respectively. b3 and b4 are obtained according to Eq. (2). Equation (11) shows that the QoE profit of the UE in VLC depends largely on the mean non-blocking time γ1 and QoE value QV. Furthermore, the QoE profit of the UE in the RF link is affected by the queue length li, the RF serving rate μRF, the VLC serving rate μV, the mean non-blocking time γ1, and the QoE value in RF QRF. By definition, if the UE spends more time in VLC, it will spend less time in RF. Therefore, the more time the UE remains in VLC, the higher the QoE profit it derives and the lower the QoE profit from the RF links.

4.3.2 Handover penalty function

When the UE leaves the VLC region or moves into the VLC coverage area, the decision of whether to handover will introduce some cost or penalty. The handover cost in our scheme is composed of two parts: an expected cost of change in QoE and a delay cost. For a given state s={v,li,tx} and action a(s), the expected handover cost, g(s,a(s)), is given by
$$ g(s,a(s)) =\max\{0,\Delta Q(a(s))\} + \tau(a(s)) $$

where max{0,ΔQ(a(s))} is the expected cost of change in QoE, τ(a(s)) is the delay cost, and ΔQ(a(s))=Q(before decision)−Q(after decision).

When the UE is associated with the RF link and then transfers from RF to VLC, the QoE before the decision to handover to VLC may be less than the QoE after the decision. This is consistent with the theory and practical experience where VLC often offers a faster transmission rate than that provided by an RF network. Under this condition (ΔQ(a(s))<0), the handover decision is beneficial to the UE; hence, the handover cost is determined only by the handover delay cost. The variation in QoE cost is denoted as
$$ \Delta Q(a(s)) = a(s)\gamma_{1}\Delta f_{{{RF}}\rightarrow_{{VLC}}} $$
where \(\Delta f_{{{RF}}\rightarrow _{{VLC}}}\phantom {\dot {i}\!}\) is the variation in UE’s QoE from the RF channel to VLC channel (QRFQV). Therefore, the handover cost can be represented as
$$ g(s,a(s))=\tau(a(s)) $$
Next, we consider the case when the UE is connected to VLC and leaves the VLC coverage area. Clearly, the QoE value before the decision will be more than the QoE value after the decision, irrespective of the action taken. The variation in the QoE value after handover can be described as
$$\begin{array}{@{}rcl@{}} \begin{aligned} \Delta Q(a(s))=a(s)\gamma_{2} \Delta f_{{{VLC}}\rightarrow_{{RF}}} +&\\ (1-a(s)) \gamma_{2}\Delta f_{{{VLC}}\rightarrow {{BLOCK}}} \end{aligned} \end{array} $$
where ΔQ(s,a(s))>0. ΔfVLCRF is the change in UE’s QoE from the VLC channel to the RF channel, and ΔfVLCBLOCK is the change in QoE from the VLC channel to blocked. Clearly, ΔfVLCRF<ΔfVLCBLOCK. When a handover from VLC to the RF link is executed (a(s)=1), the expected handover cost is
$$ g(s, a(s)) = \gamma_{2} \Delta f_{{{VLC}}\rightarrow {{RF}}} + \tau(a(s)) $$
In this instance, the handover cost is determined largely by the mean blocking time γ2 and handover delay. However, when handover is not executed (a(s)=0), the expected cost is represented as
$$ g(s, a(s)) = \gamma_{2}\Delta f_{{{VLC}}\rightarrow {{BLOCK}}} $$
The handover delay cost is determined by the handover delay, which is the waiting time in the uplink and downlink queue plus the packet processing time. For example, if the UE triggers a handover from the VLC to RF network, it first sends an access request via the RF uplink queue. After the uplink access request packets have been processed, the handover succeeds when the downlink data packets depart from the downlink queue successfully. If we let the maximum handover delay hmax correspond to the maximum delay cost and the minimum handover delay hmin correspond to the minimum delay cost, we obtain the following expression for the delay cost as a function of a(s):
$$ \tau (a(s)) = \frac{\beta h - \beta h_{{min}}}{h_{{max}}-h_{{min}}} $$

where h is the handover delay in seconds and β is normalization constant. We set β = 2.5, hmax = 1 s, hmin = 0.1 s, h={0.1,0.2,,1}.

4.4 Optimization problem and Q-VHO algorithm

The goal of our VHO scheme is to give the UE the maximum possible QoE profit with the minimum handover cost for any given state. In our VHO algorithm, we use a discounted model. That is, the reward of the current stage and the discounted reward of future stages are maximized. Using the Bellman equation [28], the average discounted sum of rewards can be expressed as
$$\begin{array}{@{}rcl@{}} \begin{aligned} V(s) = \max \limits_{a(s)}\sum\limits_{s^{'}\in S} {\mathrm{P}^{a(s)}_{{\text {s}\rightarrow \mathrm{s}^{'}}}} \left[Q_{p}\left(s, s^{'}, a(s)\right)\right.\\- g\left.\left(s, s^{'}, a(s)\right) + \alpha V(s^{'})\right] \end{aligned} \end{array} $$

where \(\phantom {\dot {i}\!}Q_{p}(s,s^{'},a(s))\) is the expected QoE profit obtained by moving from state s to state \(\phantom {\dot {i}\!}s^{'}\) under the action a(s), and \(g(s,s^{'},a(s))\phantom {\dot {i}\!}\) is the expected cost of handover for a given state s and action a(s) resulting in a new state \(s^{'}. V(s^{'})\phantom {\dot {i}\!}\) is the optimal reward obtained by moving into state \(\phantom {\dot {i}\!}s^{'}\), and α is the discount factor.

The solution of the optimality equation corresponds to the maximum expected total reward V(s) and the MDP optimal policy a(s), which represents the decision of whether to handover at a given state. Various algorithms can be used to solve the optimization problem given in Eq. (19). With the value iteration algorithm, we can obtain the VHO solution that is shown in Eq. (19). In the algorithm, Qk[s,a(s)] is the average reward for each state of iteration k under action a(s). \(V_{k}^{*}(s)\) is the optimal average reward for each state of iteration k. Using the knowledge of \(V_{k-1}^{*}(s)\), the optimal action \(a_{k}^{*}(s)\) is selected to maximize Qk[s,a(s)], and the corresponding optimal reward \(V_{k}^{*}(s)\) is obtained. The iteration continues until ||VkVk−1||≤ε. The following MDP-based Q-VHO algorithm determines the expected total reward and corresponding stationary deterministic optimal policy.

5 Simulation

Using MATLAB, simulation is carried out to compare the performance of the Q-VHO scheme with that of the benchmarks (I-VHO and D-VHO schemes). We use the average QoE, handover failure probability, and average number of vertical handovers as performance metrics.

The simulation scenario is set up in a room with 9 overlapping VLC hotspots and an RF AP. Each VLC hotspot has a coverage radius of 1.5 m. The overlap areas of VLC hotspots are regarded as out-of-VLC coverage due to the interference of existing optical signals. The locations of the 9 VLC APs are as follows: (1.5, 1.5, 5), (4, 1.5, 5), (6.5, 1.5, 5), (1.5, 4, 5), (4, 4, 5), (6.5, 4, 5), (1.5, 6.5, 5), (4, 6.5, 5), and (6.5, 6.5, 5). The RF AP can be accessed anywhere in the room. We assume that the uplink and downlink queues of the RF AP are M/M/1/K systems with maximum lengths of 10 packets [29]. Initially, a UE is connected to a VLC hotspot. The UE undergoes random movement in a uniform random direction within 0 and 2 π radians. The velocity of UE is defined as the speed of movement in a particular direction. The range of UE’s velocity is from 0.3 to 0.7 m/s, which is somewhere between a slow walk and a quick stroll [30]. The period of time for UE to move to a new position is referred to as the movement time duration. The pause time is the period of time an UE remains at a new position. The random movement continues until the total simulation time (1 h) elapses. The random movement of the UE leads to the blocking and unblocking of the VLC link. When the UE moves out of or into VLC coverage, the mean duration of blocking and non-blocking of VLC channel are updated. The Q-VHO algorithm utilizes this information for handover decision making.

Our vertical handover scheme is compared with the immediate and dwell-based vertical handover schemes [4]. In I-VHO, the controller performs VHO whenever the UE transitions from VLC coverage to RF coverage and vice versa. However, in Dwell-VHO, the controller waits for a period of time t0 before VHO decision. We set t0=0.5 s and 1 s. When the dwell time expires, the controller switches the transmission mode to RF if the optical link is still blocked; otherwise, the transmission mode remains VLC. When the VLC link is recovered, the controller immediately switches from RF mode to VLC mode in both the I-VHO and D-VHO schemes. The simulation parameters are summarized in Table 1.
Table 1

Simulation parameters



Room dimensions

8 m x 8 m x 5 m

Number of VLC APs


Radius of APs

1.5 m

Velocity of UE v

0.3–0.7 m/s

Movement time duration t

1–10 s

Pause time pt

2–10 s

Direction d

0–2 π radians

Throughput for RF θ2

1 Mbps

Throughput for VLC θ1

10 Mbps

Packet arrival rate λ

0.1–1 packets/s

Packet departure rate of VLC μV

2 packets/s

Packet departure rate of RF μRF

1.1 packets/s

Downlink queue length of ith UE Li

20 packets

The maximum queue length of the uplink


and downlink

10 packets

Number of RF-UE accessing the uplink

1–10 UEs

Weighting factor η

1 packet

Dwell time t0

1, 0.5 s

Simulation time

3600 s

Number of iterations


To estimate the average QoE, we determine the transmission mode of UE, i.e., whether it is transmitting via VLC or RF. In addition, we obtain the time duration of being connected to VLC or RF. The average QoE is calculated by
$$ {A}_{{Q}_{e}}=\frac{{\sum\nolimits}_{r=1}^{{N}_{i}}{\sum\nolimits}_{c=1}^{{N}_{c}(r)}{\left[{Q}_{e}(c,r)\times{T}_{c}(r)\right]} -{d}_{c}}{{\sum\nolimits}_{r=1}^{{N}_{i}}{T}_{i}(r)} $$
where A\(_{Q_{e}}\) is the average QoE for a particular VHO scheme, Qe(c,r) is the QoE during the cth connection of iteration r, dc is the delay cost of a VHO to establish the cth connection. Tc(r) is the time duration of the cth connection in iteration r, Nc(r) is the number of connections in iteration r, Ti(r) is the total time duration of iteration r, and Ni is the number of iterations. The handover failure probability which is the likelihood that a VHO request will not be processed is calculated by
$$ F_{{VHO}}=\frac{{\sum\nolimits}_{r=1}^{N_{i}}\frac{p(r)^{B}-p(r)^{B+1}}{1-p(r)^{B+1}}}{N_{i}} $$
where FVHO is the handover failure probability for a particular VHO scheme, p(r) is the utilization of the RF uplink server for iteration r, and B is the maximum queue length of the RF uplink. The average number of VHOs is calculated by
$$ A_{{VHO}}= \frac{{\sum\nolimits}_{r=1}^{N_{i}}\textit{N\(_{\text{VHO}}\)(r)}}{\textit{N\(_{i}\)}} $$

where A VHO is the average number of VHOs for a particular VHO scheme and NVHO(r) is the number of VHOs for iteration r.

6 Results and discussion

Simulation results are presented and discussed in this section. The performance of the Q-VHO scheme is measured against that of I-VHO and D-VHO schemes in terms of average QoE, handover failure probability, and average number of vertical handovers (VHOs).

6.1 Average QoE comparison

We evaluate the impact of the uplink arrival rate of λ with 10 RF-UE on the average QoE, which is demonstrated in Fig. 3a. When λ (packets/second) increases, the average QoE of all the schemes decreases because UE can only send access and handover requests via the RF links. However, an increase in the RF-UE arrival rate fills the uplink queue of the RF AP and substantially increases the handover delay, which leads to a decrease in the QoE performance. By contrast, for our proposed Q-VHO scheme, if a handover request cannot yield the necessary QoE profit, the handover is not executed. In this situation, the scheme will latch onto its preferred network to enhance system performance and results in better QoE. The impacts of the movement time duration t and UE velocity v on the average QoE are shown in Figs. 3b, and c, respectively. Initially, the UE connects to the VLC hotspot for a large QoE. As t or v increase, the UE starts to move in and out of the VLC coverage frequently, and the VLC link is occasionally blocked, leading to frequent handovers which have a negative impact on the UE’s QoE. I-VHO has the worst QoE performance because immediate handover occurs when the VLC link is blocked, potentially resulting in the ping-pong effect. However, the proposed Q-VHO has fewer number of handovers than the I-VHO and D-VHO schemes because we maximize the sum of the handover rewards and infer that the handover cost dominates the handover benefit. Therefore, the UE tends to continue with the RF link to guarantee continuous service, despite the reduced QoE. In addition, we analyze the impact of the pause time pt (the time it takes the UE to stop momentarily in a coverage area) on the average QoE of the UE, as shown in Fig. 3d. When pt increases from 2 to 10 s, the average QoE increases for all the schemes because the UE tends to remain in the preferred network and not perform handover often. Therefore, the QoE profit tends to be larger, according to Eq. (10). As the pt of the UE increases, there are fewer vertical handovers in our Q-VHO scheme, which helps to reduce the handover cost. As a consequence of the UE being connected to better and secured coverage for a long time, the QoE of the UE increases to enhance performance.
Fig. 3
Fig. 3

The impact of the a RF-UE arrival rate, b UE movement time duration, c UE velocity, and d and UE pause time on the average QoE performance

6.2 Handover failure probability and average number of VHOs’ comparison

We investigate the impact of pt (the time the UE momentarily stops in a given hotspot) on the handover failure probability, as shown in Fig. 4a. As pt increases from 2 to 10 s, I-VHO decreases from 0.05899 to 0.04513, D-VHO t0=0.5 decreases from 0.09786 to 0.08178, D-VHO t0=1 decreases from 0.13136 to 0.12162, and Q-VHO decreases from 0.04986 to 0.032131. As pt increases, the UE tends to stay in one coverage area for a longer period of time and remains connected to the same network. Accordingly, the number of vertical handover requests decreases, and the number of handover failures decreases. Our scheme is adaptive to the UE’s movements and has the lowest probability of handover failure. I-VHO has the worst performance because it has the largest handover delay. The handover failure probability versus number of RF-UE is shown in Fig. 4b. The handover delay h increases with increasing number of RF-UE, and the probability of handover failure increases for all the schemes. A greater number of handovers leads to a higher RF arrival rate, which results in a higher uplink utilization. Unlike the I-VHO and D-VHO schemes, the proposed MDP-based Q-VHO scheme reduces the number of handovers as the handover delay cost increases. Consequently, the impact of the number of RF-UE on the failure probability is the smallest. The number of unsuccessful handovers tends to be large as more request packets arrive, causing a long delay at the uplink. This leads to an increase in the handover failure probability.
Fig. 4
Fig. 4

The impact of the a pause time, b number of RF-UEs, c movement time duration, and d UE velocity on handover failure probability

The handover failure probability versus movement time duration t is shown in Fig. 4c. For a given number of RF-UE, more handovers result in an increase in the activity of the uplink queue which results in a longer handover delay and hence a higher likelihood of handover failure. In addition, the rate of blocking and recovery of VLC LOS links increases with increasing movement time duration. Since the mean blocking and non-blocking time affect the decision making in the Q-VHO scheme, there are fewer handovers when the movement duration is greater than 2 s. The number of VHOs increases between t=1 s and t=2 s since in the Q-VHO scheme, the benefits of a handover dominate the handover cost. Therefore, the handover failure probability for the Q-VHO scheme increases between t=1 s and t=2 s due to the increase in the utilization of the uplink queue. In all cases, however, the handover failure probability of the Q-VHO scheme is the smallest. Additionally, when t increases, the UE makes more transitions between the VLC and the RF in the I-VHO and D-VHO schemes. Consequently, the handover failure probability increases in the I-VHO and D-VHO schemes. However, for our Q-VHO scheme, the movement time duration is inversely proportional to the handover failure probability initially and remains stable when t>2 s because the Q-VHO reduces the number of handover requests when it is connected to a secured network, where the handover cost is greater than the handover benefit. As the velocity increases from 0.3 to 0.7 m/s, the frequency of VLC channel fluctuation increases sharply. Hence, the I-VHO approach triggers the largest number of VHOs and results in the largest handover failure probability of the three VHO schemes, as shown in Fig. 4d. For D-VHO, the longer the waiting time is, the fewer the handovers. For Q-VHO, the number of handovers decreases as velocity increases because the cost of a handover outweighs the benefits. Therefore, the handover failure probability decreases as the velocity increases. Our scheme outperforms the other schemes with respect to the failure probability.

An increase in the number of RF-UE may not have a significant effect on the number of vertical handovers unless the UE is making handover requests to the uplink queue. Therefore, the average number of VHOs is strongly dependent on the number of handover requests not necessarily on the number of UE. As we increase the number of RF-UE, there is a proportional increase in the number of RF requests, which results in a higher handover delay h since the uplink queue length is larger. In contrast to the I-VHO and D-VHO schemes, the MDP-based Q-VHO scheme considers the handover delay cost before deciding to handover. Therefore, the number of VHOs for Q-VHO is the smallest as indicated in Fig. 5.
Fig. 5
Fig. 5

The impact of the number of RF-UE on the average number of vertical handovers

7 Conclusion

In this article, we investigate a QoE-maximization-based VHO scheme for VLC-HetNet systems. The aim of our study is to find a solution to maximize the QoE and reduce the handover cost to provide continuous transmission. By modeling the irregular blockage of the VLC link as an ON/OFF process, we formulate the VHO decision making as an MDP problem and propose a QoE optimization method. On the basis of the simulation results and analysis, the proposed scheme is adaptive to user movement and achieves better performance (i.e., average QoE, handover failure probability, and number of vertical handovers) than that of D-VHO and I-VHO schemes in terms of reducing the ping-pong effect.




4th generation


Access point


Beyond 3rd generation


Dwell vertical handover


Heterogeneous network


Institute of electrical and electronics engineers


International telecommunication union


Immediate vertical handover




Long-term evolution


Medium access control


Markov decision process


Media independent handover


Mean opinion score


Mobile terminal


Orthogonal frequency division multiple access






Peak signal-to-noise ratio


Quality of experience


Quality of service


QoE-maximization-based vertical handover


Radio frequency


Received signal strength


Signal-to-noise ratio


User equipment


Vertical handover decision algorithm


Vertical handover


Visible light communication


Wireless local area network



Not applicable.


This research was supported by the National Natural Science Foundation of China (Grant Nos. 61502210, 61772243, and 61701198) and the China Postdoctoral Science Foundation (Grant No. 2015M570484).

Availability of data and materials

The datasets supporting the conclusions of this article are included within the article and its additional files.

Authors’ contributions

XB, WA, and AA developed the Q-VHO scheme, designed the system model, and carried out the simulation. All authors were involved in analyzing and interpreting results. XB and WA drafted the manuscript. AA, WZ, and JD revised the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

School of Computer Science and Communication Engineering, Jiangsu University, 301 Xuefu Road, Zhenjiang, 212013, China
Takoradi Technical University, Box 256, Takoradi, Ghana
School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, China


  1. National Telecommunications and Information Admission (NTIA) (2003). FCC frequency allocation chart. Accessed 1 May 2018.
  2. X. Bao, G. D. Yu, J. S. Dai, X. R. Zhu, Li-Fi light fidelity - a survey. Wirel. Netw. 21(6), 1879–1889 (2015).View ArticleGoogle Scholar
  3. J. Hou, D. C. O’Brien, Vertical handover-decision-making algorithm using fuzzy logic for the integrated Radio-and-OW system. IEEE Trans. Wirel. Commun. 5:, 176–185 (2006).View ArticleGoogle Scholar
  4. F. Wang, Z. Wang, C. Qian, L. Dai, Z. Yang, Efficient vertical handover scheme for heterogeneous VLC-RF systems. J Opt. Commun. Netw. 7:, 1172–1180 (2015).View ArticleGoogle Scholar
  5. X. Bao, X. R. Zhu, T. C. Song, Y. Q. Ou, Protocol design and capacity analysis in hybrid network of visible light communication and OFDMA systems. IEEE Trans. Veh. Technol. 63(4), 1770–1778 (2014).View ArticleGoogle Scholar
  6. X. Bao, J. Dai, X. Zhu, Visible light communications heterogeneous network (VLC-HetNet): new model and protocols for mobile scenario. Wirel. Netw. 23(1), 299–309 (2017).View ArticleGoogle Scholar
  7. L. Liu, L. Sun, E. Ifeachor, in Wireless and Mobile Computing, Networking and Communications (WiMob), 2015 IEEE 11th International Conference on. A QoE-driven vertical handover algorithm based on media independent handover framework (IEEE, 2015), pp. 51–58.
  8. A. Singhrova, N. Prakash, Vertical handoff decision algorithm for improved quality of service in heterogeneous wireless networks. IET Commun. 6:, 211–223 (2012).MathSciNetView ArticleGoogle Scholar
  9. Y. Kirsal, E. Ever, G. Mapp, O. Gemikonakli, in Advanced Information Networking and Applications (AINA), 2013 IEEE 27th International Conference on, Kirsal. Enhancing the modelling of vertical handover. in integrated cellular/WLAN environments (IEEE), pp. 924–930.
  10. M. Bin, D. Hong, X. Xianzhong, L. Xiaofeng, An optimized vertical handoff algorithm based on Markov process in vehicle heterogeneous network. China Commun. 12:, 106–116 (2015).View ArticleGoogle Scholar
  11. A. M. Vegni, E. Natalizio, A hybrid (N/M) CHO soft/hard vertical handover technique for heterogeneous wireless networks. Ad Hoc Netw. 14:, 51–70 (2014).View ArticleGoogle Scholar
  12. E. Stevens-Navarro, Y. Lin, V. Wong, An MDP-based vertical handoff decision algorithm for heterogeneous wireless networks. IEEE Trans. Veh. Technol. 57(2), 1243–1254 (2008).View ArticleGoogle Scholar
  13. S. Liang, H. Tian, B. Fan, R. Bai, in Vehicular Technology Conference (VTC Fall), 2015, IEEE 82nd. A novel vertical handover algorithm in a hybrid visible light communication and LTE system(IEEE, 2015), pp. 1–5.
  14. T. Wan, L. Luo-Kun, Z. Xia, J. Chun-Xiao, in Computational Intelligence and Communication Networks (CICN), 2016 8th International Conference on. A resource allocation algorithm combined with optical power dynamic allocation for indoor hybrid VLC and Wi-Fi network (IEEE, 2016), pp. 21–27.
  15. D. A. Basnayaka, H. Haas, in Vehicular Technology Conference (VTC Spring), 2015 IEEE 81st. Hybrid RF and VLC systems: improving user data rate performance of VLC systems (IEEE, 2015), pp. 1–5.
  16. T. Nguyen, Y. M. Jang, M. Z. Chowdhury, in Ubiquitous and Future Networks (ICUFN), 2013 Fifth International Conference on. A pre-scanning-based link switching scheme in visible light communication networks (IEEE, 2013), pp. 366–369.
  17. E. Dinc, O. Ergul, O. B. Akan, in Vehicular Technology Conference (VTC Fall), 2015 IEEE 82nd. Soft handover in OFDMA based visible light communication networks (IEEE, 2015), pp. 1–5.
  18. (Houghton Miin Harcourt Publishing Company, 2014). Retrieved 15 May 2018.
  19. R. C. Streijl, S. Winkler, D. S. Hands, Mean opinion score (MOS) revisited: methods and applications, limitations and alternatives. Multimedia Systems. 22:, 213–227 (2016).View ArticleGoogle Scholar
  20. A. Anandkumar, N. Michael, A. K. Tang, A. Swami, Distributed algorithms for learning and cognitive medium access with logarithmic regret. IEEE J. Sel. Areas Commun. 29:, 731–745 (2011).View ArticleGoogle Scholar
  21. A. B. Reis, J. Chakareski, A. Kassler, S. Sargento, in INFOCOM IEEE Conference on Computer Communications Workshops. Distortion optimized multi-service scheduling for next-generation wireless MeSH networks (IEEE, 2010), pp. 1–6.
  22. T. V. Q. E. Group, Final report from the Video Quality Experts Group on the validation of objective models of video quality assessment, Phase II (FR-TV2), 32 (2003). Accessed 14 Nov 2018.
  23. C. A. Courcoubetis, A. Dimakis, M. I. Reiman, in INFOCOM 2001. Twentieth Joint Conference of the, IEEE Computer and Communications Societies. Proceedings, vol. 1. Providing bandwidth guarantees over a best-effort network: call-admission and pricing (IEEE, 2001), pp. 459–467.
  24. J. A. Bergstra, C. A. Middelburg, ITU-T Recommendation G.107 : The E-Model, a computational model for use in transmission planning. Fundamenta Informaticae. 61:, 183–211 (2003).Google Scholar
  25. S. Sengupta, M. Chatterjee, S. Ganguly, Improving quality of VoIP streams over WiMax. IEEE Trans. Comput. 57:, 145–156 (2008).MathSciNetView ArticleGoogle Scholar
  26. R. Matos, N. Coutinho, C. Marques, S. Sargento, J. Chakareski, A. Kassler, in IEEE International Conference on Communications. Quality of experience-based routing in multi-service wireless mesh networks (IEEE, 2012), pp. 7060–7065.
  27. T. Inzerilli, A. M. Vegni, A. Neri, R. Cusani, in IEEE International Conference on Wireless and Mobile Computing. A location-based vertical handover algorithm for limitation of the ping-pong effect. in Networking and Communications, 2008. WIMOB’08 (IEEE, 2008), pp. 385–389.
  28. I. Chads, G. Chapron, M. J. Cros, F. Garcia, R. Sabbadin, MDPtoolbox: a multi-platform toolbox to solve stochastic dynamic programming problems. Ecography. 37:, 916–920 (2014).View ArticleGoogle Scholar
  29. J. Sztrik, Basic Queueing Theory (GlobeEdit, OmniScriptum GmbH & Co., Germany, 2016). Tian Lan.Google Scholar
  30. V. O. Li, Z. Lu, in Networking, Sensing and Control, 2004 IEEE International Conference on. Ad hoc network routing (IEEE, 2004), pp. 100–105.


© The Author(s) 2018