Skip to main content

A strategy for joint service offloading and scheduling in heterogeneous cloud radio access networks


Cloud radio access network is one of the most promising cellular networks for the next generation of mobile networks. The basic idea of cloud RAN (radio access network) is virtualizing and centralizing the intelligent part of the base station, the base band unit, and keeping remote radio heads on cell site enabling a centralized processing and management. Offloading data computation to edge cloud was proposed as a solution to deal with resource limitation while keeping a good quality of service. In this paper, we propose a strategy to jointly handle offloading decision and offloading request scheduling in cloud RAN. We aim to improve network quality of service while reducing the scheduling cost expressed in terms of overload, network delay, and migration cost. Numerical results show that the proposed approach is able to reduce the response time of the applications, mobile terminal energy consumption, and total execution cost.


With the success of smartphones, mobile developers are becoming more ingenious in creating sophisticated applications to attract users such as face recognition, interactive gaming, and augmented reality applications. However, these applications are resource-intensive, i.e., they require high processing and energy capabilities to be executed. Finding a tradeoff between limited resources and battery lifetime of mobile devices and resource-intensive applications is a big challenge for next-generation mobile platform development [1].

Mobile cloud computing was proposed as a solution to tackle this challenge [2]. Offloading of total or part of application workflow to a resource-rich cloud infrastructure helps to increase mobile devices capabilities. However, computation offloading to remote central cloud will not solve the problem while mobile users may experience long latency for data exchange with the central cloud through the wide area network. Since it is very hard to reduce the latency in a wide area network, mobile cloud computing based cloudlets was proposed as a solution [3]. The basic idea consists on leveraging the physical proximity by offloading computation to servers via a Wi-Fi access point. However, due to limited coverage of Wi-Fi access points, services cannot be provided everywhere. Besides, cloudlets are based on servers with small or medium resources which may not satisfy the QoS (quality of service) of a large number of users.

In order to meet these challenges, mobile edge cloud computing was proposed as an innovative mobile cloud computing paradigm which complements the cloudlet concept [4]. The concept of mobile edge cloud computing is to provide cloud computing resources at the edge of radio access networks close to mobile end users. Using this infrastructure will allow to reduce latency by deploying a fiber transport network between base stations and edge cloud data centers. Endowing base stations with additional computation and storage resources is expected to enhance mobile users’ QoS anywhere and at any time [5].

In a C-RAN (cloud radio access network) infrastructure, base band units are moved from cell site to a central data central. With this approach, all the radio access network functionalities are centralized in the cloud. Centralized processing enables indeed more advanced and efficient network coordination and management. In the cell site, RRH (radio remote head) is still responsible for transmitting radio signal, amplification of signal power and analog to/from digital conversion, while all the base band signal processing parts, including physical, mac, and upper layers, which require higher processing resources, are relocated from the cell site to a centralized BBU (base band unit) in the operator cloud infrastructure. The interface between the numerous RRH and BBU is named CPRI (common public radio interface). This interface supports a bidirectional constant bit rate protocol that requires accurate synchronization and strict latency control. Other protocols have also been proposed for this interface such as OBSAI (Open Base Station Architecture Initiative) and ORI (open radio equipment interface). The general C-RAN architecture is illustrated in Fig. 1.

Fig. 1
figure 1

General C-RAN architecture

In this paper, we propose a novel strategy for data offloading in cloud RAN-based mobile edge cloud computing. We aim to fully leverage the potential of such infrastructure throughout joint offloading decision and offloading requests scheduling in the edge cloud. One of the most critical issues impacting data offloading performances is task scheduling in cloud resources.

For example, when too many applications are offloaded to the same edge cloud, if all of them are executed on the same container, it will be highly overloaded. In return, this situation will lead to high energy consumption and long application response time. Therefore, in order to efficiently benefit from data computation offloading, we need to address two key challenges:

  1. (1)

    How to choose between local execution on mobile device and offloading to the cloud?

  2. (2)

    Once an application is offloaded to the cloud, how to schedule it among the available resources?

The rest of the paper is organized as follows: In the next section, different related works are discussed. In Section 3, we explain our strategy for joint offloading decision and offloading request scheduling. Then, in Section 4, we study the system performances and present simulation results. Finally, Section 5 concludes this paper.

Related work

Internet data traffic is increasing exponentially, especially the portion of traffic going through mobile networks. That is why mobile data computation offloading has become an important issue in cellular networks. As a result, various cloud offloading systems were proposed in the literature. MAUI (Mobile Assistance Using Infrastructure) [1] and ThinkAir [6] describe hardware components and propose to offload data in order to optimize energy consumption of mobile devices. However, they ignore other aspects of offloading. CloneCloud [7] proposes to improve application partitioning between the device and the cloud with the purpose of reducing energy consumption or execution time.

In addition to mobile cloud frameworks, many other research works have focused on offloading decision-making issue in MCC (mobile cloud computing) based systems. However, most of them have focused on the issue of energy consumption without considering other parameters that can affect the offloading process. For example, authors in [8] proposed an offloading decision mechanism based on computation, communication, and compilation energies comparison. The main objective of the proposed process is conserving energy on mobile terminal. It consists in comparing different energy consumption values of different execution strategies and choosing the alternative which have the lowest energy cost. In [9], Liu et al. proposed an offloading decision algorithm based on application deadline and communication quality constraints. The proposed offloading decision process helped the cloud controller to choose tasks for offloading in order to minimize mobile handset energy consumption. In [10], authors proposed CADA (context-aware offloading decision algorithm) in order to offload to the cloud servers. CADA uses a profiler containing the location of the mobile user and time-of-day in order to make the mobile offloading decisions. However, this method generates a lot of overhead and requires a lot of memory in order to store users’ profiles. In [11], authors have proposed an energy-aware data offloading scheme for cloud RAN. The centralized BBU makes offloading decision considering the mobile devices transmission rate and energy consumption of both cellular and Wi-Fi networks. It benefits from the centralized characteristic of C-RAN in order to schedule mobile terminals’ computation offloading from RRH to Wi-Fi access point. In [12], authors proposed to jointly optimize communication resources (downlink and uplink beamforming), computation resources (power and computation capability allocation), and offloading decision in cloud RAN to minimize the network energy consumption while satisfying applications’ delay requirements.

There is also a significant body of literature on task scheduling in the cloud mostly focusing on traditional cloud systems. Authors in [13] have proposed a task scheduling scheme to achieve the cooperation between local cloud and the Internet cloud. Applications are firstly classified according to their delay requirements before being served. However, with such an approach, most end users have to wait before being served and QoS decreases dramatically when the arrival rate increases. For this reason, authors have then designed a threshold-based policy to cooperatively schedule the local cloud and the Internet cloud. The objective is to maximize the probability that tasks can be executed within their delay requirements. In [14], a job scheduling scheme in the cloud computing clusters considering job resource requirements and completion time sensitivity is proposed. The problem is formulated as a maximization of the minimum utility achieved across all the jobs in the cluster, where the job utilities are functions of their completion times. In [15], authors have presented an online scheduling scheme that aims to minimize the average task queuing delay while accounting for task execution times. As a first step, upon the arrival of a new task, the scheduler tries to find an available server for assigning and executing this recently submitted task. If the scheduler does not find any server for the task, it will be putted in the queue. Then, the scheduler tries to find the task with the shortest execution time among the tasks that are already in the queue and fits it on the recently released server. In another work, authors have proposed in [16] a task scheduling mechanism which takes care of deadline and cost. Based on the concept of space-shared scheduling policy, this work presents a CDB (cost-deadline based) task scheduling algorithm to schedule tasks by taking into account task penalty and provider profit. Simulations show that if the number of virtual machines and datacenters decreases with the decrease of the number of cloudlets, the proposed algorithm misses deadline. Cost-based scheduling using linear programming was also investigated in [17]. Authors proposed SAH-DB (a task scheduling algorithm based on delay-bound constraint) in order to improve the task execution concurrency: when a task is received, all the resources (CPU, memory, and network) are sorted in a descending order based on the resources processing capacity, then the task is dispatched to resources with the minimum execution time.

To summarize our state-of-the-art analysis, we can state that existing contributions in the area of mobile computation offloading optimization have mainly focused on the energy consumption and latency. In the area of task scheduling, in which the main focus was on the jobs’ completion time, we propose in this work a scheduling optimization mechanism that aims to reduce the cost of task scheduling. Unlike previous works, we model the cost of tasks as a function of overloading, network delay, and migration. The proposed resource management strategy takes mainly into account the available resources, resource requirements, deadlines, and load balancing in Cloud-RRH. These two problems are addressed separately and there are very few contributing addressing these two aspects in a holistic way. In this work, we propose to address the two problems at the same time proposing a joint optimization of offloading decision and application scheduling in the edge cloud which from our viewpoint is necessary. In our previous contribution [18], we have proposed a dynamic multi-parameter offloading decision scheme in order to adapt offloading decision to the current network state. In this work, we extend significantly this contribution proposing a global offloading strategy which combines offloading decision and task scheduling optimization. The originality of our algorithm is that it takes into consideration in the decision-making several parameters related to the network state, the MT mobility, and capabilities, as well as the tasks to offload.

Computation offloading and task scheduling in H-CRAN

System model

Computation offloading system model

The scenario is depicted in Fig. 2. We consider a H-CRAN (heterogeneous cloud radio access network) composed of H-RRHs (high remote radio heads) that act as macro-cells and L-RRHs (low remote radio heads) that act as small cells. In our previous works, we proposed to add an edge cloud, the Cloud-RRH [19]. It represents additional cloud resources close to the mobile end user. While in traditional cloud RAN architecture all RAN functionalities are centralized in the cloud, we proposed to flexibly split RAN functionalities. Besides, the Cloud-RRH contains additional computation and storage resources for data offloading. We also make the following assumptions: (i) mobile applications that are utilized by the mobile user are installed on the mobile device, on the cloud server, and also on the Cloud-RRH; (ii) even if the interface between mobile terminal and clouds may provide different rate and delay values, mobile broadband connectivity does not change during the application processing time. Note that the second assumption means that we considered applications with no large processing time. This condition has also been assumed by most of the related works [5, 20,21,22,23].

Fig. 2
figure 2

Proposed C-RAN architecture model

The system is composed of M mobile users that can be served by either a high or a low RRH. We consider uplink. The time needed to transfer S up bits in the UL (uplink) connection between mobile device and the serving RRH t up depends only on the uplink data rate r up and the number of bits to be transmitted, i.e., t up = S up/r up. Similarly, for DL (downlink) transmission, after remote processing, from Cloud-RRH to the serving RRH, the time required is t dl = S dl/r dl, with S dl the number of bits to transmit and r dl the downlink data rate.

The measurements provided in [22] proved that the power consumed by the mobile device in UL increases with the uplink transmission power, p tx , while a baseline power is consumed just for having the transmission chain switched on, whereas the power consumed in DL increases with the downlink data rate, r dl, and a baseline power is consumed just for having the reception chain switched on. Based on these results, we adopted the following models of power consumption at the mobile device in both UL and DL:

$$ {p}_{\mathrm{ul}}={k}_{\left( tx,1\right)}+{k}_{\left( tx,2\right)}{p}_{tx} $$
$$ {p}_{\mathrm{dl}}={k}_{\left( rx,1\right)}+{k}_{\left( rx,2\right)}{r}_{\mathrm{dl}} $$

where k (tx, 1), k (tx, 2), k (rx, 1), and k (rx, 2) are constants.

The maximum rate supported by the channel with M users depends on the quality of the channel and the transmission power. It is given using Shannon’s theorem by the following expressions in UL and DL:

$$ {r}_{\left(\mathrm{up},m\right)}=B\;\log\;\left(1+{G}_{\mathrm{up}}{p}_{\left( tx,m\right)}\right) $$
$$ {r}_{\left(\mathrm{dl},m\right)}=B\;\log \kern0.5em \left(1+{G}_{\mathrm{dl}}{p}_{\left( tx,\mathrm{RRH}\right)}\right) $$

where G up and G dl are the channel gain normalized by the average power of the noise and interference over the bandwidth, respectively, in uplink and downlink, p (tx, m) and p (tx, RRH) represent the transmission power of the mobile user and RRH, respectively, and B is the channel bandwidth.

According to (1) and (2), the energy spent by mobile device in UP and DL is given by the following equations:

$$ {E}_{\mathrm{up}}={k}_{\left( tx,1\right)}{t}_{\mathrm{up}}+{k}_{\left( tx,2\right)}{t}_{\mathrm{up}}{p}_{tx} $$
$$ {E}_{\mathrm{dl}}={k}_{\left( rx,1\right)}{t}_{\mathrm{dl}}+{k}_{\left( rx,2\right)}{t}_{\mathrm{dl}}{r}_{\mathrm{dl}} $$

Using (3), we can express p tx as \( {p}_{tx}=\frac{2^{\frac{r_{\mathrm{up}}}{B}}-1}{G_{\mathrm{up}}} \). Therefore, the energy consumed by the mobile device for offloading is given by the following equation:

$$ {\displaystyle \begin{array}{l}{E}_{\mathrm{off}}={E}_{\mathrm{up}}+{E}_{\mathrm{dl}}\\ {}={k}_{\left( tx,1\right)}{t}_{\mathrm{up}}+{k}_{\left( tx,2\right)}{t}_{\mathrm{up}}\frac{2^{\frac{S_{\mathrm{up}}}{t_{\mathrm{up}}\cdot B}}-1}{G_{\mathrm{up}}}+{k}_{\left( rx,1\right)}{t}_{\mathrm{dl}}+{k}_{\left( rx,2\right)}{S}_{\mathrm{dl}}\end{array}} $$

The energy spent by the mobile device in the local processing is considered to be proportional to the number of processed bits. It is given by the following equation:

$$ {E}_{\mathrm{loc}}={\varepsilon}_0S $$

where ε 0 is a constant that accounts jointly for the Joules/cycle and cycles/bit at the mobile device processor and S is the total number of bits.

Concerning the latency, we considered t 0 as the time needed to process one bit at the mobile device and t 1 as the time needed to process one bit at the RRH. Therefore, the time needed for local execution is given by the multiplication of t 0 by the number of bits. Whereas if the execution is offloaded to the Cloud-RRH, the time required for processing is given by the sum of the time required to transfer the bits from the mobile device to the serving RRH through the UL transport network, the time for the remote cloud to execute the offloaded computation, and the time to transfer all the output bits through the DL. Latency for both local processing and offloading are expressed by the following equations:

$$ {L}_{\mathrm{loc}}={t}_0S $$
$$ {L}_{\mathrm{off}}={t}_{\mathrm{up}}+{t}_1S+{t}_{\mathrm{dl}} $$

Task scheduling system model

We assume that N predefined containers are running on each Cloud-RRH and each container is characterized by its available capacity resources CPUi, RAMi, and Neti, i  N. Each offloading request is composed of T tasks that have to be executed with a deadline D, and each task is characterized by its CPUj, RAMj, and Netj and has an expected execution time Tex j , j  T. We consider a binary variable t (i,j) to indicate if a task j is allocated to a container i or not:

$$ {t}_{\left(i,j\right)}=\left\{\begin{array}{c}1\kern4.50em \mathrm{if}\ \mathrm{task}\ j\ \mathrm{is}\ \mathrm{allocated}\ \mathrm{to}\ \mathrm{container}\ i\\ {}0\kern14.00em \mathrm{otherwise}\end{array}\right. $$

A cost C is associated to each pair container-task allocation. Its value depends on whether the container is overloaded after the task execution or not and also whether a task migration was necessary due to end user mobility. Energy consumption cost was not considered in this work. We present details about considered costs in the following:

Overload cost

We denote by C_cap i the computational capacity of container i at time t:

$$ C\_{\mathrm{cap}}_i=\left(\begin{array}{l}C\_{\mathrm{cap}}_i^{\mathrm{CPU}}\\ {}C\_{\mathrm{cap}}_i^{\mathrm{RAM}}\\ {}C\_{\mathrm{cap}}_i^{\mathrm{Net}}\end{array}\right) $$

C_ut j, i is the average resource utilization of task j on container i:

$$ C\_{\mathrm{ut}}_{j,i}=\left(\begin{array}{l}C\_{\mathrm{ut}}_{j,i}^{\mathrm{CPU}}\\ {}C\_{\mathrm{ut}}_{j,i}^{\mathrm{RAM}}\\ {}C\_{\mathrm{ut}}_{j,i}^{\mathrm{Net}}\end{array}\right) $$

For each new task j to be executed in container i, we express the utilization rate μ i of container i corresponding to this system configuration as the ratio of average resource utilization of task j on container i by the computational capacity (CPU, RAM, network) of the container:

$$ {\mu}_i=\left(\begin{array}{l}{\mu}_i^{\mathrm{CPU}}=\frac{\sum \limits_{j=1}^T{t}_{\left(i,j\right)}C\_{\mathrm{ut}}_{j,i}^{\mathrm{CPU}}}{C\_{cap}_i^{CPU}}\\ {}{\mu}_i^{\mathrm{RAM}}=\frac{\sum \limits_{j=1}^T{t}_{\left(i,j\right)}C\_{\mathrm{ut}}_{j,i}^{\mathrm{RAM}}}{C\_{\mathrm{cap}}_i^{\mathrm{RAM}}}\\ {}{\mu}_i^{\mathrm{Net}}=\frac{\sum \limits_{j=1}^T{t}_{\left(i,j\right)}C\_{\mathrm{ut}}_{j,i}^{\mathrm{Net}}}{C\_{\mathrm{cap}}_i^{\mathrm{Net}}}\end{array}\right) $$

If \( \operatorname{Max}\;\left({\mu}_i^{\mathrm{CPU}},{\mu}_i^{\mathrm{RAM}},{\mu}_i^{\mathrm{Net}}\right)>1 \), the container is considered in overloaded status. A penalty is associated when a task j is allocated on an overloaded container. We assume it to be positively proportional to the level of overloading. The overload cost ov_ cos t i metric of a container i is defined as follows:

$$ \mathrm{ov}\_\cos {\mathrm{t}}_i=\left\{\begin{array}{l}{\left({\mu}_i-1\right)}^{\lambda}\kern1em \mathrm{if}\kern0.5em \operatorname{Max}\ \left({\mu}_i^{\mathrm{CPU}},{\mu}_i^{\mathrm{RAM}},{\mu}_i^{\mathrm{Net}}\right)>1\\ {}0\kern3.5em \mathrm{otherwise}\end{array}\right. $$

λ allows to accentuate ov_cost when the container is approaching its saturation state. Indeed, the closer the container is approaching its maximum capacity, the more the ov_cost will increase, and therefore, the algorithm will choose another less loaded container to execute the tasks and avoid overloading.

The overall overload cost for the Cloud-RRH system to execute all the offloading request tasks can be calculated using this expression:

$$ \mathrm{ov}\_\mathrm{cost}=\sum \limits_{i=1}^N\mathrm{ov}\_{\mathrm{cost}}_i,N:\mathrm{set}\ \mathrm{of}\ \mathrm{containers} $$

Network delay cost

The network delay is caused by processing, queuing, transmission, and propagation delays. It causes performance degradation to system users. We associate a per unit of time delay cost d i, j of task j allocation on container i. The overall network delay cost is:

$$ \mathrm{nd}\_\mathrm{cost}={\sum}_i{\sum}_j{t}_{\left(i,j\right)}{d}_{i,j} $$

Migration cost

A task can be migrated when the corresponding mobile user is moving from one cell to another one. If a user task j is migrated, from one container to another one, a penalty r j is associated in order to capture the service downtime incurred by this migration. The overall migration cost is defined as follows:

$$ \mathrm{mig}\_\mathrm{cost}=\sum \limits_i\sum \limits_j{t}_{\left(i,j\right)}{r}_j $$

Note that we only consider here a migration of the tasks in the same Cloud-RRH and that migration penalty only depends on the type of task.

Joint offloading decision and task scheduling mechanism

In this section, we will present our strategy for joint offloading and tasks scheduling in 5G cloud radio access networks. The global idea is represented in Fig. 3. When a task is received, we will run an offloading decision algorithm considering a multitude of parameters about the mobile device capabilities, user mobility, and network status. If the decision is offloading to the edge cloud, a task scheduling mechanism is launched in order to improve resource utilization while reducing the execution cost.

Fig. 3
figure 3

Global approach

Offloading mechanism is represented in Fig. 4. After receiving a task, the mobile terminal (MT) starts by generating an offloading request packet and sends it to the serving RRH. The offloading request packet has the following structure:

Fig. 4
figure 4

Offloading mechanism interactions

- Offloading Req (service ID, C M T , C, E loc, B), where C M T is the capacity of the mobile terminal, C is the capacity required by the received application, E loc is the energy that will be spent for local execution, and B is the bandwidth between the serving RRH and mobile device.

The offloading request is then sent to corresponding Cloud-RRH where the container’s manager (CM) will decide about offloading or not using the offloading decision algorithm which will be detailed in the next session. If the application will be offloaded in the Cloud-RRH, the CM sends a Resource Allocation Request packet to the serving container where it indicates the capacity required for processing. The Offloading Response will be routed up to the mobile device after execution.

We propose a multi-parameter offloading decision algorithm which indicates where the application should be processed: locally on the mobile device, in the edge cloud (Cloud-RRH), or in central cloud (BBU pools). We aim to enhance the end user quality of experience (QoE) while improving the network and MT resource utilization. The decision algorithm workflow is represented in Fig. 5.

Fig. 5
figure 5

Proposed offloading decision algorithm

When a task is received, we will start by comparing the mobile device velocity to a velocity threshold (V th): if the MT velocity is higher, the task will be executed locally. The main motivation to take into account the velocity in the offloading decision-making is to prevent mobile terminals moving with a high speed from offloading tasks in the cloud which may lead to quality of service degradation due to the degradation of the communication (e.g., handover) and the risk that the tasks are migrating too often between the access Cloud-RRHs. If not, we will test the latency due to local execution with the latency due to offloading using Eqs. (9) and (10). If the latency generated by offloading the task to Cloud-RRH is greater than the latency generated by local processing, the task should be executed locally at the mobile terminal. However, while mobile terminal computation resources are limited, if the computational capacity required is greater than the predefined percentage of the total locally available capacity, the task will be offloaded to the Cloud-RRH.

Otherwise, if the latency generated by offloading is lower, we have to compare the energy consumed by mobile terminal in case of offloading using Eq. (7) to the energy consumed in local execution case using Eq. (8). If E off is higher than E loc, we compare the latency generated by local computation to the maximum latency authorized by the application. If L loc < L max, the task is executed locally, if not, the task is offloaded to Cloud-RRH.

Finally, if E off < E loc, we will test the channel conditions. We can use the Shannon theorem to calculate the channel capacity. We consider that channel gain encompasses path loss, slow fading, and fast fading. Then, the channel coefficient is compared to the average channel coefficient calculated and updated over time. The channel is considered in a relatively “good” state if the current channel realization is above this average; thus, the task is offloaded. Otherwise, the task is executed locally. This will prevent the system from applying costly offloading when channel conditions are not favorable.

When a task is offloaded in the edge cloud, the container’s manager will decide in which container the application will be processed. A container is characterized by a triplet of allocated resources (CPU, RAM, and network bandwidth). Each offloading request is considered as a set of tasks to instantiate in the Cloud-RRH. Each task has a delay constraint and is characterized by its resource requirements in terms of CPU, RAM, and network bandwidth.

It is necessary to well design the scheduler of tasks based on the available resources and tasks requirements in order to find the most suitable container for application task offloading that minimizes the total cost while respecting load balancing between containers in the same Cloud-RRH. The total execution cost is expressed as follows:

$$ C=\alpha \cdot \mathrm{ov}\_\mathrm{cost}+\beta \cdot \mathrm{nd}\_\mathrm{cost}+\delta \cdot \mathrm{mig}\_\mathrm{cost} $$

α, β, and δ introduce the importance of weights associated to each cost to optimize. If the weights are equal, this means that there is no preference of one resource against the others; otherwise, the resources that are assigned the highest weight will have the highest priority in the optimization process.

Therefore, the objective is to minimize the total cost of overloading, network delay, and migration of the entire system when executing all the submitted offloading requests.

Objective function

$$ \operatorname{Minimize}\kern1em C $$

Subject to

$$ \sum \limits_j{t}_{\left(i,j\right)}{\mathrm{Tex}}_j\le D $$
$$ \left\{\begin{array}{l}\sum \limits_j{t}_{\left(i,j\right)}C\_{\mathrm{ut}}_{j,i}^{\mathrm{CPU}}\le {\mathrm{CPU}}_i\\ {}\sum \limits_j{t}_{\left(i,j\right)}C\_{\mathrm{ut}}_{j,i}^{\mathrm{RAM}}\le {\mathrm{RAM}}_i\\ {}\sum \limits_j{t}_{\left(i,j\right)}C\_{\mathrm{ut}}_{j,i}^{\mathrm{Net}}\le {\mathrm{Net}}_i\end{array}\right. $$
$$ \frac{\left|{\mu}_i-\frac{\sum \limits_i{\mu}_i}{N}\right|}{\mu_i}\le \varepsilon $$
$$ \sum \limits_j{t}_{\left(i,j\right)}=1 $$

The optimization is subject to constraints given by (17) through (20). Deadline constraint is expressed by Eq. (17). It guarantees that each offloading request is executed before the application’s deadline. Equation (18) expresses resources constraint. It enforces that container resources are greater than all tasks’ requirements including the number of CPU, amount of memory, and network bandwidth. The constraint (19) assures load balancing between containers in the same Cloud-RRH, and ε denotes for the maximum tolerance of load balancing. Finally, the constraint (20) guarantees that each task is scheduled on only one container.

We firstly consider that there is no preference between the different types of resources, i.e., α = β = δ = 1. We also consider that all tasks are executed in parallel and the deadline D constraint is therefore fixed for the worst case, that is, all tasks are executed in serial.

This is a MIP (mixed-integer problem) problem, and we solve it as a linear program since the objective function is linear to all variables.

Performances evaluation

In order to test our resource management scheme, we considered an urban environment simulation scenario. Our heterogeneous C-RAN infrastructure is composed of seven H-RRHs and four L-RRHs per cell. H-RRHs have a coverage of 500 m and L-RRHs 30 m radius [23]. We have aligned values of ε 0 and t 0 with the measurements given in [24] for energy and frequency characteristics of local computing in commercial mobile handsets, as well as computation of data ratios in practical applications. As in an urban area, we have considered users with random velocity from 3 to 120 km/h.

In the Cloud-RRH, containers have a computing capacity from 25 to 100. The memory varies from 100 to 200 KB and the network bandwidth is set from 1 to 2 Kbps. The number of tasks T varies from 20 to 140, and they have heterogeneous requirements. To evaluate our cost-based task scheduling scheme, we compare its performances to the SAH-DB mechanism. System simulation parameters are listed in Table 1.

Table 1 Simulation parameters

To evaluate our offloading decision algorithm performances, we compare it to the following algorithm:

  • No offloading: all tasks are executed locally on the mobile handset

  • Total offloading: all tasks are offloaded to the Cloud-RRH

  • SM-POD [25]: task offloading is based on a series of successive classifications considering mobile terminal capacities, task characteristics, or communication channel state. As a first step, tasks are classified as offloadable or not regarding their characteristics. Then, offloadable and non-offloadable tasks are divided into urgent and not urgent tasks based on their latency requirements. The third step of the decision algorithm concerns only offloadable urgent tasks. It consists in checking resources. If there is a lack of resources in the mobile terminal, the task should be offloaded. Therefore, tasks are divided into offloadable urgent tasks that should be offloaded (SOffUrg) in priority and offloadable urgent tasks that could be either offloaded or executed locally on the mobile terminal (COffUrg) depending on the available resources in the terminal. In the fourth step, energy spent for local computation and energy consumed for offloading are compared in order to determine COffUrg tasks that have to be offloaded. Finally, channel state is checked in order to determine if offloadable non-urgent tasks have to be offloaded or deferred.

In order to evaluate the impact of users’ speed on the quality of service, we have tested our system performances under varying velocity values. Figure 6 represents the application response time and MT energy consumption over mobile terminals speed (ranging from 5 to 30 km/h). We considered an application’s data size of 100 Mb. We can see that the QoS decreases when users’ velocity exceeds 15 km/h. Therefore, in order to prevent QoS degradation, the velocity threshold must be taken between 15 and 20 km/h; otherwise, offloading will generate a lot of overhead leading to longer response time and higher energy consumption.

Fig. 6
figure 6

QoS with different users speed

Figure 7 illustrates the variation of the response time over the data size (ranging from 1 to 100 Mb). We can observe that the proposed offloading scheme can ameliorate the user experience by reducing the response time. For small data size, no offloading has the best performance because mobile terminal capacity is able to satisfy the application requirements. However, when the data size becomes larger, the difference between different schemes becomes lager. Thanks to cloud edge introduction and network flexibility, the proposed scheme has the lowest response time especially for big data size values. Thus, the proposed offloading decision algorithm can be useful for high resource demand applications.

Fig. 7
figure 7

Application response time evaluation

Figure 8 shows the simulation results of the energy spent by the MT under the data size (ranging from 1 to 100 Mb). We can see that the proposed offloading decision algorithm can make the mobile handset consume less energy. The difference is more important when the data size is big. Therefore, the proposed algorithm is able to augment the mobile handset battery lifetime while executing complex program applications compared to local execution and total offloading.

Fig. 8
figure 8

Mobile terminal energy consumption evaluation

When a task is offloaded to Cloud-RRH, we have evaluated the scheduling efficiency in terms of execution cost under a varying number of associated tasks. Figure 9 represents the execution cost by applying the proposed cost-based scheduling scheme and SAH-DB scheduling algorithm with 25 to 100 cloud containers respectively. The proposed scheduling algorithm can reduce total execution cost compared with SAH-DB algorithm in the different number of associated tasks. Meanwhile, with the increase of the number of resources, the total cost of scheduling decreases. Moreover, the cost of scheduling increases with the number of associated tasks.

Fig. 9
figure 9

The execution cost with different resources


Data computation offloading enables intelligent mobile devices to run greedy resource applications. However, in order to fully exploit computation offloading benefits, we have to deal with two key challenges. The first one is about offloading decision, and the second one is about offloading request scheduling among available cloud resources [18]. The main contribution of this paper is the development of a whole offloading strategy for H-CRAN composed of offloading decision and task scheduling in order to improve network performances and user QoE. Therefore, we jointly handle offloading decision and offloading request scheduling in Cloud-RRH. First, we proposed a dynamic multi-parameter offloading decision scheme in order to adapt offloading decision to the current network state and application characteristics. Then, scheduling mechanism was developed as linear programming optimization function that aims to reduce the total execution cost expressed as overload, network delay, and migration. Simulation results show that the proposed approach is able to improve QoS by reducing application response time and mobile device energy consumption while decreasing total execution cost in the Cloud-RRH.


  1. E Cuervo, A Balasubramanian, D Cho, A Wolman, S Saroiu, R Chandra, P Bahl, in Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, New York, NY, USA. MAUI: making smartphones last longer with code offload (2010), pp. 49–62

    Google Scholar 

  2. N Chalaemwongwan, W Kurutach, in 2016 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON). Mobile cloud computing: a survey and propose solution framework (2016), pp. 1–4

    Google Scholar 

  3. M Satyanarayanan, P Bahl, R Caceres, N Davies, The case for VM-based cloudlets in mobile computing. IEEE Pervasive Comput. 8(4), 14–23 (2009)

    Article  Google Scholar 

  4. “Mobile-Edge Computing–Introductory Technical White Paper,” Sep-2014. [Online]. Available: Accessed 16 Mar 2017

  5. S Barbarossa, S Sardellitti, PD Lorenzo, in 2013 IEEE 14th Workshop on Signal Processing Advances in Wireless Communications (SPAWC). Joint allocation of computation and communication resources in multiuser mobile cloud computing (2013), pp. 26–30

    Chapter  Google Scholar 

  6. S Kosta, A Aucinas, P Hui, R Mortier, X Zhang, in 2012 Proceedings IEEE INFOCOM. ThinkAir: dynamic resource allocation and parallel execution in the cloud for mobile code offloading (2012), pp. 945–953

    Chapter  Google Scholar 

  7. B-G Chun, S Ihm, P Maniatis, M Naik, A Patti, in Proceedings of the Sixth Conference on Computer Systems, New York, NY, USA. CloneCloud: elastic execution between mobile device and cloud (2011), pp. 301–314

    Google Scholar 

  8. G Chen, BT Kang, M Kandemir, N Vijaykrishnan, MJ Irwin, R Chandramouli, Studying energy trade offs in offloading computation/compilation in Java-enabled mobile devices. IEEE Trans. Parallel Distrib. Syst. 15(9), 795–809 (2004)

    Article  Google Scholar 

  9. K Liu, J Peng, X Zhang, Z Huang, in 2016 IEEE Global Communications Conference (GLOBECOM). A combinatorial optimization for energy-efficient mobile cloud offloading over cellular networks (2016), pp. 1–6

    Google Scholar 

  10. T-Y Lin, T-A Lin, C-H Hsu, C-T King, in 2013 IEEE Wireless Communications and Networking Conference Workshops (WCNCW). Context-aware decision engine for mobile cloud offloading (2013), pp. 111–116

    Chapter  Google Scholar 

  11. YS Chen, CS Hsu, TY Juang, HH Lin, in 2015 IEEE Wireless Communications and Networking Conference (WCNC). An energy-aware data offloading scheme in cloud radio access networks (2015), pp. 1984–1989

    Chapter  Google Scholar 

  12. J Cheng, Y Shi, B Bai, W Chen, in 2016 IEEE International Conference on Communications (ICC). Computation offloading in cloud-RAN based mobile cloud computing system (2016), pp. 1–6

    Google Scholar 

  13. T Zhao, S Zhou, X Guo, Y Zhao, Z Niu, in 2015 IEEE Globecom Workshops (GC Wkshps). A cooperative scheduling scheme of local cloud and Internet cloud for delay-aware mobile cloud computing (2015), pp. 1–6

    Google Scholar 

  14. Z Huang, B Balasubramanian, M Wang, T Lan, M Chiang, DHK Tsang, in 2015 IEEE Conference on Computer Communications (INFOCOM). Need for speed: CORA scheduler for optimizing completion-times in the cloud (2015), pp. 891–899

    Chapter  Google Scholar 

  15. M NoroozOliaee, B Hamdaoui, M Guizani, MB Ghorbel, in 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). Online multi-resource scheduling for minimum task completion time in cloud servers (2014), pp. 375–379

    Chapter  Google Scholar 

  16. Himani, HS Sidhu, in 2015 Second International Conference on Advances in Computing and Communication Engineering. Cost-deadline based task scheduling in cloud computing (2015), pp. 273–279

    Chapter  Google Scholar 

  17. M Yingchi, X Ziyang, P Ping, W Longbao, in 2015 IEEE Fifth International Conference on Big Data and Cloud Computing. Delay-aware associate tasks scheduling in the cloud computing (2015), pp. 104–109

    Chapter  Google Scholar 

  18. O Chabbouh, SB Rejeb, Z Choukair, N Agoulmine, in 2016 24th International Conference on Software, Telecommunications and Computer Networks (SoftCOM). Offloading decision algorithm for 5G/HetNets cloud RAN (2016), pp. 1–5

    Google Scholar 

  19. O Chabbouh, SB Rejeb, N Agoulmine, Z Choukair, in 2017 31st International Conference on Advanced Information Networking and Applications Workshops (WAINA). Cloud RAN architecture model based upon flexible RAN functionalities split for 5G networks (2017), pp. 184–188

    Chapter  Google Scholar 

  20. S Barbarossa, S Sardellitti, PD Lorenzo, in 2013 Future Network Mobile Summit. Computation offloading for mobile cloud computing based on wide cross-layer optimization (2013), pp. 1–10

    Google Scholar 

  21. D Huang, P Wang, D Niyato, A dynamic offloading algorithm for mobile computing. IEEE Trans. Wirel. Commun. 11(6), 1991–1995 (2012)

    Article  Google Scholar 

  22. O Muñoz, A Pascual-Iserte, J Vidal, in 2013 Future Network Mobile Summit. Joint allocation of radio and computational resources in wireless application offloading (2013), pp. 1–10

    Google Scholar 

  23. J Oueis, EC Strinati, S Barbarossa, in 2014 IEEE Wireless Communications and Networking Conference (WCNC). Multi-parameter decision algorithm for mobile computation offloading (2014), pp. 3005–3010

    Chapter  Google Scholar 

  24. AR Jensen, M Lauridsen, P Mogensen, TB Sørensen, P Jensen, in 2012 IEEE Vehicular Technology Conference (VTC Fall). LTE UE power consumption model: for system level energy and performance optimization (2012), pp. 1–5

    Google Scholar 

  25. AP Miettinen, JK Nurminen, in Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, Berkeley, CA, USA. Energy efficiency of mobile clients in cloud computing (2010), pp. 4–4

    Google Scholar 

Download references


The authors declare that no one contributes to this work except the authors.


Only the authors financed this work.

Author information




All authors contributed to the work. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Olfa Chabbouh.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chabbouh, O., Ben Rejeb, S., Choukair, Z. et al. A strategy for joint service offloading and scheduling in heterogeneous cloud radio access networks. J Wireless Com Network 2017, 196 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Cloud RAN
  • 5G networks
  • HetNets
  • Offloading
  • Scheduling
  • QoS