Joint computation offloading and resource allocation strategy for D2D-assisted and NOMA-empowered MEC systems

Multi-access edge computing (MEC) emerged as a promising network paradigm that provides computation, storage and networking features within the edge of the pervasive mobile radio access network. This paper jointly considers computation offloading and resource allocation problem in device-to-device (D2D)-assisted and non-orthogonal multiple access (NOMA)-empowered MEC systems, where each mobile device (MD) is allowed to execute its task in one of the three ways, i.e., local computing, MEC offloading or D2D offloading. We invoke orthogonal multiple access (OMA) and NOMA schemes for MDs that select D2D offloading mode, allowing them to assign tasks to their peers using OMA or NOMA. The original problem is formulated as an overall energy consumption minimization problem, which proves to be NP-hard, making it intractable to solve optimally. We start from a simple case, OMA case and transform the original problem into two sub-problems, i.e., resource allocation sub-problem and computation offloading sub-problem and propose two heuristic algorithms to obtain the sub-optimal solutions of both sub-problems. Then, for the MDs selecting D2D offloading mode, we conduct user pairing and apply the NOMA scheme. Finally, simulation results demonstrate the efficiency of the proposed scheme when compared with the related schemes.


Introduction
Technology scaling triggers some promising applications, i.e., face recognition, virtual-reality (VR), augmented-reality (AR), interactive gaming, etc. These applications are very often running on mobile devices (MDs); however, the limited computation and processing capability of MDs may degrade the performance measures of the applications and result in the undesired quality of experience (QoE) [1]. Multi-access edge computing (MEC) has emerged as a potent tool to satisfy the growing demands of high task execution rate, low latency, and low energy consumption by bringing the computation and storage resources to the edge of wireless networks, such as the base stations (BSs) of cellular networks. Leveraging the advanced processing capability of MEC servers, conducting computation offloading which offloads computationally intensive tasks from MDs to MEC servers, and executing user tasks at the servers is highly desired [2,3].
The problem of computation offloading has been considered extensively in recent research work [4][5][6][7][8][9]. The authors in [4] formulate the cooperative computation offloading problem (which maximizes the expected long-term reward in terms of service delay) as a Markov decision process (MDP) and propose two intelligent computation offloading algorithms based on soft actor critic (SAC), i.e., centralized SAC offloading and decentralized SAC offloading to solve the problem. A device-centric and risk-based distributed approach is proposed in [5], where the authors exploit game theory to obtain the optimal amount of computation offloading volume. The authors in [6] jointly optimize software caching and computation offloading to minimize the weighted sum energy consumption in a multi-user cache-assisted MEC system and propose an alternating direction method of multipliers (ADMM) and penalty convex-concave procedure (Penalty-CCP) to obtain the sub-optimal solutions. In [7], the authors formulate an energyefficient computation offloading problem as a mixed integer nonlinear programming problem in a MEC-enabled small cell network. To minimize the energy consumption of all user equipments, a sub-optimal algorithm consisting of a hierarchical genetic algorithm and a particle swarm optimization (PSO)-based computation algorithm is proposed. The authors in [8] study the computation offloading and caching problem aiming at minimizing the execution latency of user tasks by utilizing a collaborative call graph approach. In [9], a collaborative computation offloading scheme is proposed for centralized computing environment and a game-theoretic approach is proposed for distributed computing environment so as to minimize the energy consumption of the system.
To alleviate the ever-growing resource contention and improve the communication and computational efficiency of stand-alone MEC servers, device-to-device (D2D) communication is leveraged, where the diversity among nearby devices can be exploited to share the computational burden [10,11]. D2D-aided computation offloading techniques have been considered in recent research work [12][13][14][15].
Taking into account the dynamic system status and random task arrival rate, the authors in [12] investigate the energy-efficient task offloading problem in socially-aware D2D-assisted MEC networks and maximize the long-term network utility by jointly optimizing D2D connection selection and task allocation. Aiming at minimizing the computation latency for the tasks in a D2D-enabled heterogeneous network, the authors in [13] formulate a user-assisted multi-task offloading problem under the constraints on latency and energy consumption. A distributed optimization scheme based on ADMM is presented to determine the task offloading strategy. The authors in [14] formulate the computation capacity maximization problem in a multi-user D2D-MEC system as a mixed integer nonlinear programming problem with constraints on both communication and computation resources. The original problem is decomposed into two subproblems, where the first sub-problem aims to minimize the required edge computation resources for a given D2D pair, and the second sub-problem aims to maximize the computation capacity of the D2D-MEC system. A task offloading framework is proposed in [15], where MDs could share their computation and communication resources among each other via the assistance of network operators. Lyapunov optimization tool is utilized to produce dynamic task offloading decisions, which minimizes the time-averaged energy consumption.
Some research work jointly considers computation offloading and resource allocation problems [16][17][18][19][20][21][22]. In [16], the authors jointly optimize task offloading, cache decision, transmission power and central processing unit (CPU) frequency allocation to minimize the weighted sum cost of the execution delay and energy consumption in a cloud-edge heterogeneous network. The authors in [17] consider a D2D-enabled MEC system and formulate user association, task offloading and resource allocation problem as a latency minimization problem. By solving the optimization problem, a joint optimal strategy is obtained. In order to maximize the long-term utility energy efficiency, the authors in [18] jointly optimize the transmit power of the D2D link, the cellular uplink transmit power and the local CPU speed in a wireless powered D2D-enabled MEC system. Lyapunov optimization method is employed to transform the original problem into a series of deterministic drift-plus-penalty sub-problems in each time slot. The authors in [19] propose a mobility-aware task scheduling approach in a D2D-enabled cooperative MEC framework, in which joint optimization of user mobility, computation capacities and task properties is performed to minimize task offloading latency. The authors in [20] jointly optimize computing resource, transmit power and channel allocation to minimize the weighted sum of delay and energy consumption of all users. Due to multivariate fractional summation nature, the original optimization problem is dissolved into subproblems, namely, power allocation sub-problem, and channel allocation sub-problem. To solve the power allocation sub-problem, the authors employ PSO algorithm, and for the channel allocation sub-problem, a one-to-one matching algorithm based on swapping operations and Pareto improvement is proposed. The authors in [21] formulate the computation offloading and resource allocation problem in a multi-user MEC system as a weighted delay and energy consumption minimization problem and solve the problem by exploiting branch and bound method. A two-stage heuristic optimization (THO) algorithm is proposed in [22], which minimizes the overall energy consumption of the MDs by jointly designing task offloading decisions, channel selection, power allocation and resource allocation strategy.
Emerged as a key enabler for the fifth generation (5G) of wireless networks, nonorthogonal multiple access (NOMA) allows multiple users to share the same resource block (RB) simultaneously by enabling superposition coding (SC) at the transmitters and successive interference cancellation (SIC) at the receivers [23,24]. Aiming to achieve high spectral efficiency, improved quality of service (QoS) and lower latency, some recent research work has been carried out considering the cooperation of NOMA and MEC [25][26][27][28][29][30][31][32][33].
In order to minimize the overall system delay, the authors in [25] jointly optimize the offloaded computation-workloads and the transmission time in a NOMA-assisted MEC system. To solve the formulated joint optimization problem, two algorithms are proposed for single-user case and multi-user case, respectively. The authors in [26] jointly optimize the MEC server allocation, the transmit power allocation of all the MDs, the transmit power allocation of the MEC servers, the computational resource allocation of the MEC servers, the time allocation and the channel allocation variables to minimize the overall delay of all the tasks. The authors in [27] propose a hybrid NOMA-MEC offloading strategy and formulate a multi-objective optimization problem to minimize the user's energy consumption by finding low-complexity pareto-optimal resource allocation solution using exhaustive search method. Taking into account two different wireless channel scenarios, namely, static-channel scenario and dynamic-channel scenario, the authors in [28] propose an algorithm based on channel quality ranking (CQR) as a means to minimize the overall computation delay for a single-user multi-edge computing server. An optimal offloading solution is obtained by combining the golden section search method with the CQR algorithm for a static channel. For a dynamic channel, an algorithm based on deep reinforcement learning (DRL) is proposed. A hybrid NOMA-MEC offloading framework is proposed in [29], where the authors jointly optimize power, time and sub-channels to minimize the overall energy consumption. A switched hybrid NOMA scheme is proposed to allocate power and time, while the total reward exchange stable algorithm is used for channel allocation.
Exploiting the advantages of NOMA-based MEC communication in vehicular networks, the authors in [30] propose a NOMA-enabled vehicular edge computing (VEC) network, in which the joint optimization of offloading decisions, vehicular user equipment clustering, sub-channel allocation, computational resource allocation and transmit power control is implemented to minimize the overall system cost. A backscatter-assisted wireless-powered NOMA-MEC framework is presented in [31], in which the authors optimize energy harvesting (EH) time, backscatter communication time, uplink time, power reflection coefficient and transmit power, as well as computing frequencies in order to maximize the total amount of computation bits across all internet of things (IoT) devices. The authors in [32] jointly optimize the offloaded computation workloads of users, the offloading-duration as well as the computation resource allocation of the MEC servers to minimize the overall task execution latency. A computation efficiency maximization problem is formulated for NOMA-enabled MEC networks in [33], where the authors jointly optimize the transmit power of users and the CPU frequency of MEC servers. To analyze the impacts of delay and energy consumption on computation offloading and resource allocation, the authors in [34] formulate a joint latency and energy consumption minimization problem and provide analytical results for both NOMA uplink and downlink communication scenarios. Aiming at minimizing the overall system cost, the authors in [35] jointly optimize the computation-resource allocation at the MEC servers, the MDs computation offloading and the radio resource allocation for the data transmission in a NOMA-enabled multi-access MEC network and propose a three-layered algorithm to obtain the optimal solution. To minimize the weighted sum energy consumption of all the MDs, the authors in [36] jointly optimize task offloading, channel allocation and time allocation.
While computation offloading problem in NOMA-enabled MEC systems has been investigated, the extensive study on the computation offloading and access mode selection problem is missing. In particular, exploiting the advantages of both NOMA and orthogonal multiple access (OMA) technologies so as to enhance the task transmission performance is still an important issue worthy studying. Furthermore, although there exist numerous research activities that investigate D2D-based computation offloading and NOMA-enabled MEC system, previous studies fail to jointly consider the computation offloading mode, resource allocation and access scheme selection issues for OMA/NOMA-enabled cellular D2D systems. In this paper, we study computation offloading problem in cellular D2D systems. Specifically, it is assumed that MDs may execute their tasks locally, offload their tasks to the MEC servers or to their D2D peers by applying either OMA or NOMA schemes, and the resources of BSs and MEC servers are shared among multiple MDs, we address the computation offloading mode, communication and computation resource sharing and access scheme selection problem. The overall energy consumption is examined and the joint optimization problem is formulated and solved by dividing the original problem into two sub-problems and solving the two sub-problems, respectively. In a nutshell, the key contributions of the proposed design can be summarized as follows: • To unlock the true potential of key multiple access techniques, i.e., OMA and NOMA, in the perspective to enhance the task transmission performance, which aimed at minimizing the overall energy consumption, in this paper, we consider a D2D-assisted and NOMA-empowered MEC framework, in which computation and communication resources are jointly optimized. • To preserve the computation and communication efficiency, we jointly investigate the computation offloading mode, resource allocation and access scheme selection issues for OMA/NOMA-empowered cellular D2D system. Each MD in the network can execute its task in one of the three execution modes, i.e., local, MEC or D2D. For the MDs selecting D2D offloading mode, we further invoke OMA and NOMA modes, which allow the MDs to offload their tasks to their D2D peers by applying either OMA or NOMA. • The joint computation offloading mode selection, resource allocation and access scheme selection problem is formulated as an energy consumption minimization problem. Since the formulated problem is of NP-hard nature which is very difficult to solve in polynomial time, we start from a simple case and consider only OMA-based transmission scheme. We transform the original problem into two sub-problems, i.e., computation offloading sub-problem and resource allocation sub-problem and propose two heuristics to solve them, then for the MDs selecting D2D offloading mode, we conduct user pairing and apply NOMA scheme. Extensive numerical results are provided to validate the performance of the proposed scheme.
The rest of this paper is organized as follows. Section 2 describes the system model. Section 3 further explores the system model, express different delay and energy consumption formulations for various computing modes and formulate the optimization problem. Sections 4 and 5 present the solution of the optimization problem and propose two heuristics to solve it. Extensive simulation results are presented in Sect. 6. A brief discussion on computational complexity and convergence analysis is provided in Sect. 7. Finally, conclusion is drawn and presented in Sect. 8.

System model
In this section, we discuss the system model considered in this paper, including network model and communication resources sharing schemes. Table1 summarizes the notations used in this work.

Network model
We consider a cellular D2D communication system which consists of N BSs and a number of mobile devices (MDs), where each BS is equipped with one MEC server which is capable of offering computation offloading service to the MDs. The MDs in the network can be classified into two types, i.e., task offloading request users (RUs) and service providing users (PUs). Each RU has a computation-intensive task to execute, while each PU is of relatively advanced computation performance and may offer computation offloading service to RUs via D2D links. Denote BS n as the n-th BS, 1 ≤ n ≤ N . For convenience, we denote the MEC server attached to BS n as MEC n . Let M and J denote, respectively, the number of RUs and PUs, and RU m and PU j denote, respectively, the mth RU and the j- We assume that T m can be described by a triple < ξ m , η m , D max m > , where ξ m is the input data size of T m , η m is the computing capacity (in CPU cycles per bit) required to process T m and D max m is the maximum tolerable latency to execute T m . It is apparent that in the considered cellular D2D system, one RU may execute its task in various manners, i.e., local computing, MEC offloading or D2D offloading. Specifically, in local computing mode, the RU executes its entire task locally. In MEC offloading mode, the RU offloads its task to one MEC server for task execution. In D2D offloading mode, the RU offloads its task to one neighboring PU for task execution. The considered system model is shown in Fig. 1.

Communication resources sharing schemes
To enable efficient task transmission in MEC offloading and D2D offloading mode, we assume that a number of orthogonal sub-channels have been allocated to cellular links and D2D links as well. For cellular link transmission, multiple RUs may access one BS using orthogonal sub-channels and one RU can only occupy one sub-channel for task transmission. Let W max n denote the maximal number of sub-channels that can be utilized for data transmission between RUs and BS n , 1 ≤ n ≤ N , and W 0 denote the bandwidth of each subchannel. Let K denote the total number of sub-channels. Note that in this work, we make a relatively simple assumption on channel interference. That is, we assume that there's no interference between RUs. In the case that there exists interference between different links, we can apply power control or time-frequency resource allocation schemes to migrate or reduce the interference.
For D2D offloading mode, we assume that one PU may assign at most two (adjacent) subchannels to neighboring RUs in order to enable their task transmission in D2D offloading mode. In the case that two RUs tend to offload their tasks to one PU, given the bandwidth resources of D2D links, the two RUs may access the PU by applying either OMA or NOMA scheme. For convenience, the corresponding task computation modes are referred to as OMA-based D2D offloading mode and NOMA-based D2D offloading mode, respectively. The data rate of the D2D links in both modes is analyzed below.

Data rate in OMA-based D2D offloading mode
To apply OMA scheme to RUs in D2D offloading mode, we assume that one sub-channel is assigned to at most one RU. Suppose RU m offloads its task T m to PU j in OMA-based D2D offloading mode. Let R d m,j denote the achievable data rate of the link between RU m and PU j , R d m,j can be formulated as where µ d m,j,k ∈ {0, 1} is the sub-channel assignment variable in D2D offloading mode, i.e., µ d m,j,k = 1 , if the k-th sub-channel is assigned to RU m when offloading to PU j , otherwise, µ d m,j,k = 0 , P m is the transmit power of RU m , h d m,j,k is the channel gain of the link between RU m and PU j at the k-th sub-channel, σ 2 is the power of channel noise.

Data rate in NOMA-based D2D offloading mode
Applying NOMA-based D2D offloading mode, we assume that two RUs are allowed to offload their tasks to one PU simultaneously. Suppose RU m and RU m 1 both offload their tasks to PU j using NOMA scheme. Let R 1 m,m 1 ,j and R 2 m,m 1 ,j denote, respectively, the data rate of the link between RU m and PU j , and that between RU m 1 and PU j . Let h n m,j,k and h n m 1 ,j,k be the channel gain of the link between RU m and PU j , and that between RU m 1 and PU j at the k-th sub-channel. Without loss of generality, we assume that |h n m,j,k | < |h n m 1 ,j,k | , |h n m,j,k | ≈ |h n m,j,k+1 | , and |h n m 1 ,j,k | ≈ |h n m 1 ,j,k+1 | [37]. Suppose SIC scheme is exploited at PU j , R 1 m,m 1 ,j and R 2 m,m 1 ,j can be computed, respectively, as where µ n m,m 1 ,j,k ∈ {0, 1} is the sub-channel assignment variable in NOMA-based D2D offloading mode, i.e, if the k-th and the (k + 1)-th sub-channel are allocated to RU m and RU m 1 for transmitting tasks to PU j in NOMA-based D2D offloading mode, we set µ n m,m 1 ,j,k = 1 , otherwise, µ n m,m 1 ,j,k = 0.

Task execution cost in various computation modes
In this section, the delay and energy consumption for task execution in different computation modes are analyzed.

Local computing mode
In the case that RU m executes its task locally, 1 ≤ m ≤ M , the task execution delay can be characterized by where f 0 m denotes the computational capability of RU m . The energy consumption of RU m due to task execution can be expressed as where κ m is the energy consumption coefficient of RU m , which depends on the attributes of the CPU of RU m [21].

MEC offloading mode
In MEC offloading mode, one RU sends its task to one of the MEC servers, which then conducts task execution for the RU. Hence, the delay required to complete task execution can be computed as the sum of task transmission time from the RU to the MEC server, and the task execution time at the MEC server. Suppose RU m offloads its task T m to MEC n , the total time for completing task execution can be calculated as where D m,t m,n and D m,e m,n denote, respectively, the transmission time and execution time of T m . It should be noted that after executing T m , MEC n needs to transmit the result back to RU m . Since the data size of the task after execution is in general very small, the required transmission delay from MEC n to RU m is negligible [21].
D m,t m,n in (6) can be formulated as where R m m,n is the transmission rate of the link between RU m and MEC n , which can be expressed as where µ m m,n,k ∈ {0, 1} is the sub-channel allocation variable in MEC offloading mode, i.e., µ m m,n,k = 1 , if the k-th sub-channel is allocated to RU m for offloading its task to MEC n , otherwise, µ m m,n,k = 0 , h m m,n,k denotes the channel gain of the link between RU m and MEC n at the k-th sub-channel.
The task execution delay, denoted by D m,e m,n in (6), can be characterized as where f m n denotes the computational capacity of MEC n for processing the task of one RU.
The energy consumption in MEC offloading mode is resulted from task transmission and execution. Consider RU m offloads its task to MEC n , we obtain the energy consumption as where E m,t m,n is the energy consumption of RU m when transmitting its task to MEC n , which is given by E m,e m,n in (10) is the energy consumption of MEC n when executing T m for RU m . E m,e m,n can be computed as where κ m n denotes the energy consumption coefficient of MEC n .

D2D offloading mode
In D2D offloading mode, one RU may transmit its task to a neighboring PU which then executes the tasks for the RU. In order to facilitate efficient spectrum utilization, we assume that both OMA-based D2D scheme and NOMA-based D2D scheme are allowed during the task transmission from the RUs to the PUs.
To apply OMA-based D2D offloading mode, we assume that one sub-channel is assigned to at most one RU for offloading its task to one PU. Suppose RU m offloads its task T m to PU j , the corresponding task completion delay can be determined by where D d,t m,j and D d,e m,j denote, respectively, the transmission time required when RU m offloads its task T m to PU j and the execution time of task T m at PU j .
D d,t m,j can be expressed as (13) can be characterized by where f d j is the computational capacity of PU j for processing the task of one RU. The energy consumption in D2D offloading mode is caused by task transmission and execution. When RU m offloads its task T m to PU j , the energy consumption is given by where E d,t m,j and E d,e m,j denote the energy consumption of RU m for task transmission and the energy consumption of PU j for task execution, respectively. E d,t m,j can be expressed as  To apply NOMA-based D2D offloading mode, we assume that two RUs offload their tasks to one PU simultaneously. Suppose RU m and RU m 1 both offload their tasks to PU j using two adjacent sub-channels, the task completion time can be expressed as where D n,t m,m 1 ,j and D n,e m,m 1 ,j are, respectively, the task transmission time and execution time of RU m and RU m 1 when offloading their tasks to PU j . D n,t m,m 1 ,j is given by where D n,t,1 m,m 1 ,j and D n,t,2 m,m 1 ,j denote, respectively, the task transmission time of RU m and RU m 1 , and can be computed as The task execution time of RU m and RU m 1 at PU j denoted by D n,e m,m 1 ,j in (19) can be calculated as where D n,e,1 m,m 1 ,j and D n,e,2 m,m 1 ,j are, respectively, the task execution time of RU m and RU m 1 at PU j , which are given by The energy consumed due to task transmission and execution when RU m and RU m 1 offloading their tasks to PU j in NOMA-based D2D mode can be expressed as where E n,t m,m 1 ,j and E n,e m,m 1 ,j denote, respectively, the energy consumption during transmission and that during task execution. E n,t m,m 1 ,j is given by where E n,t,1 m,m 1 ,j and E n,t,2 m,m 1 ,j are, respectively, the transmission energy consumption of RU m and RU m 1 , and can be expressed as (19)     E n,e m,m 1 ,j can be expressed as where E n,e,1 m,m 1 ,j and E n,e,2 m,m 1 ,j are, respectively, the energy consumption of PU j when executing the task of RU m and RU m 1 . E n,e,1 m,m 1 ,j and E n,e,2 m,m 1 ,j are given by

Delay and energy consumption function formulation
In this work, we design a centralized scheme, where instead of optimizing the performance of a particular RU, we aim to minimize the overall system energy consumption which consists of the local task execution energy consumption of the RUs, the task transmission energy consumption of the RUs and the task execution energy consumption of the PUs and the MEC servers. In this section, we first examine the total energy consumption, then formulate joint computation offloading and resource allocation strategy as an energy consumption minimization problem.
x n m,m 1 ,j E n m,m 1 ,j ,

Optimization constraints
In order to jointly design the computation offloading and resource allocation strategy, we consider a number of optimization constraints.

Delay constraint
The tasks of RUs should be executed before the given maximum deadline, i.e., where D m is the time required for transmitting and executing the task of RU m , and can be computed as

Computation offloading constraint
We assume that each RU can only execute its task in one of the three offloading modes, i.e., local computing, MEC offloading or D2D offloading, hence, the computing mode selection constraint is given as

Resource allocation constraints in MEC offloading mode
In MEC offloading mode, we assume that one sub-channel can only be assigned to one RU and vice versa, hence, the sub-channel allocation constraints can be expressed as The maximal number of sub-channels of BS n puts the constraint on the number of RUs accessing the BS, i.e.,

Resource allocation constraints in OMA-based D2D scheme
In OMA-based D2D scheme, we assume that one sub-channel can only be assigned to one RU and vice versa, hence, the sub-channel allocation constraints can be expressed as In OMA-based D2D scheme, at most two RUs may access one PU for computation offloading utilizing two sub-channels, we obtain the following constraint:

Resource allocation constraints in NOMA-based D2D scheme
In NOMA-based D2D scheme, each sub-channel can only be assigned to one NOMA pair, we obtain In NOMA-based D2D scheme, at most two sub-channels are assigned to two RUs, i.e., We assume that two adjacent sub-channels should be assigned to one NOMA pair, i.e., where ⊙ represents the inclusive OR operator.

Constraints on offloading mode selection and resource allocation
Apparently, there exists a direct relation between offloading mode selection and subchannel allocation decision in all the three offloading modes, we express the constraints as follows:

Optimization problem formulation
To minimize the energy consumption subject to a number of constraints, we formulate the optimization problem as follows:

Proposed algorithm: no NOMA scheme applied
Since the optimization problem formulated in (50) is NP hard, which is inconvenient to solve in polynomial time. In this section, we start from a relatively simple case, i.e., for D2D offloading mode, only OMA-based transmission scheme is considered, and propose a heuristic algorithm. By examining the energy consumption of RUs in different task offloading modes, we first determine local computing mode, then present a priority-based sub-channel allocation algorithm for conflicting RUs. In next section, we consider the RUs choosing D2D offloading mode, and determine task offloading mode and sub-channel allocation strategy.

Rewriting energy consumption in various offloading modes
To minimize the energy consumption in (50), we may examine extensively the energy consumption of individual RUs in different offloading modes at various sub-channels. Let E loc m , E mec m,n,k , E d2d m,j,k denote, respectively, the energy consumption of RU m in local computing mode, MEC offloading mode and OMA-based D2D offloading mode.
Suppose that only OMA scheme is allowed in D2D offloading mode and taking into account the constraints on mode selection variables and sub-channel allocation variables specified in C12, C13, we may rewrite the energy consumption E as follows: The original optimization problem in (50) is reduced to (51)  The above optimization problem involves computation mode selection and sub-channel allocation among various offloading modes, which is still difficult to tackle. In this subsection, we propose a heuristic algorithm, which conducts the following steps successively, i.e., local computing mode selection, sub-channel allocation for non-conflicting RUs, priority-based sub-channel allocation for conflicting RUs.

Local computing mode selection
For RU m , we may calculate its energy consumption in different computing modes at different sub-channels. It is obvious that if one RU needs to consume the minimum energy when performing local computing compared with both the MEC offloading mode and the D2D offloading mode, the RU should execute its task locally. Therefore, we can first assign the local computing mode for the RUs by comparing its energy consumption in various computing modes. That is, if RU m achieves the minimum energy consumption when executing its task locally, i.e., E loc m ≤ E mec m,n,k and E loc m ≤ E d2d m,j,k , ∀n, j, k , we should assign the local computing mode to RU m , i.e., x 0, * m = 1 , x m, * m,n = 0 , x d, * m,j = 0 , where x 0, * m , x m, * m,n , and x d, * m,j represent the optimal computing and offloading strategy.

K-M algorithm-based sub-channel allocation for nonconflicting RUs
After removing the RUs which have been assigned local computing mode, we place the remaining RUs into a set, denoted by RU . We now solve the optimization problem in (52) for the RUs in RU .
It is noticeable that the formulated optimization problem is similar as a matching problem in a bipartite graph, however, it is not a typical one-to-one matching problem as the sub-channel allocation among different offloading modes should be taken into account. To tackle this problem, we first consider an ideal sub-channel allocation assumption for both the BSs and the PUs. More specifically, we assume that all the sub-channels are available for all the BSs and the PUs, and then determine the resource allocation and computation offloading strategy which minimizes the energy consumption. Equivalently, we virtualize the set of sub-channels into N + J sets and assign each BS and PU one set of sub-channels. For instance, BS n is assigned the ((n − 1)K + 1)-th to the (nK)-th sub-channels, and PU j is assigned ((N + j − 1)K + 1)-th to the ((N + j)K )-th sub-channels, 1 ≤ n ≤ N , 1 ≤ k ≤ K.
The energy consumption E can then be rewritten as where Ē mec m,n,k ′ is the energy consumption of RU m when offloading its task to MEC n using the k ′ -th sub-channel after sub-channel virtualization, Ē mec m,n,k ′ can be expressed as E d2d m,j,k ′ is the energy consumption of RU m when offloading its task to PU j using the k ′ -th sub-channel after sub-channel virtualization, Ē d2d m,j,k ′ can be expressed as E mec m,n,k ′ = E mec m,n,k , 1 ≤ k ≤ K , k ′ = (n − 1)K + k. Similarly, μ m m,n,k ′ = µ m m,n,k for k ′ = (n − 1)K + k , and μ d m,j,k ′ =μ d m,j,k , for The original optimization problem in (52) can be expressed as The above optimization problem can be regarded as a one-to-one matching problem in a bipartite graph, which can be solved by typical algorithm such as the Kuhn-Munkres (K-M) algorithm [38]. Let x 0 m , μ m m,n,k ′ and μ d m,j,k ′ denote, respectively, the local optimal strategy of x 0 m , μ m m,n,k ′ and μ d m,j,k ′ obtained from the K-M algorithm. Based on the local optimal strategy of the RUs, we may check whether there exist non-conflicting RUs of which the selected subchannel is not shared with other RUs. For non-conflicting RUs, we assign the local optimal offloading and sub-channel allocation strategy as the global optimal one. As an example, suppose the local optimal strategy of RU m 1 is x 0 m 1 = 0 , μ m m 1 ,n 1 ,k ′ 1 = 1 and μ d m 1 ,j,k ′ = 0 , and no other RUs select the same sub-channel, i.e., μ m m,n,k ′ = 0 and μ d m,j,k ′ = 0 , for m = m 1 , k ′ 1 � = k ′ , we set the global optimal offloading and sub-channel allocation strategy of RU m 1 as x 0, * m 1 = 0 , x m, * m 1 ,n 1 = 1 , and x d, * m 1 ,j = 0 , µ m, * m 1 ,n 1 ,k 1 = 1 , µ m, * m 1 ,n,k = 0 and µ d, * m 1 ,j,k = 0 , for n = n 1 , Once the RUs have been assigned global optimal strategy, they are removed from the remaining user set RU and their selected sub-channels are removed correspondingly.

Priority-based sub-channel allocation for conflicting RUs
Note that by applying sub-channel virtualization, the BSs and PUs are allowed to share same sub-channels, the obtained local optimal computation offloading and sub-channel  allocation strategy x 0 m , μ m m,n,k and μ d m,j,k may involve resource conflicting among RUs. More specifically, it is probable that more than one RU chooses to occupy a common sub-channel for task offloading. For instance, if μ m,n,k = 1 , μ m 1 ,n 1 ,k 1 = 1 , and mod(k, K ) = mod(k 1 , K ) , then both RU m and RU m 1 choose the kmodK-th sub-channel for task offloading. We refer RU m and RU m 1 as a pair of conflicting users.
Since multiple sub-channels are not allowed for MEC offloading mode and OMAbased D2D offloading mode, we need to design computation offloading and resource allocation strategy for the conflicting RUs. To this end, we propose a priority-based offloading mode selection and sub-channel allocation scheme. The steps of the proposed scheme can be summarized as follows: (1) Assign priority to the conflicting RUs We examine the energy consumption of the conflicting RUs and assign various priorities to these RUs. For each RU, we first evaluate the energy consumption in various offloading modes, and set the lowest one as the energy consumption of the RU. Then, aiming to minimize the energy consumption of all the RUs, we order the non-conflicting RUs according to their energy consumption and assign the highest priority to the RU having the lowest energy consumption.
(2) Assign global optimal strategy to the RU with the highest priority For the RU with the highest priority, the local optimal strategy will be set as its global optimal strategy. We remove this RU as well as the corresponding sub-channel from the RU set and sub-channel set.
(3) Update local optimal strategy of the remaining RUs The local optimal strategy of the remaining RUs is updated by applying the K-M algorithm. Check whether conflicting RUs exist, if yes, return to (1), otherwise, set the local optimal strategy of the remaining RUs as the global optimal one, and the algorithm terminates.

Greedy method-based task offloading and user pairing algorithm: NOMA scheme applied
In this subsection, we consider the RUs which need to offload their tasks to PUs using D2D offloading mode. Since RUs may apply OMA-based D2D scheme or NOMA-based D2D scheme, the optimal computation offloading selection, sub-channel allocation and NOMA paring strategy is very difficult to obtain. For simplicity, we design a greedybased computation offloading and NOMA paring algorithm.

Task offloading strategy: one RU case
For individual PUs, we may assign different RUs for conducting D2D offloading, and accordingly, various computation offloading and NOMA paring strategies can be obtained. For one or two RUs, they may choose one PU and offload their tasks to the PU in OFDMA mode. Alternatively, two RUs may form a NOMA pair and send their tasks to a common PU. For a specific PU and the set of RUs choosing the PU to offload tasks, we may list all potential task offloading combinations, and compute the corresponding energy consumption by exploiting extensive search method. Based on the local optimal strategy of RUs obtained from the K-M algorithm, we assign task offloading strategy for the RUs choosing D2D offloading mode. In the case that only one RU chooses a PU for task offloading, we assign OMA-based D2D offloading mode to the RU. Suppose RU m is the only user choosing PU j to offload its task, i.e., μ d m,j,k ′ = 1 and μ d . If more than one RU choosing one PU to offload their tasks, we need to select one or two RUs and determine the optimal task offloading strategy. To this end, we propose two schemes, i.e., greedy method-based user pairing and task offloading algorithm and low complexity user pairing and task offloading algorithm.

Task offloading and user pairing algorithm
We assume that multiple RUs select PU j for task offloading. Let RU denote the set of RUs, i.e., if μ d m,j,k ′ = 1 , then RU m ∈ RU . Since at most two RUs are allowed to offload their tasks to one PU, among all the RUs in RU , we need to choose one or two RUs and assign the task offloading mode and the corresponding sub-channel.

Local optimal strategy in OMA-based D2D offloading mode
First examine the optimal performance obtained by using OMA-based D2D offloading mode. Suppose OMA-based D2D offloading mode is assigned to the RUs in RU , we may examine the task offloading performance of the RUs and select one or two RUs achieving the optimal performance. Specifically, for ∀RU m ∈ RU , compute E d2d m,j,k , and select two RUs with the corresponding sub-channels obtaining the optimal task offloading performance, i.e., if E d2d , then RU m 1 and RU m 2 are selected as the local optimal users and subchannels k 1 and k 2 should be allocated to the two users. Let E o, * m 1 ,m 2 ,j,k 1 ,k 2 = E d2d m 1 ,j,k 1 + E d2d m 2 ,j,k 2 denote the local optimal energy consumption of RUs when offloading tasks to PU j in OMA-based D2D mode.

Local optimal strategy in NOMA-based D2D offloading mode
We then evaluate the task offloading performance in NOMA-based D2D offloading mode. Following a similar manner as in OMA-based case, we choose two RUs in RU to form NOMA pair and compute the task offloading performance on various sub-channels and select the pair achieving the optimal performance. Let E n, * m ′ 1 ,m ′ 2 ,j,k denote the energy consumption of RU m ′ 1 and RU m ′ 2 when offloading tasks to PU j in NOMA-based D2D mode via the k-th and (k + 1)-th sub-channels, and RU m ′ 2 are selected as the local optimal users in NOMA-based D2D offloading mode.

Determine task offloading and user pairing strategy
Given the local optimal task offloading performance in OMA-based and NOMA-based task offloading modes, we now compare the performance and choose the one offering the better performance. In particular, if the following condition meets: we assign NOMA-based D2D offloading mode to RU m ′ 1 and RU m ′ 2 with PU j being the offloading PU, and the k-th and the (k + 1)-th sub-channels are allocated to the two RUs for task transmission. Similarly, if we assign OMA-based D2D offloading mode to RU m 1 and RU m 2 with PU j being the offloading PU and the k 1 -th and the k 2 -th sub-channels are allocated to the two RUs for task transmission. Find particular PU j=1 5: if (57) holds then 6: RUm, RUm 1 will offload at PU j using NOMA-based D2D offloading mode 7: else if (58) holds then 8: RUm, RUm 1 will offload at PU j using OMA-based D2D offloading mode 9: end if 10: Remove the RUs and PU from the set of RUs and PUs j=j+1 11: end for

Simulation results
In this section, numerical results are presented to evaluate the performance of the proposed scheme. We run our simulations on Matlab-based simulator. The considered system model is a cellular D2D communication system with 4 BSs, 6 PUs and 5-30 RUs uniformly distributed around the BSs. The overall simulation region is chosen as 1000 m × 1000 m. All the simulation parameters utilized unless explicitly mentioned are reported in Table 2. Results are obtained by averaging over 2000 random trials. Figure 2 plots the curve for the energy consumption versus the number of RUs when three different algorithms are applied. For comparison, in addition to exhibiting the performance of our proposed algorithm, we also consider the one only OMA-based D2D offloading mode is applied and no NOMA-based scheme is utilized. Furthermore, the performance of the algorithm proposed in [22] is also evaluated. It can be observed from the figure that with the increase in the number of RUs, energy consumption increases as well. This is because as large number of RUs offload their tasks, the energy consumption due to task transmission and execution increase accordingly. In addition, it can also be seen that when noise power increases, the energy consumption also increases, the reason is that higher noise power leads to lower data transmission rate and longer time for task transmission, hence, higher energy consumption is resulted. Comparing the performance of the three algorithms, we can see that our proposed algorithm offers the lowest energy consumption which is benefited from the joint optimization of task offloading and resource allocation, as well as the performance gain of NOMA-based scheme.
In Fig. 3, we show the comparison results of the energy consumption versus the CPU cycles required for three different algorithms, i.e., proposed greedy scheme, THO scheme [22] and scheme proposed in [3]. Since the device execution efficiency can be examined by its processor's clock cycles (frequency), as lengthy instructions (or data) take more cycles to process as compared to short instructions. Therefore, there exists a direct relation between energy consumption and the number of CPU cycles required. The increase in the required CPU cycles indicates the higher complexity required to process the tasks, hence, higher energy consumption is required. It can also be observed that our proposed scheme outperforms the two comparative schemes. Figure 4 shows the energy consumption versus the capacity of the MEC servers for the proposed scheme and the schemes proposed in [22] and [3]. It can be seen from the figure that with increasing the MEC server capacity by keeping the fixed task data size lowers the energy consumption due to the fact that the processor's clock frequency inversely affects the performance, as more clock cycles are available  for fixed length data. The algorithms are evaluated against two different sub-channel bandwidth settings, i.e., W 0 = 2 MHz and W 0 = 1 MHz, and stating the fact that high bandwidth resource produces high information transmission rate and in turn lower energy consumption is produced. It can be observed from the figure that the proposed greedy scheme which integrates both OMA and NOMA schemes, outperforms the schemes proposed in [22] and [3] as more RUs prefer to offload their tasks to PUs as compared to local execution or offloading at far distant placed MEC servers. In Fig. 5, an evaluation of energy consumption and task data size with different noise variances and sub-channel bandwidth settings is conducted for the proposed scheme and the schemes proposed in [22] and [3]. We can see that as the noise power  increases, higher energy consumption is resulted. This is because increasing noise values decreases the signal-to-noise ratio (SNR), hence increasing energy consumption. From the figure, it can be noticed that with the rise of task data sizes, the energy consumption values tend to increase gradually. It can also be observed that our proposed scheme outperforms the schemes under comparison. An illustration of the energy consumption versus the task data size over different D2D distance combinations and RU counts is provided in Fig. 6. According to the figure, when more RUs offload their tasks at large distances, we get higher energy consumption. This is due to the fact that long distance produces low data transmission rates, which ultimately leads to high transmission energy consumption in comparison with short distances. Moreover, the proposed NOMA and OMA integrated algorithm outperforms the proposed OMA-based D2D scheme because the integrated algorithm yields better transmission performance, and lower energy consumption in turn.

Discussion
In this section, we will briefly discuss the computational complexity and convergence analysis of the proposed algorithm.

Computational complexity
In this section, the computation complexity of the proposed algorithm is analyzed.
As the formulated problem is tackled according to two different use cases: no NOMA scheme and NOMA scheme, therefore, we examine the complexity of solving the both sub-problems, i.e., resource allocation sub-problem and computation offloading subproblem according to use cases.
For the case where no NOMA scheme is applied, we virtualize the set of sub-channels into N + J sets, the complexity of the K-M algorithm is O (G 3 ) with ( 11G 3 +12G 2 +31G)/6 maximum number of operations, where G=N + 2J + 1 [38].
For the case where NOMA scheme is applied, let K denote the RUs pairs in NOMA, the required number of operations needed using extensive search method is M(J + 1)K , having the computational complexity O(M(J + 1)K ) . Figure 7 plots the computational complexity of the proposed greedy scheme.

Convergence analysis
It should be mentioned that through the process of algorithm execution, we conduct various sub-algorithms successively. Specifically, the sub-algorithms include: Sub-algorithm 1: determining local computing mode, Sub-algorithm 2: K-M algorithm-based sub-channel allocation, Sub-algorithm 3: priority-based sub-channel allocation for conflicting RUs, Sub-algorithm 4: greedy method-based task offloading and user pairing algorithm.
As Sub-algorithm 1 is conducted in an extensive manner and no iteration is required, the convergence can be reached easily. Sub-algorithm 2 is conducted in a centralized and noniterative mode, the strategy can be obtained directly by running the algorithm and the convergence of the algorithm is guaranteed. Sub-algorithm 3 is conducted iteratively. In each iteration, at least one RU with the highest priority is selected. Given the number of conflicting RUs, the number of RUs with the highest priorities is highly limited, which is in general much smaller than the number of conflicting RUs, hence, the algorithm convergence can be guaranteed, and the maximum iteration number can simply be set as the number of conflicting RUs.
Sub-algorithm 4 is applied to the RUs choosing D2D mode. Specifically, for various D2D pairs, the sub-algorithm is conducted independently. For an individual D2D pair, the energy consumption in OFDMA-based scheme and NOMA-based scheme is evaluated and the one offering better performance is selected. Hence, the algorithm convergence is guaranteed.

Conclusion
This paper jointly considers the computation offloading and resource allocation problem in a D2D-assisted and NOMA-empowered MEC systems. The original problem has been formulated as an energy consumption minimization problem that is NP hard; therefore, we have decomposed it into two sub-problems, i.e., resource allocation sub-problem and computation offloading sub-problem, and proposed two heuristic algorithms to obtain appropriate strategies for resource allocation and computation offloading. Numerical results have validated the effectiveness of the proposed scheme when compared with the relevant schemes [3,22]. Future strategies might include extending the proposed scheme into an integrated network, e.g., the integration of satellites, unmanned aerial vehicles (UAVs) and cellular systems, which would utilize satellites and UAVs as MEC servers, so as to increase the flexibility and efficiency of task offloading. In addition, the task offloading strategy under dynamic scenarios with randomly arriving tasks and dynamically-changing channel models can also be investigated.