Computation offloading to edge cloud and dynamically resource-sharing collaborators in Internet of Things

With the diversity of the communication technology and the heterogeneity of the computation resources at network edge, both the edge cloud and peer devices (collaborators) can be scavenged to provide computation resources for the resource-limited Internet-of-Things (IoT) devices. In this paper, a novel cooperative computing paradigm is proposed, in which the computation resources of IoT device, opportunistically idle collaborators and dedicated edge cloud are fully exploited. Computation/offloading assistance is provided by collaborators at idle/busy states, respectively. Considering the channel randomness and opportunistic computation resource share of collaborators, we study the stochastic offloading control for an IoT device, regarding how much computation load is processed locally, offloaded to the edge cloud and a collaborator. The problem is formulated into a finite horizon Markov decision problem with the objective of minimizing the expected total energy consumption of the IoT device and the collaborator, subject to satisfying the hard computation deadline constraint. Optimal offloading policy is derived based on the stochastic optimization theory, which demonstrates that the energy consumption can be reduced by a proportional factor through the cooperative computing. More energy saving is achieved with better wireless channel condition or higher computation energy efficiency of collaborators. Simulation results validate the optimality of the proposed policy and the efficiency of the cooperative computing between end devices and edge cloud, compared to several other offloading schemes.

has risen as a promising technology that provides cloud computing capabilities in close proximity to the data sources. Through one-hop transmissions, resource-constrained IoT devices can offload their application-related data to edge clouds deployed at network edges, such as base stations or access points. Compared with mobile cloud computing that requires IoT devices to access remote cloud servers through the Internet, mobile edge computation offloading (MECO) can potentially reduce the transmission latency between IoT devices and the servers, which is especially important for supporting timeconstraint applications. Considering massive device access in current wireless networks, plenty of literatures have studied the offloading decision and resource allocation problem [2][3][4][5][6]. However, most of them take no consideration on the time-varying characteristics of channel conditions in the process of computation offloading, which limits the scope of application for their conclusions.
On the other hand, peer-to-peer computing is also regarded as an efficient approach for computation offloading [7]. Through some proximate communication technologies, such as device to device (D2D) communication and bluetooth, the idle resources on nearby IoT devices, i.e., collaborators, can be exploited to enhance the offloading efficiency. In this way, the cost on additional deployment of the edge servers at network edges can be cut down. Meanwhile, the traffic burden and computation load on the edge base station and servers are effectively alleviated. Most of the existing works [8][9][10][11] on peer-to-peer computing mainly consider that collaborators offer constant idle computation resources until completing the offloaded computation loads. However, due to its own tasks with high priority, the central processing unit (CPU) state of collaborators is opportunistically idle and available to the offloaded tasks [12,13], which is different from the dedicated resource provisioning of edge servers for IoT subscribers. Moreover, with some incentive mechanism [14], collaborators not only share idle computation resources, and their network function can also be utilized to assist computation offloading.
Considering the stochastic channel condition and the dynamic computation resources share, in this paper, we propose the cooperative computing which integrates both the dedicated edge cloud and opportunistic collaborators to provide computation resources for IoT devices. In particular, collaborators provide two functions: computation assistance at idle states and offloading assistance at busy states. With given computation load and completion time constraint, an IoT device should adapt the local computation load and offloading computation load to the dynamic process of the channel states and the collaborator states. Our contributions are summarized as follows: 1 A novel cooperative computing paradigm is proposed, which supports the parallel computing on local devices, edge cloud and collaborators with dynamic computation resources. A collaborator assists in computing or further offloading the computation to the edge cloud based on its stochastic CPU availability. 2 A dynamic offloading decision problem on how much computation load is executed locally, offloaded to edge cloud and a collaborator, is modeled as a finite horizon Markov decision problem. We aim to minimize the expected sum energy consumption of both the IoT device and the collaborator, subject to the computation completion time constraint.
3 Based on the stochastic optimization theory, the optimal offloading policy is derived by Karush-Kuhn-Tucker (KKT) conditions and backward induction, which facilitates the design of a low-complexity dynamic programming algorithm and alleviates the well-known curse of dimensionality. 4 Optimal offloading policy shows that the energy consumption can be reduced by a proportional factor through cooperative computing. More energy saving is achieved with better wireless channel condition or higher computation energy efficiency of collaborators. Simulation results validate the optimality of the proposed policy and the efficiency of the cooperative computing between end devices and edge cloud.
The rest of the paper is organized as follows. Section 2 summarizes some related work. Section 3 describes the system model and problem formulation. Optimal offloading policy and the corresponding algorithm are presented in Sect. 4. In Sect. 5, the simulation experiments and performance evaluation of the proposed algorithm are provided. Finally, conclusions and future work are given in Sect. 6.

Related work
In this section, some related work on MECO and peer-to-peer computing is summarized. Considering computation offloading to dedicated edge servers, [2] investigated the partial offloading for a mobile device with dynamic voltage scaling technology. The offloading ratio, computational speed and transmit power of the device are optimized for minimizing its energy consumption or application execution latency. For multiple users that require offloading services, time-division multiple access (TDMA) and frequencydivision multiple access (FDMA) are adopted in [3]. A convex optimization problem was formulated and solved optimally for joint offloading ratio and time allocations, such that the total energy consumption of multiple devices is minimized. A heuristic algorithm was then proposed for offloading ratio and frequency channel allocations. Decentralized algorithms for resource allocation and offloading decision were studied in [4,5] by using game theory and decomposition techniques, respectively. These works focused on the quasi-static scenario where channel or link quality remains unchanged in the offloading process. Targeting at time-varying channel conditions when computation offloading, [15] investigated binary decision, i.e., offloading or local computing, in Gilbert-Elliott channel model. Then the authors considered linear task topology and random channel in [16], and the problem of optimal offloading control was formulated as a shortest path problem. "One-climb" policy structure was proved that the tasks should only migrate at most once between an end device and the cloud. Considering the intermittent connectivity between an end device and edge cloud due to network congestions, an analytical framework was built in [17] and a closed-form expression of the task processing time under different network conditions was derived.
For peer-to-peer computing, D2D-enabled collaborative computation was studied in [8], where a local user offloads multiple independent tasks to its nearby devices through TDMA. Computation latency was minimized by optimizing task assignment jointly with the time for task offloading and results downloading, subject to the individual energy constraints at the local user and the collaborators. In [9], we investigated the application partitioning and collaborator selection problem for multiple users when multiple idle collaborative devices are available. Both the centralized and decentralized algorithms were proposed to minimize the energy consumption of all the users. Considering a collaborator may opportunistically provide idle computation resources due to its local computation; [12] studied a user exploits non-causal information on the CPU state of collaborator to control offloaded data sizes. The CPU information was assumed to be predicted accurately in offline. Instead, the offloading policy was then studied in [13] based on the statistic distribution on the dynamic collaborator's CPU state and the random channel state.
Integrating nearby collaborators with dedicated servers, recent studies show the efficiency of this cooperative computing. In [10], the peer-to-peer offloading was integrated with edge severs to maximize the number of devices that the system can support, where parts of the task that cannot be completed in time by each local device are offloaded to an edge node and a nearby device. The optimal transmit power allocation and the task offloading strategy are obtained. The selection on application execution approaches, including local computing, offloading to a collaborator or an edge cloud for multiple users, was considered in [11]. The problem was formulated as a sequential game to minimize the weighted sum of the task executive delay and energy consumption. Apart from utilizing the computing resources of the collaborators, network function was also considered in [18]. Through time allocation, portions of data are transmitted from a mobile device to a collaborator for computation at the first time block; then, the collaborator assists another portions of data transmission from the mobile device to an AP as a relay at the subsequent two time blocks. These existing works on cooperative computing considered static scenarios with no channel varying and assumed collaborator offers constantly dedicated idle computation resources similar as edge/cloud servers. However, due to signal interference and user mobility, as well as stochastic tasks arrival, the channel dynamics and the randomness of collaborator states should be considered in offloading policy design, which motivates our study in this work. We model the dynamic offloading control as a finite horizon MDP problem. Different with some existing work [19][20][21] that solved MDP problems numerically and sub-optimally using deep reinforcement learning, we aim to derive the optimal solution that yields useful insight into the policy structure.

System model
We consider an edge computing system with an IoT device, a collaborator and an edge cloud server that is co-located at an access point (AP). The IoT device has computationally intensive applications to be completed within the hard deadline constraint, and the application-related computing data are stored in its buffer. In this paper, we focus on datapartitioned applications, such as virus scan, file/figure compression and text conversion [2,22,23], for which the application data can be partitioned continuously and processed in parallel. The edge cloud provides the dedicated computation services for the resourceslimited IoT device, and the resources are sufficient for completing the workload from the device. The collaborator can also share its computation capability to assist the computation of the IoT device. However, due to its own randomly arriving computation tasks, the available idle computation resources of the collaborator for assistance are dynamic. To model the dynamic property of wireless channel and collaborator state, the system time is divided to slots with equal duration τ . The channel and collaborator conditions remain constant within a time slot but may vary between different time slots. Consider an application with data size W should be completed before deadline D, i.e., T = D/τ time slots. In each time slot t, a L t bits of data are processed locally by the IoT device. In parallel, through some existing cellular connectivity and proximate communication technology, such as LTE and D2D communication, the IoT device offloads a E t and a C t bits to the edge server and the collaborator, respectively. At the next time slot t + 1 , the remaining data size in the buffer of the IoT device can be expressed as

Local computing
According to [24], utilizing dynamic voltage and frequency scaling (DVFS) techniques, the local CPU frequency can be dynamically adjusted in each time slot in order to minimize the computation energy consumption. Let γ denote the number of CPU cycles for computing per data bit, then the local CPU-cycle frequency of the IoT device at time slot t is c L t = γ a L t τ . The local computation power consumption P L t can be modeled as [25], where κ L is the energy conversion coefficient of the device depending on the chip architecture. The larger κ L is, the lower its computation energy efficiency is. Then the local computation energy consumed by the IoT device at time slot t is expressed as

Direct offloading to edge cloud
Due to the effect of shadowing, multipath interference and mobility of the IoT device, the correlated fading channels are considered in this paper. Such correlated fading wireless communication channels can be modeled by the finite state Markov chain (FSMC) model with K states, indexed as k = {0, 1, 2, . . . , K } . The channel state between the IoT device and the edge cloud H le t keeps constant during one time slot t, but may change in different time slots. The transition probability from two neighboring states k − 1 and k is denoted as P k−1,k , 1 ≤ k ≤ K . These probabilities can be calculated based on practical fading models [26]. The transition probability between two states that are not neighbors is 0. According to the empirical model in the literature, the transmission power of the IoT device to the edge cloud at t can be modeled as the following monomial function: where is the energy coefficient incorporating the effects of bandwidth and noise power, s t = a E t /τ is the transmission rate at time slot t, and m is the monomial order determined by the modulation-and-coding scheme. It is shown by [27,28] that the powerrate relation can be well approximated by the monomial function. Following [13], the monomial order of ( m = 3 ) is adopted to approximate transmission power, such that the coding scheme for the targeted error probability is less than 10 −6 . Then the offloading energy consumption from the IoT device to the edge cloud at time slot t is written as We omit the computation energy consumption of the edge cloud since it is often powered by stable on-grid power.

Collaborator-assisted computing or offloading
Since the computation resource availability of collaborator in the future is often variable and unknown, statistic information based on historical data can be utilized. As shown in [29], the idle/busy intervals in the quantized CPU utilization traces roughly follow the geometric distribution, while the idle/busy intervals in Markov chains are exactly geometrically distributed. Hence, the CPU availability processes can be modeled as two-state Markov chains [13,29]. Based on the idle/busy state distribution, the state transition probability can be also obtained by theoretic derivation [17] or real-world experiments [29]. In each time slot t, the CPU state of the collaborator is C t ∈ {0, 1} , where C t = 0 and C t = 1 denote the idle and busy states, respectively. The current CPU state C t only depends on the previous random state C t−1 and is independent of the past states {C 1 , . . . , C t−2 } . Let P bb and P ii denote the busy-to-busy and idle-to-idle transition probabilities. Then the busy-to-idle and idle-to-busy transition probabilities are P bi = 1 − P bb and P ib = 1 − P ii , respectively. When C t = 0 , the CPU is idle and the offloaded computation from the IoT device can be processed by the collaborator; otherwise, the collaborator will further forward the computation to the edge cloud. The system model is shown as Fig. 1. We now consider the assisted computing when C t = 0 . Recall that a C t data bits are offloaded to the collaborator in time slot t. Similar with the transmission energy consumption model described in Sect. 3.2 and computation energy consumption model described in Sect. 3.1, the total energy consumption of the IoT device and the collaborator at time slot t is expressed as where β C = κ C γ 3 τ 2 and κ C is the energy conversion coefficient of the collaborator. H lc t is the channel state between the IoT device and the collaborator. The first term in R.H.S of  When C t = 1 , the offloaded computation in this time slot cannot be processed by the collaborator. In this case, the collaborator further forwards the offloaded computation a C t to the edge cloud for processing. Therefore, the total transmission energy consumption of the IoT device and the collaborator at time slot t is expressed as where H lc t is the channel state between the collaborator and the edge cloud server.

Problem formulation
Markov decision process (MDP) is a mathematical framework for modeling the decision-making problems in a stochastic system with multiple states and statistic system information, which suits our problem well. Furthermore, considering the hard completion time constraint of the application, we then formulate this opportunistic cooperative computation offloading as a finite-horizon MDP problem. The expected energy consumption of both the IoT device and the collaborator is minimized, subject to the hard deadline constraint of the application.
(1) State and action sets The system state space of our MDP formulation is represented as where I(l) is an indicator function, which is equal to 1 if the condition (l) holds and 0 otherwise. The total energy consumption of the IoT device and the collaborator in each time slot t, including the transmission energy consumption and the computation energy consumption, can be given by An offloading policy consists of T decision rules for the T decision epochs: t = {1, 2, . . . , T } , is defined below. A decision rule at time slot t maps states to actions and is denoted by a t : S → A.

Definition 1
An admissible offloading policy is a function mapping the buffer state, the CPU state of the collaborator, the channel states and the time slot information into an action in each decision period: The space of all admissible policies is denoted by . Our objective is to minimize the expected total energy consumption of the IoT device and the collaborator given an initial state s 1 ∈ S: where the expectation is taken with regard to the stochastic system state S t for all t. In this paper, we consider deterministic Markov policies which is shown to be optimal under the expected total reward criteria [30]. The solving of the optimal policy is nontrivial since the decision on each time slot cannot be taken independently. The action in each time slot affects the system transition probability, the future states and energy consumption in subsequent time slots. The resultant computation offloading policy can be computed numerically by brute-force method or some reinforcement learning techniques, but on the one hand, brute-force method leads to an exponential increase on the complexity of policy solving with the number of state variables; on the other hand, reinforcement learning requires amount of time to training and learning a better strategy from trial and error. In the following section, we exploit the known model information to derive the closed-form policy, which provides some insight for the strategy design and efficiently alleviate the curse of dimensionality in stochastic optimization.

Optimal computation offloading policy
In this section, we derive the optimal computation offloading policy in each time slot based on the principle of optimality and dynamic programming (DP) [31]. The optimal closed-form solution greatly reduces the complexity of the problem solving.
Define V t (s t ) as the cost-to-go function that represents the minimum expected sum energy consumption from time slot t to T. Based on the principle of optimality, we have the following Bellman optimal equation: It can be seen the optimal cost-to-go function in each time slot depends on the functions in subsequent time slots. We first consider the policy in the last time slot T given the system state s T . Then the problem for V T (s T ) can be written as follows.
Problem P 1 : The action a T should satisfy the following constraints.
Through solving the constrained optimization problem, the optimal offloading decision can be given as follows.
Lemma 1 At the last time slot T, the optimal amount of data for local processing, offloading to the edge cloud and offloading to the collaborator is, respectively, derived as

The minimum energy consumption V T (s T ) is
Proof It is obvious that Problem P 1 is a convex optimization problem. When c T = 0 , define the Lagrangian function where δ is nonnegative Lagrangian multiplier. Applying the KKT condition, we have The optimal solution can be obtained as Similarly, the optimal offloading solution can be also derived when c T = 1 . Combining the results of c T = 0 and c T = 1 , the optimal policy at time slot T is thus obtained.
Some important observations from Lemma 1 can be highlighted. Firstly, the offloading data size to edge cloud is closely related to two parameters, i.e., the local computation energy consumption per bit β L , and the transmission energy consumption per bit η/h le T . The IoT device offloads more data to the edge cloud when the local computation per bit is energy consuming or the channel between the device and the edge cloud is in a good state. Similarly, apart from β L , the offloading data size to the collaborator is also related to the sum energy consumption of transmission (from the IoT device to the collaborator) and computation (of the collaborator) η/h lc T + β C when c T = 0 , the two-hop transmission energy consumption from the IoT device to the edge cloud η/h lc T + η/h ce T when c T = 1 . The data portions of offloading and local processing are determined by these parameters. Besides, the energy consumption can be reduced by the proportional factor through parallel computing. More energy saving is achieved with better wireless channel condition or higher collaborator's computation energy conversion efficiency.
Lemma 1 provides the optimal solution for different system states and the minimum network energy consumption at the last time slot. Based on this closed-form policy, the optimal computation offloading policy in each time slot can be derived through backward induction approach, shown as below.

Theorem 1
At time slot t = 1, 2, . . . , T , the optimal policy determines data a L⋆ t for local processing, a E⋆ t for offloading to the edge cloud and a C⋆ t for offloading to the collaborator, which satisfies: where � t (s t ) is defined as Correspondingly, the minimum expected energy consumption of the network V 1 (s 1 ) is Proof Firstly, we derive the optimal action and expected energy consumption from time slot T − 1 to T for both c T −1 = 0 and c T −1 = 1 . When c T −1 = 0 , the optimization problem for V T −1 (s T −1 ) can be given as Based on V T (s T ) in Lemma 1 and define (19) can be further written as where a T −1 satisfies the constraint Applying the KKT condition, the optimal policy when c T −1 = 0 can be obtained as Similarly, the optimal policy when c T −1 = 1 can be derived. Combining the results of both cases, the optimal offloading policy and minimum energy consumption at time slot T − 1 are, respectively (23) Comparing the optimal solution (14) for t = T with (27) for t = T − 1 , similar structure can be found. Therefore, utilizing the backward induction and similar derivation process, the optimal computation offloading policy can be given as Theorem 1.
Theorem 1 demonstrates that in each time slot, the computation offloading policy depends not only on the current system state, but also on the future states through the term � t (s t ) . In detail, a larger � t (s t ) means more expected energy consumption for completing per bit data in future time slots. Therefore, more data should be processed in the current time slot no matter for local computing or offloading. Moreover, the same conclusion as Lemma 1 can be obtained. That is, the fraction of the optimal data size for locally computing, offloading to the edge cloud and offloading to the collaborator in each time slot is determined by the local computation energy consumption per bit β L , and the offloading energy consumption per bit η/h le T , η/h lc T + β C or η/h lc T + η/h ce T . Given the sum-processed data in one time slot, the preferred destination (local device, edge cloud or collaborator) at which more data are processed relies on the energy consumption for per-bit data.
Based on the Theorem 1, Algorithm 1 shows the optimal closed-form computation offloading policy. We then analyze the computational complexity of the algorithm. Based on Theorem 1, the optimal policy is determined by the known original system state s 1 and unknown � 1 (s 1 ) , where the complexity for calculating � 1 (s 1 ) depends on the three channel state spaces, the dimensions of collaborator CPU state and the number of time slots. Denote the number of the channel state as N, the computational complexity of the algorithm is thus O (2N 3 T ) . For the direct DP approach that searches all actions in policy space and in each time slot, the dimensions of state space, action space are O(2N 3 ) and O(W 3 ) in time slot t, respectively, leading to the total computational complexity |S| 2 AT = O(4N 6 W 3 T ) , which is impractical for large data size W and time slots T. Compare our closed-form policy to the brute-force policy search, the computational complexity is reduced dramatically.
Note that the entire process including the offloading decision and application execution is controlled by the AP. Once an application is generated, the IoT device reports its data size S, application deadline T and related parameters of local computing via feedback channel. Based on the historical statistical information on the channel state and the CPU state of the collaborator, the AP executes Algorithm 1 to get the amounts of data for local computing ( a L t ) and offloading ( a E t and a C t ) in each time slot. The policy is informed of the IoT device by the bidirectional feedback channel and the corresponding computation offloading process starts.
In this work, we consider the hard deadline requirement of an IoT application, which is usually an explicit parameter upon it is generated. This is common in most of the existing literature, see [10][11][12][13]. However, if the deadline parameter is hard to obtain in a fine-grained way, the statistical value of the deadline for a kind of IoT application should be got from the historical information. In this case, there are two options: (1) A guard time can be set to cope with this uncertainty. That is, the deadline is set as the minimum historical value less a guard time. Our scheme can be still adopted in this case.
(2) The application should be completed before a probabilistic deadline. This make the original problem more complicated since the constraint is a probabilistic form. The problem will be considered in our future work.

Results and discussion
In this section we simulate the performance of proposed Algorithm 1. Default parameters are set as follows unless stated otherwise. The time slot duration is 20 ms [32]. We consider an IoT device has an application with 400 kilobits under the deadline constraint of 140 ms, i.e., T = 7 time slots. The delay budget T is the actual delay budget of the application less the time needed to run Algorithm 1. Energy conversion coefficients κ L = 10 −27 , κ C = 0.3 * 10 −27 [18] and the required CPU cycles for computing 1-bit data are γ = 1000 cycle/bit [33]. For large-scale fading, the distances for IoT device-edge cloud, IoT device-collaborator and collaborator-edge cloud are 300 m, 50 m and 260 m. Path-loss exponent is 2.4. Small-scale fading for the channel is modeled as a two-state Markov chain, i.e., if the measured channel gain is below a threshold, the channel is considered as "bad"; otherwise, the "good" condition. The average channel power gains are 1 and 0.01 for the "good" and "bad" states, respectively [15]. The good-to-bad and bad-togood transition probabilities are, respectively, set as P gb = 3/7 and P bg = 3/10 [28]. The CPU state of the collaborator follows another Markov chain with P bb = 0.7 and P ii = 0.8 [13] and the energy coefficient = 10 −21 . To validate the optimality of the derived policy, optimal numerical results solved by brute-force policy search are presented. Then three algorithms are considered for performance comparison.
(1) Local or edge server execution (LESE) [15]: The computation load is entirely processed locally by IoT device or offloaded to the edge cloud through stochastic channel. The energy consumption of IoT device is minimized.
(2) Local and dynamic collaborator execution (LDCE) [13]: Part of the computation load is processed locally by IoT device, and the other is offloaded to a collaborator with dynamic computation resources through stochastic channel concurrently. The energy consumption of IoT device is minimized. (3) Equal allocation in each slot (EA): The computation load is processed locally by IoT device, offloaded to the edge cloud and offloaded to a collaborator with dynamic computation resources in parallel. The input data W are allocated into T time slots equally, regardless of channel conditions and the collaborator CPU state. Then the offloading policy in each slot a T = (a L T , a E T , a C T ) is obtained with similar derivation procedure as the proposed algorithm. The energy consumption of both the IoT device and the collaborator is minimized. Figure 2 shows the results of the total expected minimum energy consumption versus data size for the derived closed-form policy and brute-force policy search under different deadline constraints. It is noticed that the results of the two approaches are very close, which demonstrates the optimality of our computation offloading policy. Meanwhile, as the data size of application increases, the expected energy consumption grows at an increasing rate. The reason is that more energy, both computation energy and transmission energy, is consumed for completing more computation within the given deadline constraint. More stringent deadline constraint also leads to more expected energy consumption.
The total expected energy consumption versus the application data size for our approach and the three benchmark schemes are depicted in Fig. 3. With the increase of the application data size, the total expected energy consumption grows and the increasing rate gets larger in all four schemes, which is consistent with Fig. 2. It also can be observed that the total expected energy consumption of the proposed scheme is less than the EA scheme. This is because the proposed scheme considers the Data Size (bits) channel and CPU states when offloading. More (less) data are transmitted when channel state is good (bad), which saves the transmission energy consumption. The proposed scheme also outperforms LDCE in [13]. The reason is that the presence of edge server increases the offloading opportunity for the IoT device to reduce its energy consumption. More wireless channels and computation resources promote the IoT device to transmit more application-related data on better channels, which achieves diversity gain. Moreover, the collaborator and edge server can process the data in parallel, which reduces the application completion time substantially. Similarly, the performance of the proposed scheme is better than LESE in [15] due to the diversity gain.
Overall, this demonstrates the efficiency of the proposed cooperative computing. Figure 4 depicts the total expected energy consumption versus the application deadlines. As the deadline extends, the total expected energy consumption reduces at an decreasing rate in the four schemes. On the one hand, local processing rate can be lowered with extended time and thus the local computation energy consumption further reduces. On the other hand, the computation can be opportunistically offloaded at good channel condition or favorable collaborator CPU state in more time slots, leading to the reduction of the transmission energy consumption. This also demonstrates that extending deadline contributes more to the energy saving when deadline constraint is stringent. When the deadline is relaxed, the constraint is inactive and has less impacts on the expected energy consumption.
We then evaluate the effect of the collaborator energy conversion coefficient κ C on the total expected energy consumption in Fig. 5. As κ C increases, the energy consumption correspondingly increases for the three collaborator-related schemes, since the per-bit computation energy consumption for the collaborator grows. Only with available computation resources of the local device and the collaborator, the energy consumption of the LDCE scheme grows rapidly. The increase of the energy Data Size (bits) consumption of the other two schemes, i.e., the proposed and EA scheme, is alleviated because more computation can be offloaded to the edge cloud. The total expected energy consumption versus the CPU cycles per bit γ is shown in Fig. 6. The energy consumption grows as γ increases in all the schemes, but at different increasing rates. For the LDCE scheme, the changes of the slope are the largest. This is because the computation is processed only at the local IoT device and the collaborator, and both of the computation energies are related to γ . For the LESE scheme, local computation energy grows with the increases of γ , and more computation is offloaded to the edge cloud. As a result, the expected energy consumption grows with a decreasing rate. The increasing rate for the proposed scheme is similar to the LESE scheme rather than the LDCE scheme, which shows offloading to the edge cloud is more preferred as γ increases. But the performance of our scheme is still better than the LDCE and LESE schemes since more computing resources can be utilized. Moreover, the advantage of the proposed approach is more obvious with larger γ . Thus, the performance gap between the proposed approach and the equal allocation gets larger. Figure 7 shows the total expected energy consumption versus the distance between the IoT device and the collaborator. It can be noticed that the energy consumption grows as the collaborator gets more far away from the device, since more transmission energy consumption incurred. Besides, the energy consumption of the proposed scheme is close to the LDCE scheme when the distance is short, but the performance gap gets larger as the distance extends. This also demonstrates the effectiveness of the proximate computing. Table 1 shows the execution time of Algorithm 1, brute-force policy and the other three comparison algorithms. These algorithms are deployed at the computer with a dual-core Intel CPU at 2.9 GHz frequency. The time is collected by 1000 average with the data sizes uniformly distributed in [200,500] kilobits. It can be noticed that the execution time of Algorithm 1 is much lower than brute-force policy, thanks to the derived closed-form optimal policy. The execution time of Algorithm 1 is larger than the other three comparison algorithms, due to the trade-off between the performance gain and the computational complexity. It should be highlighted that the absolute time is just a reference because it is highly dependent on the machine running the algorithms.

Conclusions
In this work, we study the cooperative computing between IoT devices, collaborators with dynamic idle computation resources and dedicated edge cloud. Specifically, an IoT device can compute locally, offload computation load to a collaborator and edge cloud in parallel. The collaborator assists in computing at idle states and in further offloading to the edge cloud at busy states. The problem on how much computation load is executed locally, offloaded to edge cloud and a collaborator, is modeled as a finite horizon Markov decision problem with the objective of minimizing the expected total energy consumption of the IoT device and the collaborator, subject to satisfying the hard completion time constraint. Optimal offloading policy is derived based on the stochastic optimization theory, which alleviates the well-known curse of dimensionality and facilitates the design of a low-complexity dynamic programming algorithm. Simulation results validate the optimality of the proposed policy and show that more energy saving is achieved with better wireless channel condition or higher computation energy efficiency of collaborators.
In the future, we focus on extending this work to more general scenarios, where a large number of IoT devices exist. Resources competition (on communication and computation resources) or collaborator selection problem need to be addressed efficiently. Moreover, based on the useful insights in this work, devising online learning algorithm is also an interesting direction, which may require less model information but more time (for training) to get a satisfactory policy. Lastly, although we consider the energy consumption of the peer collaborators in this work, incentive mechanisms  can still be designed for effectively encouraging nearby peer devices to share their idle communication and computation resources and achieve win-win situation.