Research on energy-efficient routing algorithm based on SWIPT in multi-hop clustered WSN for 5G system

As one of the basic supporting technologies of 5G system, wireless sensor networks technology is facing a new challenge to improve its transmission energy efficiency. This paper considers combining simultaneous wireless information and power transfer (SWIPT) technique and routing technique, and applying them to multi-hop clustered wireless sensor networks (MCWSN), where each node can decode information and harvest energy from a received radio-frequency signal. And the relay nodes in MCWSN can utilize the harvest energy to forward data to their next hop nodes according to the routing scheme. First, we formulate an energy-efficient routing problem of MCWSN with SWIPT. Then, a heuristic energy efficient cooperative SWIPT routing algorithm (EECSR) is presented to find a transmission path with the maximum energy efficiency. Specifically, in EECSR, the resource allocation problem in each hop of the path is transformed to some equivalent convex optimization problems, which are resolved via dual decomposition. Moreover, a distributed routing protocol based on EECSR is proposed. As far as we know, this is the first solution that considers energy efficiency optimization based on routing and SWIPT in MCWSN. Simulation results show that our EECSR algorithm has high energy efficiency and good robustness. And our distributed routing protocol has better real-time performance than traditional protocols.

development of 5G system has brought new opportunities for WSN. However, new technologies and applications based on 5G come with new challenges, which require WSN to have higher transmission energy efficiency and longer life time [14].
Energy limitation has always been one of the challenges of WSN applications. It is well known that most of the energy in wireless networks is consumed in data transmission. Therefore, maximizing transmission energy efficiency is one of the concerns in the application and research of WSN. In recent years, a radio-frequency (RF) based energy harvesting technology, namely, simultaneous wireless information and power transmission (SWIPT), has been regarded as a promising technique to improve transmission energy efficiency of wireless networks [15,16]. For WSN, SWIPT can make nodes harvest energy and receive information via the same RF signal [17]. This feature of SWIPT can balance the energy distribution and provide a potential long-term operation of networks. By employing a power splitting (PS) or a time switching (TS) circuit, receivers in SWIPT can adjust the splitting ratio to change the throughput and energy consumption of communication links [18,19].
In recent years, SWIPT has been studied to improve the energy efficiency of different wireless systems related to 5G network, such as orthogonal frequency-division multiple (OFDM) systems [15,20,21], multiple-input multiple-output (MIMO) systems [7,18], femtocells [16] and mobile clustered WSN [22]. Moreover, due to the high energy efficiency and reliability, SWIPT cooperative networks have aroused more interest of researchers. For example, in [23], the authors have studied resource allocation policies in the cooperative wireless network with SWIPT to optimize energy efficiency. In [24,25], the authors have presented the energy efficiency optimization for SWIPT in MIMO two-way amplify-and-forward relay networks. Meanwhile, in order to improve energy and spectrum efficiencies of wireless networks, some studies have considered applying SWIPT to cooperative NOMA wireless networks [26,27]. In addition, the relay selection policy in cooperative networks has been studied based on the tradeoff between information transmission and energy harvest [28]. The authors of [29] have formulated a resource allocation problem in wirelessly powered sensor networks, and have presented an energy efficient cooperative transmission scheme based on SWIPT.
It can be observed that most studies on SWIPT technology have only focused on onehop or two-hop small networks. However, in the applications of WSN, multi-relay transmission can effectively reduce the transmit power and improve the efficiency of data collection [30,31]. Moreover, based on the energy supplement function of SWIPT, some studies have also applied SWIPT to multi-hop wireless networks to improve network performance [32][33][34][35][36]. In [32], the authors have optimized the energy harvest ratios of multi-hop amplify-and-forward (AF) and decode-and-forward (DF) WSN with SWIPT. In a multi-hop SWIPT DF sensor network, a problem of minimizing transmitting energy has been studied under an end-to-end throughput constraint [33]. Subject to transmit power constraints, the authors of [34,35] have focused on the relay beamforming design to maximize the achievable rate of a MIMO multi-hop sensor network with SWIPT. In [36], SWIPT has been applied to a multi-hop clustered WSN, where the transmission energy efficiency sum has been maximized. However, this paper has only focused on the communication between two adjacent cluster heads instead of the source nodes and the sink.
Most of the above works on SWIPT in multi-hop networks have only considered improving the performance of one given path. The authors of [37,38] have started to study routing algorithms based on SWIPT in multi-hop energy-constrained WSNs. And these two works have shown that routing algorithms based on SWIPT can find the transmission paths with the maximum throughput and the minimum consumption energy, respectively. As we know, routing can affect the energy efficiency of the transmission paths in WSN [39,40]. Therefore, in this paper, we consider to supplement the energy of the relays by implementing SWIPT in multi-hop clustered wireless sensor networks (MCWSN). Based on this scheme, we plan to introduce a routing algorithm to further exploit SWIPT. Our approach has advantages as follows: (1) SWIPT allows us to dynamically adjust the data rate and energy consumption of links to achieve higher transmission energy efficiency; (2) Benefiting from the energy supplement of SWIPT, relay selection becomes more flexible; (3) Relays replenished by SWIPT may form a routing scheme with higher transmission energy efficiency.
Although the combination of SWIPT and routing algorithm has the above advantages, it is not easy to combine these two techniques directly. The main challenge of our work is to concurrently consider the impacts of routing and resource allocation on the transmission energy efficiency of a path. Note that the resource mentioned here specifically refers to the transmit power of nodes in paths, the energy harvested at the receivers, and candidate relay nodes. The more challenging thing is that the routing policy and the resource allocation policy interact with each other. On one hand, different routing strategies will make each node in the path have a different next hop node, which will obviously result in different resource allocation policies. On the other hand, in each hop, resource allocation policies can determine the routing metrics of corresponding links and nodes. And different routing metrics may cause different routing results. Therefore, our aim is to find the most energy efficient transmission path for source CHs by jointly determining the optimal policies, including the optimal routing policy and the resource allocation policy for each hop.
For this purpose, we first formulate an energy-efficient routing problem for multihop clustered wireless sensor networks with PS-SWIPT. Then we design a heuristic energy efficient cooperative SWIPT routing algorithm (EECSR) to decompose our problem into an energy efficient routing subproblem and multiple resource allocation subproblems. In EECSR, during the process of finding the energy efficient paths, the resource allocation policies of communication links are solved. Specifically, the resource allocation subproblems for links are formulated as non-convex optimization problems with the object of maximum energy efficiency. These problems are transformed to equivalent convex problems and then optimally resolved by the dual decomposition method. Furthermore, we provide a table-driven distributed energyefficient cooperative SWIPT routing (TDEECSR) protocol for our routing problem. MCWSN is a type of sensor network often used in object monitoring applications. As far as we know, this is the first work considering the application of SWIPT-based routing algorithms to MCWSN's transmission energy efficiency optimization. Our simulation experiments demonstrate the convergence of proposed algorithms. Then, the impacts of parameters on the energy efficiency of our algorithms are analyzed. Moreover, under different transmission modes and network energy scenarios, a lot of simulation results prove the energy efficiency of our EECSR algorithm and the practicability of TDEECSR protocol.

Network model
In this paper, we consider a multi-hop clustered wireless sensor network (MCWSN) as shown in Fig. 1. The sink node is the data collection end. The nodes in the network are divided into multiple clusters. In a cluster, all member nodes are distributed within onehop communication range of the cluster head (CH), which has better performance than that of the members. CHs can form a multi-hop path to transmit data from a source CH to the sink, like the path 1 → 2 → 3 → sink in Fig. 1, where CH 1 is the source, CH 2 and CH 3 are relays. In addition, there are some cluster members which can communicate with more than one CH. We define this kind of member nodes as candidate relays for the cooperative transmission between two adjacent CHs, e.g., r 12 (1) , r 12 (2) and r 12 (3) in Fig. 1.
We choose power splitting (PS) mode instead of time splitting (TS) mode to support SWIPT. This is because TS mode requires that the transceiver nodes in communication links must be time synchronized. However, this requirement may not be guaranteed in some harsh application environments of WSN. We can observe in Fig. 2 that one node contains a receiving antenna, a PS unit, an energy harvester (EH) circuit, an information decoder (ID) circuit and a rechargeable battery. The node can split the received signal power into two parts according to ρ I and ρ E , where ρ I is the power splitting ratio of information decoding, and ρ E is the ratio of energy harvesting. The battery can be recharged by the harvest energy, and can support the information decoding and forwarding. Two kinds of links can work in SWIPT state. One is the link from a source CH to a member relay, like the link from CH 1 to r 1,2 (2) in Fig. 2. And the other kind is the link between CHs. In these SWIPT links, the receivers can dynamically adjust their power splitting ratios.

Transmission model
In the network, nodes are half-duplex and channels are symmetrical [15]. If we regard the sink as a special CH, then a multi-hop path is a chain of CH-to-CH cooperative transmission subnetworks, which we may refer as subnet in the following. In each subnet, we have one source CH i and one destination CH j , where i, j ∈ K , and K is the set of identity number of all CHs. In addition, there are a set of the candidate relays between the source and destination CHs, which is named as relay candidate set N ij . Each subnet contains links such as link ir , link rj and link ij , where r ∈ N ij . Corresponding to these links, we let the channel gains as h ir , h rj and h ij , and let the additive white Gaussian noise power be σ 2 ij , σ 2 ir and σ 2 rj , respectively. On the premise that the receiver can obtain the perfect channel state information (CSI), the normalized effective channel gains are defined as g ij = h ij 2 /σ 2 ij , g ij = h ir 2 /σ 2 ir , g ij = h rj 2 /σ 2 rj , where h ab = K ab /(L ab ) α , K ab characterizes the small-scale Rayleigh fading of the channel between the transmitter a and the receiver b, L ab is the distance between a and b, and α is the path loss coefficient.
In a determined routing scheme, the data transmission in one path is implemented by the subnet of each hop. So, we just describe the details of cooperative transmission in one subnet as follows. Before transmission, the source CH i in a subnet selects a cooperative relay with the highest priority in the N ij . The calculation of the priority of each candidate relay will be introduced in the "Resource allocation optimization for subnetworks" section. We assume that the relays in the network adopt the decode and forward (DF) method. Then a cooperative transmission includes the following two phases. PHASE 1 The direct transmission. CH i sends a message signal with power P i . Then, with the PS-SWIPT scheme, the signal to noise ratio (SNR) at the CH j is expressed as where the subscript 1 in y j,1 denotes the first phase. Note that signal processing noise at the receiver is n S j ∼ CN 0, σ 2 ij . And the antenna noise n A j in Fig. 2 is neglected as it is negligible in comparison with n S j [41]. In addition, the harvested energy at receiver CH j is given by where T is the total time for the cooperative transmission, and the first phase occupies a half of T. η denotes the energy harvesting coefficient with 0 < η < 1 . Similarly, The SNR at r can be expressed as The relay transmission. The relay r forwards the decoded message to CH j . We assume that all the energy harvested by the relay r in PHASE 1 is used for its forwarding in PHASE 2. So, the SNR at CH j in PHASE 2 can be expressed as At the end of PHASE 2, CH j uses maximum ratio combining (MRC) to combine the signals from CH i and r. Thus, the equivalent received SNR of CH j is given by It can be seen from (5) that the cooperative relay helps to increase the equivalent SNR of CH j . Therefore, the achievable data rate from CH i to CH j with selected relay r is given by where B is the bandwidth of the baseband. Since the entire transmission process is divided into two phases with equal time, the data rate is divided by 2. Observing (6), we can find that when γ r is equal to γ j , the maximum transmission rate can be achieved. So, we use the equation γ r =γ j as a condition to derive ρ I r of r. We let ρ E r = 1 − ρ I r . Then we obtain We assume that ρ E r = 1 − ρ I r , which means that ρ I r of relay is first determined by the ID unit in Fig. 2 to satisfy maximum energy efficiency, then all the remaining power flow is allocated to the EH unit to support relay transmission. Based on (6) and (7), the achievable data rate from CH i to CH j with relay r is given by where r ij and P r ij = P i are the equivalent channel gain and the transmit power, respectively. As shown in (8), we can regard a CH i -to-CH j subnet as a cooperative transmission link, which may be referred as CoLink ij , in the following.

Problem formulation
In this section, we formulate a routing problem with SWIPT in MCWSNs, and find the maximum energy efficiency path from one source CH to the sink.

Energy efficiency cost
We define the throughput of a subnet as the sum of data bits effectively transmitted to the receiver CH with each possible relay selection policy. The throughput of the CH i -to-CH j subnet (also named CoLink ij ) is given by where R r ij is defined by (8), P = P r ij ≥ 0, r ∈ N ij is the power allocation policy, , r ∈ N ij represents the power splitting policy, and S = s r ∈ {0, 1}, r ∈ N ij is the relay node selection policy. s r = 1 means that node r is selected as the relay, and s r = 0 means the opposite. α r is a non-negative weight. In addition, we define the power consumption as where P C denotes the total static power consumption of the circuits in the subnet; the first term indicates the transmit power consumption of the source cluster head CH i ; ε ≥ 1 is the conversion efficiency constant of the power amplifier in CH i . Note that the energy harvested by destination CH is not subtracted from U TP(ij) (P, ρ, S) as compensation energy. Because, from the perspective of the entire network, the harvested energy is only transferred from node to node to achieve an energy balance. Accordingly, as shown in (11), the energy efficiency of the subnet is defined as the ratio of transmission throughput in bits to per Joule energy consumption.
Then, we define the reciprocal of U eff (ij) in (11) as the energy efficiency cost (EEcost) of the subnet. We can regard EEcost as the price of energy required by sending unit data. And the EEcost of a CoLink is expected to be as small as possible. Furthermore, we define the equivalent energy-efficient cost of a path from the source CH sc to the sink as where l ij are links belong to the path(sc, sink). This definition is reasonable, because the maximum throughput of a path is determined by the link with the smallest throughput, and the energy consumption of a path should be the sum of the consumed energy of links. In addition, we define a binary variable t ij , with value 1 means that the link ij is , l ij ∈ path(sc, sink), i, j ∈ K selected in the path, and with value 0 means the opposite. So, the energy-efficient cost with routing policy is given as

Optimization problem formulation
In order to find the maximum efficiency path from a source CH to the sink, we set the minimum path energy efficiency cost as the optimization object. And the optimization variables are the power allocation policy P , the power splitting policy ρ , the relay node selection policy S , and the routing policy The SWIPT routing problem in our network can be formulated as subject to where C1 is the energy harvesting constraint; E r ij is defined by (2); E Req j is the energy required by the receiver CH j for data forwarding, which is defined as is the transmit energy of CH j for forwarding, and E C,j is the remaining battery energy of CH j . Obviously,if E Fwd j < E C,j , C1 becomes E r ij ≥ 0 so that the energy constraint is relaxed. C3 gives the transmit power constraint. The maximum transmit power P max i is determined by the battery capacity E B i . C4 is the transmission data rate constraint which requires that all subnets in the path solution should be greater than R min . C5 − C8 are routing constraints which indicate that any node in the path can only have one input CoLink and one output CoLink. In particular, the source node has no input, and the sink node has no output. These constraints can prevent loops and ensure flow conservation. C9 and C10 are cooperative relay (13) constraints which indicate that only one node in N ij can be selected. C11 gives the constant lower bound ρ E min and the upper bound ρ E max of the power splitting ratio, which are decided by the capability of the receiver. Similarly, ρ I max and ρ I min in C12 denote the bounds of the ratio for information decoding. Note that ρ E max + ρ I min = 1 and ρ I max + ρ E min = 1 . C13 comes from the power characteristic of the PS unit in Fig. 2, that is, the sum of the splitting power sub-items will never exceed the total received energy.
There are two key challenges in solving OPT-1. The first challenge is that the routing policy can't be exploited directly and the reason is as follows. As shown in constraint C1, the low-limit of the harvest energy of the receiver in a subnet is determined by its E C,j and E Fwd j . However, the constraint C2 shows that E Fwd j depends on the next hop selection policy t jk and the corresponding resource allocation policy in CoLink jk . It means that the change of routing policy T may cause the change of constraint C1 and make the optimization problem more complicated. The second challenge is that the integer constraints C5 and C9 make OPT-1 become a mixed integer programming problem, which is in general non-convex and NP-hard [22]. Therefore, to address OPT-1, we propose a heuristic routing algorithm to solve this problem.

Energy-efficiency cooperative SWIPT routing algorithm
We propose a heuristic energy efficient cooperative SWIPT routing algorithm (EECSR) to solve the problem OPT-1. This algorithm takes EEcost as the routing metric, and combines the resource allocation algorithm to calculate EEcosts of CoLinks and CHs. Based on the metrics, EECSR uses a greedy strategy to update the routing policy of each CH. And EECSR is designed based on the framework of the Dijkstra routing algorithm [42]. The algorithm is shown as Table 1.
In EECSR algorithm, Next i records the next hop cluster head of CH i . E Req i records the harvest energy required by CH i for forwarding. We assume that the nodes in set S have already found the path with the minimum EEcost, while the nodes in set Q haven't found yet. The main body of EECSR is a while loop in lines 5-17. This loop sequentially adds the node with the smallest EEcost in Q to S , and then update the costs of related neighbors, as shown in lines [8][9][10][11][12][13][14][15][16]. When all the nodes in Q are added to S , the loop ends. Concretely, in line 9, the first inequality shows that if CH i chooses CH j as its next hop node, EEcost i may become smaller. In the second inequality, for CH j in CoLink ij , the maximum harvest energy E max ij = T 2 η h ij 2 P max should be greater than the required forwarding energy. If both inequalities are true, EECSR calls a sub-algorithm to solve the resource allocation problem of CoLink ij , as shown in line 10. Specifically, the input of this sub-algorithm includes h ij ,σ ij ,h ir ,σ ir ,h rj ,σ rj , P max ,R min ,η, T , α r , ε and P C . The output of the sub-algorithm includes P r ij ,ρ , ρ E r ,ρ I r and s r . In addition, the sub-algorithm also obtains EEcost * (ij) , U * (ij) , U * TP(ij) of CoLink ij . Note that in the sub-algorithm, we introduce E Req j to characterize the impact of the receiver's routing policy. According to (12), we can calculate the EEcost ′ i , U ′ TP(i) , U ′ (i) in line 11 as In lines 12 and 13, if EEcost ′ i is lower than EEcost i , the values of is updated according to the definition of the E Req j in constraint C1 of OPT-1. Next i is updated to j. Obviously, once EECSR runs over, the routing policy T * for all CHs is determined.
In summary, EECSR decomposes OPT-1 into an outer routing subproblem and multiple inner resource allocation subproblems. Once node j joins S , its routing policy will never change, so the constraints C1 and C2 of OPT-1 for CoLink ij are determined. Since the forward routing policies and energy consumption requirements of the receivers are known, we can solve the resource allocation problems for subnets independently.

Output:
T * : optimal routing policy; P * , ρ * , S * : optimal resource allocation policy for each hop in the selected path.

Resource allocation optimization for subnetworks
In this subsection, the resource allocation subproblem in line 10 of EECSR is modeled first. Then, an iterative sub-algorithm is presented to solve these subproblems. And in each iteration of the sub-algorithm, an equivalent convex optimization problem is resolved via dual decomposition.

Resource allocation optimization for subnetworks
For one subnet, we formulate the inner resource allocation subproblem in EECSR as (15), where the optimization goal is the minimum EEcost, and the optimization variables P, ρ, S determine the power allocation, power splitting and relay selection policies, respectively. The subscripts i and j in OPT-2 are fixed, which represent the sender and receiver CHs of the given CoLink ij , respectively. Therefore, although the constraints C1, C2, C3, C4-C8 in OPT-2 have similar physical meanings as C1, C3, C4, C9-C13 in OPT-1, the dimensions of the variables are all reduced from (i × j × N ij ) to N ij in OPT-2. Then, we can observe from (15) that there are still some obstacles in solving OPT-2. First, the objective function of OPT-2 is not convex with respect to P, ρ and S . Second, optimization variables ρ and P are coupled in the objective function and C1, C3 in OPT-2. Third, the integer constraint for relay selection in C4 makes constraints can't span a convex set. In order to make OPT-2 tractable, we do some equivalent transformation and relaxation operations as described belows. subject to

Algorithm for resource allocation problem
For convenience, we first equivalently transform the optimization objective in (15) to the maximum U eff (ij) . Then we observe that the fractional form of the U eff (ij) still makes OPT-2 a nonlinear programming problem. Similar to [15,36], we use Dinkelbach method [43] to convert the objective function into an equivalent subtractive form as follows where q * is defined as the maximum energy efficiency of a CoLink and is given as F (P, ρ, S) = U (ij) (P, ρ, S) − q * U TP(ij) (P, ρ, S), The equivalence of the above transformation is supported by the following theorem [44].
Brief Proof: According to (9) and (10), U (ij) (P, ρ, S) and U TP(ij) (P, ρ, S) are positive. In addition, F (P, ρ, S) and q * are defined in (16). Then, following the similar method in [43], we can complete the rest of the proof.
Similar to the method in [15,36], we propose an energy-efficient subnet resource allocation algorithm (EESRA) to solve OPT-2, as shown in Table 2.
According to line 3 of EESRA, we solve the problem (17) in each iteration of the while loop.

9: end while
For solving the problem (17), we have the following results.

Brief Proof:
The outline of the method is as follows. First, we adopt similar methods in [43,44] to prove the following three propositions: (1) F (q) is strictly monotonically decreasing with respect to q, where F (·) is the maximum F (P, ρ, S) with a given q in (11); (2) For any value of q, F (q) is always greater than or equal to zero; (3) q increases in each iteration. By combining above three propositions, we can prove that after enough iterations, F (q) will converge to zero and satisfy the optimality condition in Theorem 1.
In the following steps, we transform OPT-3 into a convex optimization problem. First, we use the time-sharing relaxation technique [45] to handle the integer constraint and relax the s r to a real number from 0 to 1 in C4 of OPT-3. In fact, we will see in "Dual decomposition solution" section that although relaxation is done here, the optimal s r will still remain Boolean. Second, we define two auxiliary variables as P r ij = s r P r ij and ρ I(r) j = s r ρ I(r) j , which can be interpreted as the actual transmit power of the sender and the actual splitting ratio of the receiver, respectively. In addition, we replace R r ij in the objective function and C3 in (17)  . Third, we decouple P r ij and ρ E j (r) in the constraint C1 in (17) by dividing ρ E j (r) at both sides of the inequality. Then we can rewrite C1 as Based on the above transformation, we can prove that the transformed OPT-3 problem with constraints C1 ′ − C8 is convex with respect to the P r ij ,ρ I(r) j , ρ E j (r) and s r . Its specific proof is in Appendix A. The convexity reveals that the transformed OPT-3 problem satisfies Slater's constraint [46]. Therefore, as the optimal duality gap is zero, we design a primal-dual algorithm to solve the transformed OPT-3 problem.

Dual decomposition solution
In this subsection, we use Lagrange dual decomposition method [47] to solve the transformed OPT-3 problem. To achieve this goal, we first give the Lagrangian of the transformed OPT-3 by (18), where β = β 1 , β 2 , . . . , β N ij , is a vector of Lagrange multiplier corresponding to the energy harvesting constraint C1 ′ , and N ij is the cardinality of N ij . Similarly, we introduce Lagrange multipliers δ for constraint C2, ζ for constraint C3, and θ for constraint C4, respectively. And the Lagrange multiplier corresponding to the power splitting constraint C8 is the vector µ = µ 1 , µ 2 , . . . , µ N ij . In addition, when deriving the power splitting variables, the boundary constraints C6 and C7 can be captured by the Karush-Kuhn-Tucker (KKT) conditions.

Then, the dual problem for transformed OPT-3 problem is
We solve this dual problem iteratively in two levels. The lower level is the internal maximization which contains N ij subproblems with identical structure. And the upper level is the external minimization. In each iteration, N ij subproblems are solved for fixed Lagrange multipliers in the lower level. And the solutions of the subproblems are used to update the multipliers in the upper level.
Lower Level: Conventional convex optimization methods and KKT conditions are used to solve sub-problems. For a fixed q in EESRA, the solutions of P r ij and ρ I(r) j for CoLink ij are given by min(x, b)) . We can observe from (20) that the allocating power depends on the priority of relay, the constraints of data rate, the transmit power and the harvest energy requirement by using α r , δ , ζ and β r , respectively. Similarly, ρ I(r) * j in (21) depends on α r ζ and µ r . On the other hand, β r and E Req j in (22) (18) L(β, δ, ζ , θ, µ, P, ρ, δ) force the node j to increase the ρ E(r) j to meet the constraint C1 ′ . Moreover, (20) and (21) show that P r * ij and ρ I(r) * j are independent of s r so that we can solve s r separately. Relay r is selected when the following criterion is satisfied where is the marginal benefit brought to the system by selecting relay r in CoLink ij . Equation (23) means that the relay which can provide the maximum marginal benefit to the system will be selected [15]. Moreover, (23) shows that although we relax the constraint in C4, the solution of s r is still a Boolean value. And (24) shows that the relay selection policy is determined by α r , the rate constraint, and the CSI of CoLink ij .
Upper Level: The Lagrange multiplier is updated. We use the gradient descent method to derive the update formula as follows Here, operator [x] + is defined as [x] + = max(0, x) . t is the iteration index and �(t) is the diminishing iteration step size. Note that the multiplier θ in (22)-(24) does not affect any resource allocation variables, so there is no need to update θ . In next iteration, the Lagrange multipliers obtained by (25)-(28) will be used to update P, ρ, S in lower level. The convexity of the transformed OPT-3 guarantees that the two-level iteration in (19) can finally converge to the solution of OPT-3 in (17).
Next, we do a time complexity analysis of the proposed algorithm. As EECSR calls EESRA to solve OPT-2, we analyze EESRA first. The outer layer of EESRA implements the update of q, and its time complexity is O(I max ) . Because the transformed OPT-3 in the inner layer is proved to be convex, the complexity of solving OPT-3 for a given q is O(N ) , where N is the maximum cardinality of relay candidate sets in all Colinks. So, the time complexity of EESRA is O(I max × N ) . In EECSR, the time complexity of the initial phase in is O(|K|), the while loop runs |K| times and selecting node j from Q requires a time complexity of O(|K|) in each loop. In addition, for each input CoLink of node j, EECSR is called. Therefore, the time complexity of EECSR is O(|K | + |K |(|K | + e(I max × N ))) , where e is the maximum CoLink number of all CHs. Apparently, e ≤ |K | − 1 , thus, the time complexity of EECSR is O |K | 2 × I max × N .

Distributed routing protocol
In the previous sections, the execution node of the centralized EECSR algorithm needs to know the CSI of the entire network and the energy status of all nodes. However, this is not easy to achieve in practical networks. Therefore, inspired by the centralized EECSR, we design a table-driven distributed energy efficient cooperative SWIPT routing protocol (TDEECSR). To support TDEECSR, each CH keeps a routing table shown as Table 3. We name the first row of Table 3 as self-entry, which contains the own routing information of the node (assumed as CH i ). The other rows are named neighbor-entries, which stores the routing information of the neighbors.
In an execution cycle, the flow of the TDEECSR protocol is as follows. The sink node first broadcasts its self-entry. Then, the neighbor CHs are triggered to exchange routing information (RI) with the sink. If a node updates the neighbor-entries in its routing table according to the sink's self-entry, this node will run lines 9-15 of EECSR to calculate and determine whether to update its self-entry. In the following, if any item in the selfentry of a node is updated, we call this node is activated. And once a CH is activated, it will compete for the channel and broadcast its self-entry to trigger neighbor nodes to exchange RI and update their routing tables. Similar to the above process, the CHs in the network are activated one by one until no CH's self-entry is updated. Note that any source CH may be activated multiple times by its neighbors and then eventually obtain its path with the smallest EEcost.
Compared to the traditional distributed Bellman-Ford routing protocol (DBF) [48,49], TDEECSR has following advantages. First, DBF protocol requires all CHs to perform RI exchange according to a time schedule, while TDEECSR does not , which eliminates the overhead of time synchronization. Second, if the number of CHs in the network is K, all CHs in DBF need to perform K times of RI exchange with neighbors. In TDEECSR, CHs are passively triggered, and they may not definitely be activated every time they are triggered. So TDEECSR avoids frequent RI exchange and thus reduces the consumption of network bandwidth and energy. Third, CHs in TDEECSR use lines 9-15 of EECSR to calculate and update their own self-entries. And this realizes dynamic sensing of neighbors' energy and real-time updating of routing metrics. In summary, TDEECSR realizes the distributed application of EECSR algorithm, and has a significant improvement in real-time performance compared to DBF. Please refer to the simulation verification in next section.

Simulation setup
We consider a network which has 30 CHs located within a square area of 100 m 2 . In particular, the source CH is located at (0,0) and the sink node is located at (100,100). When laying out other nodes, we let the distance between adjacent CHs be within 20 m, and the distance from cluster members to its CH be within 10 m. Similar to [36], we assume that the small scale fading of links follows Rayleigh fading. So, we let α = 2 and K ab obey the standard Gaussian distribution. In the following experiments, unless otherwise stated, the default settings of the parameters are as follows: In addition, we uniformly set the parameters in subnetworks as follows: N r = N ij = 4 , i, j ∈ K , σ 2 ij , σ 2 ir and σ 2 rj are − 50 dBm , all the α r equal to 1. In the scenario of a single subnetwork, we let E Req j = 0.015 J . Generally, the value of energy efficiency is obtained by averaging 10,000 independent experimental results with different channel states.

Convergence performance
In this section, since the EESRA algorithm is the basis of our EECSR algorithm, we verify the convergence of the EECSR algorithm. In Fig. 3, we can observe that EESRA algorithm can converge within about 5 iterations no matter what σ 2 N and N r take. Note that the value on the horizontal axis in Fig. 3 represents the iterations of the main loop in EESRA, where the iterations of the inner dual decomposition method are not counted. Figure 4 exhibits an example for the convergence process of the dual decomposition method with a given q.

Impacts of parameters on energy efficiency of subnetwork transmission
In this subsection, we show the impacts of some parameters on our energy efficiency solution for subnetwork transmission. We can observe from Fig. 3 that the optimal transmission energy efficiency of a subnet increases when the noise power decreases and the number of candidate relays increases. This result is reasonable. The reduction in noise power increases the SNR and thereby increases the channel capacity. And the increase of candidate relay number can increase the probability of selecting a more energy-efficient cooperative transmission scheme in a subnetwork. The impact of N r on energy efficiency can be further seen in Fig. 5. Figure 5 shows the optimal energy efficiency versus the number of candidate relays under direct transmission (DT) and relay transmission (RT) modes. The RT mode mentioned here corresponds to our transmission model, whose energy efficiency is solved by EESRA. The DT mode only uses the direct transmission link to implement SWIPT. For DT mode, we maximize the energy efficiency without considering cooperative relays in subnets. An important observation in Fig. 5 is that the energy efficiency of RT mode is higher than that of DT mode at the same noise level. This result illustrates that the   cooperative transmission scheme based on SWIPT-powered member relays can effectively improve the transmission energy efficiency between CHs. Figure 6 shows the energy efficiency versus P max in different scenarios. As P max gradually increases from a small value, the energy efficiency in RT and DT modes increase rapidly until P max is larger than 20dBm. This is because that once the maximum efficiency is achieved, the increase in transmit power will result in a decrease in energy efficiency. In order to show this phenomenon, we change the optimization goal of OPT-2 to maximize the

Path energy efficiency comparison for different transmission modes
In this subsection, Fig. 7 depicts the path energy efficiency versus minimum data rate requirement R min under DT and RT modes. Obviously, the path energy efficiency obtained under RT mode is higher than that under DT mode. As R min gradually increases, the path energy efficiency first remains constant within a range, then slightly increases, and finally decreases gradually. The reason is that when R min reaches a certain value, the nodes in the path have to increase their transmit power or ρ I to meet the rate requirement, and thus improve the energy efficiency. But as the transmit power continues to increase, the energy efficiency in each subnetwork eventually decreases. From Fig. 7, we can draw another conclusion that on the premise of no decreasing energy efficiency, the achievable maximum R min of RT mode is greater than that of DT mode.

Performance verification of the SWIPT routing algorithm
In this subsection, we first give an example shown in Figs. 8 and 9 to explain the benefit of EECSR algorithm. In Figs. 8 and 9, a CoLink between two CHs is represented by a dotted line for simplicity. In Fig. 8, all CHs work in the information transmission mode (denoted as CH-IT). To find the transmission path in CH-IT mode, we use an algorithm which is similar to EECSR with the E Req j = 0 , ρ I(r) j = 1 , and ρ E(r) j = 0 . In the case that the energy of CH 8 is insufficient for forwarding, the energy efficiency of the solution path from the source CH 2 to the sink is 0.44 Mbits/J , and the hop number of the path is 13. By contrast, in Fig. 9, where all CHs support the SWIPT transmission mode (denoted as CH-SWIPT), the energy efficiency of the path obtained by EECSR is 0.57Mbits/J, and  10. This illustrates that the algorithm combining SWIPT and routing may find a more energy-efficient transmission path.
Next, we give Fig. 10 to show the average feasible energy efficiency versus the energydeficient node ratio (EDNR), For each fixed EDNR, we perform N experiments, and in each experiment, we randomly set the corresponding number of CHs as energy-deficient according to the EDNR. The average feasible energy efficiency is defined as    PUeff l /N , where M is the number of experiments that can successfully find a feasible path, and PUeff l is the energy efficiency of a feasible path. In Fig. 10, the blue solid line shows the change of AvUeff in CH-IT mode. In CH-SWIPT mode, the value of E Req of each energy-deficient CH obeys the normal distribution CN µ Er , σ 2 Er , where µ Er and σ 2 Er are the mean and variance, respectively. The three dotted lines in Fig. 10 are the corresponding AvUeff curves with different µ Er . Compared with CH-IT mode, the path obtained by EECRT based on CH-SWIPT mode is more energy efficient. In addition, as EDNR increases, the AvUeff in CH-IT mode decreases significantly. This is because M in AvUeff decreases rapidly with node failure. In comparison, the value of AvUeff in CH-SWIPT mode drops much slower, which reveals the robustness of our EECSR algorithm.

Comparison on time delay for routing protocols
In this section, we compare the time delay of routing protocols, which are the protocol based on the EECSR, TDEECSR and the protocol based on the DBF algorithm (abbreviated as DBFR). To ensure comparability, all simulations run on the same computer with the following configuration: intel core i7-8750H-2.20GHz processor, 16BG DDR4-2666MHz memory, and GeForce GTX 1060-6 GB graphics card. Figure 11 depicts the time delay caused by the route discovery process of protocol (in logarithmic scale) versus CH number. Note that TDEECSR-Av and DBFR-Av curves represent the average time spent on each CH by the two corresponding distributed routing protocols, respectively. It can be observed that the time consumption of the three protocols increases with the number of CH in the network. EECSR takes the least time and DBFR takes the most. TDEECSRP takes significantly less time than DBFRP, and takes 5-20% longer time than CEESRP. And the average time of TDEECSRP is significantly less than the concentrated time of EESCRP, which shows that TDEECSRP not only distributes the workload of the node like DBFRP, but also has the approximate total time level of EECSRP. Therefore, TDEECSRP is more suitable for larger real-time sensor networks than the other two protocols.

Conclusions
In this paper, we have considered applying SWIPT and routing selection to multi-hop clustered WSNs, where member relays between two CHs have used the harvested energy to achieve cooperative transmission, while CH relays have utilized SWIPT to supplement battery energy and forward information. We have aimed to find optimal policies, which includes routing policy for CHs, transmission power allocation, harvest energy splitting and member relays selection, and to find the maximum energy efficiency transmission path from any source CH to the sink. To achieve this goal, we have proposed a heuristic energy efficient cooperative SWIPT routing algorithm that consists of an outer energy efficient routing algorithm and multiple inner resource allocation algorithms to obtain the path with maximum energy efficiency and the resource allocation policies for each hop. Based on our routing algorithm, we have designed a table-driven distributed energy efficient cooperative SWIPT routing protocol to support distributed applications. Simulation results have shown that our SWIPT routing algorithm can achieve significant energy efficiency and good robustness. And our distributed routing protocol has been verified to have better real-time performance than traditional ones.