An energy efficient MCDS construction algorithm for wireless sensor networks

In wireless sensor network, a connected dominating set (CDS) can be used as a virtual backbone for efficient routing. Constructing a minimal CDS (MCDS) is good for packet routing and energy efficiency, but is an NP-hard problem. In this article, an efficient approximation MCDS construction algorithm E-MCDS (energy efficient MCDS construction algorithm) is proposed which explicitly takes energy consumption into account. E-MCDS contains two stages: the CDS construction stage and the pruning stage. The constructed CDS is approximately composed of two independent sets (IS). The performance ratio of E-MCDS is analysed in both unit disk graph and disk graphs with bidirectional links, being 9.33opt and 17.33nkopt, respectively. The message complexity of E-MCDS is O(n). The simulation results have shown that E-MCDS performs well both in terms of the size of CDS constructed and the energy efficiency.


Introduction
The design of wireless sensor networks (WSNs) [1] is a highly complicated task with substantial impact on the quality, cost, and efficiency of real-life sensor applications. The sensor nodes are small electronic devices with limited energy, memory, and transmit power capabilities, which in some sensor network applications are also limited in number because of their high cost [2]. A typical goal in these network designs is to form a longlived WSN, such that the sensor nodes, using their sensing capabilities and wireless transceivers, effectively cover a region of interest and forward important information to a common collection point, usually referred as data sink.
Topology control is one way to optimize the network's topology by removing redundant links and active nodes. Topology control can improve bandwidth utilization, delivery ratio, extend network lifetime, and reduce interference [3] and the packet retransmission [4]. The topology control has mainly two types [5]: power control and hierarchical topology control. Power control is to adjust the transmission power of nodes to construct a graph with better properties, such as minimal interference and minimal packet transmission cost or to improve the robustness of the network topology. The hierarchical topology aims to construct a connected dominating set (CDS) for improving the routing performance and saving energy of the network. A CDS serves as a virtual backbone for wireless network where all nodes in the network are dominated by the CDS, and the packets are forwarded through the CDS from the source node to the destination node.
In order to improve the routing performance in wireless network, such as reducing the interference and saving the energy as well as restraining flooding in the network, the size of CDS should be minimized. Finding a minimal CDS (MCDS) is usually an NP-hard problem [6]. Thus, an approximation to MCDS is usually utilized in practice. This paper article proposes such as an approximation algorithm called E-MCDS.
In the research of MCDS construction, two types of network topology graphs are utilized, i.e., unit disk graph (UDG) and disk graphs with bidirectional links (DGB). In UDG, all nodes have the same transmission range, whereas in DGB nodes have different transmission ranges. UDG is a special case of DGB, and DGB expresses the real environment more accurately. In the real wireless networks, heterogeneous networks are widespread, each of different transmission range, among other different features.
In this article, the proposed algorithm E-MCDS is distributed and designed for DGB. The key point of the proposed algorithm lies in the fact that it also explicitly takes into consideration the residual energy of nodes when constructing a CDS, which is less concerned in the literature. In E-MCDS, all nodes have different transmission ranges, and the size of constructed CDS is smaller than the compared algorithms by simulation. As for the energy performance, E-MCDS is more energy efficient compared with the LEACH algorithm [7], and improves the network's lifetime by at least 20%.
The remainder of the article is organized as follows. Based on a discussion on related study in Section 2, Section 3 presents the proposed algorithm in terms of its basic concepts, different stages, and the corresponding algorithm for each stage. The numerical analysis of the proposed algorithm is detailed in Section 4, whereas Section 5 presents simulation results of the proposed algorithm and their corresponding analysis. The article concludes in Section 6.

Related study
In this section, some approximation MCDS construction algorithms are introduced. At first, we classify the algorithms for constructing a CDS into two kinds, which are maximal independent set (MIS)-based algorithm and non-MIS-based algorithm. Here, MIS is used to express the set of independent nodes in the network. A MISbased algorithm performs two steps, namely, to first generate a MIS and then to generate a CDS. In non-MIS-based algorithm, algorithm for constructing a MIS is not needed and the algorithm for constructing a CDS is mainly based on one simple principle such as Rule-1 in [8].
In [6], Blum et al. introduce some MCDS approximation algorithms in WSNs and Mobile Ad-hoc Networks. They point out that the two algorithms in [9] have motivated many designs of CDS construction. One of the algorithms in [9] is growing a spanning tree in a single phase, and the other is growing separate components in the first phase and connecting the components together in the second phase.
Alzoubi et al. proposed several distributed MCDS approximation algorithms in [10][11][12]. In [10], the distance between any pair of complementary subsets in the constructed MIS is exactly two hops. The algorithm has two phases. The first phase is constructing a MIS and the second phase is to construct a CDS where the connectors are selected. This algorithm's performance ratio is 8opt + 1, the message complexity is O(nlog n), and the time complexity is O(n). The algorithm in [11] has two phases, and in the first phase the dominator selection is based on the lower level neighbour' states. In the second phase, the grey node with the maximum dominator nodes' degree is selected as the connector. The performance ratio is 8opt-2. Its time complexity is O(n), and the message complexity is O(nlog n). In [12], the author proposed a more complex algorithm containing two phases. The first phase is constructing a MIS, and the second phase is finding some connectors. Its performance ratio is 192opt + 48, and its time complexity is O (n), and the message complexity is O(n). The algorithms proposed by K.M. Alzoubi et al. are very classical of the MIS based kind. The performance ratios and message complexity as well as the time complexity are analysed, which motivated the following researchers in the process of designing the MIS based algorithms.
In [13][14][15], Thai et al. proposed some distributed CDS construction algorithms. In [13], a greedy algorithm called S-MIS is proposed. S-MIS has two phases. In the first phase, the MIS construction algorithm is described in [11]. In the second phase, the black-blue component is proposed. The black node is the dominator of the first phase, and the blue node is the dominator of the second phase. In the second phase, a greedy algorithm is used to find some grey nodes adjacent to at least 2 black nodes in different black-blue components. The performance of S-MIS is (5.8 + ln 4)opt + 1.2. In [14], Thai et al. proposed two algorithms called TFA and TSA, respectively. TFA is an extension of S-MIS in the DGB, and the algorithm codes are not changed, which explain that the algorithms implemented in UDG can be easily extended in DGB. The second algorithm TSA is an improved algorithm for TFA. TSA has two phases, and in the first phase, an MIS is constructed by choosing the nodes with the biggest radius as the dominators. The size of MIS constructed by TSA is smaller than that constructed by TFA. The second phases of TFA and TSA are the same. The performance ratio of TFA is (K + 2 + ln K)opt and TSA's performance ratio is not introduced in [14]. In [15], a one-phase based CDS construction algorithm is proposed. The algorithm is better at maintenance since there is no need to build a tree or select a leader. The dominator nodes form an MIS, and the performance ratio of the algorithm is 172opt + 43. The time complexity is O(Δ) and message complexity is O(nΔ 2 ), where Δ is the maximum degree. Thai's algorithms described the CDS construction process in DGB, and proposed the performance ratios analysis method in DGB. Many researchers referred the methods proposed by Thai and get many achievements.
Except for the Alzoubi's and Thai's algorithms, there are also many MIS based algorithms such as [16][17][18][19][20][21][22][23], which also mainly have two stages and the CDS construction algorithms have different criterions. In [16][17][18], the criterion is the node degree. For example, in [16], the node degree is the number of the unmarked neighbour in the first stage, and in the second stage, the node degree is the number of unmarked nodes adjacent to the non-fragment nodes. The algorithm in [16] has the performance ratio 2H (Δ) + 1. The time and message complexities are O((n + |C|)Δ) and O(n|C|+ m + nlog nrespectively. In [19], the criterion for constructing a CDS is a timer, which is based on the transmission ranges of nodes. In [20], the criterion is the weight of node, and the weight is composed of the node degree and the battery power. In [21,22], the criterion is the node id. The node with the largest id will become a dominator in [21], while the node with the smallest id will become a dominator in [22]. In [23], Kim et al. proposed two approximation MCDS construction algorithms called D-CDS-UBG and C-CDS-UBG respectively. In this article, the unit ball graph (UBG) is proposed to replace the UDG, and the 3D MCDS problem is researched in this article. The algorithms in this article also have two stages, respectively. And the first stage is constructing the MIS, the second stage is constructing a CDS. The performance ratio of CDS-UBG is 14.937, and the message complexity is O(n 2 ). The algorithms introduced in this paragraph have good performance ratios and the size of CDS is very small. Every CDS constructed by these algorithms has a MIS and some connecters. But the energy of the network is not considered and most of the studies do not perform the energy efficient evaluations.
There are also some non-MIS-based algorithms in [8,[24][25][26][27][28]. In [8,24], Wu et al. proposed some distributed algorithms in UDG and DGB. In [8], a very simple marking process is proposed. All the marked nodes form a CDS. The CDS constructed in the first phase contains many redundant nodes. Wu et al. proposed two pruning algorithms called Rule-1 and Rule-2 to delete some redundant marked nodes. The performance ratio is not given in [8], but in [11] the performance ratio of this algorithm is proved O(n), where n is the number of nodes. The message complexity is Θ(m), where m is the number of edges in the UDG. The time complexity is O(Δ 3 ), where Δ is the maximum degree. In [24], the authors proposed an extended algorithm for the algorithm in [8]. The marking process is the same as that in [8], but the pruning process is improved and denoted by Rule-k. Based on the Rule-k, the size of the final constructed CDS is reduced. In [23], a hierarchical graph is constructed at first, and then the essential nodes are determinate in the graph according to the Rule-1 in [23]. After the essential nodes determination phase is ended, there are some redundant dominators, which are deleted according to the Rule-2 in [23]. The performance ratio of the algorithm in [23] is O(log n). In [26], Misra proposed an algorithm for constructing a MCDS by using Steiner tree, and the performance ratio of the algorithm is (4.8 + ln 5)opt + 1.2, the message complexity is O(nΔ 2 ) and the time complexity is O(n). In [27], Sakai et al. proposed time-based CDS construction algorithms named SI (Single-Initiator) version and MI (Multi-Initiator) version, where SI version generates the smallest CDS with single initiator and MI version generates the smallest CDS with multiple initiators. Both have the mobility handling capability. In [28], Ding et al. proposed a distributed algorithm named α-2hop-CDS construction algorithm, where the node pair is used as the criterion to select the dominators. And the parameter α is adjustable, which is an upper bound times of the length of the shortest path. The performance ratio is H (δ*(δ-1)/2), where H is a harmonic function and δ is the maximum degree in the graph. The algorithms introduced in this paragraph have very simple principles and the process of constructing CDS is very simple and rapid. In general, the distributed non-MIS-based algorithms have the higher performance ratios compared with MIS-based algorithms. But if the pruning stages are designed well, the size of CDS can be smaller. Most of the algorithms of non-MIS-based algorithms are fully distributed and most of the MIS based algorithms are heuristic distributed algorithms. The algorithms introduced in this paragraph also have not the energy consideration, and none of them have the energy evaluations in these articles.
The objectives of the algorithms described above are minimizing the size of the CDS. But in some researches such as [29][30][31][32][33][34][35][36][37][38], the objectives of CDS construction are not or not only minimizing the size of the CDS. For example, in [29], Kim et al. proposed two centralized algorithms. The two algorithms have the constant performance ratios of the size and diameter of the constructed CDS. The diameter of the CDS is a new term considered for the constructed CDS. In [30], Thai et al. proposed a constant approximation algorithm for the strongly CDS (SCDS) problem. In [31], Thai et al. proposed a new algorithm CDSA to construct a 2-connected virtual backbone. CDSA can resist the failure of one node. In [32], Wu et al. developed a notation of directional network backbone by using the directional antenna model, and then formed the problem of constructing a directional CDS (DCDS). Wu et al. also proposed a localized heuristic algorithm as well as two extensions of the algorithm for constructing a DCDS in [33].
In this article, we propose a distributed algorithm named E-MCDS using heuristics, which belongs to the non-MIS type, but the constructed CDS approximately have the IS properties. In other words, the CDS constructed by E-MCDS is approximately composed of two IS, and the size of CDS is very small. As for the energy consideration, we propose the dominator nodes selection rules which will be introduced later. In the simulation, the size of CDS is compared with some classical CDS construction algorithms and the energy efficient of E-MCDS is also compared with LEACH.

Basic concepts
Sensor nodes are randomly distributed in the network field and have different transmission ranges. The link between any pair of nodes is bidirectional and the neighbour of one node is defined in Definition 1.
Definition 1: If and only if d(u, v) < min{r u , r v }, node v is a neighbour of u, and node u is a neighbour of v. d (u, v) denotes the distance between node u and v, r u , and r v denote the transmission ranges of node u and v, respectively.
A neighbour set N v records the neighbour of node v as defined in Definition 1. At the same time, node v has two sets Pa v and Ch v , which denote the parent node set and children node set, respectively.
Any node i has a tuple ( , where E i denotes the residual energy of node v and RN i v denotes the reduced neighbour set computed by node i, and v is a neighbour of i. The reduced neighbour set is illustrated in Figure 1.

Algorithm description
In order to understand E-MCDS better, we first give the overview of the algorithm and then describe each part of E-MCDS in detail.

Overview of E-MCDS
In E-MCDS, there are mainly two stages, which are CDS construction stage and pruning stage, respectively. In order to execute the E-MCDS, all nodes have to be given an initial status and then exchange neighbour information among themselves to get ready for the CDS construction. We call this progress as the initial or bootstrapping stage. Following the bookstrapping stage are the main operations for generating a near-optimal MCDS, which are composed of two steps: CDS construction and the CDS pruning stages.
In the initial stage, every node exchanges one message with their neighbour and put one-hop neighbour into its own neighbour set N i at first, and then each node exchanges their neighbour's information with their neighbour and computes the reduced neighbour sets of every neighbour. For example, node i exchanges the neighbour's information with its all neighbours and then N i can be obtained. If node i get N i , then it exchanges N i with all its neighbours, and computes the reduced for any neighbour v and stores RN i v for the following computation. The CDS construction stage is divided into two substages, which are CS construction stage and DS construction stage, respectively. In the CS construction stage, an initial node is selected by the BS as the root node according to its residual energy. In order to select some nodes belonging to CDS it executes the algorithm in this stage. And after the nodes selected by initial node, all the selected nodes need to execute the algorithm in this stage and select their neighbour nodes as the backbone nodes. This backbone nodes selection progress is executed constantly in this sub-stage, and only the newly selected backbone nodes execute this progress. If there are no new backbone node selected, this sub-stage is ended. The resulting set is a connected set. Then we need to make sure the nodes in this set can cover all or almost all other nodes in the network, which triggers the next stage, i.e., DS construction. In the DS construction stage, the uncovered nodes by the CS will select its neighbour nodes covered by CS according to some mechanism as backbone node, and the progress is executed repeatedly until there are no newly selected backbone nodes. At the end of the DS construction stage, the constructed CS is expanded to a CDS.
In order to minimize the size of CDS, the pruning stage is designed. If all of a backbone node's non-backbone neighbour are covered by at least two backbone nodes (including the backbone node itself), the backbone node is redundant and deleted from the CDS.
In the following, the two main stages, namely CDS construction stage and the pruning stage, are introduced in detail. In order to simplify the algorithm description, it is assumed that each node receives the messages broadcasted from their neighbour and updates its neighbour' information accordingly in the following stages.

CDS construction
CDS construction follows two steps: firstly to construct a CS and then to update this CS into a dominating set, i.e., CDS.
CS construction sub-stage: In this sub-stage, the node with the maximum residual energy is selected as the initial node, which is also called root node. If the initial node is selected, it marks itself black, and selects the black nodes from its neighbour. The selection criterion is defined as Procedure-I, which is also used by other selected black nodes. In order to execute the Procedure-I, the initial node or the newly selected black nodes need define a temp set denoted as N new i . The Procedure-I is only executed once at the newly selected black node.
The Proceduce-I's algorithm is shown as follow: Procedure-I: The black node selection criterion of the newly selected black node i Step1: Step4: Step5: N new i = , and selects white nodes {u 1 ,u 2,..., u n }(u k In Procedure-I, E v and E u denote the energy of v and u, respectively. TE i and TRN i are temp set used for computing. B i denotes the selected black node set. The selection criterion is illustrated in Figure 2.
In Figure 2, node 1 computes the black nodes. At first, node 1 selects the nodes with the maximum residual energy, then nodes 2 and 4 are selected and TE 1 = {2,4}. Then node 1 selects the node with the maxi-  Figure 2d. Finally, the selected neighbour of node 1 covers all the node 1's neighbour.
The CS construction algorithm is implemented according to the Procedure-I. At first we defined the node set of the network as N. But some special cases maybe also exist, which are shown in Figure 3.
In Figure 3a, the node 1 is the initial node selected by the BS. Then node 1 computes the black nodes, according to Procedure-I, nodes 2 and 3 are selected as black node. Then nodes 2 and 3 compute their black children according to Procedure-I, finally, nodes 4 and 5 are selected. Then node 6 received two black messages from nodes 4 and 5, thus node 6 turns grey. Thus nodes 7, 8, 9 are isolated and not covered by CS. According to the CCA, the isolated nodes in Figure 3b are nodes 4, 8 and 9.
In order to solve the isolating nodes problem, another construction algorithm should be designed, which is used for constructing a DS based on the CS constructed in the first sub-stage. The DS construction algorithm is implemented at the second sub-stage called the DS construction sub-stage. DS construction sub-stage: After the timer of the first sub-stage is up, the DS construction sub-stage is beginning. In this sub-stage, if white node i covers grey neighbour, it selects the grey node as the black node according to the Procedure-II, which is shown as follows: Procedure-II: The black node selection criterion of the isolated white node i Step1: Selects all grey neighbour {v 1 Step3: Selects nodes Step4: Procedure-II is similar to Procedure-I, and the isolated white node selects the grey with the maximum residual grey neighbour, and then selects the grey neighbour with the minimum reduced neighbour set size. Thus in general, white node selects the node with more energy and closest to itself. An example is shown in Figure 4.
In Figure 4a, nodes 4, 8 and 9 are isolated, and according to Procedure-II, nodes 4 and 9 have grey neighbour, then node 4 selects nodes 2 and 10 as the node with the maximum residual energy, TE 4 = {2,10}, node 9 selects node 7 as the node with the maximum residual energy, TE 9 = {7}. According to Procedure-II, node 4 selects nodes 2 and 10 as the nodes with the minimum reduced neighbour set size, because | RN 4 2 |= 2 and | RN 4 10 |= 2, mean while node 9 selects 7 as the nodes with the minimum reduced neighbour set size. According to Step4, node 4 selects 10 as its black node, and node 9 selects 7 as its black node shown in Figure 4b. If node 8 received the black messages from node 4, it executes the Procedure-II and selects the only grey node 4 as a black node, which is shown in Figure 4c.
The algorithm of construction the DS is shown as follow: DS construction algorithm (DCA) 1. Computes its grey neighbour set G i = {v 1  As the example in Figure 4, nodes 4 and 9 select nodes 10 and 7 as their black nodes, respectively. Then nodes 4 and 9 send messages to nodes 10 and 7, respectively, thus nodes 10 and 7 mark themselves black and broadcast a black message as shown in Figure 4b. At the same time, nodes 4 and 9 broadcast a grey message, thus node 8 knows node 4 has turn grey. Then node 8 will select node 4 as its black node according to Procedure-II, and at last node 8 marks itself grey and node 4 turns black as shown in Figure 4c. The first stage contains two sub-stages, and the first stage is constructing a CS, the second stage is constructing a DS, and finally, a CDS is constructed. In the following section, a prune stage is introduced, which deletes the redundant black node for minimize the size of CDS.
The constructed CDS in the first stage is connected and dominating all the nodes in the network. But in order to construction an approximation minimum CDS, the size of the CDS should be the smaller the better. Thus deleting the redundant black nodes in the CDS is necessary.
As to the CDS constructed in the first stage, the redundant black nodes always exist, and in Figure 5 some examples are introduced.
In Figure 5a, nodes 4 and 6 are isolated, and according to the DS construction sub-stage, Figure 5b can be got, and nodes 2, 5 turn black, nodes 4 and 6 turn grey. In Figure 5b, all the grey neighbour of node 5 is covered by node 2 and thus node 5's grey neighbour can send packets to node 2, thus black node 5 is redundant and turns grey shown in Figure 5c. The algorithm of pruning stage is described as follow in detail.

Pruning stage
At first, we define the number of nodes in the network as n, and id i denotes the identifier of node i. G i denotes the grey nodes set of node i.
The algorithm of Pruning Stage is shown as follow: Pruning redundant black nodes algorithm (PRA) For any black node i 1. If Ch i = Φ 2. Sets a timer T i = ptimer•(id i /n) 3. If T i is up 4. Requires N v (∀v G i ) 5. If ∀v G i , ∃black node u, u N v , i ≠ u 6. Marks itself grey, broadcasts a grey message 7. End 8. End

End
For the stopping criterion 1: if the dtimer is up, the stage is completed In Figure 5b, nodes 2 and 5 are leaf nodes and set T 2 and T 5 , respectively. Clearly, T 5 >T 2 , thus node 2 requires the N 6 and N 4 , and by computation the condition ∀v G i , ∃black node u, v N u is not met and node 2 stays black. When T 5 is up, node 5 requires the N 4 , and by computation node 4 is covered by node 2, then node 5 turns grey and broadcasts a grey message. Thus, Figure 5c is the final topology.

Numerical analysis of the proposed algorithm
In this section, the properties of E-MCDS are analysed and all nodes have the same residual energy. At first, the coverage of the CS constructed by CCA is analysed. Then we introduce a k-Link network, in which the CS can approximately dominate all the nodes in the network. Then we give some proprieties of the CS, and analyse the performance ratio of the CS in the k-Link network. Finally, the message complexity is analysed.

Dominant of CS
In order to introduce the dominant of CS, we first propose a notion called k-Link Network, where the k stands for the network's connectivity.

k-Link network
In this network, any node i has a neighbour set N i , and k-Link network is defined as follow: Definition 2: Any node i in the network, if condition | path v i |≥ k (∀v ∈ N i , k ≥ 1) is satisfied, then this network is a k-Link Network (k-LN).
In Definition 2, path v i is a collection which records the two-hop and one-hop paths between nodes i and v. In Figure 6a, the minimum | path v(2≤v≤6) 1 |= 2 , thus the network of this figure is a 2-Link Network. Meanwhile, Figure 6b is a 1-Link Network. In general, the k is bigger, the connectivity of the network is stronger.

Relationship between k-link network and dominant of CS
The dominant of CS is to be verified by the simulation in Section 5. By simulation, we find that the CS can dominate almost all the nodes in the network. And as the k increases, the number of uncovered nodes by CS becomes smaller, which illustrates that if the network's connectivity is stronger the dominant of CS is stronger. Thus, the CS is approximately a CDS.

Properties of E-MCDS
In this section, some key properties of the proposed E-MCDS are analysed. The properties and the way the analysis are conducted follow more or less these proposed in [39]. However, these analyses are carried out in a new algorithm context where energy efficiency is considered.
(a) (b) Figure 6 An example of k-link network. Lemma 1: The CS constructed by the CCA is a connected set.
Prove: Given any pair of black nodes (u, v)(u CS, v CS, u ≠ v), because u, v have their parent nodes, respectively, and according to CCA, u, v have a path connecting the root node. If u, v is disconnected, then the root nodes of u and v are not the same node, and thus the network has at least two root nodes, which contradicts with the CCA, where there is only one root node. Thus, u and v are connected, and the CS is a connected set.
Lemma 2: The DS constructed by the DCA is a dominating set.
Prove: Assume DS is not a dominating set, then at least one white node i exists after the timer is up. And the white node i has no grey neighbour, or it will execute the Procedure-II and select a grey neighbour as a black node. If i has no white neighbour also, then i is isolated, else we can get a white node v m-hop away from node i with grey neighbour, then v will select a grey neighbour as a black node, and change its state grey, and after sever times, node i will turn grey, which contradicts with the assumption, thus the DS constructed by the DCA is a dominating set.
Lemma 3: Any black node belonging to the CS has only one parent black node.
Prove: Given any black node i, which belongs to the CS. If i has more than one parent black nodes, then before i turns black, it received two black messages. According to the CCA, i should turn grey and broadcast a grey message, which contradicts with the condition that i is a black node. Thus, any black node belonging to the CS has only one parent black node.
Lemma 4: In the CS, if any pair nodes which are not parent-child relationship are not adjacent.
Prove: Given any pair of black nodes (u, v)(u CS, v CS, u ≠ v), and the relation of the two nodes is not parent-child relationship. If u and v are adjacent, then suppose u turns black before v. v will receive a black message from u, and v also receives a black message from its parent, thus v will turn grey and broadcast a grey message, which contradicts with the condition that u and v are black nodes. Thus Lemma 4 is proven.
According to the CCA, we know that the constructed CS is a tree, and the initial node selected by BS is the root node. Assume every node in this tree has a level, then we define the level of the root node is 0, and the level of the root node's children is 1. As the CCA is executed, all the black nodes in CS have a level.
Lemma 5: The CS is composed by two independent sets.
Prove: In CS, each node has a level, and we use ES and OS storing the nodes with even level and odd level, respectively.
Given any pair of nodes(u, v)(u ES, v ES, u ≠ v), if u and v are adjacent, then suppose u turns black before v. Let v parent denotes the parent node of v, then the level of v parent is odd level. When all the nodes executes the CCA, v will receive two black messages from u and v parent , thus v will turn grey, which contradicts with the condition that u and v are black nodes. Thus any pair of nodes belonging to ES are not adjacent. And similarly, any pair of nodes belonging to OS are not adjacent. Thus we get that ES and OS are independent set and Lemma 5 is proven. This lemma will be used in the performance ratio analysis below.

Performance ratio
In this section, the performance ratio is introduced, which is a notion about the ratio of the size of constructed CDS to the minimum size of the CDS in this network. In general, the performance ratio is always multiplied by the size of the MCDS and is described as the upper bound of constructed CDS.
Because this algorithm can be used in both UDG and DGB, E-MCDS have two bounds. In the next section, we introduce the CDS bound in UDG at first and then introduce the CDS bound in DGB. The performance ratio is an approximation ratio. This is because the analysis is based on the CS, which is composed of two IS.

Upper bound of E-MCDS in UDG
According to [14], we have the following equation: 10 lnk ln(2cos(π /5)) otherwise (1) where k = r max /r min . And K is an upper bound number of a node's independent neighbour. In UDG, the upper bound is 5, and r max = r min .
According to [22], the black nodes in CDS have two kinds of coverage types. The first type is that a black node is not covered by another black node when it selects its children black nodes. The second type is that a black node is covered by other black nodes when it selects its children black nodes. In E-MCDS only initial node belongs to the first type, and all other node belongs to the second type.
In UDG, if a node is the first type, then it dominates at most five independent neighbours. If a node is the second type, then in general it dominates at most three independent neighbours, shown in Figure 7.
In Figure 7, x and u cover each other, and node x covers the least area size of node u, then angle ∠yxz = 2π/3, and node u has at most 4π/3 area to select its independent black neighbour. Then the maximum number of independent neighbour is 3.33 in general, in this article we define the maximum black neighbour number of the second type as 3.
According to [22], the following formula gives the upper bound of the CDS both in UDG and DGB: In (2), OPT is an optimal solution or minimum solution of CDS problem in UDG and DGB, opt is the size of OPT, which is an minimum size of CDS in UDG and DGB. n k = ln k ln(2 cos(π /5)) , and |S| are the size of the MIS.
In this article, if in the UDG network, (2) can be changed into follow: According to [22], t/(opt-t) ≤ 5/1, then we can get the following formulas: Because the CDS constructed in the first stage approximately dominates all the nodes in the network, and the CDS is composed of two independent sets ES and OS, then the size of CDS is bounded by follows:

Upper bound of E-MCDS in DGB
In DGB, if a black node is the first type, then it at most dominates 9n k independent neighbour. And if a black node is the second type, then it at most dominates⌈(4/ 3)π/a⌉n k independent neighbour, where a = π/5, thus the maximum dominated independent neighbour size is 7n k .
Then we can get the upper bound of the MIS: According to (4), the following formula is derivated: The size of CDS constructed in the first stage has the upper bound:

The unified upper bound of E-MCDS in UDG and DGB
In order to bound the size of E-MCDS both in UDG and DGB, then we find the critical condition as follow: And if k > 1.45, then |CDS| is bounded by 17 1 3 n k opt , else |CDS| is bounded by 9 1 3 opt .

Message complexity
E-MCDS contains two stages, and the first stage is the main stage. The constructed CS in the first stage dominates almost all of the nodes in the network.
In the first stage, the number of messages broadcasted in the CS construction stage accounts for almost all messages broadcasted in the first stage. According to the CCA, every white node only broadcasts one message, which is black message or grey message. Then the message complexity in this stage is O(n), where n is the number of nodes in the network.
In the second stage (pruning stage), the redundant black nodes should be deleted. By simulation, we can found that, the number of the black nodes pruned in the second stage is small, and the messages broadcasted in this stage can be ignored.
Thus the message complexity of E-MCDS is O(n).

Evaluation design
In this section, the performance of E-MCDS is evaluated. In order to compare the size of CDS, we involve other algorithms, which are the second algorithm of Xiang's (denoted by XSA) in [22] and the second algorithm of Thai's (denoted by TSA) in [14]. We adopt the LEACH protocol to compare the energy efficient of E-MCDS against. At the same time, the dominant of CS constructed by the CCA is evaluated, aiming to illustrate that the CS dominates almost all the nodes in the x u y z Figure 7 The node belongs to the second type. network, namely a good approximation can be achieved. In summary, the evaluation part is composed of three parts: the CS domination evaluation, the CDS size evaluation and the energy efficient evaluation.
Before simulation, some symbols are defined. M denotes the side length of the square network area, and N denotes the number of nodes in the network. Tr min denotes the minimum transmission range of nodes, and Tr max denotes the maximum transmission range of nodes in the network. The transmission radio of node i is computed by TR = Tr min + Random(0,1)*(Tr max -Tr min ), where Random(0,1) is a random number between 0 and 1.

Domination of CS
In this section, we set the transmission range [20,50] and [20,60]. And according to the k-Linked Network definition, k is set as 1, 2 and 3. We set the number of nodes of the network from 50 to 95 with the step by 5. All nodes are randomly distributed in the network.
In order to illustrate the domination of CS, we evaluated the number of uncovered nodes after the first substage CCA. We define the network side length M as 100 m. The network with different transmission ranges which are [20,50] and [20,60] are evaluated and the following figures are obtained.
From Figure 8, we find that the number of uncovered nodes by the CS is very small both with the transmission ranges [20,50] and [20,60]. And Figure 8a, b illustrates that the CS dominates almost all the nodes in the network. And from Figure 8, we find that the number of uncovered nodes is smaller as the K is bigger, which explains that the connectivity of the network is stronger the domination of CS is stronger.
Because the CS dominates almost all the nodes in the network, we find that the CS is approximately a CDS of the network. Thus the performance ratio of CDS is analysed based on the CS, and the performance ratio is an approximation performance ratio.

CDS size
In this section, M = 100, and the number of nodes is from 10 to 100 with the step by 10. In order to fully evaluate the performance of CDS size, we define four scenarios which have transmission ranges [20,40], [30,50], [40,60] and [50,70], respectively. The nodes are randomly distributed in the network, and all points in the figures are simulated 50 times. The performance of CDS size of E-MCDS is shown in Figure 9.
From Figure 9, we find that the CDS size of E-MCDS is the smallest among all the compared algorithms, and the size of CDS is increases as the number of nodes becomes bigger. And the CDS size of E-MCDS is always the smallest among the compared algorithms as the number of nodes increases. By compared with (a), (b), (c) and (d), we can find that the CDS size becomes smaller as the transmission range increases, which is because that the transmission range is bigger the covered area is larger, and at the same time the network area size is finite, then the size of CDS is smaller. From Figure 9, we get a conclusion that the CDS size of E-MCDS is better than the compared algorithms.

Energy efficiency
In this section, we compare the routing protocol LEACH [7] with E-MCDS, and compare the E-MCDS with itself by changing the parameters. As we known, LEACH is a protocol with some clusters and cluster  heads collect and confuse the collected packets and then send the confused packets to base station (BS). In this simulation, every node sends one packet to its cluster head and every cluster head receives packets from all the nodes in its cluster in each round. In similarly, E-MCDS constructs one CDS through CCA and DCA and PRA, and after the CDS is constructed all nodes send one packet to its parent nodes, and all parent nodes confuse the received packets and send the confused packets to its parents in each round. In the following simulation, we use the notion TD, which denotes that in each round every node send TD packets to its parent node or its cluster head. The definition of TD means that if TD = 1, we only combine one round into a single round, and if TD = 2, we combine two rounds into a single round and the nodes in this scenario send two packets each. In the simulation, we let TD = 1, 2, 3, 4, and compare the network lifetime in this scenarios with LEACH.
In this article, the network lifetime is defined as the round when the first node dies (the residual energy of the node is zero). The initial energy of nodes is set 0.5 J, the other parameters of the amplifier are the same as in [7]. The network area size is 100 m * 100 m, and the number of nodes is from 10 to 100 with the step by 10. All the nodes are randomly distributed in the network. In order to fully compare with LEACH, we select two transmission ranges, which are [60,8]  From Figure 10, we find that the network lifetime of E-MCDS is bigger than LEACH with different values of TD. The network lifetime based on E-MCDS is longer than that of LEACH at least by 20%.
As the TD increases, the network lifetime decreases, which is because that the cluster head or the nodes in CDS received and transferred too many packets in a rounds and causes energy balance of nodes worse. So, as the TD increases the energy balance is worse, and the network lifetime is smaller.
In order to compare with E-MCDS itself, we defined the values of TD as 1, 2, 3 and 4. The transmission ranges are selected as [20,40], [30,50], [40,60], [142,142]. The number of nodes and the network area size are the same as that of LEACH. Through simulation, we get the following results.
From Figure 11, we find that the network based on E-MCDS with different TDs has different lifetimes. In general, the TD is smaller the lifetime is bigger, and as the transmission range increases the network lifetime increases.
According to the definition of TD, if we use TDL(i) denotes the TD*lifetime, which means TDL (1)  According to the definition of TDL(i), the TDL(i) also denotes the number of packets sent by nodes. Thus, although the TD increases, the number of the network packets sent by nodes is almost not changed, which indicates that the energy of the network is mainly consumed by the data packets and the messages complexity of the E-MCDS is low.

Conclusions
In this article, an approximation MCDS distributed construction algorithm E-MCDS is proposed. In E-MCDS, there are two stages, which are CDS construction stage and pruning stage respectively. The constructed CDS is approximately composed of two IS. The performance ratio is approximately determined according to the parameter k, if k > 1.45, the performance ratio is 17.33n k opt, else the performance ratio is 9.33opt. By simulation, the size of CDS constructed by E-MCDS is smaller than the compared algorithms, and the energy efficient of E-MCDS is better than compared algorithm LEACH.