Smart node relocation (SNR) and connectivity restoration mechanism for wireless sensor networks

Node failures are inevitable in wireless sensor networks (WSNs) because sensor nodes in WSNs are miniature and equipped with small and often irreplaceable batteries. Due to battery drainage, sensor nodes can fail at any instance. Moreover, WSNs operate in hostile environments and environmental factors may also contribute to nodes failure. Failure of nodes leads to disruption of inter-node connectivity and might also lead to network partitioning. Failure to communicate with each other and with the base station can compromise the basic operation of the sensor network. For restoration of connectivity, a robust recovery mechanism is required. The existing connectivity restoration mechanisms suffer from shortcomings because they do not focus on energy-efficient operation and coverage-aware mechanisms while performing connectivity restoration. As a result, most of these mechanisms

networks should have self-healing, self-organization, and fault tolerance capabilities for a successful operation [2].
Other than the environmental challenges such as natural disasters, one of the major research challenges for sensor networks originated from the design and characteristics of the sensor node. Sensor nodes are small and have limited resources in processing power and memory. Sensor nodes use a battery as a central power source [3]. One or more sensors can be attached to a sensor node for measuring different phenomena from the environment [2]. Sensor nodes detect desired phenomena in the environment and can transmit this information to other nodes via radio transceivers.
Sensor nodes are often deployed in harsh environments where human access is restricted; therefore, the replacement of node batteries is not possible. After deploying nodes into a network of sensors, those nodes communicate with each other to establish a network. This network coordinates the sensor nodes and transmits the detected information to the end-user [3]. As a sensor network covers a large area, not all sensor nodes are within the transmission range of the end-user. Nodes must rely on multi-hop communications for sending information toward the end user [4][5][6]. The reliability of the collected information can be increased by increasing the node density in an area. During the period, battery exhaustion of the sensor nodes occurs because of message transmission and information processing.
When the battery of a node is fully depleted, this sensor node is considered a dead node. As network nodes begin to die during the operation of a sensor network, connectivity holes begin to appear within the network. A connectivity hole is referred to as an area where the nodes are no longer connected. Connectivity holes lead to loss of inter-node connectivity, leading to the inability to send the end-users sensed information. The first step is to detect a failure which is very trivial for connectivity restoration. After identification of a failed node, the next step is to notify the nodes adjacent to the failed node about the failed node so that these adjacent nodes can reposition themselves such that the disconnected nodes become connected again [7].
Multiple failed nodes may result in disruption of network connectivity, and the network may develop multiple isolated segments or partitions. This scenario fails the flow of information between the sensor nodes and the end-user terminal. This scenario may compromise the basic operation of the sensor network, and to prevent this, network connectivity must be restored by self-organization among the nodes in the network. One possible solution for connectivity restoration is to deploy redundant nodes in place of dead nodes. However, this solution is often impracticable due to the absence of human intervention. Ideally, the connectivity restoration process should be performed by the existing alive nodes in the network. The recovery mechanism needs to be robust, and the overhead on the sensor nodes involved in the recovery process should be minimal because of the resource constraints.
Failure of nodes in a sensor network results in different types of connectivity problems. This work categorized the connectivity problems into four different cases: (1) cutvertex failure, (2) end node failure, (3) two cut-vertex node failures, and (4) multiple end node failures. Details of all these cases are provided in Sect. 3.
Dealing with the above cases is a challenging task. This paper proposes a novel connectivity restoration technique called smart node relocation (SNR) and Connectivity Restoration Mechanism to handle the issues associated with network connectivity. The major focus of our work is to deal with the connectivity restoration problem by proposing a mechanism capable of restoring connectivity using the existing nodes in the network; therefore, there is no need to re-deploy new nodes for restoring connectivity. Moreover, SNR does not substantially reduce the network coverage due to the movement of nodes as observed by the other connectivity restoration techniques. As energy efficiency is one of the major concerns in sensor networks; therefore, SNR does not rely on the exchange of large amounts of messages for its operation. SNR can also detect all the identified four connectivity problems and effectively find the solution for each case.
The rest of the paper is organized as follows. In Sect. 2, we describe the most relevant related work. Section 3 consists of the research method used for our research work, and we also elaborate on the four different cases related to connectivity restoration. Section 4 presents the simulation results, and subsequently, Sect. 5 concludes this paper.

Related work
Connectivity restoration in wireless sensor networks is an area that has been thoroughly studied by researchers [8][9][10][11][12]. Some solutions are based on relocation on-demand, while other solutions rely on post-deployment relocation. The applications requiring sensor nodes to be deployed over large geographic areas use the aerial deployment of the nodes. Due to this, node density throughout is not uniform, and some areas may have a higher density of nodes than some other areas. To achieve a uniform distribution of nodes, relocation of sensor nodes is desired so that connectivity can be established between the sensor nodes and the end-user, and the coverage area can be maximized.
The connectivity aspect is thoroughly studied in the literature, and several approaches are presented [8][9][10][11][12]. Some works are more focused on maximizing the coverage of nodes without affecting connectivity. In [13], the authors considered robot networks for the process of connectivity restoration. A 2-connected network concept is introduced, meaning that there should exist a minimum of two pathways among each pair of nodes in the network. This approach achieves 2-degree connectivity. For dealing with a node's failure, the algorithm strives to achieve 2-connectivity by moving a pair of sensor nodes. In this way, connectivity is restored. In [14], the authors proposed a technique called C 2 AP. In this technique, post-deployment of nodes is used for improving coverage and connectivity. A hierarchical architecture is proposed by the authors called COCOLA in [15], where coverage is maximized without forwarding data path to 1-tier node by the incremental relocation of higher-tier nodes. However, both the proposed solutions, C 2 AP and COCOLA, are incapable of dealing with the failed nodes' implications. In [16], a solution based on the node's cascaded movement is introduced for connectivity restoration due to failed nodes. According to this technique, a nearby node replaces a failed node, which is then replaced by another node, and this process continues until finding a redundant node. In [17], the authors proposed a new method called DARA. DARA uses a scheme based on probability for detecting cut vertices and selecting an appropriate neighbor node to the failed node for relocation. The appropriateness of the neighbor is decided based on the number of communication links.
Cascade movement of nodes is observed in most recent connectivity restoration techniques, but it does not care about the sink node and load distribution role. Therefore, a sink-oriented cascade model is introduced in a Memetic Algorithm for Topology Optimization against cascading failures (MA-TOSCA) [18]. It helps wireless sensor networks to avoid cascading failures using topology optimization. A local search operator is used on this new network paired metric. As a result, it enhances the network robustness and takes less time as compared to other techniques. Furthermore, this algorithm does not consider the impact of SINR (Signal to interference plus noise ratio), which relates to communication links. Therefore, it is an unaware channel algorithm that affects its overall performance.
Energy-Aware Connectivity Restoration Mechanism for Cyber-Physical Systems of Networked Sensors and Robots [19] consists of three algorithms, i.e. CoRFL, CoRFL2, and CoRFLN. It restores connectivity while taking care of network lifetime, energy level, network lifetime, and ecological conditions. Nodes movement is controlled using a distributed algorithm based on the fuzzy logic, residual energy, node rank, and distance. CoRFL handles the issue if the most qualified node is a cut-vertex. In this case, it will not move, and it asks even any far node to move forward. CoRFL2 and CoRFLN are used to solve the movement of cut-vertex nodes. The supervisor gathers the information of partition nodes. The best recovery node is chosen by using the fuzzy logic system from the live nodes. The supervisor administers cascade movement in this way that it is connected to other partitions. A proper mechanism of coordination among the nodes is used. In CoRFLN, the nearest node is considered standby, whereas CoRFL2 attempts to find its substitute by keeping in view the distance and residual energy.
Most current research on cascading failure of WSNs deals with the single sink network, and a few use multi-sink networks. MA-MSP is a memetic algorithm that supports WSNs struggle cascading failures by using multi-sink placement optimization. In this technique, a local search operator is intended based on a new network balancing metric. This proposed cascade model adequately characterizes the process of cascading of multi-sink in the wireless sensor networks [20].
Efficient Solution for Connectivity Restoration (ESCR) [21] is an energy-efficient technique for the connectivity of wireless sensor networks. The technique restores the network with an efficient consumption of residual energy and slightest node movement. Only such nodes can participate in the network restoration process near the faulty node and have more energy. As a result, unnecessary cascade movement of the nodes during the restoration process is stopped. Moreover, it ensures that the node that participates in the network restoration process has a sufficient energy level not to be exhausted during the process. ESCR consists of two algorithms, i.e., Assigning Backup Nodes (ABN) and Connectivity Restoration Process (CPR). ABN algorithm is used to assign the backup nodes for every node according to their residual energy. This process will be done at the start of the network and repeats in any node failures. CPR is the second algorithm whose fundamental task is to restore the network. In case of any nodes failure, its backup node is moved forward to participate in this process. ESCR compared with other well-known techniques and was found better than others. It is evaluated in an environment where sensor nodes are stationary and only actor nodes can move. Its result can be varied if we consider the mobility of sensor nodes as well.
Geometric Skeleton-based Reconnection (GSR) is proposed in [22]. GSR employs a geometrical skeleton-based approach to logically partition networks into different segments. A group of nodes having maximum connectivity becomes the geometrical skeleton backbone. Each segment keeps track of all skeletal backbone nodes because it plays an important part in network partitioning. In the case of network partitioning, each segment tries to join the geometrical skeleton backbone. This process leads to the restoration of connectivity. However, GSR assumes that each node knows the locations of all other nodes in the network. Second, it is also a prerequisite that all nodes in the network must be aware of all nodes present in the geometrical skeleton backbone. These assumptions are impractical, particularly for large networks, because keeping all this information in a network with mobile nodes can cause massive overhead. Another problem that may arise during the network's operation is that the skeleton backbone may exhaust its energy soon, causing a decrease in such nodes. After a while, the lack of presence of such nodes may result in compromising the recovery mechanism.
An energy-efficient technique, Intelligent On-Demand Connectivity Restoration for wireless sensor networks (IDCRWSN) [23], has been presented that efficiently uses the sensor nodes' residual energy. IDCRWSN restores the connectivity through redundant nodes, which are managed by Slave Keeper nodes. The Slave Keeper nodes are managed and controlled by the Master Keeper nodes. The Permanent Relocation Algorithm for Centralized Actor Recovery (PRACAR) and Self-Route Recovery Algorithm (SRRA) [24] addresses the connectivity restoration of failed actor nodes. The PRACAR restores failed actor nodes' connectivity, and SRRA provides an optimal path to the relocated sensor nodes.
All the above-mentioned works do not consider connectivity, coverage, and energy efficiency collectively. Our proposed work can be distinguished from the abovementioned works because it addresses connectivity restoration, better coverage, and efficient utilization of energy in an integrated manner.

Research method
We present in the proposed work the different cases that may come up if single or multiple nodes in a network die. Our work aims to propose a solution capable of recognizing all of the identified cases and then taking necessary actions to restore connectivity. For our proposed algorithms, we assume that all sensor nodes are randomly deployed in the deployment area. After the deployment, all nodes discover their neighbors by exchanging HELLO beacon messages. For the initial relocation of nodes, the mechanism used in [17] is used. Algorithm 1 presents the steps that are used for initial node relocation. Figure 1 shows the initial relocation scenario. It is assumed that all network nodes are homogenous and have the same processing and communications capabilities. For each node, it is assumed that the sensing range and communication range are the same. In this paper, we use Rc to represent the sensing and communication range.
Once the nodes are deployed in an area, each node will broadcast HELLO beacon messages with a transmission range of Rc/2 for providing its location information to other nodes in the network [17]. Each node share information, including its ID and current position in acknowledgment (ACK), to all the neighboring nodes. The transmission range of Rc/2 is used for the transmission of ACK. Also, each node periodically sends a broadcast message for synchronization called SYN message. The transmission range used for transmitting SYN messages is also Rc/2. SYN messages are used for the identification of failed nodes. For example, let us consider the scenario presented in Fig. 1. If node S7 has failed, then nodes S5, S6, and S8 will not receive SYN messages from S7. The absence of a SYN message means the failure of a node. Upon detecting a failed node, nodes S5, S6, and S8 will send a HELLO message with a transmission range Rc toward the failed node direction. Upon receiving the HELLO message, in reply, each node transmits an ACK message. In this way, each node updates the list of neighbors, and they initiate mobility to restore the network, as depicted in Fig. 2.
Upon detecting the failure, it is important to understand the impact of failure on the network topology. In this work, we have categorized four scenarios that can occur due to single or multiple node failures.

Scenario 1: single cut-vertex failure
The cut-vertex scenario is illustrated in Fig. 2. This happens when a node's failure divides the connected network into multiple disjoint partitions [21]. In Fig. 2, it can be observed that the failure of node F divides the network into two partitions resulting in cut-vertex failure. Failure of node F is detected by nodes S5, S6, and S8 due to the absence of SYN messages from node F. Algorithm 2 illustrates the single cutvertex failure detection and restoration process. For connectivity restoration by our proposed algorithm, nodes S5, S6, and S8 send a broadcast HELLO (also known as Heartbeat) message with a communication range of Rc. If each node receives an ACK from another neighboring node, the node's mobility needs to restore network connectivity. Upon receiving the ACK from neighboring nodes, and all nodes update the routing list. The solutions presented in [17,18,22] rely on the cascaded relocation of neighboring nodes in the given scenario. It is proven that cascaded relocation of nodes leads to more energy consumption leading to quick drainage of sensor node's batteries [4]. Cascaded relocation also leads to the shrinking of network coverage. Our proposed algorithm improves energy efficiency by avoiding the unnecessary mobility of neighboring nodes and improving network coverage. Essentially, this algorithm also prefers coverage; therefore, these coverage holes will be fulfilled by the neighbors by measuring overlapped distance according to [19].

Scenario 2: single end node failure
End nodes are also referred to as leaf nodes. These are nodes normally present at the edge of the network. Failure of end-nodes does not affect inter-node connectivity. However, their failure affects the coverage area. Upon detecting end node failure (due to the absence of HELLO messages), the neighboring nodes will calculate the overlapped coverage area with the failed node using the mechanism presented in [19]. A neighboring node with more overlapping distance with the failed node is a suitable candidate for moving toward a failed node. During the movement, the node will continue to send HELLO messages and receive ACK messages to neighboring nodes with a communication range of Rc. This process ensures that the node mobility by a suitable candidate node does not cause network disconnectivity. For further illustration of this process, let us consider Fig. 3. Let us assume that node S4 has failed. The failure of S4 will be detected by its neighbors, i.e., S3, and S6 due to the absence of SYN messages from S4. Both the neighbors S3 and S6 will compute the relative overlapped sensing area with the failed node S4. Node S3 has more overlapped areas with the failed node than S6. Therefore, S3 will be selected as a suitable neighbor responsible for moving toward the failed node S4. According to [19], S3 will move a maximum distance of Rc/4 toward the direction of S4. This process is illustrated in Fig. 4, and the steps are presented in Algorithm 3. During the movement, S3 will continue sending and receiving HELLO and ACK messages with Rc's transmission range with neighbors to ensure connectivity. End nodes are also referred to as leaf nodes. These are the nodes normally present at the edge of the network. Failure of end-nodes does not affect the inter-node connectivity [25][26][27][28].

Scenario 3: two cut-vertex node failure
Two cut-vertex node failure is shown in Fig. 5, where multiple nodes fail simultaneously causing a network partition. The implications of two cut-vertex node failures are substantial. In this case, merely broadcasting HELLO messages with a transmission range of Rc will not be received by the nodes in the direction of a failed node as the distance between nodes is greater than Rc (as illustrated in Fig. 5). The absence of ACK will mean no immediate neighbors of the failed node in the Rc range. Dealing with this problem, a new type of message called coordination message will be broadcasted with a transmission range of Rc/2 among the neighbors as they start to move toward the failed node. The coordination message aims to ensure that no moving node goes out of range of its neighbor, causing further disruption in the network shows simultaneous node failure. These moving nodes will continue to transmit HELLO messages and wait for ACK messages. Receiving the ACK message will mean a node in the failed node's vicinity capable of restoring connectivity. This process is illustrated in Fig. 6, and the steps are presented in Algorithm 4. Figure 6 shows the original position of the relocated nodes as well as the position after the relocation.

Scenario 4: multiple end node failure
Multiple end node failure is illustrated in Fig. 7. As WSNs operate in harsh environments, therefore multiple end node failure is a possibility. Multiple end node failure can Fig. 6 Recovery of two cut-vertex node cause a big coverage hole, and for various applications, it is undesirable. To deal with such a situation, the failed nodes' neighbors start to move toward the failed nodes and exchange SYN messages for reporting the change in location to all neighbors. The maximum movement toward the failed nodes by neighbor nodes is Rc/4 (as assumed in most baseline works such as [17][18][19]. Figure 8 illustrates the movement of nodes toward the failed nodes to cope with multiple node failures. Multiple end node failure detection and recovery process are presented in Algorithm 5.

Algorithm analysis
There are five algorithms given above. Each of these algorithms takes a certain amount of time which is known as its complexity in terms of time denoted by T(n) where n is  Table 1 gives the time cost of each step in terms of its input. For example, step 4 of Algo.1 is a while-statement under the loop. Its cost is t1: t1 ≤ si which means it will execute t1 times depending upon condition but at most si times. Σ represents the sum of the cost of all steps involved in the respective algorithm. MaxΣ means at most number of occurrences of t times. It can be easily seen that all algorithms are linear in their complexity depending upon their input size.

Results and discussion
For the simulations, we have used NS2 (Network Simulator 2). During all simulations, at time T = 0, sensor nodes are randomly deployed in a field with dimensions of 950 × 950 m 2 . The communication range is varied between 25 and 150 m. Node density is varied in the simulation area by varying the number of nodes between 20 and 250. Table 2 illustrates all the simulation parameters. Each point in the graph is calculated by running simulations with random seeds ten times. The results for the proposed algorithm are compared with baseline algorithms MA-TOSCA [18], MA-MSP [20], ESCR [21], GSR [22], and IDCRWSN [23]. The following sections explain the results obtained by doing extensive simulations. Figure 9 shows the effect of increasing the number of nodes on the total distance moved by nodes for connectivity restoration. It can be observed that our proposed protocol SNR performs well compared to all the other baseline algorithms. The major reason behind this is that SNR moves just the critical nodes near the failed nodes. Alternatively, all the other baseline algorithms rely on non-critical nodes' movement, resulting in cascaded relocation. Therefore, the average distance moved by all baseline protocols is much more as compared to our proposed algorithm. Cascaded relocation results in an increase in the average distance moved by the nodes during recovery and average energy  Fig. 9 Nodes vs. Distance Moved during relocation consumption. As SNR reduces cascaded relocation compared to other protocols, therefore it proves to be more efficient.

Number of nodes moved
The number of nodes moved by the considered protocols by increasing the total number of nodes in the network is presented in Fig. 10. As the number of nodes increases; the number of nodes moved by all protocols increases. However, our proposed protocol SNR outperforms all the considered baseline protocols as the increase in the number of nodes moved is lesser. Cascaded relocation is the main reason for more nodes moving on average for all the considered baseline algorithms. Cascaded relocation increases as the number of nodes in the network increase. It is evident from Fig. 10 that SNR is scalable and performs well as the number of nodes increases in the network.

Reduction in field coverage
The percentage reduction in field coverage concerning different communication ranges is shown in Fig. 11. Two factors contribute to the reduction in field coverage. First, the nodes that die due to complete drainage of their batteries; second, the node movement to restore connectivity. Our proposed protocol aims to achieve connectivity restoration by reducing the number of exchanged messages (for achieving energy efficiency) and moving only critical nodes (for the restoration of connectivity and improving field coverage in case of failed nodes). It is evident from the figure that with an increase in the communication range of nodes, the percentage reduction in the field coverage decreases for all considered protocols. The percentage reduction in the field coverage by our proposed protocol SNR is lesser than all other baseline protocols. Among all considered protocols, GSR yields the largest percentage reduction in the field. The major reason behind this observation is using excessive cascaded relocation for connectivity restoration. Other protocols also move non-critical nodes for connectivity restoration, leading to the increased energy consumption of nodes due to movement leading to the failure of more nodes in the network. This ultimately leads to a decrease in the coverage area. For SNR, the percentage reduction in the field coverage remains below 2 percent for all the considered communication ranges.

Number of exchanged packets
The average number of packets exchanged by all considered protocols is illustrated in Fig. 12. For the working of each protocol, several different types of packets are transmitted. For Fig. 12, the communication range is varied between 25 to 150 m. It can be observed from the figure that with the increase in the communication range, the number of packets exchanged increases for all considered protocols. The operational details of each protocol impact the number of packets that need to be exchanged. Moreover, the decisions made regarding the movement of nodes also play a key role. Whenever a node is relocated, it needs to exchange different control packets with its neighboring nodes. The more nodes a protocol relocates, the more packets are exchanged. It can be seen from the figure that GSR exchanges the maximum number of packets compared to all other protocols. Cascaded relocation is one of the major factors resulting in increased packets for all baseline protocols. As our protocol moves just the critical nodes; therefore, it avoids unnecessary relocation of nodes. This results in a lower number of exchanged packets. Due to this, our proposed protocol exchange the least number of packets. Exchanging the least number of packets also makes our proposed protocol more energy efficient as packets' exchange requires energy. Therefore, SNR proves to be the most energy-efficient protocol among all considered protocols.
For the continuous operation of sensor networks, connectivity restoration is of immense importance, and a technique capable of restoring the connectivity is crucial for smooth operation. A connectivity restoration technique should be self-organizing, coverage-aware, and energy-efficient. By studying the literature, it was observed that most of the solutions for connectivity restoration focused on only one of the above features but not all of them collectively. This research aimed to design a novel connectivity restoration mechanism that effectively restores connectivity by moving fewer nodes than existing techniques. Another focus was to keep the connectivity restoration technique energy efficient by exchanging a minimal number of control messages. Last but not least, the technique should minimize the reduction in field coverage. Our proposed connectivity restoration mechanism achieves all the above objectives. Extensive simulations proved the effectiveness of our proposed protocol.

Conclusion
Node failures pose serious challenges for the researchers because they may lead to connectivity disruption among the nodes. It is inevitable to have an efficient connectivity restoration mechanism capable of efficiently restoring connectivity in case of single or multiple node failures. In this paper, a novel connectivity restoration technique called SNR is proposed. SNR is capable of detecting single and multiple node failures and efficiently restores connectivity by relying on the movement of the minimal number of nodes. It does not rely on the exchange of excessive control messages and also avoids the problem of cascaded relocation by relocating the minimal number of nodes for connectivity restoration. It also improves the field coverage as it results in a minimum percentage reduction in the field coverage compared to the other approaches.
Future work can be done in two possible directions. The first direction is the development of a simple yet flexible analytical model for calculating the performance metrics under different mobility models. The second direction involves the real-world implementation of the proposed solution for extensive performance evaluations.