Energy-efficient hierarchical routing in wireless sensor networks based on fog computing

Wireless sensor networks (WSNs) have been recognized as one of the most essential technologies of the twenty-first century. The applications of WSNs are rapidly increasing in almost every sector because they can be deployed in areas where cable and power supply are difficult to use. In the literature, different methods have been proposed to minimize the energy consumption of sensor nodes to prolong WSNs utilization. In this article, we propose an efficient routing protocol for data transmission in WSNs; it is called energy-efficient hierarchical routing protocol for wireless sensor networks based on fog computing. Fog computing is integrated into the proposed scheme due to its capability to optimize the limited power source of WSNs and its ability to scale up to the requirements of the Internet of things applications. In addition, we propose an improved ant colony optimization algorithm that can be used to construct an optimal path for efficient data transmission for sensor nodes. The performance of the proposed scheme is evaluated in comparison with P-SEP, EDCF, and RABACO schemes. The results of the simulations show that the proposed approach can minimize sensor nodes’ energy consumption, data packet losses, and extends the network lifetime. We are aware that in WSNs, the certainty of the sensed data collected by a sensor node can vary due to many reasons such as environmental factors, drained energy, and hardware failures.

identification, sensing type, communication capabilities, mode of operation, sensing rate, frequency [1].Heterogeneity of the network provides greater flexibility in deployment with low overhead.The sensor nodes in the proposed scheme consist of low-end nodes and high-end nodes.The low-end nodes are normal sensor nodes that are constrained with limited computation and sensing abilities.They are responsible for gathering sensed data and transmission.
The high-end nodes consist of sensors with high processing power, long communication range, and high throughput.A network with heterogeneous sensor nodes can achieve load balancing and prolong the network lifetime utilization.WSNs are ubiquitously used everywhere in modern society, including in hydraulic power plants for environmental monitoring, in the production of vehicles in the automobile industry, for heat energy in greenhouses, and forest monitoring [2,3].In conventional WSNs, sensor nodes transmit their raw data to a base station for analysis and storage.
Recent improvements in IoT technologies have enabled WSNs to transmit raw data to the cloud for processing and storage.This process requires extremely high network bandwidth and drained the limited energy of sensor nodes [4].However, instead of transmitting the huge amount of sensed data generated from the sensor nodes over the network and then processing them using a cloud computing platform, some processing can be performed close to the sensor networks using a technique called fog computing.Fog computing is composed of networking devices such as routers, proxy servers, set-top boxes, gateways [5].These devices have higher processing capability and storage than sensor nodes.They can be placed between sensor nodes and cloud computing to receive sensed data from nodes.This technique will not only conserve sensor nodes' energy dissipation during transmission but also provide low latency, location awareness, and high bandwidth for WSNs.The term 'fog computing' was developed by Cisco to overcome limitations in cloud computing [6,7].Fog computing provides services closer to end-users.It consists of fog nodes (FNs) which provide resources at the edge of the network.They have a high processing power, power source, more storage which can be used as fog servers in the network.This new technology is implemented at the edge of the network.It offers several benefits to end-users; these include efficient network bandwidth usage, data security, fewer bottlenecks, the solving of high latency on the network, increased reliability of transmitted sensed data, and a higher speed of analysis [8].Fog computing is integrated into our architectural model to address some of the constraints of WSNs and also to exploit all the benefits of fog computing mentioned above.
FNs have the capability to receive raw data from WSNs, aggregate, perform pre-processing, and store them temporarily instead of sensor nodes forwarding all the raw data to the cloud computing, thus reducing network bandwidth and latency [9].
The integration of fog computing with the WSNs and IoT creates a new type of service called fog as a service (FaaS) [10].However, for fog computing to fully deploy its service, each fog node should meet the following requirements: (1) concurrent data collection from low-end nodes, (2) high processing power and performance to support real-time data processing and analysis, (3) high service reliability, and (4) low power consumption, to achieve long-time utilization.Fog nodes in fog computing architecture function as middleware, operating between end-users and cloud computing.In the context of WSNs, they provide resources to the underlying sensor nodes.The integration of fog computing into WSNs can overcome numerous problems [11,12].
Routing by ant colony optimization (ACO) is used to find the optimal path from source nodes to destination nodes in WSNs, using the following metrics: transmission distance, node's energy level, quality of service (QoS), pheromone, and heuristic factor [13,14].Ants in a routing algorithm are agents traveling from one node to another in the network.They deposit a chemical substance called pheromone on their path as they traverse the network.This method has been used to solve many optimization problems in WSNs [15][16][17].
This work is different from the work presented in [18] as follows.
We develop an algorithm that can uniformly distribute the advanced nodes in the network for uniform energy distribution among the nodes.
We ensure that only advanced nodes can be selected as cluster heads unlike in [18], and the authors stated that either normal nodes or advanced nodes can be used as cluster heads.This can create an energy hole in the networks and result in loss of data packets.
We develop an improved ACO algorithm to construct an efficient and optimal path among the nodes.
Continuous technological advancements and innovations in the twenty-first century have increased the development and applications of WSNs in many areas of human endeavor.However, sensor nodes that form WSNs have a limited power source.It is essential to save the lives of individual sensor nodes due to sensitive sensory data they transmit in most cases in the application areas.In order to overcome this challenge, fog computing is integrated into our scheme.Fog computing is a good technology that can provide a solution to some of the shortcomings of WSNs with its capability to meet the requirements of the IoT applications.Moreover, WSNs generate a large amount of sensed data which becomes significantly challenging to handle for the nodes.Thus, aggregating, filtering, and transmitting each sensed data would require devices with higher processing and memory capacity; fog computing can provide these functionalities.
Finally, this technology provides low latency, storage scalable, ubiquitous, layered, and federated network connectivity between WSNs and cloud computing.
The proposed method consists of a three-tier hierarchical network structure as shown in Fig. 1.We discuss each of these tiers in Sect.3.
The contributions of this paper are as follows: 1. We propose an energy-efficient hierarchical routing in wireless sensor networks based on fog computing (EEHFC).2. The proposed scheme consists of many nodes with different capabilities.We drive mathematical expressions for the specifications of each device in the network for seamless integration with one another.3. We develop mathematical models for normal sensor nodes, advanced nodes, and fog nodes representing their physical components based on their service delivery for efficient data transmission in the networks.
4. We develop mathematical models for the proposed scheme.An algorithm is designed for uniform distribution of advanced nodes to achieve load balancing within the network.In addition, an algorithm is designed for reliable end-to-end data packets transmitted between the fog nodes and cloud computing.5.An improved ACO algorithm is developed to adjust routes to minimize the total number of broadcast transmission in the network.6.We evaluated EEHFC using real data sets from the wireless environment.Results show that EEHFC performs better than the three related schemes.
The locations and number of sensors deployed in an interesting area determine the topology of the network, which will further influence many of its fundamental properties, such as its lifetime, connectivity, cost, and coverage.Thus, the performance of a sensor network depends to a large extent on its deployment and the ecological conditions.Indeed, the information provided by sensors may not always reliable, either due to environmental factors such as Gaussian white noise, hardware failures, security issues, or operational tolerance levels.Therefore, it is very important to take into account this uncertainty in the deployment process and the results that are obtained.The remainder of this article is structured as follows: Sect. 2 discusses related work, while fog computing architecture is presented in Sect.3. Section 4 focuses on performance evaluation and a discussion of the results before Sect. 5 concludes the article.

Related work
The unique characteristics and wide application of WSNs have motivated researchers to propose various approaches and algorithms to optimize the performance of sensor nodes in networks, concerning many metrics (e.g., latency, throughput, energy cost, scalability, average energy consumption, end-to-end delay).Most of the approaches that have been used in the literature are based on the existing protocol [19], and attempts have been made to improve on them.Performance evaluation, for some of these proposed protocols, is undertaken using simulations, while others are implemented through testbeds.Some of the outstanding approaches which have been proposed by different authors (as reported in the literature) to minimize energy consumption in WSNs are presented below.The strengths and weaknesses of each approach are highlighted.In the next sections, a review of those approaches is presented under three subsections: hierarchical energy-aware routing protocols, heterogeneous routing protocols, and ACO routing protocols.
In order to conserve the limited energy that is available to sensor nodes and extend the network the lifetime, they are usually organized into different clusters in layers to form a hierarchical clustering network [20].In each cluster, at least a node is selected as a cluster head (CH), depending on the protocol used and the size of the network.The nodes are responsible for collecting sensed data from member nodes, aggregate, and forward them to a base station via multi-hop communication.LEACH was the first unique and outstanding protocol proposed for clustering sensor nodes in WSNs, based on the hierarchical routing technique [21].Most other protocols proposed thereafter (for hierarchical routing) have used the LEACH protocol as a benchmark.
However, this protocol has several shortcomings: (1) the sensor nodes selected as CHs in LEACH are not evenly distributed within the network; thus, nodes that are far from the CH transmit through long distances and more energy is consumed during the transmission; and (2) the CH selection method is based on probability, which may lead to an increase in overhead when selecting new CHs and may result in an increase in energy consumption.
Wang et al. [22] proposed a novel energy-aware hierarchical cluster-based (NEAHC) routing protocol for WSNs.The goals of their work are to minimize the total energy consumption and to ensure the load balancing of energy consumption between sensor nodes.They developed an algorithm for the proposed scheme, which is divided into two phases: cluster setup phase and the steady-state phase.
They model the relay node as a nonlinear programming problem and use the property of convex function to determine the optimal solution.The proposed scheme was evaluated through simulations.They compared their approach with two other related protocols.The proposed scheme minimizes communication cost and direct data transfer by nodes close to base station prolongs the network lifetime compared with selected related protocols.
However, the network area and the number of nodes they used for their simulation are small.There is a high possibility for this approach not to perform better if the network size is large, for instance, if the number of nodes is 500 or more.
Xie et al. [23] proposed energy-efficient routing for mobile data collectors in WSNs with obstacles, by dividing the network region into grid cells with the same size.This approach provides a convenient construction of the spanning graph, using the line sweep approach.
The graph is composed of cells, which include the shortest search route for the mobile data collector (MDC).This method represents a heuristic tour-planning algorithm based on a complete graph.Performance evaluation of this approach shows that it is able to successfully dispatch MDC and extend the lifetime of the WSNs.However, only one MDC was used to evaluate this approach, when it should be evaluated using many collectors.
Ayoub et al. [24] proposed a cluster-based multi-hop advanced heterogeneity-aware energy efficient (MAHEE) protocol, which reduces the energy consumption of sensor nodes by choosing optimal CHs, thus allowing multi-hop inter-cluster communication using equal amounts of energy among all nodes.The algorithm is based on a clustering path planning algorithm for WSNs.In the proposed method, a node with higher energy is selected as a CH from among other nodes in the network, which achieves load balancing among the other nodes.MAHEE consists of two types of sensor nodes-normal and advance-which are equipped with different initial energy values.This helps to select a suitable CH based on the residual energy and distance from a base station.Simulation results show that the proposed approach performs better than existing state-of-the-art heterogeneous routing protocols and improves network stability and lifetime.
However, there is a high possibility that the CHs which are selected are not randomly distributed within the network.Thus, sensor nodes that are far from the CHs will transmit across a long distance and dissipate more energy.
Qureshi et al. [25] developed the Balanced Energy-Efficient Network-Integrated Super Heterogeneous (BEENISH) protocol.The protocol is an extension of (DEEC) in [26], in terms of choosing a CH based on the residual energy level of the nodes, with respect to the average energy of the network.BEENISH consists of four types of sensor nodes: normal, advance, super, and ultra-super nodes, each with a different energy level.The authors used the concept in [19] to divide the network into different clusters, in which ultra-super nodes have a higher probability of being selected as CHs than other types of nodes (because they are high-energy nodes).Simulation results show that this protocol performs better than previous clustering protocols in heterogeneous WSNs.However, the performance of the protocol is only based on two metrics-the number of alive nodes and data packets-which is not enough to determine how efficient and reliable the proposed protocol is, compared to other related heterogeneous protocols.
An efficient CH selection was proposed to extend the network lifetime, through an approach called Energy-Dependent Cluster Formation in heterogeneous WSNs (EDCF) [27].The authors employ a clustering process to reduce energy consumption, to extend the network lifetime.The method of CH selection starts with the generation of a random number between zero and one.An individual sensor node that wants to become a CH generates that number and checks it against its threshold function.If the value of the number generated is less than the threshold value, the node is selected as the CH for that round.Otherwise, the same process is repeated by the next node, to become the CH.The results of the experiment show that the proposed protocol is better at prolonging the network life span than related heterogeneous protocols.
However, the proposed scheme is similar to the approach in [19], which means it will have the same weaknesses.
Fog-based energy-efficient routing protocol for WSNs (P-SEP) is proposed in [18].The proposed scheme uses PEGASIS-based routing of fog nodes (FECR) and ACO-based routing of fog nodes (FEAR) algorithms for the heterogeneous WSN.The selection of cluster heads (CHs) is based on probability function considering initial energy and current energy of sensor nodes in the network.The CHs transmit their data packets to the closest fog node which further processes and forwards its data to the cloud using FECR and FEAR algorithms.The results of the simulation show that the approach decreases energy consumption and increases network lifetime.
However, the authors claimed that either a normal node or an advanced node can be selected as a CH.If two or more normal nodes are consecutively selected as CHs, while advanced nodes remain as members of a cluster.An energy hole will be created and resulted in the loss of data packets being transmitted in the network.
Mohajerani and Gharavian [28] proposed a life-time-aware routing algorithm for WSNs (LTAWSN).A unique pheromone function was derived to integrate hops and energy consumption into the routing choice.The main aim of this approach is to optimize the energy consumption of the sensor nodes in WSNs.LTAWSN is better at minimizing energy consumption and extending the network lifetime than related routing algorithms for WSNs.
Sun et al. [13] proposed a unique routing algorithm called RABACO, based on the position information and search direction in ACO.This improves the network routing algorithm by considering the residual energy, heuristic function, sensor node communication distance, searching range, and transmission direction from the source node to the destination node.Simulation results show that the proposed algorithm minimizes the average energy consumption and prolongs the network lifetime.

Design of the study
The proposed scheme consists of three types of nodes, as shown in Fig. 1.The normal sensor nodes (N S ) and the advanced nodes (N A ) are distributed within the network, while the fog nodes (N F ) are placed at the edge of the network (middle layer).Advanced nodes are included in the sensor networks to eliminate overheads that are usually involved in the selection of CHs and unnecessary energy wastage during network operations.This differs from previous approaches reported in the literature, which are based on cluster formation and the selection of CHs [29][30][31].In addition, the inclusion of fog computing ensures that not every sensed data are transmitted to the cloud for further processing and storage.Fog nodes can perform preprocessing on the received data instead of sending all the raw data to the cloud computing for processing and storage.
Given the short transmission range of sensor nodes, we divided the network into clusters of equal numbers of nodes, on average.Normal nodes that are geo-spatially close to an advanced node are grouped together to form a cluster.The relationship between the normal and advanced nodes is many-to-one, that is, many normal nodes belong to an advanced node.The number of clusters depends on the number of nodes in a network, and the application area, sensor nodes are application dependent.To ensure load balancing between the N S and N A , we developed an algorithm (Algorithm 1) for uniform distribution of N A within the network, to avoid placing N A on the same side of the network.
The N A has an injective mapping (one-to-one) with the N F , which is placed at the edge of the network.The clusters comprising of the N S and N A within that location are served by the edge nodes N F .A normal node may leave or join any N A based on its geographical distance and be connected to, and disconnected from, the corresponding N A .
Cloud computing consists of data centers that host servers.Cloud computing is responsible for the extensive analysis of all sensed data received from N F and storage.This is unlike the case with conventional cloud computing architecture, in which all sensed data are transmitted directly from sensor nodes to the cloud for the analysis and storage of a large volume of data.Thus, with the inclusion of fog computing into our scheme, moderate and relevant data are transmitted to the cloud for processing, analysis, and storage, leading to improve utilization and efficient use of cloud platforms.

Assumptions for the proposed scheme
The integration of fog nodes into WSNs is still new and yet to be implemented on a large scale.Therefore, we assumed the following for this network.
• The lowest layer of the network consists of normal nodes and advanced nodes which are different in terms of energy and capacity; • Normal sensor nodes are only responsible for sensing environmental data and data transmission; • Advanced nodes are responsible for receiving sensed data from member nodes; • Fog nodes aggregate sensed data into a single packet; • Fog nodes can process and store data in addition to routing data to cloud computing; • Advanced nodes can transmit directly to the fog nodes; and • Fog nodes' batteries can be recharged and easily replaced if they have run out.

Energy model for the proposed scheme
In this subsection, we developed an energy model for the proposed scheme; it consists of normal nodes and advanced nodes.In this model, the energy of advanced nodes is more than that of normal nodes in the network.A denotes the total number of sensor nodes in the network, the number of normal nodes is denoted by P , and the fraction of advanced nodes is denoted by K .If the initial energy of a normal node is denoted by E f and the energy of advanced nodes is h more than the energy of normal nodes, then the advanced node energy will be The energy of normal nodes is P * E f and that of advanced nodes is K (1 + h) * E f .Therefore, the total energy of the nodes in the network is expressed as follows This model shows that advanced nodes have hE f energy more than normal networks with the energy AE f .We adopted the radio energy model in [19] for the communication distance between the sensor nodes.

Energy consumption of a sensor node
WSNs are usually deployed to collect environmental data in the area of interest.Every sensor node has a sensing rate and at regular time intervals transmit their sensed data to the nearest fog node.The source transmits the packets to the fog node by utilizing (1) services of intermediate relay nodes if the received node (fog node) is not within its transmission range; thus reduces the probability of packet loss and energy consumption [32].We assume that the maximum transmission range for every node is denoted by distance R. If the distance d i,j between two nodes i and j is less than R, then energy dissi- pated by a sensor node E T i,j to transmit 1-bit of a data packet from i to j can be expressed as: where E Tx is the energy per data packet consumed by the transmitter electronics, ξ amp is the transmit amplifier dissipation per bit per square meter, and parameter r β i,j connotes the energy loss due to channel transmission.It is assumed that the receiving energy E Rx is constant.
If a sensing rate at node i is denoted by O i , the energy consumed, E consumed , of a sensor node to transmit a data packet at time t is expressed as: Based on this model, sensor nodes that are one hop from the fog node(s) will receive higher data traffic than nodes that are multi-hop from the fog nodes.They will consume more energy than sensor nodes that are far from the fog nodes.
Fog nodes are not power constrained to receive and pre-process sensed data received from the entire sensor network.Unlike conventional cloud architecture, in fog computing, not all data packets are forwarded to data centers in cloud computing for processing.Instead, all real-time analysis and latency-sensitive data packets are processed at the fog layer.The fog nodes in this layer have limited semi-permanent storage that allows them to store the received data temporarily for analysis and then send the source devices the needful feedbacks.

Mathematical model
The mathematical model of our scheme is described in this section.The operations and all entities involved are defined.It is assumed that the number of nodes distributed within the network in Tier 1 will not change during network operation.In addition, the clusters covered all the nodes in their domains.The physical components of the proposed scheme are defined as follows:

Physical components of Tier 1
Normal nodes (N S ): A normal node denoted by N is expressed as tuples, as follows.
Each of Eq. ( 4) parameters is defined as follows.
Definition 1 (Identification of N S ) N id is the unique identification (ID) assigned to each normal node in the network. (2) Definition 2 (Present condition of N S ) Present condition of N is denoted by N pc .It shows the current state of the node, whether it is in an active or an inactive state. 1 is used to denote an active state, and 0 an inactive state.It is expressed as a Boolean as N pc = {0, 1}.Definition 3 (Event type of N S ) The type of environmental data (event) N s sense is denoted by N te .It is expressed as N te = N te 1 , N te 2 , N te 3 , . . ., N te k , where N te denotes a set of environmental data being monitored by the N s and k is the total number of events.Definition 4 (Geo-spatial location of N S ) The geo-spatial location (N ln ) of N is expressed as a three-dimensional space as follows where x, y and z represent the horizontal, vertical, and height of a node, respectively.Time t denotes at which the N sends its data to a neighboring node/advanced node.
Definition 5 (Component of N S ) The component of N S is denoted by N ct which includes hardware specifications, the mode of operation, the sampling rate, and frequency.It is expressed as a six tuple as follows.
where N ps denotes power supply.In most cases, sensor nodes are powered by small bat- teries.It describes, in detail, the type of batteries a node uses for power supply (e.g., alkaline, lithium, or zinc-air batteries) and the size (AA or AAA).The tuple N mc denotes the microcontroller; it contains one or more processor cores along with memory and programmable input/ output peripherals.The details of the processor core include processor speed, cache memory, and bus specifications.The tuple N st represents the differ- ent types of sensors that are used as subunits of a sensor node.The tuple N ch denotes the sensor node's hardware communication type used for wireless communication-this includes ZigBee, radio frequency identification (RFID), and bluetooth.The tuple N dr denotes the data rate of a sensor node.The tuple N op denotes the operating system (OS) a sensor node uses, which could be TinyOS, Mantis, or Contiki OS [33].The application and application instances of the sensor nodes are defined as a tuple below.

Definition 6 (Application of N S )
The application of a sensor node (A pp ) is expressed as a four tuple, as follows.
where the tuple A id denotes the unique identification (ID) of an application.The tuple A tp denotes the type for which the application is applied (e.g., smart city, healthcare, automobile, home security, or industrial applications).The tuple A cp denotes the mini- mum component the system needs to successfully run the application, including the OS, processor, RAM, and secondary storage.The tuple A(t) denotes the time stamp (second) at which the node senses the environmental data. (5) Definition 7 (Application instance) The application instance, I A , is expressed as a five tuple, as follows.
where I id denotes the application instance ID which is generated and assigned by the system.The tuples N id and A id were defined earlier.The tuple I rq denotes the resource requirement of the I A , which may be in terms of data processing and analysis ability for healthcare applications, network bandwidth for streaming applications or processing, and storage power for automobile applications.It is possible for many instances of an application to run concurrently on a sensor node, with the unique IDs distinguishing the instances from one another.Tier 1 is divided into finite clusters C K , and each cluster contains, on average, the same number of randomly distributed nodes.The sensor network A is expressed as follows. (8) where p denotes the average number of nodes in a cluster C , while K denotes the num- ber of clusters which is equivalent to the number of advanced nodes in the network.The number of p changes due to the death and introduction of new nodes in the cluster.The area covered by a cluster is mathematically expressed as a five tuple, as follows.
where C k is a cluster k, such that k = 1, 2, 3, …, K, N k A denotes an advanced node k, the tuple G denotes the area covered by a cluster, the tuple p was defined earlier, and map- ping of a sensor node to its corresponding advanced node is denoted by M id .

Property 1
The relationship from the set of N s to the N A in the network is many-to-one mapping (i.e., many nodes transmit to the corresponding advanced node).

Definition 8 (Advanced node device)
The advanced node is defined in terms of its types as a three tuple, as follows.
where D id denotes the ID of the advanced node, D cp denotes the component of the device node (i.e., hardware and related specifications), and D tp denotes the type of node.

Property 2 The relationship between the advanced nodes and fog nodes is based on injective mapping and is represented as follows.
Definition 9 (Fog instance) The fog nodes' (middle layer) services, denoted by F , are represented by a five tuple and expressed as follows.
where the tuple F id denotes the ID of a fog node, the tuple F U denotes purpose, F RI rep- resents Required Interface, F PI denotes Provided Interface, and the last tuple denotes the Set of Attributes.Now that all the physical components of the architecture of the WSNs fog-based model have been defined, we want to show the relationship between the three tiers.
First, we want to show the relationship between the normal nodes (N S ) and advanced nodes (N A ) using the method of contraction.Proposition 1 (Subjective mapping) Given the set of normal nodes N S and advance node N A , the number of N S mapped to C K , where C K denotes the total number of clusters in the network.Proof We prove this by using the method of contraction.Suppose there were such a surjection as f : We assume There exists at least a pair (N i , N k ), such that: There exists every normal node belonging to an advanced node in each cluster C k , a con- tradiction to Proposition 1, which disproves our assumption.This concludes the proof.□Proposition 2 (Injective mapping) Given the set of advanced nodes N A and the set of fog nodes N F , the mapping of N A into N F is null.Proof This will also be proved by means of contradiction.Suppose there were such a function as f : We assume that there exist There exist N k , N f such that disproves the injectivity of Property 2. Thus, the assumption is not valid.This concludes the proof.□

Data transmission between fog nodes and data center
Our network model contains N F deployed at the edge of the network, at the middle layer (Tier 2) of our model.N F at the edge of the network communicates directly with N A in Tier 1.Each fog node at the edge of the network receives data from N A (aggregate data in each round) and transmits it to the next fog node in the chain.
Algorithm 2 is developed to transmit aggregated data between N F and cloud computing.An algorithm [21] is adopted to form a chain of fog nodes, and it enables the nodes to communicate with their neighboring nodes in the chain.The fog nodes in the chain receive data from the advanced nodes and perform pre-processing on the received data, and these data move between N F , with each fog node sending its data to the leader fog node.The leader fog node receives and aggregates all data received from N F and finally transmits these to the cloud com- puting for further processing.The algorithm shows the steps of the data transmission-from fog nodes to cloud computing.Fog node chain and its communications, as shown in Fig. 1, are constructed by choosing the furthest fog node from the cloud computing using Eqs.( 15) and ( 16), to ensure that fog nodes further from the cloud computing have close neighbors.The neighboring distances increase progressively since fog nodes already on the chain cannot be revisited.Equation ( 17) is used to connect nodes to their closest neighbors.A fog node with the shortest distance to the cloud computing in Tier 2 is selected as the leader, and its main (15) duty is to transmit aggregated data to cloud computing.Constructing a chain among N F and selecting a node as the leader-to receive and forward aggregated data instead of each fog node transmitting its own data directly to the cloud computing, could minimize energy dissipation during data transmission in the network.Thus, in order to choose a FN as a leader, the distance of FNs to the cloud and their energy need to be considered.

Improved ACO algorithms
There are many routing paths between the fog nodes and the data center.It is, however, necessary to transmit the aggregated data from the leader fog node to the data center along the shortest, most efficient, and reliable path.Algorithm 3 is developed to construct the optimal path for data transmission between N F and the data center.The aim is to determine the shortest possible path.An ACO routing technique is used to find the shortest distance between the leader fog node and the data center.ACO is a branch of optimization modeled algorithms based on the behavior of ants (17) in a colony and is a subclass of computational intelligence (IC) paradigms that aid in determining optimal solutions to optimization problems [15].The ACO model was first proposed by Dorigo [16,34].Since then, the model has been widely studied and improved.The idea comes from observing ants' foraging behavior-how they find the shortest path between the food source and their nest.When searching for food, ants first explore the area surrounding their nest randomly.While moving, they deposit a chemical substance called pheromone on the paths as they search for food, forming pheromone trails.Thus, when other ants search for food, they can smell the pheromones that have been deposited on the paths, and tend to choose a path marked by strong concentrations of pheromone.However, the conventional ACO schemes do not differentiate between the various types of data packets transmitting over a network.All data packets are transmitted in the same way to a destination, as these schemes are majorly designed to minimize congestion and improve network performance.On the other hand, WSNs in IoT environment have to support multimedia data transmission, requiring a different level of quality of service (QoS).This work divides QoS into three services (Q s ).
Q s_1 Q s_1 is guaranteed service, and it provides safe end-to-end delay guarantees.
This service warranties that data packets will reach the receiver node at the right time zero data packets loss.Q s_2 Q s_2 is a controlled load-balance service.The service is used where likelihood of time delays will happen.In addition, when there is congestion in the network, the network can provide services just as if there was no congestion in the network.Q s_3 Q s_3 is the best-effort service.It refers to an Internet delivery service where the network does not provide any guarantees on the time delay limit the data packets will be delivered in the network.
We made modifications to the proposed traditional ACO algorithms in order to improve its performance and to accommodate the heterogeneity of our network.These modifications are explained as follows.
It is assumed that the success transmission rates of all the links between the advanced nodes and fog nodes are all 100%.
An advanced node has low communication performance if the data packets transmission success rate is less than 10%.Conversely, it has high communication performance if the data packets transmission rate is more than 90%.
Given a network topological graph G = (K , E), where K denotes the number of advanced nodes in the network, and E represents the set of the links connecting two (advanced and fog) nodes in the network.
If the forward link of the node k is denoted by T f and T r denotes the reverse link of the node j .The link that directly joins nodes i and j is represented in Eq. ( 18) Each link l ∈ E represents the success probability of data transmission on the link l.For a path T path and a link l ∈ E , the distance between an advanced node and a fog node is the sum of the single hop links and expressed as follows.
(20) l = N A (k), N F j Equation (21) shows that the smaller the T path , the nearer the advanced node is to the fog node.The node should be used as a data packets forwarding node.We use R k, j to denote the relationship between a sender node (k) and receiver node j , as expressed in Eq. ( 22).
where T f (k) and T r (k) denote forward link and reverse link, respectively, of the node k.

Local pheromone update
The choice of selecting an optimal path is made based on the transfer probability decision rule in Eq. ( 21).where Q s_a kj denotes the pheromone service, η Q s_a kj denotes the local heuristic value of the path between the sender node and the receiver node, a denotes the QoS type, α and β are control parameters used to regulate the relative weight of the pheromone trail and the heuristic value, respectively.The q is a random variable ranging between 0 and 1 (i.e., [0, 1]), and q o (0 ≤ q o ≤ 1) is a given parameter.j ∈ allowed u for all u = 1, 2, 3, . . ., w are paths that can be selected by node k in the next step.

Global pheromone update
Once the communication paths between the sender nodes and receiver nodes have been completed by the search ants, Eq. ( 24) is used to choose an optima path where Z Q s_a path u is the path utility value.For a path between nodes k, j, the pheromones are updated as expressed in equation where ρ is the pheromone evaporation factor which serves to diminish the intensity of existing trail over time, is the pheromone increment on the path between the advanced node and fog node and it is expressed as follows. (21) where ̟ is an adjustment coefficient, and T path denotes path length.Equation ( 26) is a pheromone update rule for the forward ants used to create the paths between the F L and the D _center .ϕ k denotes the contraindication list of the k.

Experimental results and discussion
The performance of our proposed scheme is presented in this section.We simulated our scheme (EEHFC) and compared the results with the P-SEP [18], EDCF [27], and RABACO [13] schemes.Network Simulator 2 (NS2) was used to evaluate the proposed algorithms.The network parameters used in our simulation are presented in Table 1, and Table 2 contains a list of abbreviations.The network contained three types of nodes: normal sensor nodes, advanced nodes, and fog nodes; values h = 0.05 and h = 0.1 show the fraction of advanced sensor nodes in the network.Thus, h = 0.05 means the propor- tion of advanced nodes to normal sensor nodes is 5% of the total normal sensor nodes, for a network with 100 sensor nodes.The performance of EEHFC, P-SEP, EDCF, and RABACO in the network area 100 × 100 m 2 was evaluated.All simulations ran 80 times, to generate different topologies to obtain accurate results.We determined the average values of the results and then used them to plot the figures.The simulation time (round) for each experiment was 5000 s.A round was defined as an equal period for which every node transmitted its sensed data to a data center, located outside the network, and at a relatively far distance from the WSNs.The number of fog nodes was considered to be five (5), and each had higher energy than the advanced nodes in the network.

Network lifetime
We studied the number of alive nodes of EEHFC, P-SEP, EDCF, and RABACO when the ⍺ = 3, h = 0.1.They dissipated their energy gradually as the simulation time increased (see Fig. 2).The first node died (FND) in EEHFC, P-SEP, EDCF, and RABACO at 1561, 1519, 1421, and 1387 rounds, respectively.In addition, half of the nodes died (HND) in EEHFC, P-SEP, EDCF, and RABACO at 2468, 2006, 1579, and 1536 rounds, respectively, while the last node died (LND) in EEHFC, P-SEP, EDCF, and RABACO at 4039, 3017, 3472, on 1864 rounds, respectively.In all the scenarios, the proposed algorithm had the highest number of alive nodes and prolonged the network lifetime beyond that of the selected algorithms.

Number of packets
We investigated the number of data packets transmitted to the final destination in each algorithm.EEHFC delivered more sensed data to the data center than all the other algorithms in terms of reliability, stability period, availability, and network lifetime as depicted in Fig. 3.It had the highest data delivery ratio.This result proves that our scheme not only performed better in a small network but also in a large network.

Total energy consumption of sensor nodes
Sensor nodes dissipate energy gradually as the network size increases in all the schemes and the energy curve is shown in Fig. 4. The RABACO algorithm dissipated the most energy during data transmission-the reason being that the algorithm kept to multiple paths and did not consider the residual energy of the neighboring node, which created an energy hole in the network.There was a high probability that the remaining nodes would transmit across a long distance and hence consume more In addition,  the energy dissipated by P-SEP was less than that of RABACO, because it transmitted along an alternate path whenever there was congestion along the primary path.Thus, the energy consumption in EEHFC is lowest because the scheme selects nodes on paths with few hops to transmit the data packets.Finally, overheads that are usually involved in the selection of CHs are eliminated in the proposed scheme.It is able to optimize the routing process to maintain load balance in the network.

Average energy dissipation in the network
Average energy dissipation of advanced nodes varying the number from two to six follows the curve shown in Fig. 5.The average energy dissipation was highest for two advanced nodes-the reason being that each normal node transmits to advanced nodes across a long distance, thereby using more energy to transmit.However, when the number increases to six, the average energy dissipation was lowest, since the nodes transmitted across a short distance and conserved more energy.We observe that after 3000 rounds, the nodes dissipate more energy.This is due to the congestion and retransmission of data packets.Finally, after the number of rounds, energy holes created in the network will increase the nodes' energy dissipation.

Network lifetimes with a varying number of advanced nodes
Table 3 shows the values of the network lifetimes obtained for each of the four algorithms.The algorithms have a different number of alive nodes after a certain number of rounds (α = 2, h = 0.05).Before FND, the EEHFC increased by 34.8%, 53.4%, and 82.9% in P-SEP, EDCF, and RABACO, respectively.In addition, it increased by 21.0%, 33.5%, and 40.4% in P-SEP, EDCF, and RABACO, respectively, before HND.Finally, it increased by 10.9%, 20.3%, and 33.8%, respectively, in P-SEP, EDCF, and RABACO, before LND.

Performance of QoS levels for different schemes
The performance of selected schemes and our scheme are compared based on the QoS levels.The results are presented in Fig. 6.The results show that regardless of the energy consumption, the number of data packets transmitted, data packets reliability, the average of QoS values of the nodes of the EEHFC scheme perform better than other selected schemes.Hence, it is an improvement over related schemes.

Data packets transmission loss rate
We perform experiments on data packets transmission loss rate for the selected schemes and the proposed scheme under different QoS levels for sensor nodes.We observe that when the distance between a sender node and the destination node is short, the data packets transmission loss rates are almost the same for all the schemes.Conversely, as the transmission distance between a sender node and the destination node increases, data packets transmission loss rate likewise increases.The reason is that as the distance between the nodes increases, the number of hops that are required for reliable data transmission equally increases.As shown in the figure, the data packets loss rate increases in all the schemes.Percentages for data packets loss rate for the different schemes are shown in Table 4. EEHFC scheme performs better than other selected schemes.It can select the optimal path for its data transmission, minimizing the packet loss rate.

Conclusion
EEHFC is proposed as a fog-based algorithm for WSNs.This algorithm presents efficient methods for data transmission in the network.A three-tier architecture was developed for the proposed scheme, consisting of three types of nodes-normal sensor nodes, advanced nodes, and fog nodes.EEHFC presented a hierarchical routing architecture for data transmission from normal sensor nodes to the data center through fog nodes.
In addition, an algorithm was designed for the uniform distribution of advanced nodes within the network, to minimize the transmission distance of each sensor node.The advanced nodes forward gathered data from the normal sensor nodes of their clusters to the fog nodes.Thereafter, the fog nodes performed pre-processing on the received data and stored them locally, which invariably reduced the processing of a large volume  of data at the data center.Only sensed data that cannot be processed in fog nodes are transmitted to the data center.Moreover, the inclusion of various levels of energy in our scheme causes the network to be heterogeneous and helps maintain load balancing in the network.The proposed scheme is energy efficient and extends the network lifetime due to the higher energy of fog nodes.To increase data reliability at the data center, we developed algorithm 2 for data transmission from the fog nodes to the data center.Algorithm 3 was designed to construct the optimal and most efficient path between the fog nodes and the data center, using the ACO approach.The proposed EEHFC approach is energy-efficient, has high data reliability and availability, and prolongs the network lifespan-more so that related approaches, such as the P-SEP, EDCF, and RABACO schemes.However, since the F s store only important data and the individual node has a balanced processing load, optimal resource allocation for them is one of the drawbacks of this approach.
In addition, data transmission in the network is open and is susceptible to different attacks.There is a need to provide adequate security solutions to these attacks.These will be looked into in our future research.We are aware that sensed data generated by sensor nodes can change due to environmental factors and hardware failures which might cause sensor nodes to produce incomplete data or erred.The end-users need to be mindful of the uncertainty level of the collected data and various results obtained in the research.

Data traffic load distribution
We performed an experiment using the model developed in Sect.3.3 to determine the energy consumption of sensor nodes based on the data traffic and distance of each node from fog nodes.In total, 100 sensor nodes are randomly distributed and set each node transmission range to 75 m.The improved ACO routing algorithm 3 is also used in this simulation, and the packet transmission rate is 1-bit per second.The results of the experiment are presented in Fig. 7.We can see that sensor nodes with short distance from the fog nodes receive higher traffic load than those that are far from them.The reason is that the closer nodes act as relay nodes for other nodes that are far; hence, there is high possibility for these nodes to use up the limited energy and die faster than those nodes that are far as shown in the figure.It can be seen that the experimental traffic load distribution agrees with the theoretical traffic load distribution when there is number of sensor nodes.This consolidates the practical importance of the energy model presented in Sect.3.3.

Fig. 1
Fig. 1 Architecture of the proposed scheme

Fig. 2
Fig. 2 Number of nodes alive against the number of rounds

Fig. 3 Fig. 4
Fig. 3 Number of packets transmitted to data center

Fig. 5
Fig. 5 Average energy dissipation against the number of rounds

Fig. 6
Fig. 6 Performance of different schemes

Fig. 7
Fig. 7 Data traffic in a network with 100 sensor nodes and transmission range 75 m