- Research
- Open Access
- Published:

# Route referencing and ordering for synchronization-free delay tomography in wireless networks

*EURASIP Journal on Wireless Communications and Networking*
**volume 2018**, Article number: 211 (2018)

## Abstract

Delay tomography is an inference technique for link delays in a network, where end-to-end route measurement is a promising method to reduce measurement overhead. Furthermore, by incorporating compressed sensing, delay tomography can efficiently detect sparse anomaly. In delay tomography, however, there is an inevitable issue that is clock synchronization for the route measurements. In this paper, based on route referencing, we study synchronization-free delay tomography with compressed sensing. From theoretical analysis, optimal route referencing and ordering methods for synchronization-free delay tomography are derived as “subtractive and differential schemes,” which cancel or minimize the error factors caused by clock asynchronism, clock skew, and normal link delays with single or multiple references, respectively. Simulation experiments confirm that the proposed methods can identify abnormal links more accurately with robustness against the error factors than a conventional scheme, where the newly proposed differential scheme always shows the best performance thanks to its better error factors cancelation.

## Introduction

Anomaly detection/identification [1] is essential for wireless sensor networks since it can guarantee not only long-term workability but also quick recovery once an anomaly occurs. In order to detect and identify anomaly, it is necessary to monitor the network either passively, actively, or both. In a passive approach, the traffic flows on the network are passively monitored and the network information is collected. There are a number of routing algorithms proposed for wireless sensor networks [2], and among them, distributed algorithms are commonly used for their low transmission redundancy. In distributed routing algorithms, each sensor node locally exchanges hello packets for link state probing only between the neighboring sensor nodes, and then, data packets of each sensor node are sent according to the established routes in the network. This implies the drawback of the passive network monitoring, that is, it is limited to the periods of time when there is traffic on the network between the nodes of interest. On the other hand, an active approach that injects traffic into the network can quickly and accurately detect anomaly, but may adversely affect the normal data traffic. The primary purpose of wireless sensor networks is to reliably collect data from sensor nodes, so it should be accomplished by fewer probing packets and less interference for normal data traffic without change or addition of sensor network functions. In this paper, we address the challenge of the active anomaly detection in an asynchronous network, based on “network tomography [3–5].”

Network tomography has been used to encompass a class of approaches to estimate internal link states from end-to-end route measurements in networks. Especially when packet route delays are measured, network tomography is referred to as “delay tomography,” and recently, it has been applied to the problem of anomaly identification in networks since it alleviates cooperation of internal nodes. In the problem, tomography schemes are often formulated as underdetermined linear systems [6–8], and compressed sensing [9, 10] is utilized to solve it, which is a promising technique for reconstructing a finite-dimensional sparse vector based on its linear measurements with dimension smaller than the size of the unknown sparse vector. The probability that multiple anomaly simultaneously occur is considered to be relatively low, so compressed sensing has successfully brought high efficiency into delay tomography for the aforementioned challenge of the active anomaly detection, by reducing the number of measurement routes as compared to the number of links in a given network, that is, by reducing the number of probing packets. Similar to the previous study [6–8], provided that the sparsity of the desired solution can be exploited, the primary objective of this paper is to identify few links giving extraordinarily large delays as anomaly, which is referred to as “abnormal links.”

In a wireless sensor network, unattended sensor nodes are usually likely to connect to neighboring nodes in better link states, so it is difficult for its network manager to know whether there is something wrong in a network and where it is even if they notice it. When an anomaly happens due to a node failure, the active delay tomography can identify it, so the manager can fix or replace the failed node. This would be necessary for continuing the task of the wireless sensor network. In addition, when an anomaly happens due to a link disconnection by a physical obstruction, the manager can find or remove the physical obstruction for the identified link. This would be applicable for an intruder or wild animal detection in a wireless sensor network for environmental monitoring over an agricultural field. Although delay tomography brings these benefits into wireless sensor networks, few works have been dedicated for this interesting research topic [11–15].

As an issue for consideration, delay tomography burdens source and destination nodes with clock synchronization between them. A packet route delay, which is required in the formulation, can be known from the transmission time at a source node and the reception time at a destination node, so precise clock synchronization between them seems essential and unavoidable, but it is difficult in reality. In wireless sensor networks, the electronic components of nodes are sometimes too untrustable to meet the requirement of clock synchronization in terms of precision and complexity [16, 17]. Global positioning system (GPS) may provide precise clock synchronization among wireless nodes [18], but all nodes cannot be equipped with GPS receivers. Network time protocol (NTP) [19] is applicable and several methods using the medium access control layer (MAC) time stamp [20–22] have been proposed for clock synchronization, but they require all nodes to exchange packets specialized for clock synchronization. It must be more energy-efficient to accomplish active delay tomography without the use of such specialized packets.

In [13], we tackled the clock synchronization problem and proposed a synchronization-free delay tomography based on compressed sensing. The proposed scheme cancels the time offset between the source and destination nodes by selecting a single reference route, inspired by the workability of a time-difference-of-arrival (TDOA)-based localization where the clocks of target node and reference nodes are not synchronized [18]. The trade-off for the asynchronism is a loss of equation in the delay tomography formulation, but compressed sensing is a method to obtain a solution from an underdetermined linear system, so it successfully compensates the trade-off. In [13], based on the compressed sensing theory, the potential performance is discussed for identifying abnormal links when measurement routes are given, and it is proven that the potential performance can be preserved, as long as reconstructing an underdetermined exactly sparse vector with a single non-zero entry. Here, we refer to the conventional synchronization-free delay tomography scheme with a single route reference as “a subtractive scheme.”

In addition to the success of the cancelation of the time offset in asynchronous wireless sensor networks, it needs to be carefully designed taking two more factors into consideration in order to more accurately identify abnormal links, which are inherent to wireless sensor networks. One is “link delay,” which always arises from some factors: buffering, medium access protocol, packet collision, and packet loss due to propagation characteristics such as fading, blocking, and near/far effect. The link delay in wireless networks is relatively large as compared to that in wired networks and is accumulated in the measured route delay as a large positive bias, so it gives an unignorable impact on the tomography performance, if the route delays are simply used in it. The other is clock error derived from clock frequency deviation, which is referred to as “clock skew.” Since the frequency of crystal oscillators inevitably deviates [23], the clock error between the source and destination nodes increases over time even if the clock of one node is temporarily synchronized to that of the other node at a time. For example, when the measurement interval gets longer to avoid interference from active probing packets to normal data traffic, the clock skew appears as a larger time-variant bias in the measured route delay. In [24], we simply analyzed the effect of the two error factors but did not deeply discuss the performance of the subtractive scheme.

In this paper, we propose the design strategies to make the synchronization-free delay tomography scheme more accurate in wireless sensor networks. We start by theoretically analyzing the effects of the link delay and clock skew on the tomography performance and propose the optimal route referencing and ordering method for the subtractive scheme. Next, extending the idea to the case of multiple routes reference, we newly propose a better synchronization-free delay tomography scheme, which is referred to as “a differential scheme,” with its optimal route referencing and ordering method as well. We show by simulation experiments that well-designed synchronization-free delay tomography schemes can be insensitive to the error factors and accurately identify abnormal links in an asynchronous wireless network. To the best of the authors’ knowledge, there have been no works on the theoretical analysis of such error factors and the design of route referencing and ordering suited for delay tomography in wireless sensor networks, although there are a number of works dealing with network tomography.

The remainder of this paper is organized as follows. Section 2 presents some preliminaries on compressed sensing for discussing the mathematical property of the proposed schemes. Section 3 explains the system model. After Section 4 introduces the conventional delay tomography. Sections 5 and 6 propose the synchronization-free delay tomography methods with a single route reference and multiple routes reference, respectively. Section 7 prepares to evaluate the proposed methods, and then, Section 8 demonstrates their performance by simulation experiments. Finally, Section 9 concludes the paper.

## Preliminaries on compressed sensing

### Definition

The *ℓ*_{p} norm (*p*≥1) of a vector ** ω**=[

*ω*

_{1},

*ω*

_{2},⋯,

*ω*

_{j},\(\cdots, \omega _{J}]^{\top } \in \mathbb {R}^{J \times 1}\) is defined as

where ⊤ denotes the transpose operator. For *p*=0, using its support, we define ∥** ω**∥

_{0}as

*K*-sparse vectors are defined as those which have at most *K* non-zero elements, and a set of *K*-sparse vectors \(\boldsymbol {\Sigma }_{K} \in \mathbb {R}^{J \times 1}\) is given by

We consider that, through a matrix \(\mathbf {A} \in \mathbb {R}^{I \times J}\) (*I*<*J*), a linear measurement vector \(\mathbf {y} = [y_{1}, y_{2}, \cdots, y_{i}, \cdots, y_{I}]^{\top } \in \mathbb {R}^{I \times 1}\) for a vector **x**=[*x*_{1},*x*_{2},⋯,*x*_{j},⋯,\(x_{J}]^{\top } \in \mathbb {R}^{J \times 1}\) is obtained as

Whether or not one can recover a sparse vector **x** from **y** by means of compressed sensing can be evaluated by Spark (**A**), which is defined as

where Ker(**A**) denotes the kernel of **A**, that is,

The following theorem guarantees the unique recoverability of sparse vector [10]:

###
**Theorem 1**

For any **y**, there exists at most one **x**∈*Σ*_{K} such that **y**=**A****x**, if and only if Spark(**A**)>2*K*.

Theorem 1 yields the requirement *I*≥2*K*, since Spark(**A**)∈[2, *I*+1]. We refer to such a matrix **A** with Spark(**A**)>2*K* as *K*-identifiable.

###
*ℓ*
_{0}, *ℓ*
_{1}, and *ℓ*
_{1}/*ℓ*
_{2} optimization problems

When **A** is *K*-identifiable and **y** is not contaminated with noise in (5), **x**∈*Σ*_{K} is reconstructed by solving the following *ℓ*_{0} optimization problem:

Because of the discrete and discontinuous nature of the *ℓ*_{0}, it is very difficult to solve (8), so instead, the following *ℓ*_{1} relaxation of (8) is proposed:

The problem given by (9) can be easily solved by linear programming algorithms, and its solution is proven to be equivalent to the one for the original *ℓ*_{0} optimization problem [25].

On the other hand, when **y** is contaminated with some form of noise, Theorem 1 may not guarantee the recoverability. In this case, instead, we estimate \(\hat {\mathbf {x}}\) by solving the following *ℓ*_{1}/*ℓ*_{2} optimization problem [26, 27]:

where *ξ* is an adjustable parameter.

Several methods have been so far proposed to solve the *ℓ*_{1}/*ℓ*_{2} optimization problem [10], and we use the fast iterative shrinkage-thresholding (FISTA) algorithm [28] to obtain \(\hat {\mathbf {x}}\). In our simulation experiments, after selecting the element with the maximal magnitude from the elements of \(\hat {\mathbf {x}}\) as *x*_{max}, we finally identify the indexes of non-zero elements as

where *θ* is a constant parameter.

## System model

### Network and tomography session model

Figure 1 shows the network model with a defined boundary. Let \(\mathcal {G} = (\mathcal {N}, \mathcal {L})\) denote an undirected network, where \(\mathcal {N}\) and \(\mathcal {L} \subset \mathcal {N} \times \mathcal {N}\) are sets of nodes and links, respectively^{Footnote 1}. The nodes and links are numbered such as \(\mathcal {N} = \{n_{1}, n_{2}, \cdots, n_{q},\) ⋯,*n*_{Q}} and \(\mathcal {L} = \{l_{1}, l_{2}, \cdots, l_{j}, \cdots, l_{J}\}\), where \(Q= |\mathcal {N}|\) and \(J = |\mathcal {L}|\) denote the numbers of nodes and links, respectively. It is assumed that the topology is fixed while a tomography session defined below is conducted.

We consider delay tomography in a simple scenario [7, 13] for an easy-to-understand explanation, where a single source node *S* and a single destination node *D* are assigned out of the boundary nodes, and *S* has established several routes to *D* before a tomography session. We define a set of the routes as \(\mathcal {R} = \{r_{1}, r_{2}, \cdots, r_{i}, \cdots, r_{I}\}\), where *I* denotes the number of routes, and define a set of links which constitutes *r*_{i} as \(\mathcal {L}_{i}\). In a tomography session, *S* sequentially sends probing packets to *D* through the routes in the packet transmission interval of *T*_{probe}. Here, in order for a probing packet to traverse over each pre-selected route, *S* uses a source routing algorithm such as the dynamic source routing (DSR) [29].

For each of the probing packets, the corresponding end-to-end delay is calculated from the transmission time at *S* and reception time at *D*. Note that, since the active proving traffic increases the load of the network, its longer transmission interval is preferable not to interfere with normal data traffic; on the other hand, since the assumption of the stationarity in the network may be invalid for a long duration, the tomography session should be performed in a short time. One of the reasons for utilizing compressed sensing is that it is possible to reduce the number of probing packets while keeping the identifiability of abnormal links.

In addition to the above definitions and notations, we assume that there are *K* abnormal links giving extraordinarily large delays in the network. The abnormal links are distinguished from the other links referred to as normal links, and they are separated into different sets \(\mathcal {L}^{A}\) and \(\mathcal {L}^{N}\) with \(\left |\mathcal {L}^{A}\right |=K\) and \(\left |\mathcal {L}^{N}\right |=J-K\), respectively. Finally, we define sets of routes containing and not containing abnormal links as \(\mathcal {R}^{A}\) and \(\mathcal {R}^{N}\), respectively, and define the numbers of abnormal links and normal links over *r*_{i} as *κ*_{i} (*κ*_{i}≤*K*) and *ρ*_{i}, respectively.

### Clock model

Figure 2 shows a clock model. On the basis of the clock of *S*, the clocks of *S* and *D* are written respectively as

where *T*_{off} and *Δ*_{sk}=(*α*−1) are the time offset at *t*=0 and the skew parameter, respectively. As an example of clock skew, in the IEEE 802.11 standard for wireless local area networks (WLANs) and the IEEE 802.15.4 standard for wireless personal area networks (WPANs), the physical layer (PHY)/medium access control (MAC) protocols are designed to accept the clock error up to ± 40 ppm [30, 31]. In this paper, assuming that the clock frequency uniformly deviates in [−*Δ*_{dev},+*Δ*_{dev}], *Δ*_{sk} follows the triangular distribution triangular [−2*Δ*_{dev},0, +2*Δ*_{dev}] with the following average and variance:

where *E*[(·)] is the ensemble average of (·).

### Link delay model

Figure 3 shows the time chart between *S* and *D*. The *i*th probing packet (*i*=1,2,⋯,*I*), which is transmitted from *S* at time of *t*_{Si} on the basis of the clock of *S*, traverses the *i*th route and finally arrives at *D* at time of *t*_{Di} on the basis of the clock of *D*. It experiences the delay of *z*_{i} on the basis of the clock of *S*, which is the sum of the link delays over the *i*th route. In [6], it is assumed that the normal link delay behaves stochastically according to an exponential distribution whereas the abnormal link delay behaves deterministically, irrespective of wired or wireless networks. In this paper, we take the same approach as in [6], but we adopt a different model on the normal link delay suited for wireless sensor networks [32], that is Gaussian distribution. The reason is due to the central limit theorem, which asserts that the distribution of the sum of a large number of independent and identically distributed (i.i.d.) random variables approaches to that of Gaussian random variable. This model will be appropriate if the link delays are thought to be the addition of numerous independent random processes. So when *l*_{j} is a normal link, that is, \(l_{j} \in \mathcal {L}^{N}\), we assume that its delay is given as a Gaussian random variable *γ*_{j}, and it is i.i.d. among links. *γ*_{j} has the following statistical properties:

where *η*_{N} and \(\sigma _{N}^{2}\) are the average and variance of *γ*_{j}, respectively. On the other hand, for an abnormal link \(l_{j} \in \mathcal {L}^{A}\), we assign a constant large delay of *η*_{A} \(\left (\eta _{A} \gg \eta _{N},~\eta _{A}^{2} \gg \sigma _{N}^{2}\right)\). Finally, as the statistical properties of the link states, we assume that the link delays are stationary in a tomography session.

## Conventional delay tomography scheme

### Matrix/vector representation

The measured route delay over *r*_{i} (*i*=1,2,⋯,*I*) can be written as

Defining the link delay vector \({\mathbf {d}} \in \mathbb {R}^{J \times 1}\) as

it can be decomposed into the following two vectors:

where \({\mathbf {x}} \in \mathbb {R}^{J \times 1}\) and \({\mathbf {a}} \in \mathbb {R}^{J \times 1}\) are the link delay vectors containing only the abnormal link delays whereas the normal link delays, respectively, that is,

Furthermore, defining the measurement route delay vector \({\mathbf {y}} \in \mathbb {R}^{I \times 1}\), the routing matrix **B**∈{0,1}^{I×J}, the *i*th row vector of the routing matrix \(\boldsymbol {\beta }_{i} \in \mathbb {R}^{1 \times J}\), the time offset vector \({\mathbf {v}} \in \mathbb {R}^{I \times 1}\), and the clock skew vector \({\mathbf {w}} \in \mathbb {R}^{I \times 1}\) as

where

the measurement route delay vector **y** is written as

Now, the purpose of the tomography is to estimate the *K* non-zero elements of **x**, so we distinguish between **x** and the other vectors which are the components of *error factor vector* defined as

where

is the route delay vector contributed from normal links \({\mathbf {u}} \in \mathbb {R}^{I \times 1}\). Consequently, we arrive at

### Design for reducing the effect of the error factors

Let us pay attention to the statistical properties of **e**. Regarding the each component of **e**, from (16), (17), and (36), the statistical properties of the *i*th element of **u** are calculated as

from (32), the statistical properties of the *i*th element of **v** are written as

and from (14), (15), and (33), the statistical properties of the *i*th element of **w** are written as

Thus, taking into consideration

where \(\mathbf {0} \in \mathbb {R}^{I \times I}\) is the zero matrix, the mean squared norm of **e** is written as

where

From (38)–(43), they result in

where *P*_{A} and *P*_{N} are the probabilities that *r*_{i} (*i*=1,2,⋯,*I*) is included in \(\mathcal {R}^{A}\) and \(\mathcal {R}^{N}\), respectively. From these, in order for a conventional scheme to perform the delay tomography accurately, in other words, to minimize the mean squared norm of **e** given by (46), there is no way except for selecting clock oscillators with high clock frequency stabilities *Δ*_{dev}≈0 and estimating the time offset accurately to make *T*_{off}≈0.

## Subtractive scheme

### Matrix/vector representation

The subtractive scheme is achieved by getting rid of *T*_{off} completely, where the *m*th row components of **y**, **B**, **u**, and **w** are selected as references. Define a set of observation route indexes as \(\mathcal {I}_{\text {route}}=\{1, 2, \cdots, I \}\) with \(|\mathcal {I}_{\text {route}}|=I\). Using \(\mathcal {I}_{\text {route}}\), sets of selected reference route indexes \(\mathcal {F}_{\text {ref}}\) and not-selected (remaining) route indexes \(\mathcal {H}_{\text {rem}}\) are defined respectively as

with \(|\mathcal {F}_{ref}|=I-1\) and \(|\mathcal {H}_{\text {rem}}|=I-1\). Using \(\mathcal {F}_{\text {ref}}\) and \(\mathcal {H}_{\text {rem}}\), the reference route delay vector \(\mathbf {y}_{\text {ref}} \in \mathbb {R}^{(I-1) \times 1}\), the reference routing matrix \(\mathbf {B}_{\text {ref}} \in \mathbb {R}^{(I-1) \times J}\), the reference route delay vector contributed from normal links \(\mathbf {u}_{\text {ref}} \in \mathbb {R}^{(I-1) \times 1}\), the reference clock skew vector \(\mathbf {w}_{\text {ref}} \in \mathbb {R}^{(I-1) \times 1}\), the remaining route delay vector \(\mathbf {y}_{\text {rem}} \in \mathbb {R}^{(I-1) \times 1}\), the remaining routing matrix \(\mathbf {B}_{\text {rem}} \in \mathbb {R}^{(I-1) \times J}\), the remaining route delay vector contributed from normal links \(\mathbf {u}_{\text {rem}} \in \mathbb {R}^{(I-1) \times 1}\), and the remaining clock skew vector \(\mathbf {w}_{\text {rem}} \in \mathbb {R}^{(I-1) \times 1}\) are defined respectively as

Therefore, we finally have the following new matrix/vector equation:

where **B**^{(m)}∈{−1,0,1}^{(I−1)×J} is referred to as the subtractive routing matrix using the *m*th route reference and **v** disappears completely in (68). Note that, in [13], the principle of the synchronization-free delay tomography corresponding to (65)–(67) was simply derived, but it seemed that *m* can be neither 1 nor *I*. The expressions from (55) to (70) seem complicated but are more mathematically strict and expandable for the case of multiple routes reference. When **u**^{(m)}≈**0** and **w**^{(m)}≈**0**, (65), (66), and (67) result in the equations which are indeed equivalent to those in [13].

### Reference route selection preserving the identifiability

Assuming **e**^{(m)}=**0** in (65), it becomes a simple equation of linear observation (see (5)). In [13], it is proven that, if **B** is 1-identifiable, then **B**^{(m)} can be also 1-identifiable, and in this case, we can identify the abnormal link by solving the *ℓ*_{1} optimization problem (see (9)). Now, we add the following two theorems to the synchronization-free delay tomography scheme.

###
**Theorem 2**

If the subtractive routing matrix using the *m*_{1}th route reference is *K*-identifiable, then another subtractive routing matrix using the *m*_{2}th route reference is also *K*-identifiable.

###
*Proof*

See the Appendix 1. □

###
**Theorem 3**

If a subtractive routing matrix is *K*-identifiable and the number of abnormal links is more than *K*, then we can have the same possibility to identify them even when selecting any route as the reference.

###
*Proof*

See the Appendix 2. □

Note that the solution may be *algorithm*-dependent in the case of Theorem 3.

In reality, however, the assumption of **e**^{(m)}=**0** cannot be held in (65), and in this case, we can identify the abnormal link by solving the *ℓ*_{1}/*ℓ*_{2} optimization problem (see (10)). **e**^{(m)} varies according to the reference and order of the probing routes, so the solution is still affected by the design of the tomography session. In the following subsections, how to order the measurement routes and select a preferable reference is addressed to make *E*[**e**^{(m)⊤}**e**^{(m)}] closer to 0.

### Reference route selection for reducing the error factors

Let us pay attention to **e**^{(m)} in (65). Its mean squared norm is written as

where

From (39), (55), (56), (59), (63), and (69), (72) results in

On the other hand, from (43), (55), (56), (60), (64), and (70), (73) results in

Applying \(\rho _{h_{i^{\prime }}}-\kappa _{h_{i^{\prime }}} \approx \rho _{h_{i^{\prime }}}\), \(\rho _{f_{i^{\prime }}}-\kappa _{f_{i^{\prime }}} \approx \rho _{f_{i^{\prime }}}\) and *η*_{A}≫*η*_{N} to (74) and (75), they can be approximated respectively as

Now, applying the Cauchy-Schwarz inequality to (76) results in

From (55), taking into consideration \(f_{i^{\prime }}=m\), that is, \(\rho _{f_{i^{\prime }}}=\rho _{m}\) (*i*^{′}=1,2,⋯,*I*−1), by finding the minimizer for (78), we can see that the following *ρ*_{m} minimizes the error contributed from random delays in normal links:

On the other hand, applying the Cauchy-Schwarz inequality to (77) results in

so by finding the minimizer for (80) as well, we can see that the following *m* minimizes the error contributed from clock skews:

### Design for reducing the effect of the error factors

Equation (79) means that the route whose number of links is closer to the average of those of links over all the routes should be selected as the reference. On the other hand, in terms of reducing the error contributed from the clock skew, (81) means that the route whose probing packet is transmitted at around the middle of tomography session should be selected as the reference. Consequently, one theoretical design strategy to jointly reduce the two error factors is to order the probing routes as below, send probing packets according to the order, and use the delay as the reference which is measured over the route probed at the middle of the tomography session:

In the following, we use the superscript “*s*” instead of (*m*) to indicate the subtractive scheme.

## Differential scheme

We can also select multiple routes as references in order to get rid of *T*_{off}. In this case, (55) is modified as

Any routes need to be included in either of a set of reference routes or a set of remaining sets, so (56) is modified as

From (76) and (77), the following conditions obviously minimize the error factors contributed from the random delays and clock skew

This means that, as a matter of course, the error factor should be canceled by another error factor with a similar value. Inspired by the workability of differential phase shift keying (DPSK) scheme in optical wired communication system which is rich in phase noise [33], one design strategy for achieving (88) and (89) is to order the observation routes with the numbers of links in an ascending (or descending) order and select the *i*^{′}th route as the reference for the (*i*^{′}+1)th route (*i*^{′}=1,2,⋯,*I*−1), namely,

Following this, we propose a differential delay tomography scheme, and modify (65)–(70) respectively as

where **B**^{d} is referred to as the differential routing matrix and the superscript “*d*” is used instead of (*m*) to indicate the differential scheme. Similar to the discussion in Appendix 1 and 2, we can derive

so in terms of the identifiability of the differential routing matrix, there is no difference from the subtractive routing matrices, contrary to the better error factor cancelation than the subtractive scheme.

## Methods

To guarantee the repeatability of simulation experiments, we will use the network with *Q*=14 and *J*=29 for performance evaluation, which is shown in Fig. 4. This network is generated according to the random graph theory [34], where the degree of the node is adjusted to be more than 2 for satisfying 1-identifiability, and the most left and most right nodes are selected as *S* and *D*, respectively. In addition, according to the method proposed in [7], 15 routes between *S* and *D* are selected, which are summarized in Table 1. Note that we have the relationships among *Q*, *J*, and *I* as *Q*≈*J*/2 and *I*≈*J*/2 for the network in Fig. 4.

Furthermore, since the size of the network in Fig. 4 is limited, we will use other networks with larger sizes, which are also generated according to the same method for the one in Fig. 4. Here, the performance depends on *J* rather than *Q*, so we will show the performance as the function of *J* as the dependency on the network size, where we try to keep the relationships among *Q*, *J*, and *I* similar to those for the network in Fig. 4.

In the simulation experiments, after setting *η*_{N}=15 msec, *σ*_{N}=3 msec [35, 36], and *Δ*_{dev}=40 ppm [30, 31], we selected *K* abnormal links out of 29 links randomly and evaluated the five schemes in 1000 tomography sessions. For each given *K*, all combinations for the locations of abnormal links were realized, so the total tomography sessions resulted in _{29}C_{K}×1000. The threshold *θ* in (11) is set to 0.1, and the adjustable parameter *ξ* in *ℓ*_{1}/*ℓ*_{2} optimization is optimized.

For the evaluation, we define two metrics [37] to quantify two different types of errors that can occur in our identification problem. The first type of error corresponds to the case where normal links are falsely identified as abnormal links, and we refer to such errors as false positives. We quantify the number of these errors using the false positive rate (FPR) defined (erroneous identification rate) as

where \(\hat {\mathcal {E}^{A}}\) is the set of the links identified to be abnormal links, and \(\mathcal {E}^{A}\) is the set of the actual abnormal links. On the other hand, the second type of error occurs when the abnormal links are not identified correctly. We refer to these errors as false negatives, and we quantify the number of these errors using the false negative rate (FNR) defined (unidentified error rate) as

We can evaluate a tomography scheme to be accurate if its errors in terms of the above two performance metrics are suitably small. Especially when both the FPR and FNR equal zero, we can say that a perfect identification is accomplished, defining perfect identification ratio (PIR) as the ratio of the number of perfect identifications divided by the number of tomography sessions.

We will show the performance of the following five delay tomography schemes by simulation experiments; we refer to the subtractive delay tomography schemes with and without the route ordering in Section 5.4 as “Mid-Ave/Sub” and “Rand/Sub,” respectively, the differential delay tomography schemes with and without (90) as “Order/Diff” and “Rand/Diff,” respectively, and furthermore, the conventional delay tomography scheme assuming a perfect clock synchronization between *S* and *D* only at the beginning of the tomography session as “Conv.” If we can select a *J*×*J* full-rank matrix **B** for (37), we can perform the non-compressed sensing-based abnormal link identification. However, no *J*×*J* full-rank matrix always exists for a given network. In fact, for the network with *Q*=14 and *J*=29 in Fig. 4, we can construct no 29×29 full-rank matrix.

## Result and discussion

Figure 5a, b, and c show the dependencies of the PIR, FPR and FNR on the value of abnormal link delay (*η*_{A}), respectively, for the network in Fig. 4 with *T*_{probe}=10 s and *K*=1. We can see from these figures that the Conv scheme does not work well at all for smaller abnormal link delays due to the normal link delays and clock skew, even though perfect clock synchronization has been accomplished at the beginning of the tomography session. On the other hand, the Sub and Diff schemes have better performances than Conv scheme. The route ordering is effective for both schemes. In particular, the Order/Diff scheme outperforms the other four schemes in all the range of *η*_{A} and especially when *η*_{A} is larger than 150 msec, which corresponds to be larger than around ten times the normal link delay (*η*_{N}=15 msec), it can perfectly identify the abnormal link. The performance of the Rand/Diff scheme is worse than the Sub schemes, especially for the smaller values of abnormal link delay. Let us discuss the reason for the phenomenon by an additional simulation experiment. Since *T*_{probe} = 10 s in these figures, the magnitude of the clock skew is still less than 1 msec (=|*Δ*_{sk}*T*_{probe}|<80×10^{−6}×10=0.8×10^{−3}), whereas that of the total normal link delays on a route is in the order of several tens of milliseconds from the system assumption. Therefore, the dominant source in **e**^{s} or **e**^{d} can be the Gaussian-distributed normal link delays. If an abnormal link delay is much small, that is, **x**≈**0**, we have **y**^{s}≈**u**^{s} or **y**^{d}≈**u**^{d}. A false positive error occurs when the delay over at least one route which can be composed of only normal links reaches the identification threshold *θ**x*_{max}, so its probability is given by

where *p*(*u*_{1},*u*_{2},⋯,*u*_{I−1}) is the joint probability density function (pdf) of *u*_{1},*u*_{2},⋯, *u*_{I−1}. Figure 6 shows the result on the probability that at least one normal route delay exceeds the delay *ζ* seconds under the above condition. From this figure, we can see that, for *ζ*<0.13 s, the route delay for the Rand/Diff scheme exceeds *ζ* more frequently than that for the Sub schemes, which means \(P_{\text {FP}}^{s}<P_{\text {FP}}^{d}\) resulting in the superiority of the Sub schemes over the Rand/Diff scheme for the smaller values of abnormal link delay in Fig. 5a.

Figure 7a, b, and c shows the dependencies of the PIR, FPR, and FNR on the probing packet transmission interval (*T*_{probe}), respectively, for the network in Fig. 4 with *η*_{A}=1.0 s and *K*=1. *Δ*_{dev} is extremely small in the practical setting, so both Sub and Diff schemes are insensitive to the clock skew caused by *Δ*_{dev} when *T*_{probe} is smaller. However, these figures indicate the superiority of the Diff schemes; the Diff schemes can perfectly identify the abnormal link keeping the FPR and FNR to zero even for larger *T*_{probe} such as more than 1000 s, while the Sub schemes incorrectly identify an abnormal link when *T*_{probe} reaches around 300 s since the mismatch of clock frequency between *S* and *D* keeps giving monotonously increasing bias to the route delay measurements during the tomography session. Furthermore, the Rand/Sub scheme corresponds to the one proposed in [13], so comparing the performances between the Rand/Sub and Mid-Ave/Sub schemes in Figs. 5 and 7, respectively, we can see that the route ordering/selection improves the performance of the Sub scheme.

The routes were selected to be 1-identifiable, but Fig. 8a, b, and c shows the dependencies of the PIR, FIR, and FNR on the identifiability (*K*), respectively, for the network in Fig. 4 with *T*_{probe}=10 *s* and *η*_{A}=1.0 s. It is natural that in Fig. 8a, the PIRs of all the five schemes are one at *K*=1 and then they decrease as *K* increases. The Sub schemes outperform the Rand/Diff scheme. Figure 8b and c shows that the FPR dominates the degradation of the PIR, so this phenomenon comes from the same reason as that observed for the Sub schemes in Fig. 5a, namely, the Diff schemes increase the probability that the delay of at least one normal route exceeds the identification threshold. However, the most important fact is that the Order/Diff scheme shows the best performance.

Figure 9a, b, and c shows the dependencies of the PIR on the network size, assuming *T*_{probe}=10 s and *K*=1 for *η*_{A}=0.1 s, *η*_{A}=0.5 s and *η*_{A}=0.9 s, respectively. In all the three figures, as *J* increases, the PIRs tend to gradually improve to 1.0, which means that larger size networks are more advantageous in terms of abnormal link identifiability. This is because larger-size networks give more information on the single abnormal link through more different measurement routes. It is obvious to see that the PIR more improves for larger *η*_{A} for all the five schemes, but the superiority of the Order/Diff Scheme is outstanding; it can keep the PIR closer to 1.0 even for smaller *η*_{A} in smaller-size networks.

Finally, Table 2 compares the computational complexities of the tomography schemes, where the commonalities to all the five schemes are omitted such as link delay estimation and abnormal link identification. The Mid-Ave/Sub and Order/Diff schemes show better performance in the above simulation experiments with additional computational cost for route ordering besides that of route delay subtraction *O*(*I*). The computational complexities of average calculation of route length and route ordering for the Mid-Ave/Sub scheme are *O*(*I*) and *O*(1), respectively, where average calculation dominates the complexities. On the other hand, the route sorting for the Order/Diff scheme takes *O*(*I*^{2}) in worst case. However, it is important to note that the route ordering can be executed only once in an off-line manner before tomography session.

## Conclusions

This paper proposed two kinds of synchronization-free delay tomography such as subtractive and differential schemes. We theoretically derived the optimal route reference and ordering methods for the two schemes and then confirmed their robustness against clock asynchronism, clock skew, and normal link delays in a realistic wireless sensor network by simulation experiments. The subtractive scheme is simple, namely, does not care about the order of all measurement routes; it puts just a route as the reference at the middle of tomography session whose number of routes is closer to the average of those of links over all the measurement routes. On the other hand, the differential scheme is little complicated; it needs to order all measurement routes with linear computational complexity in terms of the number of links, and it shows the highest robustness against the clock asynchronism, clock skew, and normal link delays. The differential scheme can keep its identification accuracy even for much longer transmission interval of probing packet, so it is much harmless to normal sensor data traffic.

Compressed sensing-based delay tomography scheme is applicable for networks where non-compressed sensing-based scheme does not work, and it can identify abnormal links more accurately for larger-size networks. These two facts are its main advantages.

We have discussed the proposed schemes in a simple scenario, but we expect that they will be feasible in other scenarios such as passive proving strategy or where multiple source and destination nodes are selected. We leave the application to other scenarios and real-world trace to verify the proposed schemes as our future works.

## Appendix 1: Proof of Theorem 2

Define the subtractive routing matrices using the *m*_{1}th and *m*_{2}th route references are \(\phantom {\dot {i}\!}{\mathbf {B}}^{(m_{1})}\) and \(\phantom {\dot {i}\!}{\mathbf {B}}^{(m_{2})}\), respectively. When \(\phantom {\dot {i}\!}{\mathbf {B}}^{(m_{1})}\) is *K*-identifiable, \(\text {Spark}\left (\mathbf {B}^{(m_{1})}\right)=\Sigma _{1}>2K\). Now, \(\text {Spark}\left (\mathbf {B}^{(m_{1})}\right)=\Sigma _{1}\) implies that the smallest number of column vectors of \(\phantom {\dot {i}\!}\mathbf {B}^{(m_{1})}\) is *Σ*_{1} which are linearly dependent, so picking up different *Σ*_{1} column vectors arbitrarily out of the *J*−1 column vectors of \(\phantom {\dot {i}\!}\mathbf {B}^{(m_{1})}\) as \(\mathbf {b}_{1}^{(m_{1})}, \mathbf {b}_{2}^{(m_{1})}, \cdots, \mathbf {b}_{q}^{(m_{1})}, \cdots, \mathbf {b}_{\Sigma _{1}}^{(m_{1})}\) and a nonzero vector as \(\boldsymbol {\eta }=\left [ \eta _{1}, \eta _{2}, \cdots, \eta _{q}, \cdots, \eta _{\Sigma _{1}}\right ]^{\top }\), the following equation is satisfied:

Applying the fundamental manipulation on the addition of equations to (104), it can be converted to

where \(\mathbf {b}_{1}^{(m_{2})}, \mathbf {b}_{2}^{(m_{2})}, \cdots, \mathbf {b}_{q}^{(m_{2})}, \cdots, \mathbf {b}_{\Sigma _{1}}^{(m_{2})}\) correspond to the column vectors of \(\phantom {\dot {i}\!}\mathbf {B}^{(m_{2})}\). Conversely, we can derive

Consequently, \(\text {Spark}(\mathbf {B}^{(m_{1})})=\text {Spark}(\mathbf {B}^{(m_{2})})\), that is, whichever route we may select as the reference, the subtractive routing matrix is *K*-identifiable.

## Appendix 2: Proof of Theorem 3

In this case, there is no guarantee that the abnormal links are correctly identified. However, applying the fundamental manipulation on the subtractive matrix/vector equations, we have

that is, a set of solutions in selecting the *m* _{1}th route as the reference is equivalent to that in selecting the *m* _{2}th route as the reference.

## Notes

We intentionally use \(\mathcal {G} = (\mathcal {N}, \mathcal {L})\) instead of \(\mathcal {G} = (\mathcal {V}, \mathcal {E})\) because we call the elements not “vertex” and “edge” but “node” and “link,” respectively, in this paper.

## References

M. H. Bhuyan, D. K. Bhattacharyya, J. K. Kalita, Network anomaly detection: methods, systems and tools. IEEE Commun. Surveys Tuts.

**16**(1), 303–336 (2014).J. N. Al-Karakim, A. E. Kamal, Routing techniques in wireless sensor networks: a survey. IEEE Trans. Wireless Commun.

**11**(6), 6–28 (2004).R. Castro, M. Coates, G. Liang, R. Nowak, B. Yu, Network tomography: recent developments. Statist. Sci.

**19**(3), 499–517 (2004).M. Coates, A. O. Hero III, R. Nowak, B. Yu, Internet tomography. IEEE Signal Process. Mag.

**19**(3), 47–65 (2002).Y. Vardi, Network tomography: estimating source-destination traffic intensities from link data. J. Amer. Stat. Assoc.

**91**(433), 365–377 (1996).M. H. Firooz, S. Roy, Link delay estimation via expander graphs. IEEE Trans. Commun.

**62**(1), 170–181 (2014).K. Takemoto, T. Matsuda, T. Takine, Sequential loss tomography using compressed sensing. IEICE Trans. Commun.

**E96-B**(11), 2756–2765 (2013).W. Xu, A. Tang, in

*Proc. 48th Annu. Allerton Conf. Commun., Control, Comput.: 29 Sept.-1 Oct. 2010*. Compressive sensing over graphs: how many measurements are needed? (IEEEAllerton, 2010), pp. 16–27.D. L. Donoho, Compressed sensing. IEEE Trans. Inf. Theory.

**52**(4), 1289–1306 (2006).Y. C. Eldar, G. Kutyniok,

*Compressed sensing: theory to applications*(Cambridge University Press, Cambridge, 2012).J. Zhao, R. Govindan, D. Estrin, Sensor network tomography: monitoring wireless sensor networks. ACM SIGCOMM Comput. Commun. Rev.

**32**(1), 64–64 (2002).G. Hartl, B. Li, in

*Proc. 3rd IPSN: 26 - 27 Apr. 2004*. Loss inference in wireless sensor networks based on data aggregation (IEEEBerkeley, 2004), pp. 396–404.K. Nakanishi, S. Hara, T. Matsuda, K. Takizawa, F. Ono, R. Miura, Synchronization-free delay tomography based on compressed sensing. IEEE Commun. Lett.

**18**(8), 1343–1346 (2014).K. Nakanishi, S. Hara, T. Matsuda, K. Takizawa, F. Ono, R. Miura, Reflective network tomography based on compressed sensing. Procedia Computer Science.

**52:**, 186–193 (2015).T. Naka, S. Hara, Route selection algorithms utilizing the property of the ZDD for compressed sensing-based transmissive network tomography. Procedia Comput. Sci.

**109:**, 124–131 (2017).I. F. Akyildiz, X. Wang, W. Wang, Wireless mesh networks: a survey. Comput. Netw.

**47**(4), 445–487 (2005).B. Sundararaman, U. Buy, A. D. Kshemkalyani, Clock synchronization for wireless sensor networks: a survey. Ad Hoc Netw.

**3**(3), 281–323 (2005).E. D. Kaplan, C. J. Hegarty,

*Understanding GPSP: principles and applications, Second Edition*(Artech House, Norwood, 2012).D. L. Mills, Internet time synchronization: the network time protocol. IEEE Trans. Commun.

**39**(10), 1482–1493 (1991).S. Ganeriwal, R. Kumar, M. B. Srivastava, in

*Proc. First Int. Conf. Embedded Networked Sensor Syst.: 5–9 Nov.*Timing-sync protocol for sensor networks (ACMNew York, 2003), pp. 138–149.K. Noh, E. Serpedin, in

*Proc. IEEE Int. Symp. World Wirel., Mob. Multimedia Netw.: 18–21 June; Espoo*. Pairwise broadcast clock synchronization for wireless sensor networks, (2007), pp. 1–6.Y. Zhang, T. Qiu, L. Liu, Y. Sun, A. Zhao, F. Xia, in

*Proc. ICSN 2016: 23–26 May*. Mac-time-stamping-based high-accuracy time synchronization for wireless sensor networks (IEEEJeju, 2016), pp. 1–4.V. F. Kroupa,

*Frequency stability: introduction and applications*(Wiley, Hoboken, 2012).T Otsuka, S. Hara, T. Matsuda, K. Takizawa, F. Ono, R. Miura, in

*Proc. WPMC 2015: 13–16 Dec. 2015*. Path ordering and reference selection method for the differential delay tomography (IEEEHyderabad, 2015). in CD-ROM.M. Elad,

*Sparse and redundant representations: from theory to applications in signal and image processing*(Springer, New York, 2010).M. Zibulevski, M. Elad, L1-l2 optimization in signal and image processing. IEEE Signal Process. Mag.

**27**(3), 76–88 (2010).T. Matsuda, M. Nagahara, K. Hayashi, Link quality classifier with compressed sensing based on

*ℓ*_{1}-*ℓ*_{2}optimization. IEEE Commun. Lett.**15**(10), 1117–1119 (2011).A. Beck, M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci.

**2**(1), 183–202 (2009).D. Johnson, Y. Hu, D. Maltz,

*RFC: 4728, The dynamic source routing protocol (DSR) for mobile ad hoc networks for IPv4*(IETF, Fremont, 2007).IEEE: Std 802.11–2012 - Wireless LAN medium access control (MAC) and physical layer (PHY) specifications (2012).

IEEE: Std 802.15.4–2011 - Low-rate wireless personal area networks (LR-WPANs) (2011).

K. L. Noh, Q. M. Chaudhari, E. Serpedin, B. W. Suter, Novel clock phase offset and skew estimation using two-way timing message exchanges for wireless sensor networks. IEEE Trans. Commun.

**55**(4), 766–777 (2007).G. P. Agrawal,

*Fiber-optic communication systems*(John Wiley & Sons, Hoboken, 2010).B. Bollobas,

*Random graphs, 2nd Ed*(Cambridge Univ. Press, Cambridge, 2001).W. Zeng, X. Chen, X. Kim, Z. Bu, W. Wei, B. Wang, Z. J. Shi, in

*Proc. IEEE MILCOM.: 18–21 Oct. 2009*. Delay monitoring for wireless sensor networks: an architecture using air sniffers (IEEEBoston, 2009), pp. 1–8.K. Liu, Q. Ma, H. Liu, Z. Cao, Y. Liu, in

*Proc. IEEE MASS: 14–16 Oct. 2013*. End-to-end delay measurement in wireless sensor networks without synchronization (IEEEHangzhou, 2013), pp. 583–591.J. Haupt, R. M. Castro, R. Nowak, Distilled sensing: adaptive sampling for sparse detection and estimation. IEEE Trans. Inf. Theory.

**57**(9), 6222–6235 (2011).

### Funding

This work was supported in part by the Japanese Ministry of Internal Affairs and Communications in R&D on Cooperative Technologies and Frequency Sharing Between Unmanned Aircraft Systems (UAS) Based Wireless Relay Systems and Terrestrial Networks, and JSPS KAKENHI grant numbers JP16K00124, and 18H01445.

## Author information

### Authors and Affiliations

### Contributions

KN, SH, and TM contributed to the main idea and analyzed the results. KN and TN designed and carried out the simulation. KT, FO, and RM encouraged this whole work. All authors read and approved the final manuscript.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare that they have no competing interests.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

### Cite this article

Nakanishi, K., Naka, T., Hara, S. *et al.* Route referencing and ordering for synchronization-free delay tomography in wireless networks.
*J Wireless Com Network* **2018, **211 (2018). https://doi.org/10.1186/s13638-018-1227-x

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/s13638-018-1227-x

### Keywords

- Anomaly detection
- Compressed sensing
- Network tomography
- Synchronization-free delay tomography