Distributed time synchronization in wireless sensor networks with coupled discrete-time oscillators (invited)

In wireless sensor networks, distributed timing synchronization based on pulse-coupled oscillators at the physical layer is currently being investigated as an interesting alternative to packet synchronization. In this paper, the convergence properties of such a system are studied through algebraic graph theory, by modelling the nodes as discrete-time clocks. A general scenario where clocks may have different free-oscillation frequencies is considered, and both time-invariant and time-variant network topologies (or fading channels) are discussed. Furthermore, it is shown that the system of oscillators can be seen as a set of coupled discrete-time PLLs. Based on this observation, a generalized system design is discussed, and it is proved that known results in the context of conventional PLLs for carrier acquisition have a counterpart in distributed systems. Finally, practical details of the implementation of the distributed synchronization algorithm over a bandlimited noisy channel are covered.


I. INTRODUCTION
Distributed timing synchronization refers to a decentralized procedure that ensures the achievement and maintenance of a common time-scale (frequency and phase) for all the nodes of the network [1].This condition enables a wide range of applications and functionalities of a sensor networks, including complex sensing tasks (distributed detection/ estimation, data fusion), power saving (all nodes sleep and wake-up at coordinate times) and medium access control for communication (e.g., Time Division Multiple Access).
Conventional design of distributed algorithms for timing synchronization prescribes the exchange of packets carrying a time-stamp to be appropriately elaborated by the transmitting and receiving nodes [1].Packet-based synchronization has been widely studied, especially in the context of wireline networks.However, the specific features and requirements of wireless sensor networks call for alternative methods that improve both the computational complexity (and therefore energy efficiency) and scalability.Toward this goal, physical layer-based synchronization protocols are currently being investigated that exploit the broadcast nature of radio propagation.The idea is to build distributed algorithms based on the exchange of pulses at the physical layer, thus avoiding the need to perform complex processing at the packet level.
Physical layer-based synchronization was first proposed in [2], using a mathematical framework developed in [3] in order to model the spontaneous establishment of synchronous periodic activities in biological systems, such as the flashing of fireflies.In [3] (and [2]), nodes are modelled as integrate-and-fire oscillators coupled through the transmission of pulses.Convergence is proved under the assumption of an all-to-all interconnection among the nodes.The model was later extended in [4], by explicitly including constraints on the transmission range of each node.In particular, the authors derived a bound on the velocity of convergence by using algebraic graph theory [5].An implementation of distributed synchronization on a real sensor network testbed was reported in [6].A related work is [8], where a generalization of the model in [3] is proposed and the regime of an asymptotically dense network is investigated.As a final remark, it should be noted that the framework of physical layer-based timing synchronization has been recently interpreted as a means to achieve distributed estimation/ detection [7] [9] or data fusion [10].
In this paper, we reconsider physical layer-based synchronization by modelling the sensors as coupled discrete-time oscillators.Basically, each node modifies its current clock based on a weighted average of the residual differences of timing phases as measured with respect to other nodes.The synchronization algorithms proposed in [11] in the context of inter-base station communication and [12] for inter-vehicle transmission can be seen as instances of this general model.The analytical framework is at the same time as a generalization and an application of the literature on discrete-time consensus problems for networks of agents (see, e.g., [13]).In particular, differently from [13], here we address the case of clocks with generally different free-oscillation frequencies, and account for the specific features of a wireless network, namely channel reciprocity and randomness (fading).Analysis of convergence of the synchronization process is carried out by algebraic graph theory as in [4], allowing to relate global convergence properties to the local connectivity of the network.The results are first derived for a timeinvariant scenario, and then extended to the case where the network topology (or fading) varies with time, building on the results presented in [15].
A central contribution of this paper is the observation that the distributed synchronization system at hand can be modelled as a set of coupled discrete-time Phase Locked Loops (PLLs).The system can thus be seen as a discrete-time version of the network synchronization scheme of [17], that is based on continously-coupled analog PLLs This fact allows us to generalize the system design by introducing the concept of loop order.Moreover, we prove that known result about the convergence of conventional PLLs for carrier acquisition have a counterpart in distributed systems.In particular, it is shown that, under appropriate conditions on the interconnections between sensors: (i) a system of first-order distributed PLLs is able to recover perfectly a phase mismatch among the clocks; (ii) in case of a frequency error, first-order loops are able to recover the frequency gap, but at the expense of an asymptotic phase mismatch; (iii) this asymptotic phase mismatch can be reduced by considering second-order loops.
Finally, the analysis is complemented by addressing the issue of a practical implementation of the distributed synchronization algorithm over a bandlimited Gaussian channels.

II. SYSTEM MODEL AND MAIN ASSUMPTIONS
Let the wireless network be composed of K sensors, where each node, say the kth, has a discrete-time clock with period T k .If the nodes are left isolated, the timing clock of the kth sensor evolves as t k (n) = nT k + τ k (0), where 0 ≤ τ k (0) < T k is an initial arbitrary phase and n = 1, 2, ... runs over the periods of the timing signal.Two synchronization conditions are of interest.We say the K clocks are frequency synchronized if for each k and for sufficiently large n, where 1/T is the common frequency.A more strict condition requires full frequency and phase synchronization 1 : We remark that the network is said to fractionate into, say, two clusters of synchronization if there exist a permutation function on the nodes' labels, π(i) : [1, ..., n] → [1, ..., n] such that for n large enough where the number of nodes in the two clusters is r and K − r respectively.The definition above generalizes naturally to more than two clusters.Towards the goal of achieving synchronization, the clocks of different sensors can be coupled by letting any node radiate a timing signal as the one sketched in fig. 1.A pulse 2 is transmitted at times t k (n) by the kth node, and received through independent flat fading channels by the 1 In [6] a distinction is made between synchronization (the state where nodes of the network have a common notion of time) and synchronicity (nodes agree on "firing" period and phase).In this paper, as in most part of the literature, we focus on the latter, and refer to it as either synchronization or synchronicity. 2 The temporal width of the transmitted pulse (or equivalently the employed bandwidth) has to be selected so as to guarantee the desired resolution of timing synchronization (see Sec. VII).other sensors.It is assumed that all the nodes transmit with the same power, and that the power P ki received on the wireless link between the ith and the kth user reads where C is an appropriate constant, d ki (n) = d ik (n) is the distance between node i and node k at the nth period, G ki (n) is a random variable accounting for the fading process and γ is the path loss exponent (γ = 2÷4).Notice that the fading channel is reciprocal (all transmissions use the same carrier frequency), which implies As detailed in the following, each node (at any period n) processes the received signal in order to estimate the time difference between its own clock t k (n) and the corresponding "firing" instant of other nodes, i.e., t i (n) − t k (n), i 6 = k, and, based on this measure, it updates its own clock.

A. The synchronization algorithm
In this section, we consider the synchronization procedure under the ideal assumptions that any node, say the kth, is able to measure exactly the time differences t i (n) − t k (n) and the powers P ki (n) of other nodes (i 6 = k) based on the received signal.This model is elaborated upon in the first part of the paper.A practical implementation of the system that alleviates the said assumptions is then discussed in Sec.VII.At the nth period, the kth node updates its clock t k (n) according to a weighted sum of timing differences3 : where ε is the step-size (0 < ε < 1) and the coefficients α ki (n) are selected so that α ki (n) ≥ 0 and In this paper, we focus on the following choice: The selection of the weighting coefficients ( 6) is inspired by the algorithms proposed in [11] and [12].The rationale of this design is that time differences measured over more unreliable (i.e., low power) channels should be weighted less when updating the clock, thus rendering the algorithm robust against measurement errors over the fading channels (see also Sec.VII).Notice that by using (5b) we are implicitly neglecting the propagation delays among nodes, that are assumed to be smaller than the timing resolution.A method to handle propagation delays is described in [11].As a final remark, we notice that the dynamic system (5) updates the clock t k (n + 1) as a convex combination of the times {t i (n)} K i=1 [15].By defining the vector containing the clocks of all nodes as t(n we can express (5) as the difference vector equation where Notice that even though we assume channel reciprocity, matrix A(n) is not symmetric.Moreover, by construction, matrix A(n) is non-negative and stochastic since the sum of the elements on each row sums to one, or equivalently III. TIME-INVARIANT FREQUENCY-SYNCHRONOUS NETWORK In this section, we study the convergence properties of the distributed synchronization algorithm (5) under the following assumptions: (i) frequency-synchronous network, i.e., all the clocks share the same period T = T 1 = • • • = T K ; (ii) the network is time-invariant, i.e., P ki (n) = P ki for any n and k 6 = i.From assumption (i), the clock of the kth node can be expressed as where τ k (n) is the timing phase 0 ≤ τ k (n) < T k of the kth node in the nth period (see fig. 1).Moreover, from both assumptions (i) and (ii), it follows that the synchronization algorithm (5) can be written in terms of the phases τ k (n) as with coefficients α ki : Finally, by defining the vector containing the timings of all nodes as where A is a K × K matrix such that we have [A] ii = 1 − ε on the main diagonal and Model (12) resembles the one considered in the literature on multi-agent coordination (see, e.g., [13]).The goal of this section is to determine the conditions under which the system (10) converges to a unique cluster or to multiple clusters of synchronization for a fixed realization of the fading variables G ki in (4), i.e., matrix A is assumed to be deterministic.We will define the conditions of convergence in terms of the properties of the graph associated to the wireless network under study, or equivalently in terms of the system matrix A.

A. The associated graph and useful definitions
A wireless sensor network can be represented by a weighted directed graph G=(V, E,A) of order K, where V ={1, ..., K} is the set of nodes and E ⊆ V × V is the set of edges weighted by the off-diagonal elements of the K × K adjacency matrix [A] ij =α ij .The edge connecting the ith and the jth nodes, i 6 = j, belongs to E if and only if α ij > 0. Notice that the graph is directed (α ij 6 = α ji for i 6 = j), even though fading links are reciprocal (P ij = P ji for i 6 = j).Moreover, notice that the system matrix reads where L is the graph Laplacian of the network that is defined as [13]: [L] ii = 1 (which is the degree of node i: The main result of this Section (Theorem 1) relates the convergence properties of the distributed synchronization procedure in (10) with the connectivity of the graph G associated to the sensor network.We need the following definitions.
Definition 1: A graph G is said to be strongly connected if there exists a path (i.e., a collection of edges in E) that links every pair of nodes.
It can be proved that strong connectivity of graph G is equivalent to the irreducibility of matrix A [20].
Definition 2: A K × K matrix A is said to be reducible if there exists a K × K permutation matrix P and an integer r > 0 such that where The degree of irreducibility of a matrix A, or equivalently of strong connectivity of the associated graph G, can be measured by the following quantity (see, e.g., [19]) where the minimum is taken over all non-empty proper subsets of It can be shown that σ = 0 if and only if the matrix A is reducible, or the associated graph G is not strongly connected.

B. Convergence properties
The main result of this Section can be now stated as follows.
Theorem 1: (i) The distributed synchronization (10) converges to a unique cluster of synchronized nodes, τ 1 (n) = ... = τ K (n) = τ * for n → ∞, if and only if the associated weighted directed graph G is strongly connected, or equivalently if system matrix A is irreducible.(ii) In this case, the system (12) converges to (for n → ∞) or equivalently , where v is the normalized left eigenvector of matrix A corresponding to eigenvalue 1: An immediate consequence of the Theorem 1 is that the timing vectors converge to the average of their initial values τ (0) if and only if the system matrix A is doubly stochastic (i.e., if A T is stochastic as well).In fact, in this case A T 1 = 1 and vector v in ( 16) reads v =1/K•1.This condition occurs in balanced networks [13], where In sensor networks, this result is of interest in applications where the steady state value of synchronization is used in order to infer the status of the process monitored by the sensor [16] [7] [9].
Proof: The proof of part (i) of Theorem 1 is available in the literature for applications where the graph G associated to the dynamic system (12) is undirected [5].In the case of a directed graph, strong connectivity can generally be proved to be only a sufficient condition for synchronization.However, in a wireless fading case with reciprocal channels the result can be proved as shown in the following.The second part (ii) of Theorem 1 follows from a result derived, among the others, in [13].
As explained above, in order to prove Theorem 1, we only need to show that strong connectivity is also a necessary condition for synchronization.As a by-product, the proposed proof brings insight into the formation of multiple clusters of synchronization (3).Let us assume that A is reducible (or equivalently the associated graph G is not strongly connected).Then, by definition, there exists a permutation matrix P and an integer r > 0 such that (14) holds.But if α ij = 0 in A then for reciprocity P ij = P ji = 0 and then α ji = 0 (i 6 = j).This property is sometimes referred to as bidirectionality of the graph (i.e., α ij = 0 if and only if α ji = 0 but α ij and α ji need not to be equal [15]).Therefore, the r × K − r matrix C in ( 14) has all zero entries.Since the permuted matrix P T AP is non-negative and stochastic, so are submatrices B and D. By applying the permutation function π(k , where P k is the kth row of matrix P, to the nodes' labels, we can write the system (12) as where τ (n) = Pτ (n).Therefore, the set of r nodes {π(1), ..., π(r)} evolves independently from the remaining nodes {π(r + 1), ..., π(K)}.Now, if either B or D are reducible, the reasoning above can be iterated bringing to the formation of multiple independent set of nodes evolving separately.At the end of this procedure, the system matrix can be written as a block matrix with irreducible stochastic blocks on the diagonal.Without loss of generality, let us then assume that B and D are irreducible.From the first part of the proof (see also Appendix-A), it follows the two cluster of r and (K − r) nodes synchronize among themselves according to (3).Moreover, the steady state values of the timing vectors depend on the left eigenvectors of B and D according to (16): where is the K − r × 1 vector collecting the remaining entries.The convergence of the dynamic system at hand could be also studied in terms of the subdominant eigenvalue of matrix A, similarly to approach commonly adopted in the context of the analysis of Markov chains.In particular, the following results can be proved relating convergence to the multiplicity of eigenvalue 1.
Theorem 2: The distributed synchronization (10) converges to a unique cluster of synchronized nodes as in (2) if and only if the subdominant eigenvalue λ 2 6 = 1.
Proof: By recalling Theorem 1, it is enough to prove that: i) if λ 2 = 1 then the graph is not strongly connected; ii) if the graph is not strongly connected then λ 2 = 1.Part i) can be proved similarly to [13]; however, in Appendix-A we give an alternative proof based on the measure σ in (15) of irreducibilty of A. Part ii) does not hold in general for problems with directed graphs but it is easily shown under the reciprocity assumption similarly to Theorem 1.

C. Numerical results
Here we present a numerical example to corroborate the analysis discussed above.A network of K = 4 nodes is considered where the nodes are divided into two groups, V 1 = {1, 2} and V 2 = {3, 4}, as in fig. 2. The initial phases τ k (0) are set to τ (0)/T = [0.1 0.4 0.6 0.8] T .Fading variables G ki are equal to 1, the path loss exponent is γ = 3, D/d = 2 and ε = 0.3.Notice that, given the definition (11), the performance is not affected by the value of C in (4) and it only depends on relative distances.Fig. 3 shows the timing vector τ (n) versus n.After a transient where the nodes tend to synchronize in pairs within the two groups, the system reaches the steady state to the average value τ * /T = 0.475, as stated in Theorem 1, since the system matrix is easily shown to be doubly stochastic for this specific example.
In order to quantify the rate of convergence, from Theorem 2, we notice that the convergence of the synchronization protocol (10) depends on the subdominant eigenvalue λ 2 .In particular, as it is well known from the theory of linear difference equations, the rate of convergence is ruled by a term proportional to |λ 2 | n (see, e.g., [24]).If we define a threshold λ o , we could say that the protocol reaches the steady state condition at the time instant n o for which as a measure of the rate of convergence of the algorithm.Fig. 4 shows the rate of convergence v versus the normalized distance D/d for ε = 0.3, 0.7.As expected the rate v decreases with increasing D/d and decreasing ε.Along with v, fig. 4 shows the measure of irreducibility (or strong connectivity) σ (15) as dashed lines.It is interesting to note that the rate of convergence v and the measure of irreducibility σ have the same behavior as a function of D/d and ε.This confirms that convergence is strictly related to the connectivity properties of the associated graph, as proved in Theorem 1.

D. Effect of fading: an example
In this Section, the effect of fading on the rate of convergence v is investigated via simulation for linear, ring and star topologies (see fig. 5).Rayleigh fading is assumed, i.e., the fading amplitude G ki in ( 4) is assumed to be an exponentially distributed random variable with unit average.Fading is assumed to be constant for any n during the evolution of the algorithm.Fig. 6 plots the average rate of convergence E[v] (where the average E[•] is taken with respect to the distribution of fading) for the three networks versus the number of nodes K (ε = 0.3).Notice that for K = 2 the three networks coincide and recall that only relative distances are of concern for the behavior of the system (10).As it is expected the star topology has the largest rate of convergences whereas the linear network yields the slowest convergence.

IV. TIME-VARYING NETWORK FREQUENCY-SYNCHRONOUS NETWORK
Here we reconsider the performance of the synchronization procedure (5) by removing the assumption of time-invariance underlying the analysis of the previous section.However, we still assume a frequency-synchronous network.Overall, the system (5) can be written in vector form in terms of phases as (recall (12)) with the definition of the system matrix A(n) in Sec.II.In this case, the sensor network can be described by a sequence of directed graphs G(n)=(V, E(n),A(n)) of order K, defined similarly to Sec.III.In particular, A(n) is the adjacency matrix [A(n)] ij =α ij (n) and the edge connecting the ith and the jth nodes, i 6 = j, belongs to E(n) if and only if α ij (n) > 0. At each time instant n, the dynamic system describing the synchronization process evolves as where A(n) = I−εL(n), with L(n) being the graph Laplacian at time n (see Sec. III).Study of convergence of a family of algorithms encompassing (10) has been recently attempted in a few works (see [13], [15] and references therein).In particular, adapting a result first presented in [15] to our case, we are able to relate the convergence of dynamic system (20) to the connectivity properties of the associated sequence of graphs G(n).We need the following definition.Definition 3: A sequence of graphs G(n) is said to be strongly connected across an interval Theorem 3: The distributed synchronization (20) in a time-varying topology converges to a unique cluster of synchronized nodes, Proof: Theorem 3 can be proved by specializing the proof of Theorem 3 in [15] to our scenario.The basic idea is to exploit the convexity, or the contractive property, of transformation (10).An interesting remark is that reciprocity of fading plays here a key role as it did in the proof of Theorem 1 for the case of fixed topology.Reciprocity of fading translates into bidirectionality of the associated graphs (see Sec. III).As proved in [15], in presence of unidirectional communication among nodes (i.e., non-reciprocal fading in our scenario), convergence of synchronization is not necessarily guaranteed if each sensor communicates to every other sensors (either directly or via intermediate nodes) in an interval [n 0 , ∞).On the contrary, in order to guarantee convergence in a unidirectional graph, a limit should be imposed on the time it takes for the graph to become strongly connected, i.e., the interval in Theorem 3 should be modified as [n 0 , n 0 + T ] where T ≥ 0 finite.

V. FREQUENCY-ASYNCHRONOUS NETWORK
In the previous sections, it was assumed that all the nodes have the same clock period T (frequency synchronous network).However, in practice, different nodes might have different frequencies {1/T k } K k=1 , and the question arises of whether or not the physical layer-based scheme ( 5) is still able to achieve synchronization on a strictly connected graph.For a timeinvariant scenario (i.e., P ki (n) = P ki for any n and k 6 = i), it will be shown below that, in presence of a frequency mismatch, the scheme ( 10) is able to synchronize the clock periods of the nodes (recall (1)), but not their timing phases, so that the full synchronization condition (2) is not achieved.In this regard, it should be noted that, while perfect synchronization (2) is necessary for many application, in other scenarios having nodes with synchronized frequency is the only requirement (i.e., to ensure equal sensor duty cycles).
For a frequency-asynchronous time-invariant network, the considered synchronization scheme (5) reads or, in vector form (see ( 7)): Let us now denote a possible common value for the clock period of all nodes as T (to be determined) as in (1).It follows that the clock of the kth sensor can then be written for sufficiently large n as or equivalently, in vector form, as We are interested in determining if such common frequency 1/T exists and, if so, whether eventually the phases τ (n) converge to the same value for n → ∞.The main conclusion is summarized in the theorem below, whose proof is inspired by the analysis of the convergence of coupled analog oscillators in [14].Theorem 4: With reference to (23), under the assumption that the graph G is strictly connected, the system (21) synchronizes the clocks of the K nodes to the common period where v is the normalized left eigenvector of matrix A corresponding to eigenvalue 1: However, the timing phases τ (n) remain generally mismatched and given for n → ∞ by with (•) † denoting the pseudoinverse and the definitions The theorem above states that, in presence of a frequency mismatch, the algorithm ( 21) is able to synchronize the frequencies of different nodes to the common clock period T in (24).However, the system does not lead to phase-synchronous clocks, and the phase error is determined by the frequency (period) mismatch ∆T according to (25).Notice that if the network is such that the system matrix A is doubly stochastic (as in the example of Sec.III-C), the eigenvector v reads 1/K and the common period T is in this case the average T = 1/K P K k=1 T k .Moreover, with doubly stochastic matrix A, condition (25) simplifies as η = v T τ (0) since 1 T L † = 0 (see proof below for further details).Finally, we remark that if the frequency mismatch is ∆T = 0 (or equivalently T k = T ), the Theorem above follows from Theorem 1.
Proof: Under the assumption of a connected graph (or irreducible matrix A), according to Theorem 2 or [13], the Laplacian L is easily shown to have rank K − 1, where the onedimensional null subspaces are defined by the relationships v T L = 0 T and L1 = 0. (28) Using the latter equality, recalling (13) and the definition of common clock period T and phases τ (n) in ( 23), the vector difference equation ( 22) can be written as An equilibrium state τ (n + 1) = τ (n) = τ * for the difference equation (29) satisfies τ (n + 1) − τ (n) = 0, which yields the condition From (28), it follows that: (i) in order for (30) to be feasible (i.e., for an equilibrium point to exist), the common clock period T must satisfy v T ∆T = 0 or equivalently (24); (ii) an equilibrium phase vector τ * must read τ * = L † ε ∆T+η1 where η is an arbitrary constant.It remains to show that the system actually converges for n → ∞ to the equilibrium point τ * determined above, and to evaluate the constant η.
Toward the goal of studying convergence, let us define With this change of variables, the difference equation ( 29) boils down to where we used the relationship LL † ∆T = ∆T, which easily follows from the definition of pseudoinverse and (24).As a consequence of (31), as per Theorem 1, we have τ 0 (n)→ v T τ 0 (0).This expression is equivalent to (25), thus proving the theorem.Notice that from (31) the rate of convergence is the same as in the case of no frequency mismatch.
VI. DISTRIBUTED TIME SYNCHRONIZATION AS COUPLED DISCRETE-TIME PLLS The purpose of this section is to discuss the distributed synchronization algorithm investigated throughout the paper by building an analogy with conventional discrete-time Phase Locked Loops (PLLs) [22].The discussion is not only be beneficial for a better understanding of the system, but it also allows us to generalize the system design.In order to appreciate the similarity with a discrete-time PLL, fig.7 depicts the synchronization procedure ( 5) carried out at each sensor as a linear dynamic feedback system.Similiarly to a discrete-time PLL, the adder at the input evaluates the timing error ∆t k (n), that is then multiplied by ε and then fed to a Voltage Controlled Clock (VCC) that updates the clock according to (5a).The constant ε plays the role of the loop filter in a discrete-time PLL.The procedure (5) can then be interpreted as a first order discrete-time PLL since the loop filter is a trivial pure gain [23].
From the discussion above, it is clear that the second or third order discrete PLLs4 can be obtained by introducing a loop filter ε(z), with one or two poles respectively, instead of the constant ε in the synchronization system of fig. 7.For instance, in the case of a second-order loop, we can introduce a pole μ in the loop by setting ε(z) = ε/(1 − μz −1 ) with 0 < μ < 1.The corresponding update rule (5a) modifies as The updating rule (32) essentially corrects the local period T k by the estimate of the common clock period t k (n) − t k (n − 1).The convergence analysis of the second-order loop (32) can be carried out similarly to the previous section where a first-order loop was considered.In particular, the following results hold Theorem 5: If the network of distributed PLL is strictly connected and the system (32) converges, then it synchronizes the clocks of the K nodes to the common period (24).However, under the same conditions, the timing phases τ (n) remain generally mismatched and given for n → ∞ by with (•) † denoting the pseudoinverse, and with definitions and ( 27).
Proof : the proof is along the lines of the proof of Theorem 4 (see Appendix-B for details).
Comparing the statement of the previous Theorem with the results derived for a first-order loop (Theorem 4), it can be seen that introducing a pole μ in the loop causes a reduction in the steady state phase error by a factor 1 − μ.However, this reduction comes at the expense of decreased margins of stability.In fact, convergence cannot be guaranteed for all values of 0 < ε < 1 and 0 < μ < 1.We refer to Appendix-C for further analysis on this point.Here we illustrate this issue by means of an example.Consider a network with two nodes.In this case, we have α 12 = α 21 = 1 and the graph is connected.Fig. 8 shows the four eigenvalues of the system matrix associated with (32) (see Appendix-B and C for further details): for different values of the pole μ and ε = 0.9.Notice that the system matrix ( 35) is 4 × 4 since (32) is a system of two second order difference equations [24].Moreover, one eigenvalue of Ã is 1 irrespective of the value of μ.The absolute value of the remaining eigenvalues tends to one for μ → 1, showing that increasing the value of the pole leads to lack of stability of the equilibrium point (33).Moreover, the value of μ at which a couple of eigenvalues acquire a non-zero imaginary part can be calculated exactly as a function of the spectrum of matrix A, as shown in Appendix-C (see ( 47)).
In order to corroborate the conclusions above, fig. 9 shows the standard deviation ξ(n) of the timing vector t(n) versus n, where , for the network in fig. 2 with parameters γ = 3, D/d = 2, ε = 0.9 and ∆T 1 = ∆T 4 = 0, ∆T 2 = 0.05, ∆T 3 = −0.05 with T = 1.Recall that the graph associated to this network is symmetric.Different values of the pole μ are considered showing the reduction in steady state synchronization error with increasing μ.Dashed lines correspond to the analytical result (33).
To conclude, it is interesting to revisit the results of the Theorem 1, 4 and 5 in the light of the analogy with conventional PLLs drawn above.It has been shown that in a strictly connected network: (i) a phase error is perfectly recovered for n → ∞ by the distributed synchronization algorithm (5) (Theorem 1); (ii) a frequency error is perfectly recovered at the expense of a phase mismatch for n → ∞ (Theorem 4); (iii) the residual phase mismatch caused by a frequency error can be reduced by introducing a pole in the control loop (Theorem 5).All these results can be read as the counterpart of known facts in the analysis of linearized PLLs, which assert that a first-order loop is indeed able to (i) recover phase errors and (ii) to achieve a constant phase error (referred to as static phase error) in case of a frequency mismatch [23].Moreover, in this second case, it is interesting to notice how the phase errors (25) and (33) depend on the frequency mismatch ∆T exactly as the static phase error of a PLL [23].

VII. IMPLEMENTATION OF DISTRIBUTED COUPLED DISCRETE-TIME OSCILLATORS
In the previous Sections, it was assumed that each node is able to measure time differences and powers of other nodes so as to calculate the phase update ∆t k (n).Here we remove this assumption by presenting a practical scheme to implement the phase detector over a bandlimited noisy channel.Since the algorithm is based only on instantaneous power measurements by different nodes, it applies to both a time-invariant and time-variant scenario.The scheme is inspired by the proposal in [12].A carrier frequency is dedicated to the synchronization channel, where each node, say the kth, transmits a bandlimited waveform g(t) (say a square root raised cosine pulse) centered at times t k (n), with symbol period 1/F s (i.e., the time between peak and first zero).The symbol period 1/F s defines the timing resolution of the system.
Each node works in a half-duplex mode and measures the received signal on a interval of duration T k around the current timing instant t k (n).Due to the half duplex constraint and the finite switching time between transmitting and receiving mode, each sensor is not able to measure the received signal in an interval of (unilateral) size θ around the firing instant t k (n).It follows that the observation window reads t 10-(a) illustrates a block diagram of the operations performed at the receiver side by each sensor.The receiver performs baseband filtering matched to the transmitted waveform and than samples the received signal at some multiple L of the symbol frequency F s , i.e., LF s with L ≥ 1.Based on the N = LF s T samples received in the nth observation window, the kth node computes the update (21a) in case a first-order loop is employed, or (32) if a second-order loop is considered.Not knowing the exact timings and powers of other nodes, t i (n) and P ki with i 6 = k, the kth sensor cannot directly calculate the updating term ∆t k (n) from (5b) and ( 6).Instead, it estimates these quantities from the received samples, as explained below.
After matched filtering and sampling, the discrete-time baseband signal received by the kth node in the nth time period reads (sampling index m ranges within −N/2 < m ≤ N/2 with m = 0 corresponding to the firing time t k (n) of the kth node): where the average energy per symbol reads E ki = C/(d γ ki • F s ) (recall (4)); β ki denotes the Rayleigh fading coefficient, that is a zero-mean and unit-power complex (circulary symmetric) Gaussian random variable with |β ki | 2 = G ki ; and w(n, m) is the additive Gaussian noise with zero mean and power N 0 .Notice that the sample in the interval −∆LF s ≤ m ≤ ∆LF s are not measured (i.e., zero) due to the half-duplex constraint and the switching time between receive and transmit mode of node k.A sketch of a possible realization of the received signal (36) is provided in fig.10-(b) using an arbitrary waveform g(t).
A simple estimate of ∆T k (n) can then be obtained as where J is the subset of time instants m ∈ (−N/2, −θLF s ) ∪ (θLF s , N/2] for which the received signal |y k (n, m)| 2 is above a given threshold as in [12].The threshold is a system parameter that can be optimized.Being based solely on instantaneous power measurements (i.e., on samples |y k (n, m)| 2 ), the practical scheme (37) has the advantage that it does not require any a priori knowledge at the nodes about network topology or average received powers.
For the example of Sec.III-C with no fading (β ki = 1 for every i and k), fig.11 shows the standard deviation ξ(n) of the timing vector t(n) versus n, where and expectation E[•] is taken with respect to noise.We are considering equal clock periods T k = T = 1 for k = 1, ..., K.Moreover, it is assumed that all nodes transmit the same power and the signal to noise ratio for transmission to the closest node (e.g., from 2 to 1) is set to SNR = E 12 /N 0 = 15dB.Other parameters are as follows: ε = 0.9; the threshold is set for simplicity to zero (see discussion below); distances satisfy D/d = 2; the normalized timing resolution is 1/F s = 0.01; the waveform g(t) is a raised cosine with roll-off δ = 0.2; the switching time is set to θ = 1/F s5 .We first consider the first-order loop (21a) (or equivalently (32) with pole μ = 0) with different oversampling factors L = 1, 2, 5, 10, 15.It can be seen that the finite resolution of the system produces a performance floor for increasing n, that can lowered by increasing the oversampling factor L. In any case, an upper bound on the accuracy of synchronization is set by the finite switching time θ = 0.01.This bound is reached for n and L sufficiently large 6 .Adding a pole in the loop as in (32) can increase the convergence speed as shown in fig.11 for μ = 0.2, 0.4, 0.6.Notice that convergence speed could also be improved by setting an appropriately chosen threshold in (37) (not shown).Finally, fig.11 shows that an upper bound on the performance of the practical implementation discussed here is set by the performance of the dynamic system (10), where the performance gap is due to noisy observations, finite resolution and finite switching time.

VIII. CONCLUSIONS
An increasing number of applications of sensor networks requires the availability of a common time reference to all the nodes.Aiming at scalability and complexity reduction, physicallayer based synchronization qualifies as a valid alternative to conventional packet-based synchronization.In this work, the convergence properties of such an approach have been investigated by modelling the network as a set of discrete-time coupled oscillators and relying on the analytical framework of algebraic graph theory.It has been shown that the system can be equivalently studied as a set of distributed discrete-time PLLs.This observation allowed to generalize the design of the synchronization process, and prove that well-established results on conventional PLLs extend naturally to distributed systems.Finally, practical details of the implementation of the synchronization algorithm have been discussed on a bandlimited Gaussian channel.

IX. APPENDIX A. Proof of Theorem 2
We need to prove that if λ 2 = 1 then the graph is not strongly connected.Toward this goal, we note that we have the following bound on the measure of irreducibility σ (15) [19]: from which it easily follows that if λ 2 = 1, σ = 0 and thus the graph is not strongly connected.

B. Proof of Theorem 5
The system of difference equation (32) (k = 1, ..., K) can be stated, similarly to ( 22) in vector form as Following the same steps as in the proof of Theorem 4, the system ( 22) can be written in terms of phases τ (n) relative to the common period T as A equilibrium state τ * for the difference equation (40) satisfies τ (n + 1) = τ (n) = τ (n − 1) = τ * , which yields the condition From (41), it follows that: (i) in order for (41) to be feasible (i.e., for an equilibrium point to exist), the common clock period T must satisfy v T ∆T = 0 or equivalently (24); (ii) an equilibrium phase vector τ * must read τ * = (1−μ) L † ε ∆T+η1 where η is an arbitrary constant.Let us define τ 0 With this change of variables, the difference equation (40) boils down to The system (42) is a second-order vector difference equation, that can be studied by recasting it as a first-order vector difference equation in terms of vector τ (n) = [τ 0 (n) T τ 0 (n − 1) T ] T with system matrix Ã (35) (see, e.g., [24]).Convergence of the corresponding system τ (n) = Ãτ (n − 1) depends on the eigenvalues of Ã.It is easy to see that Ã has an eigenvalue equal to one, with left and right eigenvectors z = 1 and z r = [v T v T ] T (recall that v is the right eigenvector of A corresponding to the eigenvalue λ = 1).Moreover, it will be shown in Appendix-C that this eigenvalue is unique.Therefore, the system (42) is stable if and only if all the remaining 2K − 1 eigenvalues of Ã have absolute value less than one (see, e.g., [13]).This point is further investigated in Appendix-C.Assuming that the stability condition mentioned above holds, then we have Ãn → z z T r for n → ∞ (see, e.g., [13]) and the phases τ 0 (n) converge as τ 0 (n) → 1v T τ 0 (0) (having set τ (−1) = 0), which implies that the constant η in ( 33) is (34).

C. Discussion on the stability of second-order distributed PLLs (32)
In this section, we further investigate the stability of system (32) by studying the eigenvalues of the system matrix Ã (35).As discussed in Appendix-B, stability is achieved only if and only if all the 2K − 1 eigenvalues of Ã, apart from the eigenvalue equal to one, have absolute value less than one.In the following we show how to relate the eigenvalues of Ã with those of the system matrix A.
In order to evaluate the eigenvalues of Ã, let us consider that for λ to be an eigenvalue of Ã there must exist a vector w = [w T 1 w T 2 ] T such that Ãw = λw.After some straightforward algebra, the latter equation becomes Therefore, denoting as λ any of the K eigenvalues of A we have the following relationship between the spectrum of matrices A and Ã: or equivalently It follows that for every eigenvalue λ of A, we have two eigenvalues of Ã, given by (45).In order to proceed, we need to recall the following result on the eigenvalues of A.
Moreover, if G is connected only one eigenvalue λ = 1.Proof : See, e.g., [13].From (45) and (46), it can be seen that the eigenvalue λ of Ã (with the positive sign in (45)) corresponding to λ = 1 equals one.Moreover, by substituting λ = 1 in (45), this eigenvalue is unique.In order to prove stability of the system at hand, we then need to show that all the other 2K − 1 eigenvalues λ(λ, μ) different from λ = 1 are such that ¯λ(λ, μ) ¯< 1.In general, it is not easy to find general conditions on ε and μ that guarantee the condition ¯λ(λ, μ) ¯< 1 for any λ in (46) (except λ = 1).However, a numerical study can be easily carried out from (45) and (46).For illustration, consider an undirected graph G.In this case, the eigenvalues of matrix A are real and from (46) satisfy −1 < λ < 1 (recalling that 0 < ε < 1).The absolute value | λ(λ, μ)| (45) (with positive sign, similar result holds for the negative sign) is plotted versus μ and λ in fig.12.It can be seen that unless μ → 1 the stability condition is satisfied.Moreover, by studying the sign of the polynomial in (45), it is also possible to evaluate the value of the pole μ, say μ 0 (λ), such that for μ ≤ μ 0 (λ) the eigenvalues λ corresponding to λ are real and for μ > μ 0 (λ) are complex conjugate.This value is easily found to be which is shown in fig.12.
As a final remark, we notice that, for μ = 1, the absolute value | λ(λ, μ)| equals one irrespective of λ, leading to an instable system.

Fig. 1 .
Fig. 1.Clock t k (n) of the kth node.τ k (n) is the timing phase in the nth period of the clock.

V2Fig. 2 .
Fig. 2. The rectangular topology considered in the example in Sec.III-C.

Fig. 6 .
Fig. 6.Average rate of convergence E[v] for the linear, ring and star network in fig. 5 versus the number of nodes K (ε = 0.3, γ = 3).

Fig. 8 .Fig. 11 .
Fig.8.Eigenvalues of the second order loop system ( 32) in the case of two users with μ increasing from 0 to 1.