Critical data points retrieving method for big sensory data in wireless sensor networks

Zhu, Tongxin; Cheng, Siyao; Cai, Zhipeng; Li, Jianzhong

doi:10.1186/s13638-015-0505-0

Research
Open access
Published: 15 January 2016

Critical data points retrieving method for big sensory data in wireless sensor networks

Tongxin Zhu¹,
Siyao Cheng¹,
Zhipeng Cai² &
…
Jianzhong Li¹

EURASIP Journal on Wireless Communications and Networking volume 2016, Article number: 18 (2016) Cite this article

1948 Accesses
8 Citations
Metrics details

Abstract

With the development and widespread application of wireless sensor networks (WSNs), the amount of sensory data grows sharply and the volumes of some sensory data sets are larger than terabytes, petabytes, or exabytes, which have already exceeded the processing abilities of current WSNs. However, such big sensory data are not necessary for most applications of WSNs, and only a small subset containing critical data points may be enough for analysis, where the critical data points including the extremum and inflection data points of the monitored physical world during given period. Therefore, it is an efficient way to reduce the amount of the big sensory data set by only retrieving the critical data points during sensory data acquisition process. Since most of the traditional sensory data acquisition algorithms were only designed for discrete data and did not support to retrieve critical points from a continuously varying physical world, this paper will study such a problem. In order to solve it, we firstly provided the formal definition of the δ-approximate critical points. Then, a data acquisition algorithm based on numerical analysis and Lagrange interpolation is proposed to acquire the critical points. The extensive theoretical analysis and simulation results are provided, which show that the proposed algorithm can achieve high accuracy for retrieving the δ -approximate critical points from the monitored physical world.

1 Introduction

The appearance of wireless sensor networks (WSNs) makes it possible to observe the complicated physical world with low cost. Nowadays, WSNs are widely used in many applications, including military defense [1–4], environment monitoring [5, 6], traffic monitoring [7, 8], and structural health monitoring [9–11]. Meanwhile, the amount of sensory data also grows fast with the wide use of sensor networks. For example, the climate data will exceed 100 PB in 2020 according to the report in [12]. For the Large Hadron Collider in Europe, if all sensory data were recorded, the total amount of the sensory data is nearly 50 EB per day. Similarly, the traffic data, including GPS data, the monitoring data captured by electronic eyes, and so on, also increase rapidly and have already exceeded petabyte annually. Fortunately, not all sensory values are necessary for the users’ analysis in most applications. Some critical data points, such as maxima, minima, and flection points in a monitoring process, may be the only ones required by the users. Therefore, the algorithms for retrieving the critical data points from the monitored physical world are quite important for WSNs.

Currently, it supposes that the sensor nodes sense and sample the data from the monitored environment with equal sampling frequency in most of traditional applications, and the sensory data generated by WSNs is regarded as a set of discrete values. Under such assumptions, a great number of query processing techniques on discrete sensory data have been proposed, including the curve query processing algorithms [13], aggregation query processing algorithms [14–17], top-k query processing algorithms [18–20], skyline [21, 22], and quantilen [23, 24] query processing algorithms. Although all the above algorithms are efficient for processing the discrete sensory data, they cannot meet the complicated query requirements given by users and do not support to retrieve the critical data points in current WSNs since only discrete data were considered. For example, in the air pollution monitoring application, the users want to estimate the extremum pollution values and obtain the period when these values appear. In the climate monitoring system, the users may want to know the convexity of the wind velocity or rainfall curve and acquire the inflection points of such curves. The reasons the above query requirements cannot be satisfied by the existing query processing techniques, which only consider the discrete sensory data, are given as follows. First, as pointed out by [25, 26], the monitored physical world varies continuously, and the discrete sensory dataset omits many critical points, such as maxima, minima, and flection points from the physical world, so that the queries of retrieving critical points cannot be answered since these points do not belong to the discrete sensory dataset. Second, to answer the queries about the convexity, monotonicity, or the positions of critical points, the original data needs to have the continuous first and second derivatives, which cannot be met by the discrete sensory dataset either.

To overcome the above problems, Cheng et al. [26] proposed an adaptive data acquisition algorithm to reconstruct the physical world as much as possible. For any given error bound ε, the authors proved that the result returned by their algorithm is O(ε) approximate to the monitored physical world, where ε can be arbitrarily small. Since the sampling frequency of each sensor is adjusted adaptively according to the variation of the monitored physical world, the algorithm proposed by [26] samples few sensory data and saves lots of energy on condition that the observation error is guaranteed. However, since the aim of [26] is to reconstruct the physical world, the amount of sensory data acquired by the sensors is still quite large. Considering that many applications may only want to acquire the critical point information, such as as maxima, minima, and flection points, according to the above analysis, therefore, the energy cost of collecting the sensory data from the monitored physical world can be further reduced.

Due to the above reasons, we will study the problem of retrieving the critical points, including the extremum and inflection points, in the sensor networks. In this paper, a novel sensory data acquisition algorithm is proposed based on numerical analysis techniques [27] and Lagrange interpolation [28] in order to retrieve the critical points approximately. Such algorithm can adjust the sampling frequency of sensors adaptively according to the variation of physical world in order to dramatically reduce the amount of sensory data. Furthermore, to evaluate the error of approximate critical points, the formal definitions of δ -approximate extremum point and δ -approximate inflection point are firstly provided, where δ denotes the relative error between the approximate critical point and the exact one. The correctness of the algorithm is proved. In summary, the contributions of this paper are as follows.

1.
The formal definitions of δ -approximate extremum point and δ -approximate inflection point are firstly proposed. The problem of acquiring critical points from the monitored physical world is also defined.
2.
Two critical point aware data acquisition algorithms are proposed based on numerical analysis [27] and Lagrange interpolation [28] techniques. The algorithms can adjust the sampling frequency of the sensors automatically according to variation of the physical world. The correctness of the algorithms is proved and the complexities of the algorithms are analyzed.
3.
The extensive simulations on the real data set are carried out. The experimental results show that both of the precision and recall rate of our proposed algorithms are quite high to retrieve δ -approximate extremum point and δ -approximate inflection point from the monitored physical world.

The organization of the paper is as follows. Section 2 gives the problem definition. Section 3 provides the mathematical foundations of the algorithms. Section 4 proposes two critical points aware data acquisition algorithms, to retrieve the δ -approximate extremum point and δ -approximate inflection point, respectively. Section 5 shows the experimental results. Section 6 discusses the related work of the paper. Finally, section 7 concludes the whole paper.

2 Problem definition

Let N denote the number of sensor nodes in a given WSN, and V={1,2,⋯,N} be the set of sensor nodes, where i (1≤i≤N) denotes the ID of a sensor node.

Suppose that t _s and t _f denote the start and final time in monitoring the physical world by a WSN, respectively. Therefore, the variation of the physical world monitored by sensor node i can be regarded as a curve. We use S _i (1≤i≤N) to denote such curve. According to the discussion in [26], the physical world always varies continuously and S _i is smooth enough to have a continuous fourth-order derivative, i.e., S _i∈C ⁴[t _s,t _f], where C ⁴[t _s,t _f] denotes the set of functions whose fourth-order derivative is continuous in [t _s,t _f].

In this paper, the critical points considered by us are extremum points and inflection points of the physical curve S _i (1≤i≤N). Since S _i has the continuous fourth-order derivative according to the above analysis, the extremum points of S _i in range [t _s,t _f] can be denoted by $\left \{x|S^{(1)}_{i}(x)=0\bigwedge x\in [t_{\text {s}},t_{\text {f}}]\right \}$; similarly, the inflection points of S _i in [t _s,t _f] can be denoted by $\left \{x|S^{(2)}_{i}(x)=0\bigwedge x\in [t_{\text {s}},t_{\text {f}}]\right \}$, where $S^{(1)}_{i}$ and $S^{(2)}_{i}$ are the first and second derivatives of S _i, respectively.

Since when the critical points will appear in the future is unknown, it requires that the sampling frequency should be infinite to obtain all the critical points of S _i exactly, which is almost impossible. Thus, we will study the sensory data acquisition algorithm to retrieve the critical points, including the extremum points and the inflection points, approximately. To evaluate the relative error between the approximate critical points and the exact critical points, the δ -approximate extremum point and δ -approximate inflection point are defined as follows.

Definition 1.

( δ-approximate extremum points ) $\widehat {x_{i}}$ from $\widehat {S_{i}}$ is called as a δ-approximate extremum point if and only if ∃x _i∈[t _s,t _f] satisfying $\frac {|\widehat {x_{i}} - x_{i}|}{x_{i}}\leq \delta $ and $S_{i}^{(1)}(x_{i})=0$.

Definition 2.

( δ-approximate inflection points ) $\widehat {x_{i}}$ is called as a δ-approximate inflection point if and only if ∃x _i∈[t _s,t _f] satisfying that $S_{i}^{(2)}(x_{i})=0$ and $\frac {|\widehat {x_{i}} - x_{i}|}{x_{i}}\leq \delta $.

The intuition of our algorithm is to forecast the first and second derivatives of S _i using the current collected sensory data. If the first or second derivative is close to 0, then the sampling frequency increases. Otherwise, we reduce the sampling frequency in order to save energy. Since the physical world varies continuously, it is acceptable to use the history sensory data to forecast the future. The detailed algorithm is presented in Section 4, and the formal definition of retrieving the approximate critical points from physical world is given as follows.

Input:
1. 1.
  The start and final time of monitoring, t ₀(=t _s) and t _f
2. 2.
  The maximum bound of the sampling frequency of a sensor node, f _max
3. 3.
  The step-size increment t ^′ and the decrease factor α

Output: The sets of approximate extremum and inflection points in [t _s,t _f], X ₁ and X ₂

We verify that the precision and the recall rate of our algorithms are quite high on condition that δ -approximate extremum points and δ -approximate inflection points are collected. To improve the readability of the whole paper, the table of the symbols used in this paper is given in Table 1.

Table 1 The meanings of symbols

Full size table

3 Mathematical foundations

Let t _c denote the current sampling time, t _c−1 denote the last sampling time before t _c, and $t_{\frac {2{c-1}}{2}}$ be the median of t _c and t _c−1. For each sensor node i (1≤i≤N), $S_{i}\left (t_{\frac {2c-1}{2}}\right)$ should always be sampled besides S _i(t _c−1) and S _i(t _c) in [t _c−1,t _c]. Based on such operations and a three-point central difference formula [27], the first and second derivatives of S _i at $t_{\frac {2c-1}{2}}$ can be estimated by the following two formulas.

$$ \begin{array}{l} \widehat{S_{i}}^{(1)}\left(t_{\frac{2c-1 }{ 2 }}\right)=\frac{S_{i}(t_{c})-S_{i}(t_{c-1}) }{ 2h}-\frac{{h}^{2} }{ 6 }S_{i}^{(3)}\left(t_{\frac{2c-1 } { 2 }}\right) \end{array} $$

((1))

$$ \begin{array}{ll} \widehat{S_{i}}^{(2)}\left(t_{\frac{2c-1}{2}}\right) & = \frac{S_{i}(t_{c})-2S_{i}(t_{\frac{2c-1 }{ 2 }})+S_{i}(t_{c-1})}{h^{2}}-\frac{{h}^{2}}{12}S_{i}^{(4)}\left(t_{\frac{2c-1}{2}}\right) \end{array} $$

((2))

where $h=t_{\frac {2c-1 }{ 2 }}-t_{c-1}=t_{c}-t_{\frac {2c-1 }{ 2 }}$ denotes the half length of the sampling interval.

The following theorem guarantees that the difference between $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1 }{ 2 }}\right)$ and $S_{i}^{(1)}\left (t_{\frac {2c-1}{2}}\right)$ is determined by h ².

Theorem 1.

$\left |S_{i}^{(1)}\left (t_{\frac {2c-1}{2}}\right)-\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)\right |=\left |\frac {{h}^{2} }{ 6 }\left \{S_{i}^{(3)} \left (t_{\frac {2c-1 }{ 2 }}\right)-S_{i}^{(3)}(\xi)\right \}\right |$, where ξ∈[t _c−1,t _c].

Proof.

Based on Taylor Series, we have

$$\begin{aligned} S_{i}(t_{c})=&\,S_{i}\left(t_{\frac{2c-1}{2}}\right)+hS_{i}^{(1)}\left(t_{\frac{2c-1}{2}}\right)\\&+\frac{h^{2}}{2}S_{i}^{(2)}\left(t_{\frac{2c-1}{2}}\right) +\frac{h^{3}}{3!}S_{i}^{(3)}(\xi_{1}) \end{aligned} $$

Similarly

$$\begin{aligned} S_{i}(t_{c-1})=&\,S_{i}\left(t_{\frac{2c-1}{2}}\right)-hS_{i}^{(1)}\left(t_{\frac{2c-1}{2}}\right)+\frac{h^{2}}{2}S_{i}^{(2)}\left(t_{\frac{2c-1}{2}}\right)\\&-\frac{h^{3}}{3!}S_{i}^{(3)}(\xi_{2}) \end{aligned} $$

where $\xi _{1}\in \left [t_{\frac {2c-1}{2}},t_{c}\right ]$ and $\xi _{2}\in \left [t_{c-1},t_{\frac {2c-1}{2}}\right ]$. Therefore,

$$ S_{i}(t_{c})-S_{i}(t_{c-1})=2hS_{i}^{(1)}\left(t_{\frac{2c-1}{2}}\right)+\frac{h^{3}}{3!}\left(S_{i}^{(3)}(\xi_{1})+S_{i}^{(3)}(\xi_{2})\right) $$

((3))

Let M and m be the maximum and minimum values of $S_{i}^{(3)}(t)$ for any t∈[t _c−1,t _c]. Since ξ ₁,ξ ₂∈[t _c−1,t _c], $m\le \frac {S_{i}^{(3)}(\xi _{1})+S_{i}^{(3)}(\xi _{2})}{2}\le M$. Furthermore, since S _i has the continuous third-order derivative, ∃ξ∈[t _c−1,t _c] satisfies that $S_{i}^{(3)}(\xi)=\frac {S_{i}^{(3)}(\xi _{1})+S_{i}^{(3)}(\xi _{2})}{2}$. Therefore, according to Formula (3),we have

$$ S_{i}^{(1)}\left(t_{\frac{2c-1}{2}}\right)=\frac{S_{i}(t_{c})-S_{i}(t_{c-1})}{2h}-\frac{h^{2}}{6}S_{i}^{(3)}(\xi) $$

((4))

Comparing Formula (3) with Formula (1), we have

$${\fontsize{9.2}{6}\begin{aligned} \left|S_{i}^{(1)}\left(t_{\frac{2c-1}{2}}\right)-\widehat{S_{i}}^{(1)}\left(t_{\frac{2c-1}{2}}\right)\right|= \left|\frac{{h}^{2} }{ 6 }\left\{S_{i}^{(3)}\left(t_{\frac{2c-1 }{ 2 }}\right)-S_{i}^{(3)}(\xi)\right\}\right|. \end{aligned}} $$

□

Theorem 2.

$\left |S_{i}^{(2)}\left (t_{\frac {2c-1}{2}}\right)-\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right)\right |=\left |\frac {{h}^{2} }{ 12 }\left (S_{i}^{(4)} \left (t_{\frac {2c-1 }{ 2 }}\right)-S_{i}^{(4)}(\xi)\right)\right |$, where ξ∈[t _c−1,t _c]. □

The proof of Theorem 2 is similar as that of Theorem 1. These two theorems indicate that the difference between the exact critical points and the approximate ones calculated by Formulas (1) and (2) can be arbitrary small with the decline of h.

In Theorems 1 and 2, $S_{i}^{(3)}\left (t_{\frac {2c-1 }{ 2 }}\right)$ and $S_{i}^{(4)}\left (t_{\frac {2c-1 }{ 2 }}\right)$ can be estimated by Lagrange interpolation. The sensory values collected by sensor node i (1≤i≤N) in the last and current sampling intervals, i.e., S _i(t _c−2), $S_{i}\left (t_{\frac {2c-3}{2}}\right)$, S _i(t _c−1), $S_{i}\left (t_{\frac {2c-1}{2}}\right)$ and S _i(t _c), are used.

Let L ₁(t) and L ₂(t) denote the Lagrange interpolation polynomials to estimate $S_{i}^{(3)}\left (t_{\frac {2c-1 }{ 2 }}\right)$ and $S_{i}^{(4)}\left (t_{\frac {2c-1 }{ 2 }}\right)$, respectively. The construction of L ₁(t) and L ₂(t) are shown in the Appendix. Therefore, $S^{(3)}_{i}(t)$ can be estimated by the following formula, where $t\in \left [t_{\frac {2c-3}{2}},t_{c}\right ]$.

$$ \begin{aligned} &S_{i}^{(3)}(t)\approx L_{1}^{(3)}(t)\\ &=6\left[\frac{S_{i}\left(t_{\frac{2c-3}{2}}\right)}{\left(t_{\frac{2c-3}{2}}-t_{c-1}\right)\left(t_{\frac{2c-3}{2}}-t_{\frac{2c-1}{2}}\right)\left(t_{\frac{2c-3}{2}}-t_{c}\right)}\right.\\ &~~~\left.+\frac{S_{i}(t_{c-1})}{\left(t_{c-1}-t_{\frac{2c-3}{2}}\right)\left(t_{c-1}-t_{\frac{2c-1}{2}}\right)\left(t_{c-1}-t_{c}\right)}\right.\\ &~~~\left.+\frac{S_{i}\left(t_{\frac{2c-1}{2}}\right)}{\left(t_{\frac{2c-1}{2}}-t_{\frac{2c-3}{2}}\right)\left(t_{\frac{2c-1}{2}}-t_{c-1}\right)\left(t_{\frac{2c-1}{2}}-t_{c}\right)}\right.\\ &~~~\left.+\frac{S_{i}(t_{c})}{\left(t_{c}-t_{\frac{2c-3}{2}}\right)(t_{c}-t_{c-1})\left(t_{c}-t_{\frac{2c-1}{2}}\right)}\right] \end{aligned} $$

((5))

Besides, $S_{i}^{(4)}(t)$ can be estimated by the following formula, where t∈[t _c−2,t _c].

$$ {\fontsize{8.9}{6}\begin{aligned} &S_{i}^{(4)}(t)\approx L_{2}^{(4)}(t)\\ &=24\left[\frac{S_{i}(t_{c-2})}{\left(t_{c-2}-t_{\frac{2c-3}{2}}\right)\left(t_{c-2}-t_{c-1}\right)\left(t_{c-2}-t_{\frac{2c-1}{2}}\right)\left(t_{c-2}-t_{c}\right)}\right.\\ &\quad\left.+\frac{S_{i}\left(t_{\frac{2c-3}{2}}\right)}{\left(t_{\frac{2c-3}{2}}-t_{c-2}\right)\left(t_{\frac{2c-3}{2}}-t_{c-1}\right)\left(t_{\frac{2c-3}{2}}-t_{\frac{2c-1}{2}}\right)\left(t_{\frac{2c-3}{2}}-t_{c}\right)}\right.\\ &\quad\left.+\frac{S_{i}(t_{c-1})}{\left(t_{c-1}-t_{c-2}\right)\left(t_{c-1}-t_{\frac{2c-3}{2}}\right)\left(t_{c-1}-t_{\frac{2c-1}{2}}\right)\left(t_{c-1}-t_{c}\right)}\right.\\ &\quad\left.+\frac{S_{i}\left(t_{\frac{2c-1}{2}}\right)}{\left(t_{\frac{2c-1}{2}}-t_{c-2}\right)\left(t_{\frac{2c-1}{2}}-t_{\frac{2c-3}{2}}\right)\left(t_{\frac{2c-1}{2}}-t_{c-1}\right)\left(t_{\frac{2c-1}{2}}-t_{c}\right)}\right.\\ &\quad\left.+\frac{S_{i}(t_{c})}{\left(t_{c}-t_{c-2}\right)\left(t_{c}-t_{\frac{2c-3}{2}}\right)\left(t_{c}-t_{c-1}\right)\left(t_{c}-t_{\frac{2c-1}{2}}\right)}\right] \end{aligned}} $$

((6))

The error of such estimation is bounded by Theorem 3, which is very small in practice.

Theorem 3.

The errors generated by Formulas (5) and (6) equal to $\gamma _{1}(t)=\frac {S_{i}^{(4)}(\xi)}{4}\left [4t-\left (t_{\frac {2c-3}{2}}+ t_{c-1}+\right.\right.$ $\left.\left.t_{\frac {2c-1}{2}}+t_{c}\right)\right ]$ and $\gamma _{2}(t)=\frac {S_{i}^{(5)}(\xi)}{5}\left [5t-\left (t_{c-2}+t_{\frac {2c-3}{2}}+ t_{c-1}+t_{\frac {2c-1}{2}}+t_{c}\right)\right ]$, respectively.

Proof.

According to the property of Lagrange interpolation [28], the interpolation remainder, denoted by R(x) satisfies that

$$ R(x)=f(x)-L_{f}(x)=\frac{f^{(n+1)}(\xi)}{(n+1)!}\omega(x) $$

((7))

where f(x) is a function whose nth-order derivative is continuous, and L _f(x) is n-degree Lagrange interpolation of f(x), x ₀,x ₁,⋯,x _n denote the interpolation points, and ξ∈[x ₀,x _n].

Therefore, the three-order interpolation remainder of L ₁(t) satisfies that

$$ \begin{aligned} \gamma_{1}(t)&=R^{(3)}(t)=\frac{S_{i}^{(4)}(\xi)}{4!}\omega_{1}^{(3)}(t)\\ &=\frac{S_{i}^{(4)}(\xi)}{4}\left[4t-\left(t_{\frac{2c-3}{2}}+t_{c-1}+t_{\frac{2c-1}{2}}+t_{c}\right)\right] \end{aligned} $$

((8))

which is the error generated during estimating $S_{i}^{(3)}(t)$.

The fourth-order interpolation remainder of L ₂(t) satisfies that

$$ \begin{aligned} \gamma_{2}(t) & =R^{(4)}(t)=\frac{S_{i}^{(5)}(\xi)}{5!}\omega_{2}^{(4)}(t)\\ & = \frac{S_{i}^{(5)}(\xi)}{5}\left[5t-\left(t_{c-2}+t_{\frac{2c-3}{2}}+t_{c-1}+t_{\frac{2c-1}{2}}+t_{c}\right)\right] \end{aligned} $$

((9))

which is the error generated during estimating $S_{i}^{(4)}(t)$. Thus, Theorem 3 is proved. □

Theorem 3 also verifies that the error generated by Lagrange interpolation estimation is also very small.

4 Critical point aware data acquisition algorithm

According to the analysis in Section 3, t _c and t _c−1 denote the current sampling time and the last one before t _c, respectively. $t_{\frac {2c-1}{2}}$ is the median time slot of t _c and t _c−1. Let f _max be the maximum sampling frequency that a sensor node can achieve.

Based on such symbols, the whole critical point aware data acquisition algorithm can be divided into two phases. The first one is the initial phase, the sampling frequency in such phase is set to be f _max in order not to omit any critical points. The second one is the maintenance phase; the sampling frequency in such phase is determined according to the variation of physical world. Since the variation of the monitored physical world in the future is unknown, the posterior estimation is adopted, that is, the history sensory data collected in the current time will be used to estimate the variation of the monitored physical world in the future. Because the physical world always varies continuously, such estimation is acceptable and can achieve high precision.

Specifically, the critical point aware data acquisition algorithm consists of the following five steps.

Step 1. Sample sensory values at time t ₀, $t_{\frac {1}{2}}$, t ₁, where $t_{\frac {1}{2}}=t_{0}+\frac {1}{f_{\text {max}}}$ and $t_{1}=t_{\frac {1}{2}}+\frac {1}{f_{\text {max}}}$. That is, we initialize h with the minimum sampling interval $\frac {1}{f_{\text {max}}}$. Initialize c with 2, then execute a loop until t _c>t _f.
Step 2. Sensor node i (1≤i≤N) samples the sensory values at time $t_{\frac {2c-1 }{ 2 }}$ and t _c, where $t_{\frac {2c-1 }{ 2 }}=t_{c-1}+h$ and $t_{c}=t_{\frac {2c-1 }{ 2 }}+h$. Using the sensory values sampled in the current and last sampling intervals, i.e., S _i(t _c−2), $S_{i}\left (t_{\frac {2c-3 }{ 2 }}\right)$, S _i(t _c−1), $S_{i}\left (t_{\frac {2c-1 }{ 2 }}\right)$ and S _i(t _c), the Lagrange interpolation polynomial can be constructed. Therefore, $S_{i}^{(3)}(t)$ and $S_{i}^{(4)}(t)$ can be obtained according to Formulas (5) and (6) for $t\in \!\left [t_{\frac {2c-3}{2}},t_{c}\right ]$ and t∈ [t _c−2,t−c] respectively.
Step 3. Call the extremum point retrieving algorithm to obtain the extremum point in current sampling interval and determine the length of the next possible sampling interval h ₁, the detailed method is shown in Section 4.1.
Step 4. Similarly, call the inflection points retrieving algorithm to collect the inflection point and determine the length of the next possible sampling interval h ₂. The detailed algorithm is given in Section 4.2.
Step 5. Finally, select the minimum one returned by the above two steps to be the adopted length of the next sampling interval, i.e., h=min{h ₁,h ₂}, which avoids omitting critical points. Go to step 2 with increasing c by 1, and start a new loop until t _c>t _f.

The detailed critical point aware data acquisition algorithm is shown in Algorithm 1.

4.1 Extremum point retrieving algorithm

The extremum point retrieving algorithm has four steps.

Step 1. Sensor i (1≤i≤N) estimates the the first derivative $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)$ based on Formula (1). Meanwhile, $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right)$ should be computed according to Formula (2).
Step 2. Find extremum point in $\left [t_{\frac {2c-3}{2}}, t_{\frac {2c-1}{2}}\right ]$. If $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)=0$, return the extremum point $t_{\frac {2c-1}{2}}$. Otherwise, compare $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)$ and $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-3}{2}}\right)$ calculated in last loop. If $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)\times \widehat {S_{i}}^{(1)}\left (t_{\frac {2c-3}{2}}\right)<0$, there must be an extremum point in $\left [t_{\frac {2c-3}{2}},t_{\frac {2c-1}{2}}\right ]$. Then, retrieve the extremum point by curve tessellation techniques [29].
Step 3. There exists three cases that need to be considered when determining the length of the next sampling interval.

1.
If $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)\times \widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1 }{ 2 }}\right)>0$, which means that the first derivative of the physical world increases or decreases monotonously; therefore, the extremum point may not be contained in the next sampling interval [t _c,t _c+1], so that the length of the next sampling interval should be increased in order to save energy. In summary, let h and h ₁ denote the half length of the current and next sampling interval, then h ₁=h+t ^′, where t ^′ is a given constant to denote the step size increment.
2.
If $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right)\times \widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1 }{2}}\right)<0$ and $\left |\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right)\right | \times 2h>\left |\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)\right |$, which means that the next extremum point is more likely to be included in the next sampling interval; thus, we decrease the length of sampling interval in order to catch the extremum point. Therefore, h ₁ is set to be $h_{1}=\text {max}\left \{\frac {1}{f_{\text {max}}},\alpha h\right \}$, where α is a decreasing factor and 0<α≤1.
3.
Otherwise, the ratio between $\left |\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right)\right |$ and $\left |\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)\right |$ will be considered, if $\left |\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)\right |\geq \left (2h+t^{\prime }\right) \times \left |\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right)\right |$, h ₁=h+t ^′.

In other situations, h ₁ maintains the last sampling interval h.

Step 4. Return h ₁ and the extremum point.

The detailed algorithm is shown in Algorithm 2.

4.2 Inflection point retrieving algorithm

The inflection point retrieving algorithm also has four steps and can be constructed by similar method shown in above section.

Step 1. Sensor i (1≤i≤N) estimates the the first derivative $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1 }{ 2 }}\right)$ and second derivative $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)$ based on Formula (1) and Formula (2).
Step 2. Find inflection point in $\left [t_{\frac {2c-3 }{ 2 }}, t_{\frac {2c-1 }{ 2 }}\right ]$. If $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)=0$, inflection point is $t_{\frac {2c-1 }{ 2 }}$ and return it. Otherwise, compare $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)$ and $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-3 }{ 2 }}\right)$ calculated in last loop. If $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)\times \widehat {S_{i}}^{(2)}\left (t_{\frac {2c-3 }{ 2 }}\right) <0$, there must be an inflection point in $\left [t_{\frac {2c-3 }{ 2 }},t_{\frac {2c-1 }{ 2 }}\right ]$ we have missed. Then, retrieve the inflection point by curve tessellation techniques [29].
Step 3. There exists three cases that need to be considered when determining the length of the next sampling interval.

1.
If $S_{i}^{(3)}\left (t_{\frac {2c-1 }{ 2 }}\right)\times \widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)>0$, which means that the second derivative of the physical world increases or decreases monotonously; therefore, the inflection point may not exist in the next sampling interval [t _c,t _c+1]. In order to save energy, increasing the length of the next sampling interval. That is, increase the half length of sampling interval h ₂=h+t ^′, where t ^′ is a given constant to denote the step size increment.
2.
If $S_{i}^{(3)}\left (t_{\frac {2c-1 }{ 2 }}\right)\times \widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)<0$ and $\left |S_{i}^{(3)}\left (t_{\frac {2c-1 }{ 2 }}\right)\right | \times 2h>\left |\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)\right |$, which means that the next inflection point is more likely to be included in the next sampling interval; thus, we decrease the length of sampling interval in order to catch the inflection point. Therefore, h ₂ is set to be $h_{2}=\text {max}\left \{\frac {1}{f_{\text {max}}},\alpha h\right \}$, where α is a decreasing factor and 0<α≤1.
3.
Otherwise, the ratio between $\left |S_{i}^{(3)}\left (t_{\frac {2c-1 }{ 2 }}\right)\right |$ and $\left |\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)\right |$ will be considered, if $\left |\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1 }{ 2 }}\right)\right |\geq (2h+t^{\prime }) \times \left |S_{i}^{(3)}\left (t_{\frac {2c-1 }{ 2 }}\right)\right |$, increase h ₂ with t ^′, h ₂=h+t ^′.

In other situations, h ₂ maintains the last sampling interval h.

Step 4. Return h ₂ and the inflection point.

The detailed algorithm is shown in Algorithm 3.

In each sampling interval, the complexity of the above algorithm is O(1) since it only needs to sample two sensory values and the first and second derivatives can be calculated in O(1). Therefore, the total complexity of Algorithm 1 is determined by the number of sampling interval it includes.

In the best case, the half length of sampling interval, h, increases t ^′ every loop. Therefore, there exists $O\left (\sqrt {\frac {t_{\mathrm {f}}-t_{\mathrm {s}}}{t^{\prime }}}\right)$ sampling intervals, so that the minimum complexity of the above algorithm is $O\left (\sqrt {\frac {t_{\mathrm {f}}-t_{\mathrm {s}}}{t^{\prime }}}\right)$.

In the worst case, the sampling frequency is always f _max, so that the maximum complexity of the above algorithm is O(f _max(t _f−t _s)). However, the worst case requires that there should exist lots of critical points in [ t _s,t _f], which is rarely happened. Therefore, in practice, the complexity of our algorithm is much better than the worst case.

4.3 Discussion: the method for estimating missing critical points

If the product of the first derivatives of $t_{\frac {2c-3}{2}}$ and $t_{\frac {2c-1}{2}}$ is less than 0, i.e., $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right) \times \widehat {S_{i}}^{(1)}\left (t_{\frac {2c-3}{2}}\right) <0$, then there exists an extremum point in range $\left (t_{\frac {2c-3}{2}},t_{\frac {2c-1}{2}}\right)$ which is not captured by our extremum point retrieving algorithm. Since such missing extremum points are also very important for users’ observation, the method for estimating these missing critical points are required. The traditional methods, such as curve tessellation techniques [29], are too complicate to solve such problem, and not practical for WSNs. However, in some situations, the missing critical points can be obtained easily just as shown in the following Theorem 4.

Theorem 4.

When $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right) \times \widehat {S_{i}}^{(1)}\left (t_{\frac {2c-3}{2}}\right) <0$ or $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right) \times \widehat {S_{i}}^{(2)}\left (t_{\frac {2c-3}{2}}\right) <0$, t _c−1 is a δ-approximate critical point of S _i, if $\frac {max\left \{\left |t_{c-1}-t_{\frac {2c-3}{2}}\right |,\left |t_{c-1}-t_{\frac {2c-1}{2}}\right |\right \}}{t_{\frac {2c-3}{2}}}\leq \delta $.

Proof.

When $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right) \times \widehat {S_{i}}^{(1)}\left (t_{\frac {2c-3}{2}}\right) <0$ or $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right) \times \widehat {S_{i}}^{(2)}\left (t_{\frac {2c-3}{2}}\right) <0$, there must exist a critical point in $\left [t_{\frac {2c-3}{2}},t_{\frac {2c-1}{2}}\right ]$, since S _i has continuous fourth-order derivative. We denote the critical point as t, $t\in \left [t_{\frac {2c-3}{2}},t_{\frac {2c-1}{2}}\right ]$. According to the definitions of δ-approximate extremum and inflection points, t _c−1 is a δ-approximate critical point if and only if $\frac {|t_{c-1}-t|}{t} \leq \delta $.

Since $t\in \left [t_{\frac {2c-3}{2}},t_{\frac {2c-1}{2}}\right ]$, therefore, $|t_{c-1}-t|\leq \text {max}\left \{\left |t_{c-1}-t_{\frac {2c-3}{2}}\right |,\left |t_{c-1}-t_{\frac {2c-1}{2}}\right |\right \}$. As the condition shows, $\frac {\text {max}\left \{\left |t_{c-1}-t_{\frac {2c-3}{2}}\right |,\left |t_{c-1}-t_{\frac {2c-1}{2}}\right |\right \}}{t_{\frac {2c-3}{2}}}\leq \delta $, then

$$\frac{|t_{c-1}-t|}{t_{\frac{2c-3}{2}}}\leq \frac{\text{max}\left\{\left|t_{c-1}-t_{\frac{2c-3}{2}}\right|,\left|t_{c-1}-t_{\frac{2c-1}{2}}\right|\right\}}{t_{\frac{2c-3}{2}}}\leq \delta. $$

Besides, since $t\geq t_{\frac {2c-3}{2}}$, then $\frac {|t_{c-1}-t|}{t}\leq \frac {|t_{c-1}-t|}{t_{\frac {2c-3}{2}}}\leq \delta $. That is, $\frac {|t_{c-1}-t|}{t}\leq \delta $. Therefore, t _c−1 is a δ-approximate critical point. □

According to Theorem 4, the missing extremum points can be obtained by following three steps.

Step 1. Judge whether t _c−1 is a δ-approximate extremum point by Theorem 4.
Step 2. Use the the values $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-3}{2}}\right)$, $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-3}{2}}\right)$, $\widehat {S_{i}}^{(1)}\left (t_{\frac {2c-1}{2}}\right)$ and $\widehat {S_{i}}^{(2)}\left (t_{\frac {2c-1}{2}}\right)$ to locate the interval where the approximate extremum point is in.
Step 3. Use the Lagrange interpolation polynomial we constructed before to estimate the sampling value of the estimated extremum point.

The detailed algorithm for estimating the missing extremum point is shown in Algorithm 4. As shown in the algorithm, we use the linear function to estimate approximate extremum point if the condition in Theorem 4 is not satisfied, and more complicated situation will be considered in our future works. For the missing inflection points, the similarly method can be adopted to retrieve them approximately.

The complexity of Algorithm 4 is O(1). The algorithm only needs to conduct compares and additions which can be calculated in O(1). So the method for estimating these missing critical points is efficient.

4.4 Discussion: the in-network cooperation and communication algorithms

The above algorithms are all about single sensor’s sensing strategies, which can reduce sensory data. Obviously, when in-network cooperation and communication is considered, the sensory data can be further reduced. Since sensory data is spatially-related, not all the sensors need to transmit their critical points. That is, when two sensors are close enough, the critical points of their sensory data may appear at the same time. In this situation, not all the critical points need to be transmitted since they may present the same area. We design an algorithm for the in-network cooperation and communication, which can further reduce the transmitting sensory data and the communication energy.

Firstly, we divide the sensors into clusters according to their positions. We divide the monitored area into squares, whose length of the side is D. The length of the side D is given by the users according to the specified applications. Each cluster has a cluster head, and all the sensor nodes in the cluster transmit their critical points to the cluster head. All cluster heads compose a spanning tree, where the base station is the root of the spanning tree and cluster heads transmit the critical points along the spanning tree. Secondly, each time a sensor samples a critical point, it transmit the critical point and its position to the cluster head. The cluster head will determine whether the critical points it received need to be reduced according to the positions. When the distance of two critical points is less than L, only one of the critical points will be transmitted, where L is a parameter given by the users. The detailed algorithm of in-network cooperation and communication is shown in the Algorithm 5.

This algorithm employs the spatial correlation of the sensors, which can further reduce the transmitting energy. Only simple in-network cooperation and communication is considered in the algorithm, since this paper mainly focuses on requiring critical points by single sensor. In the future work, we will study the in-network cooperation to further reduce the energy consumption.

5 Experiment result

5.1 Experiment setting

We use a simulated network with 200 sensor nodes to evaluate the performance of our sensory data acquisition algorithms. The network is deployed into a rectangular region with 200×200 m size. The transmission range of each sensor node is set to be 25 m.

The sensory data of the network comes from a real sensor system, where the TelosB mote (http://www.willow.co.uk/TelosB_Datasheet.pdf) is used to acquire indoor temperature, humidity, and light intensity continuously with the frequency equaling to 1 Hz, and the light intensity is adopted in the experiments.

5.2 The performance of the algorithm

The first group of experiments is going to evaluate the recall rate and precision of our algorithm for retrieving δ-approximate extremum and inflection points. The recall rate equals to the fraction of the exact δ -approximate critical points that are returned. The precision is the fraction of the returned results that are the exact δ -approximate critical points. They are important parameters to evaluate the accuracy of the proposed algorithm. In the following experiments, the recall rate and the precision of our algorithm were computed, respectively, while δ increases from 0.005 to 0.025, α=0.5, and t ^′=0.5. The experimental results are presented in Figs. 1 and 2.

From Fig. 1 a, b, we can see that the recall rate and precision of δ-approximate extremum points increase with the growth of δ, and both of them are close to a hundred percent even when δ is quite small. For example, when δ=0.024, the precision of δ-approximate extremum point is 100 % and the recall rate of δ-approximate extremum point is close to 90 %. The results in Fig. 2 a, b show that the recall rate and precision of δ -approximate inflection points are also close to a hundred percent even when δ is relatively small. The recall rate and precision of δ -approximate critical points increase with the increase of δ, since δ is the relative error. When the error is loose, the algorithm can capture more approximate critical points which leads to higher recall rate and precision. Furthermore, when the relative error is not too large, the algorithm can capture almost all critical points.

In summary, our algorithms can achieve high accuracy in practice.

The second group of experiments is to investigate the impact of α on the performance of our algorithm. In the following experiments, the recall rate, the precision, and the errors of the first and second derivatives generated by our algorithm are computed respectively while α increases from 0 to 1, δ=0.015, and t ^′=0.5. The experimental results are presented in Figs. 3 and 4.

Figure 3 a, b shows the the recall rate and precision of δ-approximate extremum and inflection points. The recall rate and precision of δ-approximate inflection points decrease with the growth of α; the reason is that the larger α will omit more inflection points. Besides, the recall rate and precision of δ-approximate inflection points are still every high even when α is not very small. For example, the recall rate of δ-approximate inflection points reaches 96 % when α=0.1. At the mean time, the recall and precision rate of δ-approximate extremum points keep stable and high enough in practice with the variation of α. Such recall rate and precision are quite high and can be acceptable in practice.

Figure 4 a, b presents the errors of the first and second derivatives generated by our algorithm. The maximum, average and 0.9-quantile errors of the first and second derivatives are quite small for different α, which indicate the high accuracy of our algorithm and also explain why the recall rata and precision of our algorithm are quite high. For example, the average error of the estimated first derivative is less than 10⁻³ and the average and 0.9-quantile errors of the estimated second derivative is about 10⁻⁴.

The third group of experiments is to investigate the impact of t ^′ on the performance of our algorithm. In the experiments, the recall rate, the precision, and the errors of the first and second derivatives generated by our algorithm are calculated, while t ^′ increases from 0.1 to 1, α=0.5, and δ=0.015. The experimental results are presented in Figs. 5 and 6.

Figure 5 a, b shows that the recall rate of δ-approximate inflection points is around 90 % and the precision of δ-approximate inflection points decreases with the growth of t ^′, since more δ-approximate inflection points will be omitted when t ^′ is larger. Although the recall rate and the precision of δ-approximate extremum points are lower than δ-approximate inflection points, we still capture a large portion of extremum points.

Figure 6 a, b shows the maximum, average, and 0.9-quantile errors of the first and second derivatives generated by our algorithm while t ^′ increases. The figures show that all these errors are extremely small, which also verify that our algorithm can achieve high precision on identifying the critical points. For example, the maximum error of the estimated first derivative is only about less than 2×10⁻³. For the estimated second derivative, most of the errors is less than 0.5×10⁻⁴.

6 Related works

Currently, there exists few published works considering the adaptive sampling in sensor networks. Moreover, none of them could support the requirement of retrieving the critical points from the monitored physical world.

Some adaptive sampling algorithms are proposed for particular applications. For example, the algorithm in [30] is designed for target tracking. Each sensor uses sensory values sensed by its neighbours and itself to predict the target position and adjusts the sampling frequency adaptively. And [31] introduced an energy efficient algorithm to adjust the sampling frequency in the abnormal event detecting applications. It applies Fourier transform to predict the events and to adjust sampling frequency automatically. Since they are designed for particular applications, they have limitation in applying.

Most of works on adaptive sampling apply prediction models to estimate sensory values instead of sampling them. Jain and Chang [32] uses Kalman filter-based prediction model to predict sensory values, if the estimations beyond the acceptable range, they will adjust sampling frequency. However, the prediction ability of Kalman filter is limited, and the estimation error may be large. The prediction method in [33] is Box-Jenkins approach. The main idea of the work is to skip samplings from equal-sampling-frequency and use the forecast ones, which can adjust the sampling frequency adaptively. However, since the method is based on equal-sampling-frequency, its accuracy is even worse than the EFS method. The work in [34] proposed a heuristic adaptive sampling algorithm for a glacial sensor network. Each sensor locally adjusts its sensing frequency based on the linear regression forecasting model. Such method reduces the energy cost of acquiring sensory data since the forecasting model is sufficiently utilized. However, the forecasting ability of the linear regression model is limited, and it do not consider the problem of retrieving the critical points, either.

As for retrieving critical points in wireless sensor networks, some published works focus on it. However, few works focus on retrieving extremum and inflection points. The most common critical points they considering is top-k values.

The work in [20] proposed a novel approach for top-k query. The basic idea of it is to use filters for each node, which can filter the unnecessary sensor updates. Silberstein et al. [19] studied the optimizing top-k queries in wireless sensor networks. The authors proposed to use samples of past sensor readings, which can reduce energy significantly. The work in [18] focuses on the location aware peak value query in sensor networks. It consider the top-k values’ location which is not considered the traditional top-k query. The authors defined LAP- (D,k) query which can find top-k values and the distance between any two values is larger than D. However, these papers only consider peak values. In face, besides peak values, the extremum and inflection points are also critical points.

7 Conclusions

This paper studies the critical point aware data acquisition algorithm to obtain δ-approximate extremum and inflection points from the physical world. We firstly provide the formal definition of δ-approximate extremum and inflection points. Then, a data acquisition algorithm is proposed based on numerical analysis and Lagrange interpolation. Such algorithm can adjust the sampling frequency of each sensor node adaptively according to the variation of physical world. The correctness of the algorithm is proved and the its complexity is analyzed in detail. Finally, the extensive simulations are carried out, which show that our algorithm can achieve high accuracy of retrieving δ-approximate extremum and inflection points from physical world.

8 Appendix

The construction of the Lagrange interpolation polynomials are as follows.

The construction for L ₁(t): first, let l _1k(t) be a k-polynomial of t for 0≤k≤3, where {l _1k(t)|0≤k≤3} satisfies that

$$l_{10}(t)=\frac{(t-t_{c-1})\left(t-t_{\frac{2c-1}{2}}\right)(t-t_{c})}{\\ \left(t_{\frac{2c-3}{2}}-t_{c-1}\right)\left(t_{\frac{2c-3}{2}}-t_{\frac{2c-1}{2}}\right)\left(t_{\frac{2c-3}{2}}-t_{c}\right)} $$

$$l_{11}(t)=\frac{\left(t-t_{\frac{2c-3}{2}}\right)\left(t-t_{\frac{2c-1}{2}}\right)(t-t_{c})}{\left(t_{c-1}-t_{\frac{2c-3}{2}}\right)\left(t_{c-1}-t_{\frac{2c-1}{2}}\right)\left(t_{c-1}-t_{c}\right)} $$

$$l_{12}(t)=\frac{\left(t-t_{\frac{2c-3}{2}}\right)\left(t-t_{c-1}\right)(t-t_{c})}{\left(t_{\frac{2c-1}{2}}-t_{\frac{2c-3}{2}}\right)\left(t_{\frac{2c-1}{2}}-t_{c-1}\right)\left(t_{\frac{2c-1}{2}}-t_{c}\right)} $$

$$l_{13}(t)=\frac{\left(t-t_{\frac{2c-3}{2}}\right)\left(t-t_{c-1}\right)\left(t-t_{\frac{2c-1}{2}}\right)}{\left(t_{c}-t_{\frac{2c-3}{2}}\right)\left(t_{c}-t_{c-1}\right)\left(t_{c}-t_{\frac{2c-1}{2}}\right)}. $$

Therefore, $L_{1}(t)=S_{i}\left (t_{\frac {2c-3}{2}}\right)l_{10}(t)+S_{i}(t_{c-1})l_{11}(t)+S_{i} \left (t_{\frac {2c-1}{2}}\right)l_{12}(t)+S_{i}(t_{c})l_{13}(t)$.

Similarly, L ₂(t) is a fourth-order interpolation Lagrange polynomial and can be calculated by $L_{2}(t)=S_{i}(t_{c-2})l_{20}(t)+S_{i}\left (t_{\frac {2c-3}{2}}\right)l_{21}(t)+S_{i}(t_{c-1})l_{22}(t)+S_{i}\left (t_{\frac {2c-1}{2}}\right)l_{23}(t)+S_{i}(t_{c})l_{24}(t)$, where

$$l_{20}(t)=\frac{\left(t-t_{\frac{2c-3}{2}}\right)(t-t_{c-1})\left(t-t_{\frac{2c-1}{2}}\right)\left(t-t_{c}\right)}{\prod_{j=1}^{4}\left(t_{c-2}-t_{c-2+j/2}\right)} $$

$$l_{21}(t)=\frac{(t-t_{c-2})(t-t_{c-1})\left(t-t_{\frac{2c-1}{2}}\right)(t-t_{c})}{\left(t_{\frac{2c-3}{2}}-t_{c-2}\right)\prod_{j=2}^{4}\left(t_{\frac{2c-3}{2}}-t_{c-2+j/2}\right)} $$

$$l_{22}(t)=\frac{(t-t_{c-2})\left(t-t_{\frac{2c-3}{2}}\right)\left(t-t_{\frac{2c-1}{2}}\right)(t-t_{c})}{\prod_{j=0}^{1}\left(t_{c-1}-t_{c-2+j/2}\right)\prod_{j=3}^{4}(t_{c-1}-t_{c-2+j/2})} $$

$$l_{23}(t)=\frac{(t-t_{c-2})\left(t-t_{\frac{2c-3}{2}}\right)(t-t_{c-1})(t-t_{c})}{\left(t_{\frac{2c-1}{2}}-t_{c}\right)\prod_{j=0}^{2}\left(t_{\frac{2c-1}{2}}-t_{c-2+j/2}\right)} $$

$$l_{24}(t)=\frac{(t-t_{c-2})\left(t-t_{\frac{2c-3}{2}}\right)\left(t-t_{c-1}\right)\left(t-t_{\frac{2c-1}{2}}\right)}{(t_{c}-t_{c-2})\left(t_{c}-t_{\frac{2c-3}{2}}\right)(t_{c}-t_{c-1})\left(t_{c}-t_{\frac{2c-1}{2}}\right)}. $$

References

Darpa Sensit Program. http://comlab.ecs.syr.edu/workshop2002/files/RichardButler.pdf.
S Kumar, D Shepherd, in Proc. 4th Int. Conf. on Information Fusion. SensIT: Sensor information technology for the warfighter (Elsevier, 2001), pp. 1–7.
Z Cai, Z-Z Chen, G Lin, A 3.4713-approximation algorithm for the capacitated multicast tree routing problem. Theor. Comput. Sci.410(52), 5415–5424 (2008).
Article MathSciNet Google Scholar
Z Cai, G Lin, G Xue, in Proceedings of 11th Annual International Conference on Computing and Combinatorics 2005 (COCOON). Improved approximation algorithms for the capacitated multicast routing problem (SpringerKunming, China, 2005), pp. 136–145.
Google Scholar
M Li, Y Liu, L Chen, Nonthreshold-based event detection for 3d environment monitoring in sensor networks. IEEE Trans. Knowl. Data Eng.20(12), 1699–1711 (2008).
Article Google Scholar
Greenorbs. http://greenorbs.org/.
K Liu, M Li, Y Liu, X Li, M Li, H Ma, in Proceedings of the 18th Annual IEEE International Conference on Network Protocols, ICNP 2010, Kyoto, Japan, 5–8 October, 2010. Exploring the hidden connectivity in urban vehicular networks (IEEE Computer Society, 2010), pp. 243–252.
B Barbagli, L Bencini, I Magrini, G Manes, A Manes, in Proceedings of the 7th International Wireless Communications and Mobile Computing Conference, IWCMC 2011, Istanbul, Turkey, 4–8 July, 2011. A real-time traffic monitoring based on wireless sensor network technologies (IEEE, 2011), pp. 820–825.
Structural Health Monitoring of the Golden Gate Bridge. http://www.cs.berkeley.edu/?binetude/ggb/.
S Kim, S Pakzad, DE Culler, J Demmel, G Fenves, S Glaser, M Turon, in Proceedings of the 6th International Conference on Information Processing in Sensor Networks, IPSN 2007, Cambridge, Massachusetts, USA, April 25–27, 2007. Health monitoring of civil infrastructures using wireless sensor networks (ACM, 2007), pp. 254–263.
S Kim, S Pakzad, DE Culler, J Demmel, G Fenves, S Glaser, M Turon, in Proceedings of the 4th International Conference on Embedded Networked Sensor Systems, SenSys 2006, Boulder, Colorado, USA, October 31 - November 3, 2006. Wireless sensor networks for structural health monitoring (Springer, 2006), pp. 427–428.
JT Overpeck, GA Meehl, S Bony, DR Easterling, Climate data challenges in the 21 st century. Science (Washington). 331(6018), 700–702 (2011).
Article Google Scholar
S Cheng, Z Cai, J Li, Curve query processing in wireless sensor networks. IEEE Trans. Veh. Technol.64(11), 5198–5209 (2015).
Article Google Scholar
J Li, S Cheng, (ε, δ)-approximate aggregation algorithms in dynamic sensor networks. IEEE Trans. Parallel Distrib. Syst.23(3), 385–396 (2012).
Article Google Scholar
S Cheng, J Li, in 29th IEEE International Conference on Distributed Computing Systems (ICDCS 2009), 22–26 June 2009, Montreal, Québec, Canada. Sampling based (ε, δ)-approximate aggregation algorithm in sensor networks (IEEE Computer Society, 2009), pp. 273–280.
S Cheng, J Li, Q Ren, L Yu, in INFOCOM 2010. 29th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, 15–19 March 2010, San Diego, CA, USA. Bernoulli sampling based (ε, δ)-approximate aggregation in large-scale sensor networks (IEEE, 2010), pp. 1181–1189.
Z He, Z Cai, S Cheng, X Wang, Approximate aggregation for tracking quantiles in wireless sensor networks. Theor. Comput. Sci. 607:, 381–390 (2015).
Article MathSciNet Google Scholar
S Cheng, J Li, L Yu, in Proceedings of the IEEE INFOCOM 2012, Orlando, FL, USA, March 25–30, 2012. Location aware peak value queries in sensor networks (IEEE, 2012), pp. 486–494.
A Silberstein, R Braynard, CS Ellis, K Munagala, J Yang, in Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, 3–8 April 2006, Atlanta, GA, USA. A sampling-based approach to optimizing top-k queries in sensor networks (IEEE Computer Society, 2006), p. 68.
M Wu, J Xu, X Tang, W Lee, Top-k monitoring in wireless sensor networks. IEEE Trans. Knowl. Data Eng.19(7), 962–976 (2007).
Article Google Scholar
I Su, Y Chung, C Lee, Y Lin, Efficient skyline query processing in wireless sensor networks. J. Parallel Distrib. Comput.70(6), 680–698 (2010).
Article MATH Google Scholar
Z Wu, M Wang, L Yuan, H Jiang, in Proceedings of the 2nd International Conference on BioMedical Engineering and Informatics (BMEI). Optimized routing structure based skyline query algorithm in wireless sensor network (IEEE Computer SocietyTianjin, China, 2009), pp. 1–5.
Google Scholar
J Cao, LE Li, A Chen, T Bu, in Proceedings of the 1st ACM Workshop on Mobile Internet Through Cellular Networks. Incremental tracking of multiple quantiles for network monitoring in cellular networks (ACMNew York, NY, USA, 2009), pp. 7–12.
Chapter Google Scholar
Z Huang, L Wang, K Yi, Y Liu, in Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12–16, 2011. Sampling based algorithms for quantile computation in sensor networks (ACM, 2011), pp. 745–756.
J Li, S Cheng, H Gao, Z Cai, Approximate physical world reconstruction algorithms in sensor networks. IEEE Trans. Parallel Distrib. Syst.25(12), 3099–3110 (2014).
Article Google Scholar
S Cheng, J Li, Z Cai, in Proceedings of the IEEE INFOCOM 2013, Turin, Italy, April 14–19, 2013. O(ε)-approximation to physical world by sensor networks, (2013), pp. 3084–3092.
C-E Froberg, CE Frhoberg, Introduction to Numerical Analysis (Addison-Wesley Reading, Massachusetts, USA, 1969).
Google Scholar
MJD Powell, Approximation Theory and Methods (Cambridge university press, England, 1981).
MATH Google Scholar
T Lindgren, J Sanchez, J Hall, in Graphics Gems III. Curve tessellation criteria through sampling (ACMSan Diego, CA, USA, 1992), pp. 262–265.
Google Scholar
M Rahimi, R Safabakhsh, in Proceedings of the IEEE International Conference on Wireless Communications, Networking and Information Security, WCNIS 2010, 25–27 June 2010, Beijing, China. Adaptation of sampling in target tracking sensor networks (IEEE, 2010), pp. 301–305.
C Alippi, G Anastasi, MD Francesco, M Roveri, An adaptive sampling algorithm for effective energy management in wireless sensor networks with energy-hungry sensors. IEEE Trans. Instrum. Meas.59(2), 335–344 (2010).
Article Google Scholar
A Jain, EY Chang, in Proceedings of the 1st Workshop on Data Management for Sensor Networks, in Conjunction with VLDB, DMSN 2004, Toronto, Canada, August 30, 2004. Adaptive sampling for sensor networks (ACM, 2004), pp. 10–16.
YW Law, S Chatterjea, J Jin, T Hanselmann, M Palaniswami, in Proceedings of the International Conference on Wireless Communications and Mobile Computing: Connecting the World Wirelessly, IWCMC 2009, Leipzig, Germany, June 21–24, 2009. Energy-efficient data acquisition by adaptive sampling for wireless sensor networks (ACM, 2009), pp. 1146–1151.
P Padhy, RK Dash, K Martinez, NR Jennings, in 5th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2006), Hakodate, Japan, May 8–12, 2006. A utility-based sensing and communication model for a glacial sensor network (ACM, 2006), pp. 1353–1360.

Download references

Acknowledgements

This work was supported in part by the National Basic Research Program of China (973 Program) under Grant No. 2012CB316200, the National Natural Science Foundation of China (NSFC) under Grant No. 61190115, 61370217, the Fundamental Research Funds for the Central Universities under grant No. HIT.KISTP201415, the National Science Foundation (NSF) under Grants No. CNS-1152001, CNS-1252292, the Research Fund for the Doctoral Program of Higher Education of China under grant No. 20132302120045, and the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology under grant No. HIT.NSRIF. 2014070.

Author information

Authors and Affiliations

School of Computer Science and Tech, Harbin Institute of Technology, Harbin, 150001, China
Tongxin Zhu, Siyao Cheng & Jianzhong Li
Department of Computer Science, Georgia State University, Atlanta, 30303, GA, USA
Zhipeng Cai

Authors

Tongxin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Siyao Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Zhipeng Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jianzhong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Siyao Cheng.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Zhu, T., Cheng, S., Cai, Z. et al. Critical data points retrieving method for big sensory data in wireless sensor networks. J Wireless Com Network 2016, 18 (2016). https://doi.org/10.1186/s13638-015-0505-0

Download citation

Received: 24 September 2015
Accepted: 20 December 2015
Published: 15 January 2016
DOI: https://doi.org/10.1186/s13638-015-0505-0

Critical data points retrieving method for big sensory data in wireless sensor networks

Abstract

1 Introduction

2 Problem definition

Definition 1.

Definition 2.

3 Mathematical foundations

Theorem 1.

Proof.

Theorem 2.

Theorem 3.

Proof.

4 Critical point aware data acquisition algorithm

4.1 Extremum point retrieving algorithm

4.2 Inflection point retrieving algorithm

4.3 Discussion: the method for estimating missing critical points

Theorem 4.

Proof.

4.4 Discussion: the in-network cooperation and communication algorithms

5 Experiment result

5.1 Experiment setting

5.2 The performance of the algorithm

6 Related works

7 Conclusions

8 Appendix

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Rights and permissions

About this article

Cite this article

Share this article

Keywords