Adaptive sparse random projections for wireless sensor networks with energy harvesting constraints

Considering a large-scale energy-harvesting wireless sensor network (EH-WSN) measuring compressible data, sparse random projections are feasible for data well-approximation, and the sparsity of random projections impacts the mean square error (MSE) as well as the system delay. In this paper, we propose an adaptive algorithm for sparse random projections in order to achieve a better tradeoff between the MSE and the system delay. With the energy-harvesting constraints, the sparsity is adapted to channel conditions via an optimal power allocation algorithm, and the structure of the optimal power allocation solution is analyzed for some special case. The performance is illustrated by numerical simulations.


Introduction
Energy supply is a major design constraint for conventional wireless sensor networks (WSNs), and the lifetime is limited by the total energy available in the batteries. Some specific sensors in WSNs may consume more energy than the radio during a long acquisition time [1]. Replacing the batteries periodically may prolong the lifetime but not be a viable option when the replacement is considered to be too inconvenient, too dangerous, or even impossible when sensors are deployed in harsh conditions, e.g., in toxic environments or inside human bodies. Therefore, harvesting energy from the environment is a promising approach to cope with battery supplies and the increasing energy demand [2]. The energy that can be harvested includes solar energy, piezoelectric energy, or thermal energy, etc. and is theoretically unlimited. Besides, background radio-frequency (RF) signals radiated by ambient transmitters can also be a viable new source for wireless power transfer (WPT) [3,4]   Unlike the conventional WSNs that are subject to a power constraint or sum energy constraint, each sensor with energy harvesting capabilities is, in every time slot, constrained to use the most amount of stored energy currently available, although more energy may be available in the future slot. Therefore, a causality constraint is imposed on the use of the harvested energy. Current researches on the energy harvesting issues mostly have focused on wireless communication systems. Gatzianas et al. [5] considered a cross-layer resource allocation problem to maximize the total system utility, and Ho and Zhang [6] studied the throughput maximization with causal side information and full side information for wireless communication systems. Ng et al. [3] studied the design of a resource allocation algorithm minimizing the total transmit power for the case when the legitimate receivers are able to harvest energy form RF signals for a multiuser multiple-input single-output downlink system. Energy management policies were studied for energyharvesting wireless sensor networks (EH-WSNs) in [7], where sensor nodes have energy-harvesting capabilities, aiming at maximizing the system throughput and reducing the system delay.
For WSNs, however, accurately recovering signals is also important. Recent results in compressive sensing (CS) can provide an efficient signal reconstruction method for WSNs. Data collected from wireless sensors are typically correlated and thus compressible in an appropriate transform domain (e.g., the Fourier transform or wavelet) [8]. Therefore, the main ideal of CS is that n data values can be well-approximated using only k << n transform coefficients if the data are compressible [8][9][10][11][12][13]. In particular, Wang et al. [13] propose a distributed compressive sensing scheme for WSNs in order to reduce computational complexity and the communication cost. It considers an m × n sparse random matrix with entries that have a probability g of being nonzero, so that on average there are ng nonzeros per row. The resulted data-approximation error rate is comparable to that of the optimal k-term approximation if the energy of the signal is not concentrated in a few elements. Somehow, the sparsity factor g of random projections impacts the accuracy of signal reconstructions. Usually, the sparsity factor g is statistically determined according to the amount of harvested energy and is homogeneous for all sensors [14,15]. Rana et al. [14] only considered AWGN channels. Yang et al. [15] took into account fading channels and studied the sufficient conditions guaranteeing a reliable and computationally efficient data approximation for sparse random projections. It is not surprising that the sparse random projections based signal recovery is non-optimal since the sparsity factor g is fixed during the entire transmission slots and thus can not reflect the effect of channel conditions. On the other hand, the system delay m, which is one of key quantities used to characterize the performance of random projection-based CS schemes, is expected to be as small as possible. Upon [14] and [15], we realize that the lower bound of the system delay also being related to the sparsity of random projections and the larger g being the shorter delay may be achieved. Note that there is often a tradeoff between the system delay and the data approximation [8,16]. Therefore, in this paper, we consider fading channels and the energy-harvesting constraints and study the problem on adapting sparsity of random projections according to full channel information, in order to improve the performance of signal recovery and reduce the system delay as well. To the best of our knowledge, very limited work such as [15] has touched upon this topic, which however, only provides rough discussion. The main contributions are presented as follows: • Considering the wireless fading channels, we verify that the random projection matrix satisfies the property that the inner product between two projected vectors are preserved in expectation, and then provide a lower bound of the system delay for achieving an acceptable data approximation error probability. • We give a new definition of the sparsity of random projections and formulate the optimal sparsity problem which is converted into an optimal power allocation problem for maximizing the system throughput. Unlike the conventional energy allocation problem, due to battery dynamics and channel dynamics, the closed-form solution may not be available. Therefore, we study a special case that the battery capacity is not bounded to find the structure of the optimal solution. Specifically, in the case the problem is converted into a convex optimization problem, then the closed-form solution is obtained in terms of Lagrangian multipliers.
The rest of paper is organized as follows. Section 2 gives the system model and overview previously known results on sparse random projections. Section 3 redefines the sparsity and formulate the optimal sparsity problem for EH-WSNs. Section 4 considers a specific case and address the structure of the optimal solution. Section 5 provides the simulation results. Finally, Section 6 concludes the paper.

System model
We consider a wireless sensor network of n sensor nodes, each of which measure a data value s i ∈ C and is capable of energy harvesting. We assume a Rayleigh-fading channel, and the channel coefficients, denoted as h ij , where 1 ≤ i ≤ m denotes the slot index and 1 ≤ j ≤ n denotes the sensor index, are independent and identically distributed (i.i.d) and satisfy the complex Gaussian distribution with zero mean and unit variance. We further assume the channel remains constant in each slot. Sensor j first multiplies its data s j by some random projections φ ij ∈ R, then transmits in the ith time slot. At the receiver where e i is a White Gaussian distributed noise with zero mean and variance σ 2 . After m time slots, the received vector is given as where H = h ij ∈ C m×n , = φ ij ∈ R m×n , Z = H , and the operation is the element-wise product of two matrices. The corresponding real-valued equation of (2) iŝ where Re {A} and Im {A} denote the real part and the imaginary part of matrix A, respectively, and

Compressible data and sparse random projections
Suppose the aggregate sensor data s ∈ C n from n nodes is compressible, so that we can model it being sparse with respect to a fixed orthonormal basis {ψ j ∈ C n : j = 1, · · · , n} [11], i.e., Generally, for a compressible signal s the largest k transform coefficients capture most of the signal information and the k is usually referred to as the sparsity of s. The best k-term approximation method is applied to recover only the k largest transform coefficients and discard the remaining as zero [8], and achieves near-optimal performance of error probabilities. However, the random projection matrix used in [8] is dense, which results in a great computational complexity. Therefore, Wang et al. [13] proposed sparse random projections to reduce the computational complexity but guarantee that the error probability is comparable to that achieved by the dense random projections. More concretely, the matrix of sparse random projections ∈ R m×n contains i.i.d. entries where g is a factor which gives the probability of a measurement and controls the degree of sparsity of random projections, e.g., if g = 1, the random matrix has no sparsity, and if g = log n/n, the expected number of nonzeros in each row is log n. We can easily verify that the entries within each row are four-wise independent, while the entries across different rows are fully independent, i.e., Therefore, each random projection vector is pseudorandomly generated and stored in a small space. Corollary 1. [13] Consider a data vector u ∈ R n which satisfies the condition in addition, Let V be any set of n vectors {v 1 , · · · , v n } ⊂ R n . Suppose a sparse random matrix ∈ R m×n satisfies the conditions as with probability at least 1 − n −γ , the random projections Corollary 1 states that sparse random projections of the data vector and any set of n vectors can produce estimates of their inner products to within a small error. Thus, sparse random projections can produce accurate estimates for the transform coefficients of the data, which are inner products between the data and the set of orthonormal bases. The sufficient condition (8) is to bound the peak-to-total energy of the data. This guarantees that the signal energy is not concentrated in a small number of components. If the data is compressible in the discrete Fourier transform with compressibility parameter θ, then [13] 3 Adaptive sparse random projections

Sparse random projections with channel fading
However, Wang et al. [13] only considered AWGN channel. With the assumption of channel fading, we wonder whether the inner products are still preserved by sparse random projections. Redefine the sparse random projection matrix as follows, where g ij gives the probability of a projection from sensor node j at time slot i. The details of g ij will be illustrated in the next section. Var Proof: With the assumption that h ij satisfies i.i.d complex Gaussian distribution with zero mean and unit variance, it is not difficult to verify the following equations: By defining independent random variables w i = Proposition 1 states that an estimation of the inner product between two vectors, using the matrix of sparse random projections (4), are correct in expectation and have bounded variance. If there is a signal and a matrix of sparse random projections satisfy the conditions (8) and (16), respectively, we can achieve the following proposition: Proposition 2. Consider a data vector u ∈ R n which satisfies the condition (8). In addition, suppose a sparse random matrix ∈ R m×n satisfies the condition as (16). Let and consider an orthonormal transform ∈ R n×n . Given only x = 1 √ m u, and , the sparse random projections can produce an approximation with error with probability at least 1 − n −γ , if the k largest transform coefficients in magnitude give an approximation with error u −û opt 2 2 ≤ η u 2 2 .
Proof: Follow the approach of [13] and define m = m 1 m 2 . Partition the m × n matrix into m 2 matrices 1 , 2 , · · · , m 2 , each of size m 1 ×n. Using the Chebyshev inequality, we have where (21)  min i g ij C 2 log n , the random projections can preserve all pairwise inner products within an approximation error with probability at least 1 − n −γ . Proposition 2 states that sparse random projections can produce a data approximation with error comparable to the best k-term approximation with high probability.

Optimal power allocation based sparsity adaption
From the above propositions, we notice that the factor n j=1 1 g ij controls the value of the estimation variance (15) and the lower bound of the system delay m (18) as well. If g ij is a small value for node j at the time slot i, we may have an estimation with a high variance producing a low-accuracy approximation. Meanwhile, m should be very large for guaranteeing an acceptable error probability. An energy-aware sparsity is given as g j = E j n j=1 E j * m n in EH-WSNs [14], where E j denotes the harvested energy profile for node j. Usually, g j is predetermined and uniform regardless of nodes and time slots, i.e., g j = g. Obviously, it is not a sophisticated definition because it does not consider the different channel conditions of nodes and times as well as the energy-harvesting constraints. Therefore, a more specific definition on sparsity is desired. We redefine the sparsity of random projections as follow, where p * ij is the allocated energy for node j during the ith time slot. p * ij is determined in term of full information consisting of past and present and future channel conditions and amount of energy harvested. The case of full information may be justified if the environment is highly predictable, e.g., the energy is harvested from the vibration of motors that turned on only during fixed operating hours and line-of-sight is available for communications.
If the energy-harvesting profile E ij for each node is known in advance and kept constant during all transmission time slots, the optimal sparsity problem is converted into an optimal power allocation problem. But the rising question is which performance measurement will be used for power allocation. We know thta the performance of random projection-based CS schemes is characterized by two quantities, i.e., the data approximation error probability (or the mean square error (MSE)) and the system delay. Note that there is often a tradeoff between these two quantities [16]. Under an allowable MSE η > 0, we thus define the achievable system delay D(η) as is the lower bound of short-term throughput of node j and B is the required data information to transmit for each node. The constraint (26) is due to that the harvested energy cannot be consumed before its arrival, and the constraint (27) is the limited battery capacity. The battery overflow happens when the reserved energy plus the harvested energy exceeds the battery capacity, which, however, is not preferred because the data rate can be increased if the energy is used in advance instead of overflowed. If we assume that there is an m which satisfies the condition (24), the optimal problem minimizing the system delay is immediately converted into a throughput maximizing problem, which can be formulated as follows: Note that the objective (28) is convex for all i since it is a sum of log functions, and others are all affine constraints. Consequently, the optimization problem is a convex optimization problem, and the optimal solution satisfies the Karush-Kuhn-Tucker (KKT) conditions [17]. With the assumption that the initial battery energy E 0j is always known by node j, define the Lagrangian function for any multipliers λ i ≥ 0, μ i ≥ 0, β i ≥ 0 as with additional complementary slackness conditions We apply the KKT optimality conditions to the Lagrangian function (30). By setting ∂L/∂p ij = 0, we obtain the unique optimal energy level p * ij in term of Lagrange multipliers as where

Structural solution
If the battery capacity is finite, the optimal water-level is not monotonic. Therefore, the structure of the optimal energy allocation cannot be described in a simple and clear way, and an online programming may be required. Since we are more interested in an offline power allocation structure, we study the following special case.
Proposition 3. if E max = ∞, the optimal water levels are non-decreasing as α i ≤ α i+1 . In addition, the water level changes when all the energy harvested before the current transmission are used up.
Proof: Without the battery capacity constraint, the water level is given as The case of E max = ∞ represents an ideal energy buffer which refers to a device that can store any amount of energy, does not have any inefficiency in charging, and does not leak any energy over time. As an example, consider a sensor node installed to monitor the health of heavy duty industrial motors. Suppose the node operates using energy harvested from the machine's vibrations, the harvested energy is greater than the consumed power and the health monitoring function is desired only when the motor is powered on. Proposition 3 presents an analytically tractable structure of the optimal sparsity. Intuitively, the harvested energy is reserved in the battery for the use in the later transmission, in order to reduce the effect of causality constraint and improve the flexibility of harvested energy allocation. The optimal water level can be obtained by the power allocation policy and it is structured as follows: the water level is non-decreasing and the harvested energy is used in a conservative way. Based on the structural properties, we can use the following reserve multi-stage waterfilling algorithm modified based on [18], to achieve the solution: Algorithm 1: Reserve multi-stage waterfilling algorithm with harvested energy [18] 1: Set t 0 = 0,γ i = γ i andÊ ij = E ij for i = 1, . . . , m 2: for all i = 1 to m do 3: for all k = m to t i−1 + 1 do 4: Find α l so that k l=t i−1 +1 p lj = k−1 end for k 8: If t i = m then exit 9: end for i 10:

Simulation results
We consider a EH-WSN containing n = 500 sensor nodes, and a uniform energy-harvesting rate E ij = 2 dB for all nodes. We evaluate the performance of the proposed adaptive sparse random projections. One of performance measurements is the mean-square error (MSE) given as error = s −ŝ 2 2 s 2 2 (35) Figure 1 illustrates the data approximation performance using sparse random projections for the different degrees of sparsity. The larger g is given, the smaller MSE is achieved. However, a larger g may bring great computational complexity. Therefore, the sparsity factor g should be carefully chosen in order to keep a balance between the MSE and the complexity. Intuitively, when channel conditions are not good, a larger g should be selected for guaranteeing an acceptable MSE, whereas a smaller g should be selected for saving the computational complexity when channel conditions are good enough. This motivates us to study adapting the sparsity of random projections according to channel conditions for improving the data-approximation performance as well as the system delay. Figures 2 and 3 compare the MSE performance obtained by our proposed adaptive sparse random projection (denoted as ' Adaptive' in the legend) with that obtained by the conventional sparse random projections (denoted as 'Fixed' in the legend) with respect to the number of transmission slots m for SNR = 15 dB and 30 dB, respectively. The conventional sparse random projections with a fixed sparsity given as g = 1/4 is looked as a baseline since it achieves an acceptable MSE with a modest complexity. We observe that the proposed adaptive sparse random projections achieves better tradeoff between the MSE and the system delay than the conventional one does when k is either 10 or 5. However, the performance gap between the proposed scheme and the conventional one is getting smaller when SNR increases. That makes sense because when the channel conditions is getting better, the benefits from the adaptive sparsity become limited. For both SNR = 30 dB and 15 dB, we notice that the case of k = 5 provides better performance than the case of k = 10.
In Figure 4, we present the performance comparison between the conventional sparse random projection with a fixed sparsity and the proposed one with respect to the number of transmission (or the system delay) m for different SNRs. We still observe that the proposed scheme outperforms the conventional one for both SNR = 20 dB and 30 dB resulting in a better tradeoff between the MSE and the system delay. We also notice that, for both the  proposed scheme and the conventional scheme, there is not a performance difference between the case of SNR = 20 dB and that of SNR = 30 dB when m < 80, but the MSE decreases as SNR increases when m is over 80. That is because m is also one of factors which control the variance of the estimation illustrated in (15). If m is not sufficiently large, it is one of dominant factors which effect the MSE performance. Therefore, increasing SNR barely impacts the MSE performance. While m is large enough, a very limited improvement of the MSE may be achieved by further increasing m, but SNR now becomes a dominant factor and increasing SNR may benefit the MSE performance. Figure 5 shows tradeoffs between the system delay and the MSE for the proposed adaptive sparse random projections and the conventional ones when SNR = 30 dB and k = 5. Consider the MSE 3×10 −2 , the conventional sparse random projection requires about m = 95 times transmission, while the proposed scheme only requires m = 78 times transmission. Consequently, the proposed scheme achieves a better tradeoff compared to the conventional one.

Conclusions
In this paper, we proposed to adapt sparsity of random projections according to full channel information for EH-WSNs. Compared to the conventional sparse random projections which keep the sparsity constant for the whole transmission slots, the proposed one achieves a better tradeoff between the MSE and the system delay. The optimal sparsity problem is turned into an optimal power allocation maximizing throughput with the energy-harvesting constraints. An offline power allocation structure is available for a special case that the battery capacity is infinite. Simulation results have shown that the proposed scheme achieves smaller MSEs than the conventional scheme. Meanwhile, the proposed scheme can also reduce the system delay given an accepted error rate. However, full channel information may not be always available. Therefore, for future work, we will study adaptive sparse random projections with partial channel information.