Skip to main content


Optimal harvest-use-store policy for energy-harvesting wireless systems in frequency-selective fading channels

Article metrics

  • 1239 Accesses

  • 2 Citations


Recent advances in energy-harvesting (EH) technology have enabled the realization of wireless systems composed of rechargeable devices. In this paper, we analyze the problem of maximizing the data transmission for the point-to-point (P2P) wireless communication systems which the transmitter is able to harvest energy from ambient environment. To be more general, we consider the EH optimal problem under the quasi-static frequency-selective fading channel. Our optimization work also includes energy storage loss constraint of the battery; therefore, we apply an efficient harvesting architecture, i.e., harvest-use-store (HUS), where the harvested energy is prioritized for use in data transmission. To balance the energy stored in or extracted from the battery for maximization throughput with the randomly arrival harvesting energy constraint, we first characterize the amazing properties of our optimal policy, implying a double-threshold structure of the solution, then investigate a dynamic programming (DP)-based double-layer optimal allocation policy. Further, we tend to analyze the online solution. First, the optimal policy is provided by using the continuous time stochastic dynamic programming. Then, building on the intuition from the optimal offline policy (i.e., double-threshold structure), a heuristic online policy is proposed, which is simple to be implemented. Numerical results are presented to validate the theoretical analysis and to demonstrate the superior performance over the existing counterparts in the previous literatures and show that the proposed online policies track well to the optimal solution.


Energy-harvesting (EH) technology has become an alternatively promising energy supplier to current battery-powered communication networks, such as wireless sensor networks, in order to extend the network lifetime by harvesting ambient energy (solar, vibration, etc.) [1]. As opposed to battery-limited devices that are subject only to a power constraint or a sum energy constraint, EH transmitters are, in addition, subject to energy-harvesting constraints, i.e., sporadic arrival of the harvested energy in limited amounts performs a totally different energy available profile in each block. Thus, it is critical to reoptimize the transmission policy to adapt to the causality constraint imposed on the use of the harvested energy.

Recent works on optimizing transmission policy with an energy-harvesting transmitter have drawn great attentions [2-8]. Ozel et al. [2] introduced two related optimization problems in single-link fading channels: a) maximization of data transmission (or throughput) within a deadline T and b) minimization time of transmission (or delay) by B bits of data is completed. Gong et al. [3] considered joint energy-harvesting and grid power supply, formulating the problem of minimizing the power grid consumption by completing the required data transmission before a given deadline. Other communication scenarios with EH ability include broadcast channel [4,5], multiple access channel [6], and two-hop networks [7,8]. Furthermore, note that battery imperfections are also key factors of energy harvesting, leading to researchers to focus on. Devillers and Gunduz [9] converged the influence of constant leakage rate and battery degradation over time into the battery model. Tutuncuoglu and Yener [10] studied the data maximization problem under finite battery capacity constraints.

All these contributions are made under flat fading channels; however, variations of the transmitter location within a dense urban wireless environment lead to constantly changing scattering scenarios, which in turn result in a varying channel law. Thus, in this paper, we pay attention to more general cases, i.e., quasi-static frequency-selective fading channels. Moreover, we focus on battery imperfection case which has not been involved in above papers, i.e., energy loss during storage; therefore, we apply an energy-efficient harvesting strategy, i.e., harvest-use-store (HUS) [11] which puts the highest priority to usage, followed by storage, contrary to harvest-store-use (HSU) strategy, which leads to a server energy loss due to storage dominated. Thus, in this paper, we study the problem of throughput optimization for P2P system within a finite block under various constraints regarding the EH profile, quasi-static frequency-selective fading channels as well as storage loss and propose a dynamic programming-based double-layer allocation policy with non-causal energy and fading information. We show that the optimal offline solution has a double-threshold structure. With this structure property in mind, we further extend the results to causal case and present an optimal online policy and a heuristic one.

The remainder of the paper is organized as follows. We provide the system model and formulate the problem in Section 2. Optimal offline policy operating on HUS mode is solved by investigating a dynamic programming (DP)-based double-layer optimal allocation policy in Section 3, followed by online policies in Section 4. Numerical results are presented in Section 5 for performance comparison of our optimal offline solution with various existing energyharvesting architectures. Also, we provide a thorough numerical study of the proposed online policies under various algorithms and compare them to the offline policy. Section 6 finishes the paper with concluding remarks.

System model and problem formulation

Consider a point-to-point wireless communication system with an energy-harvesting transmitter wearing a rechargeable battery which suffers storage loss, as depicted in Figure 1. The energy comes from the ambient environment and harvested by the energy unit. Based on the harvested energy pattern and energy loss during storage, the transmitter operates in HUS mode, suggesting that the decision device should optimize the scheduling of the energy that stored in or drawn from the battery for data transmission.

Figure 1

A HUS energy-harvesting communication system diagram.

In our system, we will focus on an N-block transmission which starts from block 1 as shown in Figure 2. E n units of energy arrives at the beginning of block n, and the time interval between two consecutive energy arrival is defined as the transmission time n . We assume that the harvested energy increments and their arrival times can be exactly known at the transmitter prior to the transmission (similar to [12]). Further, with the lemmas in [12], we indicate that the transmit power must separately remain constant within each block, due the rate function (i.e., r=(1/2)log(1+p)) is concave in power. In what follows, we assume the L-tap quasi-static frequency-selective fading channel in block n to be Turin model, as shown by:

$$ h_{n}(t)={\sum\nolimits}_{i=1}^{L} h_{n,i}\delta\left(t-\tau_{n,i}\right) $$
Figure 2

Transmission model with random energy arrivals. Energies arrive at the beginning of the block which are denoted as . The transmission starts from block 1.

where h n,i and τ n,i respectively denote the channel gain and the delay at ith-tap in block n. Under the time-invariant assumption, its discrete form is given by a Toeplitz matrix (i.e., each descending diagonal from left to right is constant) [13], which has a special eigenstructure H=U Λ U. The Λ matrix is diagonal with diagonal entries defined by the Fourier transform of h n (t). Based on this, the transmission over a frequency-selective channel can be simplified to the M-subcarriers system by adding a cyclic prefix of length L, as shown in [13]. The mth channel component of block n, is defined as:

$$\begin{array}{*{20}l} {\tilde h_{n,m}}={\sum\nolimits}_{i=1}^{L} h_{n,i}\text{exp}(-j2{\pi}mi/M) \end{array} $$

Now the transmission channel can be viewed as a collection of parallel AWGN sub-channels, one for each subcarrier m with the fading gains ${\tilde h_{n,m}}$ , n=1,…,N, m=1,…,M. We assume that ${\tilde h_{n,m}}$ remains unchanged during a transmission block, i.e., block-fading mode.

Problem statement

The transmission rate in block n is then the sum rate of all the sub-channnels, given by the mutual information $\mathcal {I}_{n}={\sum \nolimits }_{m} \mathcal {I}_{n,m}$ in bits per symbol. In general, we assume that $\mathcal {I}_{n,m}$ is concave and increasing in p n,m , which represents the power allocated to the subcarrier m of block n. Consider a complex Gaussian channel with average signal power constraint p n,m with the channel gain $H_{n,m}={{{|{{{\tilde h}_{n,m}}}|}^{2}}}$ and the noise power is 1, the information theoretically optimal channel coding scheme, which employs randomly generated codes, achieves the channel rate given by (as is well known in [14]):

$$ \mathcal{I}_{n,m} = \frac{1}{2}\log \left({1 + {p_{n,m}}{H_{n,m}}} \right) $$

Hence, the total data transmission (i.e., throughput) over N blocks for the P2P wireless communication is described as:

$$ \mathbb{C}= {\sum\nolimits}_{n=1}^{N} \frac{{{\ell_{n}}}}{2} {\sum\nolimits}_{m=1}^{M} \log(1+p_{n,m}H_{n,m}) $$

To proceed on, we characterize the HUS strategy by the battery modes as follows:

  1. (a)

    Charging: when ${E_{n}} > \sum _{m=1}^{M}{p_{n,m}}{\ell _{n}}\), the transmitter will use $\sum _{m=1}^{M}{p_{n,m}}{\ell _{n}}$ amount of energy directly from energy unit, and the battery will store the excess energy ${E_{n}} - \sum _{m=1}^{M}{p_{n,m}}{\ell _{n}}$ which is denoted as D n for simplification.

  2. (b)

    Discharging: when ${E_{n}} < \sum _{m=1}^{M}{p_{n,m}}{\ell _{n}}$ , the transmitter will use all the harvesting energy in current block, and the battery replenishes the lacking part $\sum _{m=1}^{M}{p_{n,m}}{\ell _{n}} - {E_{n}}$ which is denoted as −D n correspondingly.

  3. (c)

    Neutral: when ${E_{n}} = \sum _{m=1}^{M}{p_{n,m}}{\ell _{n}}$ , specially, the transmitter uses up all the harvested energy for transmission without any operation to the battery.

Note that only fraction or none of the harvested energy will be wasted in the presence of storage efficiency 0<η B <1 under HUS mode, more energy efficient than HSU that all the harvested energy will suffer the energy loss, accounting for it always stored the harvested energy in a battery first before its subsequent use. With definition [D i ]+=max(0,D i ), therefore, the battery level at the end of block n (i.e., the residual battery level), denoted here by B n , is given by:

$$ {B_{n}} = {\eta_{B}}\sum\nolimits_{i = 1}^{n} {\left[ {D_{i}} \right]^{+} }- \sum\nolimits_{i = 1}^{n}{\left[{-D_{i}}\right]^{+}} $$

where ${\eta _{B}}{\sum \nolimits }_{i = 1}^{n} {{{\left [ {{D_{i}}} \right ]}^ + }}$ and ${\sum \nolimits }_{i = 1}^{n} {{{\left [ {{-D_{i}}} \right ]}^ + }}$ represent, respectively, the energy stored in and taken out from the battery at the end of block n. For simplicity and ease of analysis, we assume that the initial battery level is zero (i.e., B 0=0) and has an infinite capacity.

Thus, our throughput maximization problem over N transmission blocks can be expressed as:

$$\begin{array}{*{20}l} \mathbf{(P1)}~~~\mathbb{C}&= {\underset{p_{n,m}\ge0}{\max}} ~ \sum_{n=1}^{N} \ell_{n} \sum_{m=1}^{M} \log(1+p_{n,m}H_{n,m}) \end{array} $$
$$\begin{array}{*{20}l} \mathbf{s.~t.}~~~ {B_{n}} &= {\eta_{B}} \sum \nolimits_{i = 1}^{n} {\left[ {D_{i}} \right]^ + }- \sum \nolimits_{i = 1}^{n} {\left[ {-D_{i}} \right]^ + } \ge 0,\\ &~~~~~~\forall n \in \left\{ {1,2,\ldots,N} \right\} \end{array} $$
$$\begin{array}{*{20}l} B_{N}&=0 \end{array} $$

where the battery level must not be negative at the end of each block in order to supply sufficient energy for data transmission. The equality constraint on B N , as shown in Equation 8, is obvious since otherwise, we can always increase the transmission data rate by increasing p n,m without violating any other constraints in Equation 7. Note that (P1) is not only the power allocation problem between blocks in the time domain but also the power allocation of each subcarrier within the block in the frequency domain. The former will affect the latter. Moreover, the non-linearity and non-differentiability of the constraint conditions make the whole (P1) become more difficult to solve. Thus, we start with the properties of the solution, and then we propose a DP-based double-layer allocation algorithm to solve the problem.

Optimal offline policy for frequency selective fading channel

We first employ the Lagrangian technique to investigate the properties of the transmit power whereby to obtain an intuitive insight into our optimization problem and then introduce dynamic programming algorithm to achieve the solution.

Solvability and properties of the solution

Theorem 1.

The optimal solution to Problem (P1) has the double-threshold structure, and the form is described as follows:

$$ p_{k,j}^{*}=\left\{\begin{array}{ll} \left[\xi_{k} - (H_{k,j})^{-1}\right]^{+}, & D_{k}>0\\ \left[\rho_{k} - (H_{k,j})^{-1}\right]^{+}, & D_{k}<0\\ \left[\sigma_{k} -(H_{k,j})^{-1}\right]^{+}, & D_{k}=0 \end{array}\right. $$


$$\begin{array}{*{20}l} \xi_{k} = \left(\sum\limits_{n = k}^{N} {{\lambda_{n}}} {\eta_{B}}\right)^{-1},~\rho_{k} = \left(\sum\limits_{n = k}^{N} {{\lambda_{n}}}\right)^{-1} \end{array} $$

where ξ k , ρ k , and σ k is the water level of charging, discharging, and neutral block, respectively. Specially, the water level of neutral block can be obtained by maximizing the rate of block k, with a total power constraint ${\sum \nolimits }_{m = 1}^{{M}} {{p_{k,m}}}=E_{k}/\ell _{k}$ (i.e., the harvesting power, derived from D k =0) across the sub-channels, using the traditional water-filling method.


The objective function (Equation 6) is a sum of log functions and is, thus, concave with respect to the power sequence. We can further show the convexity of the constraint set defined by Equation 7 by the method of induction. As such, our throughput maximization problem has a unique solution, according to the theory of convex optimization. For notational simplicity, denote the Lagrangian function for any λ n ≥0 by:

$$ {\cal L} = \mathbb{C} +\sum \limits_{n = 1}^{N} {\lambda_{n}}\left({{\eta_{B}}\sum \limits_{i = 1}^{n} {{\left[ {D_{i}} \right]}^ + } - \sum \limits_{i = 1}^{n} {{\left[ {-D_{i}} \right]}^ + }} \right) $$

The Lagrangian function in Equation 10 is, in essence, the summation of all the non-zero entries of a lower triangular matrix. Differentiating with respect to p k,j , we obtain:

$$\begin{array}{*{20}l} \frac{{\partial {\cal L}}}{{\partial {p_{k,j}}}} = \frac{{\ell_{k}}{H_{k,j}}}{{1 + {p_{k,j}}{H_{k,j}}}}+\sum\limits_{n = k}^{N} {{\lambda_{n}}} \frac{{d\left({{\eta_{B}}{{\left[ {{D_{k}}} \right]}^ + } - {{\left[ { - {D_{k}}} \right]}^ + }} \right)}}{{d{p_{k,j}}}} \end{array} $$

To handle the non-linearity of the rectifier function [D k ]+, we alternatively represent [D k ]+ in terms of the signum function to yield ${\left [ {{D_{k}}} \right ]^ + } = \left ({{D_{k}} + {D_{k}}{\text {sgn}} \left ({{D_{k}}} \right)} \right)/2$ . The Kuhn Tucker condition for the optimality of a power allocation is as follows [14]:

$$ \frac{{\partial {\cal L}}}{{\partial {p_{k,j}}}} = \left\{ \begin{array}{ll} = 0, & if{\kern 6pt} {p_{k,j}} > 0 \\ \le 0, & if{\kern 6pt} {p_{k,j}} = 0 \end{array} \right. $$

which guarantees the constraint p k,j ≥0 is satisfied. Recognize that ${{d{\text {sgn}} \left (x \right)}}/{{dx}} = 2\delta \left (x \right)$ , x δ(x)=0, the optimal power allocation can be described as below:

$$\begin{array}{*{20}l} {p_{k,j}} &= \left[\vphantom{\frac{q^{2^{3}}}{q}}\left[{\sum\nolimits}_{n = k}^{N} \lambda_{n} \left\{ \left({{\eta_{B}} + 1} \right)\vphantom{\frac{q}{q}}\right.\right.\right.\\ &\quad+\left.\left.\left.\!\!\!\vphantom{\frac{q}{q}} \left({{\eta_{B}} - 1} \right){\text{sgn}} \left({D_{k}} \right) \right\}/2\vphantom{{\sum\nolimits}_{n = k}^{N}}\right]^{-1} -{H_{k,j}}^{-1}\right]^{+} \end{array} $$

To solve this equation for the optimal power, we identify three cases for the signum function, which will lead to Equation 9. Particularly, when D k =0, which means all the harvested energy of block k has been allocated to the same block. Based on water-filling method in [13,14], we know the maximum throughput will be obtained by allocating the power σ k , which is determined by:

$$ \sum\limits_{j=1}^{M}\left[(\sigma_{k})^{-1}-(H_{k,j})^{-1}\right]^{+}=E_{k}/\ell_{k} $$

Above all, the main result in Equation 9 provides a basis to investigate the properties of the optimal power allocation policy for the new energy-harvesting system. This theorem reveals more interesting properties in the optimal power-allocation pattern, as summarized below. For ease of description, we define some terms for subsequent use.

Definition 1.

A block that hits the zero battery level is called a valley block or simply a valley. The blocks between any two closest valleys constitute a hill segment.

A hill segment starting from block a and ending at s>a is briefly denoted as HS (a,s), which means B a−1=B s =0. Now we can state the main properties of our problem.

Property 1.

The subcarriers in the same block possess the same water level.


Since the water level is only related with the block index k in Equation 9 rather than the subcarrier index j, thus the water level in the same block is equal.

Property 2.

Within a hill segment (e.g., HS (a,s)), all the energy-charging blocks have the same water level, equal to w +, whereas the energy-discharging blocks has the similar property, which has a lower water level w , where:

$$\begin{array}{*{20}l} w^{+} & =\left(\sum\limits_{n = s}^{N} {{\lambda_{n}}} {\eta_{B}}\right)^{-1}=\xi_{s} \\ w^{-} & =\left(\sum\limits_{n = s}^{N} {{\lambda_{n}}}\right)^{-1}=\rho_{s} \end{array} $$

Particular, the energy-neutral block k[a,s] has the water level w 0, which is determined by Equation 14. Here, we always have w +w 0w .


Since B k ≠0, we assert λ k =0, k[a,s−1], in accordance with the slackness conditions ${\lambda _{k}}\left ({{\eta _{B}}\sum \nolimits _{i = 1}^{k} {{\left [ {D_{i}} \right ]}^ + }- \sum \nolimits _{i = 1}^{k} {{\left [ {-D_{i}} \right ]}^ + }} \right)=0$ . Thus, for D k >0 which corresponds to the battery state of energy charging, the water level, j=1,2,…,M:

$$w_{k,j}=\left(\sum\limits_{n = k}^{N} {{\lambda_{n}}} {\eta_{B}}\right)^{-1}=\left(\sum\limits_{n = s}^{N} {{\lambda_{n}}} {\eta_{B}}\right)^{-1}=w^{+}. $$

Similarly for D k <0 which corresponds to the battery state of energy discharging, we obtain:

$$w_{k,j}=\left(\sum\limits_{n = k}^{N} {{\lambda_{n}}}\right)^{-1}=\left(\sum\limits_{n = s}^{N} {{\lambda_{n}}}\right)^{-1}=w^{-}. $$

Then, consider block s with B s =0. The only possibility for B s =0 is that block s is a discharging period and hence, D s <0. It follows from Equation 9 that:

$$w_{s,j}=\left(\sum\limits_{n = s}^{N} {{\lambda_{n}}}\right)^{-1}=w^{-}. $$

Particular, when D k =0, which corresponds to the battery state of energy neutral, we obtain:

$${\small\begin{aligned} w_{k,j}=\left(\sum\limits_{n = k}^{N} {{\lambda_{n}}\left\{ {\left({\eta_{B}} + 1\right) + \left({{\eta_{B}} - 1} \right){\text{sgn}} \left({0} \right)} \right\}/2}\right)^{-1}\!\!=w^{0}=\sigma_{k} \end{aligned}} $$

Since D k =0 means all the harvested energy of block k has been allocated to the same block, the water level w 0 is determined by Equation 14. Since 1≥η B ≥0, we can easily know w +w 0w .

Property 3.

The water levels of charging blocks, though equal within a hill segment, is monotonically non-decreasing from one hill segment to the next. The same assertion is true for the discharging blocks.


Assuming there are M valley blocks, we denote these blocks as V 1,V 2,…,V M , and then respectively denote $w^{+}_{V_{i}}$ and $w^{-}_{V_{i}}$ as the optimal water level of charging and discharging blocks within the hill segment HS(V i−1+1, V i ), yielding:

$$\begin{array}{*{20}l} w^{+}_{V_{i}} & =\left(\sum\limits_{n = V_{i}}^{N} {{\lambda_{n}}} {\eta_{B}}\right)^{-1}\\ w^{-}_{V_{i}} & = \left(\sum\limits_{n = V_{i}}^{N} {{\lambda_{n}}}\right)^{-1}\\ \end{array} $$

Since λ n ≥0, V i >V i−1, thus w + and w monotonically increase from one hill segment to the next.

These properties imply that:

  1. 1.

    Actually, within the block, the power allocation is equivalent to the traditional water filling.

  2. 2.

    The water level has a close relationship with the battery mode (i.e., charging D k >0, discharging D k <0, neutral D k =0).

  3. 3.

    In order to maximize the total throughput, the power management policy should balance the tradeoff among the whole transmission process. Thus, the energy may be transferred from current block to the future to ensure the optimal benefit. Also, on account of the causality constraint, energy can not be used before its arrival. These two aspects in turn prove the properties.

  4. 4.

    The main insight is that the optimal solution has a double-threshold structure as shown in Theorem 1.

All above leads to an intuitive understanding of our optimal solution; based on this, we introduce dynamic programming algorithm and form a double-layer allocation problem to obtain our optimal solution, which is shown below.

Dynamic programming

In this subsection, we develop a DP approach for our throughput maximization problem. Recall that the battery status of a block, say block n, depends on its charging and discharging history up to the time n. Specifically, it follows from Equation 5 that:

$$\begin{array}{*{20}l} {B_{n}} &= \sum \limits_{i = 1}^{n}\left({\eta_{B}} {\left[ {{D_{i}}} \right]^ + } - {\left[ {{-D_{i}}} \right]^{+}}\right) \\ &= \sum \limits_{i = 1}^{n-1}\left({\eta_{B}} {\left[ {{D_{i}}} \right]^ + } - {\left[ {{-D_{i}}} \right]^{+}}\right)+ \left({{\eta_{B}}{\left[ {{D_{n}}} \right]^ + } - {\left[ {{-D_{n}}} \right]^ +}}\right) \\ &= {B_{n-1}} + {\alpha_{n}} \end{array} $$

where α n can be regarded as an operation of the battery (i.e., battery mode), which accords with the dynamic programming model (i.e., basic discrete dynamic system) [15]. Equation 15 is a basic discrete time dynamic system, which can be regarded as an order-one Markov process or more accurately random walk model. In what follows, we introduce the first layer allocation which characterizes how to allocate the power within block n to achieve the maximum throughput in the frequency domain:

$$\begin{array}{*{20}l} \left(\mathrm{First~ layer}\right)~C_{n,M}^{P_{n}}:= \frac{{\ell_{n}}}{2}\sum\limits_{m = 1}^{{M} } {\log \left({1 + {{p_{n,m}}{{ {{H_{n,m}}} }}}} \right)} \end{array} $$
$$\begin{array}{*{20}l} s.t. ~~{p_{n,m}}=\left[w - \left(H_{n,m}\right)^{-1}\right]^{+},~~\sum\limits_{m = 1}^{{M}}{{p_{n,m}}}=P_{n} \end{array} $$

where the water level w is determined by the argument D n based on Equation 9. P n is the sum power which allocated to the block n, and is actually a mapping relation with α n :

$$\begin{array}{*{20}l} {P_{n}} = \left\{ {\begin{array}{*{20}c} \frac{{{E_{n}} - \frac{{{\alpha_{n}}}}{{{\eta_{B}}}}}}{{{\ell_{n}}}},& {\alpha_{n}} \ge 0, & {\alpha_{n}} = {\eta_{B}}{D_{n}}\\ \frac{{{E_{n}} - {\alpha_{n}}}}{{{\ell_{n}}}}, & {\alpha_{n}} < 0, & {\alpha_{n}}={D_{n}} \end{array}} \right. \; \end{array} $$

Now, the benefit function is related to α n , and we can rewrite $C_{n,M}^{P_{n}}$ as $C_{n,M}^{{\alpha _{n}}}$ . Therefore, the maximum total benefit is to find a series battery operation α 1,α 2α N for the N blocks which leads to our second layer allocation in the time domain, as shown by:

$$\begin{array}{*{20}l} (\mathrm{Second~ layer})~~ &{\underset{{\alpha_{1}},{\alpha_{2}} \cdots {\alpha_{N}}}{\max }} \sum\limits_{n = 1}^{N} C_{n,M}^{{\alpha_{n}}} \end{array} $$
$$\begin{array}{*{20}l} s.t. ~~~~ {B_{n}} = {B_{n-1}} &+ {\alpha_{n}} \ge 0, ~~\forall n \in \left\{ {1,2,\ldots,N} \right\} \end{array} $$

The second layer allocation problem can be solved by dynamic programming, which can be obtained by recursively computing J N ,,J 1 based on Bellman’s equation [15]:

$$\begin{array}{*{20}l} {J_{N}}\left({{B_{N-1}}} \right) &= {\underset{{\alpha_{N}}=- {B_{N-1}}}{\max }}{C_{N,M}^{\alpha_{N}}} \end{array} $$
$$\begin{array}{*{20}l} {J_{n}}\left({{B_{n-1}}} \right) &= {\underset {- {B_{n-1}} \le {\alpha_{n}} \le {\eta_{B}}{E_{n}}}{\max }} \left\{ {{C_{n,M}^{\alpha_{n}}} + {J_{n + 1}}\left({{B_{n}}} \right)} \right\},\\ &~~~~~~n=1,~2,~\ldots,~N-1 \end{array} $$

where B n is updated by Equation 15. Equation 22 denotes the optimal benefit of last Nn+1 blocks, which describes the tradeoff between the current rewards ${C_{n,M}^{\alpha _{n}}}$ and the future rewards J n+1(B n ). A battery operation is feasible if the energy constraints −B n−1α n η B E n are satisfied for all possible B n−1, accounted for during block n the system can at most store η B E n amount of energy and at most take B n−1 amount of energy from the battery. Equation 21 denotes the optimal benefits for the last block and the constraint (Equation 8) suggesting that the corresponding battery operation should be α N =−B N−1 for optimality evolved from any previous state B N−1. We compute J n (B n−1) as well as the optimal battery operation policy α n =μ n (B n−1) for every B n−1, n{1,2,…,N}, where μ n (B n−1) is equivalent to a mapping function which maps the given B n−1 to the optimal α n . Obviously, the search procedure for the optimal battery operation α 1, α 2, …, α N is the dynamic programming which starts with the last period and proceeds backward in time [15]; thus, we can obtain the optimal battery operation policy set $\left \{ {\mu _{1}^{*}\left ({{B_{0}}} \right),~\mu _{2}^{*}\left ({{B_{1}}} \right), ~\ldots,~\mu _{N}^{*}\left ({{B_{N - 1}}} \right)} \right \}$ . Then, given the initial battery level B 0=0 and the optimal battery operations ${\alpha _{1}^{*}}={\mu _{1}^{*}\left ({0} \right)},{\alpha _{2}^{*}}=\mu _{2}^{*}\left ({{\alpha _{1}^{*}}} \right),~\ldots,~{\alpha _{N}^{*}}=\mu _{N}^{*}\left ({{\sum \nolimits }_{n=1}^{N - 1} {\alpha _{n}^{*}}} \right)$ can be obtained. Through the mapping relation (Equation 18), our optimal policy can be solved. According to the analysis, we give Algorithm 1 to find the optimal power allocation and to make the process more clearly. It is interesting to note that the DP algorithm is similar, in principle, to the Viterbi algorithm except that the former is a backward operation, and thus, our algorithm enjoys the same computational efficiency as the Viterbi algorithm. The optimality can be proved by applying Bellman’s equation [15].

Online policy

Previously, we solve the maximization problem non-causally, which means it is necessary to know the realization of the harvesting energy and the channel in advance in order to determine the optimal transmission power. However, such information may not be available in all circumstances. Thus, in this section, based on a benchmark solution as well as insights provided in last section, we will analyze the online scheduling with the assumption that the transmitter only has the knowledge of the energy amount of the current block and the probability density function of the harvesting energy and the channel gains. We say that causal current block information is available (i.e., s n ) as future states are not a priori known. Thus, this allows us to model and treat the unpredictable nature of the wireless channel and harvesting environment. Let the accumulated channel states be ${H_{n}} \triangleq \left ({{H_{n,1}},{H_{n,2}}, \ldots,{H_{n,M}}} \right)$ and thus denote the state s n =(H n ,E n ,B n−1), n{1,2,…N}. We assume the initial state s 1=(H 1,E 1,B 0) to be always known at the transmitter.

Optimal online policy

The optimal solution is to decide the optimal battery operation α n for the block n. Hence, the optimization now is becoming the expected mutual information summed over a finite horizon of N blocks, by choosing a deterministic battery operation policy from the set π={α n =μ(s n ),s n ,n=1,2,…,N} based on the state s n . Then, by applying Equation 18, we will obtain the optimal power allocation for each subcarrier in each block. This can be solved by the dynamic programming with the only knowledge of the current block state. The detail is described as follows.

Given the initial state s 1=(H 1,E 1,B 0), the maximum throughput is given by J 1(s 1) which can be obtained by recursively computing J N (s N ),,J 1(s 1) based on Bellman’s equation [15]:

$$\begin{array}{*{20}l} {J_{N}}\left({{s_{N}}} \right) &= {\underset{{\alpha_{N}}=- {B_{N-1}}}{\max }}{C_{N,M}^{\alpha_{N}}} \end{array} $$
$$\begin{array}{*{20}l} {J_{n}}\left({{s_{n}}} \right) &= {\underset{- {B_{n-1}} \le {\alpha_{n}} \le {\eta_{B}}{E_{n}}}{\max }} \left\{ {{C_{n,M}^{\alpha_{n}}} + {\overline{J}_{n + 1}}\left({{{s_{n + 1}}|{s_{n}}}} \right)} \right\}, \end{array} $$

for n=1, 2, …, N−1, where

$$\begin{array}{*{20}l} & {\overline{J}_{n + 1}} \left({{{s_{n + 1}}|{s_{n}}}} \right) \\ &~~ =\mathbb{E}_{E_{n+1},H_{n+1}}\left[ {J_{n + 1}}\left.\left({{H_{n + 1}},{E_{n + 1}},\underbrace{{B_{n - 1}} + {\alpha_{n}}}_{{B_{n}}}} \right)\right|\right.\\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~\left.{H_{n}},{E_{n}},{B_{n - 1}},{\alpha_{n}}\vphantom{\underbrace{{B_{n - 1}} + {\alpha_{n}}}_{{B_{n}}}}\right] \end{array} $$

$\mathbb {E}(\cdot)$ is a function that takes the expectation over the distribution of the harvesting process and the fading process. The optimal battery operation policy is denoted as ${{\pi ^{*}} = \left \{ {\alpha _{n}^{*}=\mu _{n}^{*}\left ({{s_{n}}} \right),\forall {s_{n}},n = 1,2, \ldots N} \right \}}$ and can be solved iteratively. However, it is possible to further decrease the dimension of the problem to make it more tractable. For example, if the arrival process is Markovian or i.i.d., the past states do not provide any additional information about the process. The optimality can be proved by applying Bellman’s equation [15].

Structure online policy

In this subsection, based on the structure properties of the offline optimal solution obtained, we give an intuitive understanding of the optimal solution. During the hill segment HS(j,m), if $E_{k}/l_{k} \ge {\sum \nolimits }_{j = 1}^{M} {\beta _{k,j}}$ (i.e., charging), block k [j,…,m] is allocated with the water level w +, or if $E_{k}/l_{k} \le {\sum \nolimits }_{j = 1}^{M} {\gamma _{k,j}}$ (i.e., discharging) block k [j,…,m] is allocated with the water level w . Otherwise, that is, ${\sum \nolimits }_{j = 1}^{M} {\gamma _{k,j}}<E_{k}/l_{k}< {\sum \nolimits }_{j = 1}^{M} {\beta _{k,j}}$ (i.e., neutral), block k [j,…,m] is allocated with power w 0, where β k,j =[w +−(H k,j )−1]+, γ k,j =[w −(H k,j )−1]+. Hence, we now proposed a heuristic online policy as follows, based on the determination of w + and w . Note that without loss of the generality, we assume the energy arrival duration is unit (i.e., k =1,k{1,2,…N}).

$$\begin{array}{*{20}l} p_{k,j}^{*} = \left\{ {\begin{array}{cl} &\left[w^{+} - (H_{k,j})^{-1}\right]^{+}, ~~E_{k} \ge \sum\limits_{j = 1}^{M} {\beta_{k,j}} \\[-2pt] &\left[w^{-} - (H_{k,j})^{-1}\right]^{+}, ~~E_{k} \le \sum\limits_{j = 1}^{M} {\gamma_{k,j}} \\[-2pt] &\left[w^{0} -(H_{k,j})^{-1}\right]^{+}, ~~\text{otherwise} \end{array}} \right. \; \end{array} $$

where w 0 can be obtained by traditional water-filling method with the power constraint E k . Assuming that the distribution of the harvesting process is known as f(p E ), we propose finding fixed water levels w + and w that simultaneously satisfy:

$$\begin{array}{*{20}l} & { {{\eta_{B}}\int_{{\sum\nolimits}_{j = 1}^{M}{\beta_{k,j}}}^{\infty} {{ \left(p_{E}-\sum\limits_{j = 1}^{M}{\beta_{k,j}}\right)} f\left({{p_{E}}} \right)d{p_{E}}} } } \\[-2pt] &~~~~~~~~~~~~~~~~~~~~ = { {\int_{0}^{{\sum\nolimits}_{j = 1}^{M}{\gamma_{k,j}}} { {\left(\sum\limits_{j = 1}^{M}{\gamma_{k,j}}-p_{E}\right) }f\left({{p_{E}}} \right)d{p_{E}}} } } \end{array} $$
$$\begin{array}{*{20}l}[-2pt] &~~~~~~~~~~ {\eta_{B}}w^{+} = w^{-} \end{array} $$

Equation 26 provides long-term energy stability by ensuring that the expected energy stored in and drawn from the battery are equal. Equation 27 can be obtained from Property 2. A more simple way to approximately determine the water levels w + and w can be described as follows:

$$\begin{array}{*{20}l} {{\eta_{B}} {{ \left(E_{k}-\sum\limits_{j = 1}^{M}{\beta_{k,j}}\right)} } }= {\sum\limits_{j = 1}^{M}{\gamma_{k,j}}-E_{k} } \end{array} $$

where only the knowledge of the current state is needed, simplifying the computation due to there is unnecessary to know the distribution of the energy-harvesting process. For completeness, an implementation of the proposed online policy algorithm is given in Algorithm 2.

Numerical results

In this section, we present the numerical results in order to demonstrate the performance of our offline and online policies.

We first give a pictorial view of the optimal offline power allocation strategy for our HUS harvesting system in Figure 3. The channel level, defined as the reciprocal of channel gain, serves as the bottom of a vessel. Note that there are two hill segments, i.e., HS(1,4) and HS(5,8). We can see that the water levels of charging of blocks are equal within hill segment and the same phenomenon to the discharging blocks. Moreover, the water levels of different modes are respectively non-decreasing between hill segments. In the second hill segment, note that w +, w , w 0 satisfy the relationship w +w 0w . Particularly, no power are allocated to the first subcarrier of block 1, the third subcarrier of block 3, and the second subcarrier of block 5 accounting for the fact that the corresponding channel gain is so bad that the reciprocal of channel gain exceeds the water level.

Figure 3

Water-filling of frequency selective fading channel for inefficient energy-harvesting system. The block length N=8.

We compare the maximum throughput of our policy to various harvesting architectures in Figure 4, where each throughput point is obtained by averaging over 1,000 random harvested energy data in Rayleigh fading of unit power. We assume that there are a total of N=6 blocks, for which harvested energy varies independently from one block to another following the uniform distribution over the range [1,8], symbolically denoted as (1,8). Each block has a random duration uniformly distributed as (1,4). We determine the HSU results by using the optimal power policy developed in [12] and taking into account the storage efficiency. HU (i.e., harvest-use) is the greedy policy which means immediately using up the harvesting energy without storage. It is observed from the figure that HUS mode always outperforms its counterparts, regardless of the storage efficiency. For very low storage efficiency less than 0.4, the performance of HUS coincides with that of HU, implying that no energy is stored, as shown in the left subfigure. Specially, when η B =1, HSU achieves the same performance as HUS does, as shown in the right subfigure.

Figure 4

Maximum throughput of HU, HSU, and HUS versus the storage efficiency. The block length N=6, ${E_{k}} \in \mathcal {U}({1,8})$ , ${l_{k}} \in \mathcal {U}({1,4})$ .

Figure 5 shows the average throughput achieved with the optimal offline policy and the online policies. It is observed that either the proposed online with the determination of w + and w in Equation 26 or in Equation 28 performs significantly well in comparison with the online optimal policy, while all remain notably close to the optimal offline upper bound in the absence of non-causal harvesting and fading information.

Figure 5

Average transmission throughput versus battery efficiency for offline and online policies.


In this paper, we analyzed the problem of maximizing the data transmission for the energy-harvesting wireless communication systems in the frequency-selective fading channel, which operates on HUS mode. We proposed a DP-based double-layer policy and analyzed the properties of the solution. It was shown that the optimal policy has a double-threshold structure. Based on this, we further provided an optimal online policy and a heuristic online one. Numerical results perform superiorly over other offline strategies with different energy-harvesting architectures and show that the proposed online policy performed notably well, closely tracking the optimal online policy.


  1. 1

    A Kansal, J Hsu, S Zahedi, MB Srivastava, Power management in energy harvesting sensor networks. ACM Trans. Embed. Comput. System. 6(4), Article 32, 1–38 (2007).

  2. 2

    O Ozel, K Tutuncuoglu, J Yang, S Ulukus, A Yener, Transmission with energy harvesting nodes in fading wireless channels: optimal policies. IEEE J. Sel. Areas Commun. 29(8), 1732–1743 (2011).

  3. 3

    J Gong, S Zhou, Z Niu, Optimal power allocation for energy harvesting and power grid coexisting wireless communication systems. IEEE Trans. Commun. 61(7), 3040–3049 (2013).

  4. 4

    M Antepli, E Uysal-Biyikoglu, H Erkal, Optimal packet scheduling on an energy harvesting broadcast link. IEEE J. Sel. Areas Commun. 29(8), 1721–1731 (2011).

  5. 5

    J Yang, O Ozel, S Ulukus, Broadcasting with an energy harvesting rechargeable transmitter. IEEE Trans. Wireless Commun. 11(2), 571–583 (2012).

  6. 6

    J Yang, S Ulukus, in Proceedings of the 2011 IEEE International Conference on Communications (ICC). Optimal packet scheduling in a multiple access channel with rechargeable nodes (IEEEKyoto, Japan, 2011), pp. 1–5.

  7. 7

    O Orhan, E Erkip, in Proceedings of the 2013 IEEE International Symposium on Information Theory (ISIT). Throughput maximization for energy harvesting two-hop networks (IEEEIstanbul, Turkey, 2013), pp. 1596–1600.

  8. 8

    O Orhan, E Erkip, in Proceedings of 2012 46th Annual Conference on Information Sciences and Systems (CISS). Optimal transmission policies for energy harvesting two-hop networks (Princeton, NJ, America, 2012), pp. 1–6.

  9. 9

    B Devillers, D Gunduz, A general framework for the optimization of energy harvesting communication systems with battery imperfections. J. Commun. Netw. Spec. Issue Energy Harvesting Wireless Netw. 14(2), 130–139 (2012).

  10. 10

    K Tutuncuoglu, A Yener, Optimum transmission policies for battery limited energy harvesting nodes. IEEE Trans. Wireless Commun. 11(3), 1180–1189 (2012).

  11. 11

    F Yuan, KQT Zhang, S Jin, H Zhu, in Proceedings of the 2014 IEEE International Conference on Communications (ICC). A harvest-use-store mode for energy harvesting communication systems with optimal power policy (IEEESydney, Australia, 2014), pp. 5366–5371.

  12. 12

    J Yang, S Ulukus, Optimal packet scheduling in an energy harvesting communication system. IEEE Trans. Commun. 60(1), 220–230 (2012).

  13. 13

    D Tse, P Viswanath, Fundamentals of Wireless Communications (Cambridge University Press, Cambridge, UK, 2005).

  14. 14

    TM Cover, JA Thomas, Elements of Information Theory, 2nd ed. (Wiley, New York, NY, USA, 2006).

  15. 15

    DP Bertsekas, Dynamic Programming and Optimal Control, vol. 1 (Athena Scientific, Belmont, MA, 1995).

Download references


The authors would like to thank the editors and the anonymous reviewers for providing very detailed suggestions for revision, which help considerably improve the presentation of the paper. This material is supported by the National Basic Research Program of China (973 Program) under Grant No. (2013CB329005), the National Natural Science Foundation of China under Grant No. (61271237, 61222102, 61401235), the Natural Science Foundation of Jiangsu Province under Grant No. (BK2012021), and Jiangsu Scientific Innovation Research of University Graduate under Grant No. (KYLX_0807).

Author information

Correspondence to Hongbo Zhu.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark


  • Energy harvesting
  • Throughput maximization
  • Storage inefficiency
  • Harvest-use-store
  • Double threshold
  • Dynamic programming