 Research
 Open Access
Throughput maximizationbased optimal power allocation for energyharvesting cognitive radio networks with multiusers
 Yuanyi Wang^{1}View ORCID ID profile,
 Xiaohui Zhao†^{1} and
 Hui Liang†^{1}View ORCID ID profile
https://doi.org/10.1186/s136380171016y
© The Author(s) 2018
 Received: 23 May 2017
 Accepted: 18 December 2017
 Published: 5 January 2018
Abstract
An optimal power allocation (OPA) policy for orthogonal frequency division multiplexing (OFDM)based cognitive radio networks (CRNs) using underlay spectrum access model is presented under multiple secondary users (SUs) with energy harvesting (EH). The proposed algorithm can allocate transmission power to each SU on each subcarrier with the objective of maximizing the average throughput of secondary network over a finite time interval. We consider both the interference power constraint limited by primary user (PU) and the minimum throughput constraint of each SU to improve the throughput of SUs while guaranteeing the communication quality of PU. To balance current throughput and expected future throughput, a dynamic programming (DP) problem is defined and solved by the backward induction method. Moreover, for each time slot, a convex immediate optimization is presented to obtain an optimal solution, which can be solved by the Lagrange dual method. Simulation results show that our policy can achieve better performance than some traditional policies and ensure good quality of service (QoS) of PU when SUs access the spectrum.
Keywords
 Optimal power allocation
 Energy harvesting
 Cognitive radio network
 Dynamic programming
 Lagrange dual method
1 Introduction
Cognitive radio (CR) [1] is an optional solution to solve the spectrum shortage problem, which can both improve efficiency of spectrum utilization and ensure quality of service (QoS). In recent years, energy harvesting (EH) has obtained the widespread research attention on wireless communications, which has the ability to realize uninterrupted and selfsustainable operation [2]. In combination of EH technology and cognitive radio network (CRN), the secondary users (SUs) can collect energy from radio frequency (RF) signals such as primary channel and ambient sources such as solar by EH device [3]. Therefore, with energy harvesting device, SUs obtain renewable energy to improve energy efficiency and device flexibility.
Many researches have been conducted on optimal power allocation (OPA) policies with EH device under wireless communication scenarios based on different considerations addressed in [4–7]. An OPA policy for EH wireless communications with limited channel feedback from receiver is investigated in [4] where the receiver periodically sends only 1b feedback by comparing channel power gain with a predetermined threshold. In [5], a power allocation policy for an accesscontrolled transmitter with EH capability based on causal observations of channel fading state is considered. In [6], the authors study the OPA for an outage probability minimization problem in pointtopoint fading channels with the constraints of the EH and the channel distribution at transmitter. In [7], a Markov decision process (MDP) model is proposed for the energy allocation problem over a finite horizon to maximize the throughput under pointtopoint wireless communications, and both channel conditions and time varying energy sources are taken into account. According to the above research results, EH technology can obviously improve the energy efficiency; however, it also brings some new challenges to the power allocation strategy, such as the problem of uncertainty in EH process.
In fact, the former papers [4–7] have only considered pointtopoint wireless communications without using CR technology. Recently, some researches have focused on the combination of EH technology and CRN, and different OPAs are discussed in [8–14]. The OPA for EH CRN is studied in [8], the authors consider the problem of system throughput maximization over a finite horizon rather than at a certain time slot, and they adopt a rate loss constraint to protect the transmission of primary user (PU). The access strategy for hybrid underlayoverlay CR networks with EH is analyzed in [9]. The partially observable Markov decision process (POMDP) framework is proposed to determine the action of SU; meanwhile, energy threshold is used to determine the transmission mode of SU. In [10], considering an EHCR system, a power allocation policy with peak power constraints is proposed, and the target throughput maximization problem is solved by recursion machinery and geometric waterfilling algorithm. A novel savingsensing transmitting (SST) frame structure for EH CRNs is proposed in [11], where the authors aim to maximize the energy utilization efficiency of SU by jointly optimizing the save ratio and transmission power under both the energy causality constraint and the minimum throughput constraint. And the SST can make full use of residual battery energy as well as ensure enough time for spectrum sensing and data transmission. A generalized multislot spectrum sensing paradigm and two types of fusion rules (data fusion and decision fusion) are proposed in [12]. The authors focus on the tradeoff of “harvestingsensingthroughput” and joint optimization for save ratio, sensing duration, and sensing threshold as well as fusion rule to maximize the expected achievable throughput of SU while keeping the protection to PU. A POMDP is proposed in [13] to trade off energy consumption and throughput gain in hybrid CRN, where SU dynamically determines its operation mode for each time slot (e.g., to be idle or to transmit), sensing time, and access mode. In [14], for an overlay EH CRN, the authors aim to find an optimal sensing time to maximize throughput of SU and harvested RF energy. Most of the existing works only investigate OPA problem in CR system with single user, which is not pratical for real communication systems.

We consider the system model with multiple SUs where each SU transmitter is equipped with an EH device. In our study, we focus on the maximization of the average throughput of the secondary system within K time slots under the maximum transmission power constraint, the minimum throughput constraint of SUs, and the interference power constraint defined by PU. The proposed algorithm can maximize average throughput and satisfy all constrains simultaneously to achieve better performance for SUs and the QoS of PU.

Orthogonal frequency division multiplexing access (OFDMA) scheme is used in the process of spectrum sharing, where the available spectrum is divided into a set of subcarriers. The optimal transmission power of each SU can be obtain by using dynamic programming (DP) algorithm and immediate optimization solved by the backward induction method and the Lagrange dual method, respectively.
The rest of the paper is organized as follows. In Section 2, a system model of EH CRNs, a subcarrier allocation model of SUs, and a subcarrier occupation state are described. We introduce EH process in Section 3 and present the throughput optimization problem and the immediate OPA algorithm. Then, the system state and the DPbased scheme are formulated, and the backward induction method is given in Section 4. In Section 5, we present our simulation results and performance analysis through the comparison between our proposed policy and other policies. Finally, the conclusion of the whole paper is provided in Section 6.
2 System model
We consider Rayleigh fading channels modeled as a twostate Markov chain [16, 17] as shown in Fig. 1b. The channel gains from PUTx to PURx, the m^{ th } SUTx to the m^{ th } SURx, the m^{ th } SUTx to PURx, and PUTx to the m^{ th } SURx over the n^{ th } subcarrier are denoted by w^{ n }, \(h_{m}^{n}\), \(a_{m}^{n}\), and \(b_{m}^{n}\), respectively. Moreover, we select the underlay spectrum access model in which SU has the opportunity to coexist with PU, while SU must control the interference to PU under a certain threshold.
2.1 Subcarrier allocation model
According to the above system model, we determine a policy to allocate N subcarriers to M SUs. Both PU and SU systems use the OFDMA sheme, and we assume that one subcarrier can only be used by one SU at each time slot, which means the interference between SUs is not considered. We consider that each SU can use multiple subcarriers at each time slot, the subcarriers occupied by the m^{ th } SU can be denoted as the set \({{\mathcal {K}}_{m}}\), \({{\mathcal {K}}_{m}} \in {\mathcal {N}}\). We define the minimum rate requirement [18] of each SU denoted as \(R_{\min }^{m},\forall m \in {\mathcal {M}}\). The allocation policy can be described as follows.
First, we determine the priorities of each SU according to \(R_{\min }^{m}\), where the larger value of \(R_{\min }^{m}\) reaches the higher priority of SU has. Then, the subcarriers that have a good channel condition and are not occupied by PU can be allocated to SU with high priority. According to this allocation policy, we can allocate the subcarriers to SUs.
2.2 Subcarrier occupation state
Since the subcarrier occupation state is constantly changing over time slot, SU should perform spectrum sensing to determine the behavior of PU at the beginning of each time slot. From Fig. 1b, the channel may be in one of the two states: busy (B) or free (F). The state B denotes that PU is active; conversely, the state F denotes the inactiveness of PU. We consider K time slots in our study; the time slot set can be defined as \({\mathcal {K}} = \left \{ {0,1, \ldots,k, \ldots,K  1} \right \}\) and \(\forall k \in {\mathcal {K}}\).
We define \(x_{k}^{n}\) to indicate the state of n^{ th } channel at the time slot k, which has two possible values
According to \(P_{\text {BF}}^{n}\) and \(P_{\text {FB}}^{n}\), we can get the value of \(p_{ij}^{o}\).
3 Optimal power allocation policy
3.1 Energyharvesting process
where \(p_{k}^{m,n}\) denotes the transmission power allocated to the m^{ th } SU in the n^{ th } channel at the time slot k, T denotes the duration of one time slot, and B_{max} is the maximum battery capacity.
3.2 Problem formulation
where E{·} denotes the expectation of the channel gain distribution and the subcarrieroccupied state at each time slot. C1 denotes the maximum transmission power constraint, and \(B_{k}^{m}/T\) is the total transmission power budget for the m^{ th } SU at the time slot k. This constraint can ensure the transmission power of each SU at each time slot not to exceed the energy budget. C2 denotes the interference power constraint to guarantee the interference to PU remains under I^{ th }, where I^{ th } is the interference threshold prescribed by the PU receiver. C3 represents the minimum throughput constraint which can keep the throughput above the minimum throughput requirement in the network. The constrain C4 can make the transmission power of SU conform to the actual situation.
3.3 Immediate optimization problem solution
where λ_{ m }, μ, ξ_{ m }≥0 are the Lagrange multipliers. The dual variable λ_{ m } relates to the maximum transmission power constraint, the dual variable μ connects with the interference power constraint, and the dual variable ξ_{ m } contacts with the minimum throughput constraint.
where [·]^{+}= max(0,·).
where i denotes the iteration number. α_{1}, α_{2}, and α_{3}≥0 are small step sizes. The proper selection of the step size can ensure the stability and convergence of this dual algorithm [21]. Finally, we can get the optimal solution \(p_{k}^{m,{n^ * }}\). Then, by taking this solution into (11), the optimal throughput of each SU can be calculated.
Immediate OPA algorithm introduction
Algorithm 1  Immediate OPA algorithm 

1:  Initialization: set i=0,λ_{ m }(0)>0,μ(0)>0,ξ_{ m }(0)>0; 
\({\alpha _{1}}\left (0 \right) > 0, {\alpha _{2}}\left (0 \right) > 0,{\alpha _{3}}\left (0 \right) > 0;{I^{th}} > 0,R_{\min }^{m} > 0\);  
2:  Solve the optimization problem (13) to obtain \(p_{k}^{m,n},\forall k\), 
thus we can get the throughput of the secondary system;  
3:  
4:  Update the transmission power \(p_{k}^{m,n}\) by (18); 
5:  Go to 2 until \(\left  {\hat p_{k}^{m,n}\left ({i + 1} \right)  \hat p_{k}^{m,n}\left (i \right)} \right  \le \varepsilon \), where ε represents 
iteration precision usually a very small positive constant;  
6:  End: The optimal transmission power \(p_{k}^{m,{n^ * }}\) can be calculated 
by (18), and take \(p_{k}^{m,{n^ * }}\) into (11), the optimal throughput can  
be obtained. 
However, this solution does not directly apply to the multiple time slots, namely, K>1, in which we should balance the current throughput and the expected future throughput. According to our system model, the subcarrier occupation state, the harvested energy state, and the total energy budget are all timevarying state. In order to further develop this problem for K>1, we use DP algorithm, which will be introduced in the next section.
4 Dynamic programming formulation and backward induction method
4.1 System state
Considering the future expected throughput, the transition probability of the system state should be determined. Through the introduction of the previous section, we know that the system states include the subcarrier occupation and the harvested energy which are independent of each other. Since the process of EH for each SU is independent, the states of harvested energy for each SU are also independent.
According to (5), (6), and (7), the transition probability can be calculated.
4.2 Dynamic programming formulation
4.3 Backward induction method
The backward induction method [23] can be used to solve (25). The immediate reward function at the current time slot can be defined as \({R_{k}}\left ({B_{k}^{1},B_{k}^{2},......,B_{k}^{M},{S_{k}}} \right)\), while we set \({V_{k + 1}}\left ({B_{k + 1}^{1},B_{k + 1}^{2},......,B_{k + 1}^{M},{S_{k + 1}}} \right)\) to denote the future reward function at the next time slot k+1.
Thus, the reward function can be further calculated in the timereversal order.
where \(p_{K  1}^{m} = \sum \limits _{n = 1}^{N} {p_{K  1}^{m,n}},\forall m \in {\mathcal {M}}\). In this case, we only need to consider the immediate reward function, which can be achieved when the transmission power of each SU on each subcarrier satisfies the optimal solution (18).
where \(B_{k + 1}^{m}\) can be updated by (8), \(\forall m \in {\mathcal {M}}\). Through the state transition probability \(p_{ij}^{s}\), we can get the expected reward function. By considering the tradeoff between the current reward and the potential reward at next time slot, we can get the optimal transmission power.
where \(B_{0}^{m}\) denote the energy budget for SUs at the beginning of transmission, \(\forall m \in {\mathcal {M}}\), and S_{0} is the system initial state. At the initial time slot, we only need to satisfy the maximum of the expected reward at time slot k=1.
DP power allocation algorithm
Algorithm 2  DP power allocation algorithm 

1:  Input all statistics of the channel gain information, 
subcarriers occupation state and EH process;  
2:  For k=0,1,…,K−1: 
Calculate and store R_{ k } for all system states according  
to Algorithm 1;  
3:  For k=K−1,K−2,…,0: 
Calculate and store V_{ k } for all system states based on  
Then, calculate and store \(p_{k}^{m,{n^ * }} = \arg \mathop {\max }\limits _{p_{k}^{m,n}} {V_{k}}\). 
As we can see, the results can be stored by a table with the time slot index; according to this table, SUs can determine the optimal transmission power.
4.4 Performance analysis
We first analyze the performance of immediate power allocation algorithm. This algorithm can guarantee the interference of PU below a certain threshold and the throughput of each SU above a proper threshold at each time slot.
From (19) to (21), we can see that the Lagrange multipliers can be updated only by local information, which can effectively improve the calculate speed and reduce the algorithm complexity. If the transmission power \(p_{k}^{m,n}\) is relatively high, it will result in \(\sum \limits _{n = 1}^{N} {p_{k}^{m,n} > B_{k}^{m}/T} \) and \(\sum \limits _{n = 1}^{N} {\sum \limits _{m = 1}^{M} {b_{k}^{m,n}p_{k}^{m,n}}} > {I^{th}}\), which violates the constraint conditions, so that λ_{ m } and μ will increase and ξ_{ m } will decrease. Following (18), we can find \(p_{k}^{m,n}\) will reduce. As a result, the transmission power can be adjusted to satisfy the constraint conditions. However, this power does not infinitely decrease, if \(p_{k}^{m,n}\) becomes relatively small, λ_{ m } and μ will decrease and ξ_{ m } will increase. Therefore, \(p_{k}^{m,n}\) will increase and goes back to the appropriate range. This adaptive iterative process can ensure good QoS for both PU and SU.
Based on the Lipschitz continuity [24] of the dual function and the proper step parameters of the Lagrange multipliers, this algorithm can converge quickly. According to the Lipschitz continuity, there exist a Lipschitz constant δ which can make the function d^{∗}(λ_{ m },μ,ξ_{ m }) satisfy the following condition: ∥d^{∗}(λ_{1}, μ_{1}, ξ_{1}) − d^{∗} (λ_{2},μ_{2},ξ_{2}) ∥_{2}≤ δ ∥[λ_{1}, μ_{1}, ξ_{1}]^{ T }−[λ_{2},μ_{2},ξ_{2}]^{ T }∥_{2}, where λ_{1},λ_{2}∈λ_{ m },μ_{1},μ_{2}∈μ,ξ_{1},ξ_{2}∈ξ_{ m }, and ∥·∥_{2} denotes the norm of vector. Thus, we can determine the dual function d^{∗} of (13) is uniformly continuous. When \(p_{k}^{m,n}\) satisfies all constraints C1 to C4 with λ_{ m },μandξ_{ m }≥0, \(\left ({p{{_{k}^{m,n^{*}}}},\lambda _{m}^{*},{\mu ^{*}},\xi _{m}^{*}} \right)\) can converge to a feasible region. Owing to the duality property between the dual problem and the original problem, the immediate power allocation algorithm can converge to the optimal solution.
Using the immediate power allocation algorithm, we can realize DP power allocation algorithm with consideration of the throughput optimization problem for the whole K time slots. We store the system state S_{ k }, the energy budget level \(B_{k}^{m}\), the immediate reward function R_{ k }, and the reward function V_{ k } in a lookup table indexing with the time slot, which contains all the possible situations. Therefore, each SU can determine his optimal power policy from this table which greatly reduces the computational complexity.
5 Simulation results
In this section, we present some simulation results to evaluate our proposed algorithm by comparing with two policies. The first policy is the conservative power policy. We use this scheme to allocate half of available energy to each time slot for power allocation, i.e., \(p_{k}^{m} = B_{k}^{m}/(2T),\forall k \in {\mathcal {K}}\). The second policy is the greedy power policy which uses whole available energy for power allocation at each time slot, i.e., \(p_{k}^{m} = B_{k}^{m}/T,\forall k \in {\mathcal {K}}\).
Figure 2 shows the convergence of the average throughput of four SUs with the proposed algorithm, respectively. In this simulation, we set the energy budget of each SU as 5 J. From Fig. 2, we can clearly see that this scheme can quickly converge to the equilibrium point. Obviously, Fig. 2 show that the average throughput of SU_{1} is the best of all SUs, and the average throughput of SU_{4} is less than all SUs. The reason is that according to the priority of SUs and the subcarriers allocation model, the SU with higher priority can transmit data under good channel condition. However, SU_{4} must deal with the interference power constraint to ensure the QoS of PU.
In Fig. 3, we consider the average throughput comparisons among our proposed power allocation policy, the conservative power policy, and the greedy power policy. We compare the average throughput of the secondary system on the variation of total energy budget. From this figure, we know that the average throughput of our propose policy is much higher than those of the other two policies over the examined range of total energy budget, since our policy considers not only the immediate OPA but also the whole time slot transmit performance. In addition, the average throughput increases for all three policies with the increase of the total energy budget, since the increase of the energy budget means we have more energy for power allocation for the increase of the throughput. But this throughput increases much more rapidly through our policy.
Figure 4 shows the comparisons of total throughput among three policies on the variation of time slot. In this case, we also set the energy budget of each SU is 5 J. Obviously, the throughput of our policy increases significantly with the increase number of time slot. Moveover, we can find that our policy has the best performance among the three policies over the whole range of number of time slot. From the simulation result, we can conclude that our policy can guarantee optimal performance for longrun operations.
Figure 5 illustrates the comparison of interference power at PU receiver for three policies. We set the IT level as I^{ th }=0.1 W. Figure 5a provides the convergence of the interference power under arbitrary time slot. It is clear that the interferences from three algorithms can quickly converge to their stable points, and these three policies can guarantee the interference power at the PU receiver below the IT level. Figure 5b presents the average interference comparisons among three policies on the variation of total energy budget under K=5. We can find that with the increase of the total energy budget, three average interferences increase gradually and are always less than the IT level even if the energy budget reaches the battery capacity. From Fig. 5a, b, we can see that the interference from our policy is slightly higher than those of other two policies under the premise of the interference power constraint, i.e., 0.01 W∼0.02 W. Moreover, from Figs. 3, 4, and 5, we conclude that the proposed policy can provide better throughput performance of the secondary system at the cost of little more interference at the PU receiver within the tolerance of PU.
To show the impact of different IT level on the system performance, Fig. 6 presents the average throughput on the variation of the IT level with different energy budget. From Fig. 6, we find that the average throughput increases first then tends to a constant value with the increasing of the IT level. The reason for this phenomenon is that the higher IT level attains, the higher interference power can be tolerated by PU. Thus, SU can get more transmission power for more throughput. Meanwhile, due to the constraint of the energy budget in our policy, the transmission power will eventually tend to a constant value. In addition, the interference power constraint represents the distance, with the increasing distance between SU and PU; more transmit power is allocated to achieve higher throughput.
Finally, in Fig. 7, we present the average throughput on the variation of the number of time slot of our proposed algorithm for different IT levels. In this simulation, we set the energy budget of each SU is 0.5 J. From Fig. 7, we can find that the higher IT level can obtain higher total throughput. Since that if PU can tolerate higher interference power, SUs can be allocated with higher transmission power. From other perspective, with the increase number of time slot, the overall trend of throughput goes up. However, the throughputs at K=6,9 slightly decrease, since the channel gain, the subcarrier occupation state, and the harvested energy state are random at each time slot. According to Figs. 6 and 7, we can get the conclusion that it is necessary to choose an appropriate IT level to balance the PU protection and the secondary system performance.
6 Conclusions
In this paper, we study the OPA problem in the EH CRN and propose an OPA policy to maximize the average throughput of the secondary system within K time slots, where the maximum transmission power constraint, the interference power constraint, and the minimum throughput constraint are considered. The optimal transmission power in each time slot can be obtained by joint utilization of the immediate OPA algorithm and the DP method. The simulation results show that, compared with the conservative power policy and the greedy power policy without DP approach, our policy can achieve better throughput performance on the variation of total energy budget and the number of time slot. Meanwhile, our algorithm can provide well protection for the basic communication of PU through introduction of interference power constraint. However, this policy improves the performance of the secondary system at the expense of little more interference compared with the other two policies.
Declarations
Acknowledgements
This work is supported by the National Natural Science Foundation of China under Grant No. 61171079.
Authors’ contributions
YW contributed in the conception of the study and design of the study and wrote the manuscript. Furthermore, YW carried out the simulation and revised the manuscript. XZ and HL helped to perform the analysis with constructive discussions and helped to draft the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 J Mitola, GQ Maguire, Cognitive radio: making software radios more personal. IEEE Pers. Commun. 6(4), 13–18 (1999).View ArticleGoogle Scholar
 D Gunduz, K Stamatiou, N Michelusi, et al, Designing intelligent energy harvesting communication systems. IEEE Commun. Mag. 52(1), 210–216 (2014).View ArticleGoogle Scholar
 OH Hayoung, RAN Rong, Stochastic policybased wireless energy harvesting in green cognitive radio network. Eurasip J. Wirel. Commun. Netw. 2015(177), 2–10 (2015).Google Scholar
 R Ma, W Zhang, Adaptive MQAM for energy harvesting wireless communications with 1bit channel feedback. IEEE Trans. Wirel. Commun. 14(11), 6459–6470 (2015).View ArticleGoogle Scholar
 Z Wang, V Aggarwal, X Wang, Power allocation for energy harvesting transmitter with causal information [J]. IEEE Trans. Commun. 62(11), 4080–4093 (2014).View ArticleGoogle Scholar
 C Huang, R Zhang, S Cui, Optimal power allocation for outage probability minimization in fading channels with energy harvesting constraints. IEEE Trans. Wirel. Commun. 13(2), 1074–1087 (2014).View ArticleGoogle Scholar
 CK Ho, R Zhang, Optimal energy allocation for wireless communications with energy harvesting constraints. IEEE Trans. Signal Process. 60(9), 4808–4818 (2012).MathSciNetView ArticleGoogle Scholar
 H Liang, X Zhao, in International Conference on Computing, Networking and Communications (ICNC). Optimal power allocation for energy harvesting cognitive radio networks with primary rate protection (IEEEKauai, 2016), pp. 1–6. doi:10.1109/ICCNC.2016.7440658.Google Scholar
 M Usman, I Koo, Access strategy for hybrid underlayoverlay cognitive radios with energy harvesting. IEEE Sensors J. 14(9), 3164–3173 (2014).View ArticleGoogle Scholar
 P He, L Zhao, in IEEE International Conference on Communications (ICC). Optimal power control for energy harvesting cognitive radio networks (IEEELondon, 2015), pp. 92–97. doi:10.1109/ICC.2015.7248304.Google Scholar
 C Wu, Q Shi, C He, et al, Energy utilization efficient frame structure for energy harvesting cognitive radio networks. IEEE Wirel. Commun. Lett. 5(5), 488–491 (2016).View ArticleGoogle Scholar
 S Yin, Z Qu, S Li, Achievable throughput optimization in energy harvesting cognitive radio systems. IEEE J. Sel. Areas Commun. 33(3), 407–422 (2015).View ArticleGoogle Scholar
 Y Zhang, Q Zhang, Y Wang, et al, in IEEE/CIC International Conference on Communications in China (ICCC). Energy and throughput tradeoff in hybrid cognitive radio networks based on POMDP (IEEEXi’an, 2013), pp. 668–673. doi:10.1109/ICCChina.2013.6671196.View ArticleGoogle Scholar
 A Bhowmick, SD Roy, S Kundu, Throughput of a cognitive radio network with energyharvesting based on primary user signal. IEEE Wirel. Commun. Lett. 5(2), 136–139 (2016).View ArticleGoogle Scholar
 Letaief KB, Zhang W, Cooperative communications for cognitive radio networks. Proc. IEEE. 97(5), 878–893 (2009).View ArticleGoogle Scholar
 HS Wang, N Moayeri, Finitestate Markov channel—a useful model for radio communication channels. IEEE Trans. Veh. Technol. 44(1), 163–171 (1995).View ArticleGoogle Scholar
 Q Zhang, SA Kassam, Finitestate Markov model for Rayleigh fading channels. IEEE Trans. Commun. 47(11), 1688–1692 (1999).View ArticleGoogle Scholar
 C Huang, R Zhang, S Cui, Throughput maximization for the Gaussian relay channel with energy harvesting constraints. IEEE J. Sel. Areas Commun. 31(8), 1469–1479 (2013).View ArticleGoogle Scholar
 S Boyd, L Vandenberghe, Convex optimization (Cambridge University Press, Cambridge, 2004).View ArticleMATHGoogle Scholar
 N RahbariAsr, MY Chow, Cooperative distributed demand management for community charging of PHEV/PEVs based on KKT conditions and consensus networks. IEEE Trans. Ind. Inform. 10(3), 1907–1916 (2014).View ArticleGoogle Scholar
 DP Palomar, M Chiang, A tutorial on decomposition methods for network utility maximization. IEEE J. Sel. Areas Commun. 24(8), 1439–1451 (2016).View ArticleGoogle Scholar
 DP Bertsekas, Dynamic Programming and Optimal Control, vol 1 (Athena Scientific, Belmont, 1995).MATHGoogle Scholar
 ML Puterman, Markov Decision Process, Discrete Stochastic Dynamic Programming (Wiley, New York, 1994).MATHGoogle Scholar
 K Eriksson, D Estep, C Johnson, Lipschitz continuity (SpringerVerlag, Berlin Heidelberg, 2004).View ArticleGoogle Scholar