The aim of this section is to derive a distributed energy-efficient resource allocation algorithm. The main challenge in deriving a distributed algorithm in interference networks is that typically global CSI is needed. Otherwise stated, each user also needs to know other users’ channels, which requires a too great amount of overhead information to feedback. In order to circumvent this problem, the proposed algorithm will be designed so as to require only statistical CSI at the transmitter side. Since channel statistics vary at a very slow rate compared to the actual channel realizations, feeding back only statistical CSI significantly reduces the amount of required overhead. Moreover, channel statistics can be estimated more easily than channel coefficients at the receivers. It is also to be mentioned that present multi-cell networks are typically endowed with a high-speed backhaul link which allows the receivers to exchange information with one another. Therefore, each receiver can easily learn the channel statistics also of users that are not associated to it and then feedback this information to its associated transmitters. Again, the overhead information to be exchanged on the backhaul link due to the resource allocation algorithm is quite limited since only channel statistics need to be shared. Thanks to these features, the algorithm to be developed lends itself to a distributed implementation at the transmitter side.

To begin with, we remark that since the resource allocation takes place at the transmitters, the instantaneous SINR expression (2) cannot be used for resource allocation purposes, because each transmitter only has statistical CSI. Thus, before turning to the analysis of the resource allocation algorithm, an average SINR expression is needed.

### 3.1 Users’ average SINR

The average power {U}_{k,{a}_{k}}\left(\ell \right) of the *k*-th user’s intended symbol, in his assigned receiver *a*_{
k
}, on subcarrier *ℓ*, is given by

\begin{array}{c}{U}_{k,{a}_{k}}\left(\ell \right)={p}_{k}\left(\ell \right)E\left[\left|{\mathit{h}}_{k,{a}_{k}}^{H}\right(\ell \left){\mathit{h}}_{k,{a}_{k}}\right(\ell ){|}^{2}\right]\\ \phantom{\rule{5.2em}{0ex}}={p}_{k}\left(\ell \right)E\left[\left|{\mathit{\alpha}}_{k,{a}_{k}}^{H}\right(\ell \left){\mathit{R}}_{k,{a}_{k}}\right(\ell \left){\mathit{\alpha}}_{k,{a}_{k}}\right(\ell ){|}^{2}\right]\phantom{\rule{2.56804pt}{0ex}},\end{array}

(3)

with {\mathit{\alpha}}_{k,{a}_{k}}\left(\ell \right)={\mathit{R}}_{k,{a}_{k}}^{-1/2}\left(\ell \right){\mathit{h}}_{k,{a}_{k}}\left(\ell \right). Elaborating, {U}_{k,{a}_{k}}\left(\ell \right) can be expressed as

\phantom{\rule{-15.0pt}{0ex}}\begin{array}{l}{U}_{k,{a}_{k}}\left(\ell \right)=\phantom{\rule{2.35982pt}{0ex}}{p}_{k}\left(\ell \right)E\left[{\left|\sum _{m=1}^{M}\sum _{v=1}^{M}{R}_{k,{a}_{k},\ell}(m,v){\alpha}_{k,{a}_{k},\ell}\left(m\right){\alpha}_{k,{a}_{k},\ell}^{\ast}\left(v\right)\right|}^{2}\right]\\ \phantom{\rule{4.3em}{0ex}}=\phantom{\rule{2.35982pt}{0ex}}{p}_{k}\left(\ell \right)E\left[\sum _{m=1}^{M}{R}_{k,{a}_{k},\ell}^{2}(m,m)\left|{\alpha}_{k,{a}_{k},\ell}\right(m){|}^{4}\right.\\ \left(\right)close="">\phantom{\rule{5em}{0ex}}+\phantom{\rule{0.3em}{0ex}}2\sum _{m=1}^{M}\sum _{v=m+1}^{M}{R}_{k,{a}_{k},\ell}(m,m){R}_{k,{a}_{k},\ell}(v,v)\left|{\alpha}_{k,{a}_{k},\ell}\right(m\left){|}^{2}\right|{\alpha}_{k,{a}_{k},\ell}\left(v\right){|}^{2}\end{array}\left(\right)close="]">\phantom{\rule{5em}{0ex}}+\phantom{\rule{0.3em}{0ex}}4\sum _{m=1}^{M}\sum _{v=m+1}^{M}\left|{R}_{k,{a}_{k},\ell}\right(m,v\left){|}^{2}\right|{\alpha}_{k,{a}_{k},\ell}\left(m\right){|}^{2}\left|{\alpha}_{k,{a}_{k},\ell}\right(v){|}^{2}\n \n \n \n =\n \n \n \n p\n \n \n k\n \n \n (\n \u2113\n )\n \n \n \n \n \u2211\n \n \n m\n =\n 1\n \n \n M\n \n \n \n \n R\n \n \n k\n ,\n \n \n a\n \n \n k\n \n \n ,\n \u2113\n \n \n 2\n \n \n (\n m\n ,\n m\n )\n +\n 4\n \n \n \u2211\n \n \n m\n =\n 1\n \n \n M\n \n \n \n \n \u2211\n \n \n v\n =\n m\n +\n 1\n \n \n M\n \n \n |\n \n \n R\n \n \n k\n ,\n \n \n a\n \n \n k\n \n \n ,\n \u2113\n \n \n (\n m\n ,\n v\n )\n \n \n |\n \n \n 2\n \n \n \n \n \n \n \n \n close="]">\n \n \n +\n \n 2\n \n \n \u2211\n \n \n m\n =\n 1\n \n \n M\n \n \n \n \n R\n \n \n k\n ,\n \n \n a\n \n \n k\n \n \n ,\n \u2113\n \n \n 2\n \n \n (\n m\n ,\n m\n )\n +\n 2\n \n \n \u2211\n \n \n m\n =\n 1\n \n \n M\n \n \n \n \n \u2211\n \n \n v\n =\n m\n +\n 1\n \n \n M\n \n \n \n \n R\n \n \n k\n ,\n \n \n a\n \n \n k\n \n \n ,\n \u2113\n \n \n (\n m\n ,\n m\n )\n \n \n R\n \n \n k\n ,\n \n \n a\n \n \n k\n \n \n ,\n \u2113\n \n \n (\n v\n ,\n v\n )\n \n \n \n

wherein we have exploited the fact that E\left[\phantom{\rule{0.3em}{0ex}}\right|{\alpha}_{k,{a}_{k},\ell}\left(m\right){|}^{4}]=3 for all *m*=1,…,*M*, because {\alpha}_{k,{a}_{k},\ell}\left(m\right) is a standard Gaussian variable for all *m*=1,…,*M*. As for the average power of the interference-plus-noise term, we have

\phantom{\rule{-12.0pt}{0ex}}\begin{array}{l}{I}_{k,{a}_{k}}\left(\ell \right)=\phantom{\rule{0.3em}{0ex}}{\sigma}^{2}E\left[\parallel {\mathit{h}}_{k,{a}_{k}}\left(\ell \right){\parallel}^{2}\right]\\ \phantom{\rule{5.3em}{0ex}}+\sum _{i\ne k}{p}_{i}\left(\ell \right)\mathrm{tr}\left(E\left[{\mathit{h}}_{i,{a}_{k}}\left(\ell \right){\mathit{h}}_{i,{a}_{k}}^{H}\left(\ell \right){\mathit{h}}_{k,{a}_{k}}\left(\ell \right){\mathit{h}}_{k,{a}_{k}}^{H}\left(\ell \right)\right]\right)\\ \phantom{\rule{4.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}{\sigma}^{2}\mathrm{tr}\left({\mathit{R}}_{k,{a}_{k}}\right(\ell \left)\right)+\sum _{i\ne k}{p}_{i}\left(\ell \right)\mathrm{tr}\left({\mathit{R}}_{i,{a}_{k}}\left(\ell \right){\mathit{R}}_{k,{a}_{k}}\left(\ell \right)\right)\phantom{\rule{2.56804pt}{0ex}},\end{array}

(4)

where the last equality stems from the assumption that the vectors {\mathit{h}}_{i,{a}_{k}}\left(\ell \right) and {\mathit{h}}_{k,{a}_{k}}\left(\ell \right) are statistically independent for all *k*≠*i*. Finally, the *k*-th user’s average SINR, in his intended receiver, on subcarrier *ℓ*, is computed as

\phantom{\rule{-12.0pt}{0ex}}\begin{array}{l}{\gamma}_{k,{a}_{k}}\left(\ell \right)=\phantom{\rule{0.3em}{0ex}}\frac{{U}_{k,{a}_{k}}\left(\ell \right)}{{I}_{k,{a}_{k}}\left(\ell \right)}\\ \phantom{\rule{4.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\frac{{p}_{k}\left(\ell \right)\left(\left({\text{tr}}^{2}\right({\mathit{R}}_{k,{a}_{k}}\left(\ell \right))\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}2\text{tr}({\mathit{R}}_{k,{a}_{k}}\left(\ell \right){\mathit{R}}_{k,{a}_{k}}^{H}\left(\ell \right))\right)}{{\sigma}^{2}\mathrm{tr}\left({\mathit{R}}_{k,{a}_{k}}\right(\ell \left)\right)\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}\sum _{i\ne k}{p}_{i}\left(\ell \right)\mathrm{tr}\left({\mathit{R}}_{i,{a}_{k}}\left(\ell \right){\mathit{R}}_{k,{a}_{k}}\left(\ell \right)\right)}\phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}.\end{array}

(5)

### 3.2 Proposed distributed algorithm

Having derived an average expression for the users’ SINR, we are ready to start our analysis of the resource allocation problem. Mathematically, the distributed energy-efficient resource allocation problem is formulated as the *K* coupled problems

\phantom{\rule{-15.0pt}{0ex}}\begin{array}{c}\begin{array}{ll}\underset{{\left\{{p}_{k}\right(\ell \left)\right\}}_{\ell =1}^{L}}{\mathrm{max}}G\left({\left\{{p}_{k}\right(\ell ),{\gamma}_{k}(\ell \left)\right\}}_{k=1,\ell =1}^{K,L}\right)& \forall \phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}k=1,\dots ,K\\ \mathrm{s.t.}\phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}{p}_{k}\left(\ell \right)\ge 0\phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}\forall \ell =1,\dots ,L\phantom{\rule{2.56804pt}{0ex}},\sum _{\ell =1}^{L}{p}_{k}\left(\ell \right)\\ \phantom{\rule{2em}{0ex}}\phantom{\rule{2em}{0ex}}\phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\le {P}_{\mathrm{max},k}& \forall \phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}k=1,\dots ,K\end{array}\phantom{\rule{2.56804pt}{0ex}},\end{array}

(6)

with G\left({\left\{{p}_{k}\right(\ell ),{\gamma}_{k}(\ell \left)\right\}}_{k=1,\ell =1}^{K,L}\right) being the energy-efficient performance metric to optimize, which will be specified shortly. For all *k*=1,…,*K*, the solution to the *k*-th problem in (6) yields the *k*-th user’s power allocation for a fixed configuration of the other users’ powers, and any fixed point of iteration (6) represents a stable resource allocation policy. We remark that for all *k*=1,…,*K*, only the transmit powers {\left\{{p}_{k}\right(\ell \left)\right\}}_{\ell =1}^{L} have been indicated as the optimization variables of the generic *k*-th problem, because by choosing the transmit powers, each user automatically chooses also the transmit subcarriers. Indeed, a subcarrier can be discarded by simply transmitting zero power over it.

Now, traditionally, the EE of a single communication link is defined as the ratio between the achieved throughput and the transmitted power, which can be mathematically expressed as [4, 10, 30], and references therein

{\mathrm{EE}}^{\mathit{\text{pc}}}=R\frac{D}{Q}\frac{{(1-{e}^{-\gamma})}^{Q}}{p}\phantom{\rule{2.77626pt}{0ex}},

(7)

wherein *R* is the transmit data rate, *Q*≥1 is the packet length, *D*≤*Q* is the number of information symbols contained in each packet, *γ* is the achieved SINR, *p* the transmit power, and (1−*e*^{−γ})^{Q} is the so-called efficiency function which approximates the probability of correct reception for a data-packet of length *Q*[4, 10, 30] and references therein. We stress that the case of bit-oriented communications, i.e., *Q*=1 is included as a special case in our definition of the energy efficiency and all results to follow will hold true also for *Q*=1. Moreover, it should be mentioned that also the case *Q*>1 is of practical interest in modern OFDMA systems, such as LTE [31].

Another widely used efficiency function is the achievable rate log(1+*γ*) [21, 23, 32]. However, such a choice applies to strictly static channels but has no information-theoretic meaning in the considered scenario where the channels are rapidly varying. In our context, an information-theoretic meaningful function would be the ergodic achievable rate *E*[log(1+*γ*)]. Such an approach, which has been considered in [22] for the simpler scenario of single-user MIMO systems, appears more challenging in interference networks and is left as future work.

In the considered multi-carrier system, for all *k*=1,…,*K*, (7) can be seen as the per-carrier EE of a given user. Then, recalling that each user is assigned *L* subcarriers, the EE with which the generic *n*-th user transmits each packet of *Q* bits is given by [10]

{\mathrm{EE}}_{n}=R\frac{D}{Q}\frac{\sum _{\ell =1}^{L}{\left(1-{e}^{-{\gamma}_{n,{a}_{n}}\left(\ell \right)}\right)}^{Q}}{\sum _{\ell =1}^{L}{p}_{n}\left(\ell \right)+{P}_{c,n}}\phantom{\rule{2.77626pt}{0ex}},

(8)

while the network’s global energy efficiency (GEE), defined as the ratio of network’s global throughput over the network’s total consumed power, is written as

\mathrm{GEE}=\frac{\sum _{n=1}^{K}R\frac{D}{Q}\sum _{\ell =1}^{L}{(1-{e}^{-{\gamma}_{n,{a}_{n}}\left(\ell \right)})}^{Q}}{\sum _{n=1}^{K}\sum _{\ell =1}^{L}{P}_{c,n}+{p}_{n}\left(\ell \right)}\phantom{\rule{2.77626pt}{0ex}},

(9)

wherein the circuit power *P*_{c,n} that is needed to operate transmitter *n* has also been included in the expression of the consumed power. From a user-centric point of view, individual maximization of (8), for all *n*=1,…,*K* should be pursued. However, this would result in iterations (6) to be not always convergent, and even if convergence occurs, the resulting power allocation policy may not be efficient from a social welfare point of view. Indeed, from a social welfare point of view, the GEE (9) would be a canonical choice as the objective G\left({\left\{{p}_{k}\right(\ell ),{\gamma}_{k}(\ell \left)\right\}}_{k=1,\ell =1}^{K,L}\right) of (6). However, the drawback of maximizing of (9) is that it might lead to unfair power allocations. Indeed, due to its additive nature, the maximum of (9) might be obtained by having users with very low channel coefficients transmit at very low powers. In order to compromise between the need to achieve improved overall EE and the need to obtain a fair resource allocation, similarly to [15], the following multiplicative version of (9) will be considered as objective of (6), namely

\stackrel{~}{\text{GEE}}=\frac{\prod _{n=1}^{K}R\frac{D}{Q}\prod _{\ell =1}^{L}{\left(1-{e}^{-{\gamma}_{n,{a}_{n}}\left(\ell \right)}\right)}^{Q}}{\prod _{n=1}^{K}\prod _{\ell =1}^{L}\left({p}_{n}\left(\ell \right)+{P}_{c,n}\right)}\phantom{\rule{2.77626pt}{0ex}}.

(10)

Due to its multiplicative nature, it is unlikely that a maximizer of (10) results in one of the users’ throughputs to be very low, since each user’s throughputs is a factor of the product in the numerator of (10). Moreover, (10) is also a system-wide performance function, since it is an increasing function of the players’ energy efficiencies. We stress that the maximization of products of utility functions in order to obtain fair resource allocation policies is also considered in contexts other than EE maximization [33, 34].

However, the drawback of this approach is that the maximization of (10) is clearly more complex than that of (9). Thus, in order to obtain a mathematically treatable objective function, the traditional efficiency function (1−*e*^{−γ})^{Q} will be approximated by {\left({e}^{-{\beta}_{n}/\gamma}\right)}^{Q}, with *β*_{
n
} a suitable constant to be specified. Note that, similarly to (1−*e*^{−γ})^{Q}, the modified efficiency function {\left({e}^{-{\beta}_{n}/\gamma}\right)}^{Q} is still an S-shaped^{a}, increasing function of *γ*, approaching zero for *γ*→0 and approaching unity for *γ*→+*∞*. Aiming at approximating (1−*e*^{−γ})^{Q} with {\left({e}^{-{\beta}_{n}/\gamma}\right)}^{Q}, a natural choice for *β*_{
n
} is to set it so as to minimize the mean square error between the two functions. Accordingly, for all *n*=1,…,*K*, *β*_{
n
} is determined as the solution to the problem

\underset{{\beta}_{n}}{\mathrm{min}}{\int}_{0}^{{\gamma}_{max,n}}{\left[{\left(1-{e}^{-x})\right)}^{Q}-{e}^{-Q{\beta}_{n}/x}\right]}^{2}\mathit{\text{dx}}\phantom{\rule{2.77626pt}{0ex}},

(11)

wherein *γ*_{max,n} is the maximum SINR that user *n* can attain. For the case at hand, we have

{\gamma}_{max,n}=\frac{{P}_{max,n}E[\parallel {\mathit{h}}_{n,{a}_{n}}(\ell \left){\parallel}^{2}\right]}{{\sigma}^{2}}\phantom{\rule{2.77626pt}{0ex}},

(12)

i.e., the *n*-th user’s SINR with interference-free transmission. The following proposition holds.

#### Proposition 1

*The solution to problem (11) is obtained as the solution to the equation*

\phantom{\rule{-6.0pt}{0ex}}\begin{array}{c}\sum _{m=0}^{Q}\left(\genfrac{}{}{0.0pt}{}{Q}{m}\right){(-1)}^{m}{\int}_{0}^{{\gamma}_{max,n}}\frac{{e}^{-\mathit{\text{mx}}}}{x}{e}^{-Q{\beta}_{n}/x}=-\mathrm{Ei}\left(-\frac{2Q{\beta}_{n}}{{\gamma}_{max,n}}\right)\end{array}

(13)

*with Ei(·) denoting the exponential integral function.*

#### Proof

Setting the first-order derivative of the objective of (11) to zero yields

{\int}_{0}^{{\gamma}_{max,n}}\frac{{(1-{e}^{-x})}^{Q}{e}^{-\frac{Q{\beta}_{n}}{x}}}{x}\mathit{\text{dx}}={\int}_{0}^{{\gamma}_{max,n}}\frac{{e}^{-2\frac{Q{\beta}_{n}}{x}}}{x}\mathit{\text{dx}}\phantom{\rule{2.77626pt}{0ex}}.

(14)

Applying Newton’s binomial formula to the left-hand-side of (14) yields the left-hand-side of (13). Next, operating the substitution y=\frac{2Q{\beta}_{n}}{x} in the right-hand-side of (14) yields

{\int}_{\frac{2Q{\beta}_{n}}{{\gamma}_{\mathrm{max},n}}}^{\infty}\frac{{e}^{-y}}{y}\mathit{\text{dy}}=-\mathrm{Ei}\left(-\frac{2Q{\beta}_{n}}{{\gamma}_{\mathrm{max},n}}\right)\phantom{\rule{2.77626pt}{0ex}}.

(15)

Hence, the thesis. □

It should be stressed that the computation of the coefficients {\left\{{\beta}_{n}\right\}}_{k=1}^{K} needs to be performed just once and can be carried out off-line because each *β*_{
n
} only depends on the constant networks parameters *Q*, *P*_{m a x,n}, *σ*^{2}, and E[\parallel {\mathit{h}}_{n,{a}_{n}}(\ell ){\parallel}^{2}.

In order to further motivate the validity of the modified efficiency function as a substitute for the classical one, let us consider the ratio \frac{{e}^{-{\beta}_{n}/\gamma}}{1-{e}^{-\gamma}}. Note that for increasing *γ*, it converges to one. Moreover, in order to give an insight as to how large *γ* is required to be for \frac{{e}^{-{\beta}_{n}/\gamma}}{1-{e}^{-\gamma}} to approach unity, Figure 1 reports such a ratio for the case *γ*_{max,n}=100. It is seen that for *γ*>0dB (that is the region of interest) \frac{{e}^{-{\beta}_{n}/\gamma}}{1-{e}^{-\gamma}} is very close to 1.

Thus, (10) can be approximated as

\hat{\mathrm{GEE}}=\frac{\prod _{n=1}^{K}R\frac{D}{Q}\prod _{\ell =1}^{L}{e}^{-\frac{Q{\beta}_{n}}{{\gamma}_{n,{a}_{n}}\left(\ell \right)}}}{\prod _{n=1}^{K}\prod _{\ell =1}^{L}\left({p}_{n}\left(\ell \right)+{P}_{c,n}\right)}\phantom{\rule{2.77626pt}{0ex}},

(16)

and the *k*-th problem in (6) can be restated as

\phantom{\rule{-6.0pt}{0ex}}\begin{array}{c}\left\{\begin{array}{l}\underset{{\left\{{p}_{k}\right(\ell \left)\right\}}_{\ell =1}^{L}}{\mathrm{max}}\frac{\prod _{n=1}^{K}R\frac{D}{Q}\prod _{\ell =1}^{L}\left({e}^{-Q{\beta}_{n}/{\gamma}_{n,{a}_{n}}\left(\ell \right)}\right)}{\prod _{n=1}^{K}\prod _{\ell =1}^{L}\left({p}_{n}\left(\ell \right)+{P}_{c,n}\right)}\\ \mathrm{s.t.}\phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}{p}_{k}\left(\ell \right)\ge 0\phantom{\rule{2.56804pt}{0ex}}\phantom{\rule{2.56804pt}{0ex}}\forall \ell =1,\dots ,L\phantom{\rule{2.56804pt}{0ex}},\sum _{\ell =1}^{L}{p}_{k}\left(\ell \right)\le {P}_{\mathrm{max},k}\end{array}\right.\phantom{\rule{2.56804pt}{0ex}}.\end{array}

(17)

Accordingly, the resource allocation algorithm can be expressed as follows:

**Algorithm 1** Distributed resource allocation

Convergence in Algorithm 1 is declared when the difference between the values of the objective function (16) achieved at the end of two successive outer loops is below a predetermined tolerance. The following proposition guarantees the convergence of Algorithm 1.

#### Proposition 2

*For any feasible initialization point*{\left\{{p}_{k}^{\left(0\right)}\right(\ell \left)\right\}}_{k=1,\ell =1}^{K,L}, *Algorithm 1 is guaranteed to converge.*

#### Proof

The objective (16) depends on all of the users’ *KL* transmit powers. After the initialization, we have{\hat{\mathrm{GEE}}}^{\left(0\right)}({\left\{{p}_{1}^{\left(0\right)}\right(\ell \left)\right\}}_{\ell =1}^{L},\dots ,{\left\{{p}_{K}^{\left(0\right)}\right(\ell \left)\right\}}_{\ell =1}^{L}). After the first iteration of the for cycle in Algorithm 1, (16) is maximized with respect to{\left\{{p}_{1}\right(\ell \left)\right\}}_{\ell =1}^{L} while keeping the other (*K*−1)*L* powers fixed. Let us denote by{\left\{{p}_{1}^{\left(1\right)}\right(\ell \left)\right\}}_{\ell =1}^{L} the *L* powers resulting from such optimization. Then, after this first optimization, the new value of the objective is{\hat{\mathrm{GEE}}}^{\left(1\right)}({\left\{{p}_{1}^{1}\right(\ell \left)\right\}}_{\ell =1}^{L},{\left\{{p}_{2}^{\left(0\right)}\right(\ell \left)\right\}}_{\ell =1}^{L},\dots ,{\left\{{p}_{K}^{\left(0\right)}\right(\ell \left)\right\}}_{\ell =1}^{L}), and clearly, we have{\hat{\mathrm{GEE}}}^{\left(1\right)}\ge {\hat{\mathrm{GEE}}}^{\left(0\right)}. In the second iteration of the cycle, the powers{\left\{{p}_{2}\right(\ell \left)\right\}}_{\ell =1}^{L} are optimized. Thus, after the optimization, we have{\hat{\mathrm{GEE}}}^{\left(2\right)}({\left\{{p}_{1}^{1}\right(\ell \left)\right\}}_{\ell =1}^{L},{\left\{{p}_{2}^{\left(1\right)}\right(\ell \left)\right\}}_{\ell =1}^{L},{\left\{{p}_{3}^{\left(0\right)}\right(\ell \left)\right\}}_{\ell =1}^{L},\dots ,{\left\{{p}_{K}^{\left(0\right)}\right(\ell \left)\right\}}_{\ell =1}^{L}), and it holds{\hat{\mathrm{GEE}}}^{\left(2\right)}\ge {\hat{\mathrm{GEE}}}^{\left(1\right)}. It is seen that cyclically iterating this procedure originates a sequence of values of{\hat{\mathrm{GEE}}}^{\left(n\right)} which is non-decreasing. As a consequence, since\hat{\mathrm{GEE}}is upper-bounded with respect to the transmit powers, Algorithm 1 will eventually converge. □

Now, in order to complete the resource allocation design, the solution to problem (17) remains to be tackled. Such a problem is not convex, because the objective is not concave, but it is possible to recast it as a convex problem without loss of optimality, by exploiting the result in the following proposition. Next, we will also provide an algorithm to solve problem (17) that employs the alternating maximization technique [35], rather than using the convex reformulation.

#### Proposition 3

*For any*
*k*
*= 1,…,*
*K*
*, problem (17) can be restated as*

\phantom{\rule{-12.0pt}{0ex}}\begin{array}{c}\left\{\begin{array}{l}\underset{{\left\{{p}_{k}\right(\ell \left)\right\}}_{\ell =1}^{L}}{\mathrm{max}}-\sum _{\ell =1}^{L}\left(\frac{{a}_{k}\left(\ell \right)}{{p}_{k}\left(\ell \right)}+{c}_{k}\left(\ell \right){p}_{k}\left(\ell \right)+\mathrm{log}\left({p}_{k}\right(\ell )+{P}_{c,k})\right)\\ \mathrm{s.t.}\phantom{\rule{2.62202pt}{0ex}}\phantom{\rule{2.62202pt}{0ex}}\phantom{\rule{2.62202pt}{0ex}}{p}_{k}\left(\ell \right)\ge 0\phantom{\rule{2.62202pt}{0ex}}\phantom{\rule{2.62202pt}{0ex}}\forall \ell =1,\dots ,L\phantom{\rule{2.62202pt}{0ex}},\sum _{\ell =1}^{L}{p}_{k}\left(\ell \right)\le {P}_{\mathrm{max},k}\end{array}\right.\phantom{\rule{2.62202pt}{0ex}},\end{array}

(18)

*wherein*

\phantom{\rule{-6.0pt}{0ex}}\begin{array}{c}{a}_{k}\left(\ell \right)=Q{\beta}_{k}\frac{{\sigma}^{2}\mathrm{tr}\left({\mathit{R}}_{k,{a}_{k}}\right(\ell \left)\right)+\sum _{i\ne k}{p}_{i}\left(\ell \right)\mathrm{tr}\left({\mathit{R}}_{i,{a}_{k}}\left(\ell \right){\mathit{R}}_{k,{a}_{k}}\left(\ell \right)\right)}{\left({\mathrm{tr}}^{2}\left({\mathit{R}}_{k,{a}_{k}}\right(\ell \left)\right)+2\mathrm{tr}\left({\mathit{R}}_{k,{a}_{k}}\right(\ell \left){\mathit{R}}_{k,{a}_{k}}^{H}\right(\ell \left)\right)\right)}\end{array}

(19)

\phantom{\rule{-6.0pt}{0ex}}\begin{array}{c}{c}_{k}\left(\ell \right)=\sum _{n\ne k}\frac{Q{\beta}_{n}\mathrm{tr}\left({\mathit{R}}_{k,{a}_{n}}\right(\ell \left){\mathit{R}}_{n,{a}_{n}}\right(\ell \left)\right)}{{p}_{n}\left(\ell \right)\left({\mathrm{tr}}^{2}\left({\mathit{R}}_{n,{a}_{n}}\right(\ell \left)\right)+2\mathrm{tr}\left({\mathit{R}}_{n,{a}_{n}}\right(\ell \left){\mathit{R}}_{n,{a}_{n}}^{H}\right(\ell \left)\right)\right)}.\end{array}

(20)

*Moreover, the objective of (18) has a unique maximizer, which lies in its concave region.*

#### Proof

We start by observing that it is possible to apply any increasing function to (16) without changing its maximizers. Then, applying the logarithmic function yields

\phantom{\rule{-15.0pt}{0ex}}\begin{array}{l}K\mathrm{ln}\left(\frac{\mathit{\text{RD}}}{Q}\right)+\mathrm{ln}\left(\prod _{n=1}^{K}\prod _{\ell =1}^{L}\frac{\left({e}^{-Q{\beta}_{n}/{\gamma}_{n,{a}_{n}}\left(\ell \right)}\right)}{\left({p}_{n}\left(\ell \right)+{P}_{c,n}\right)}\right)=K\mathrm{ln}\left(\frac{\mathit{\text{RD}}}{Q}\right)\\ \phantom{\rule{4.3em}{0ex}}-\sum _{n=1}^{K}\sum _{\ell =1}^{L}\left(\frac{Q{\beta}_{n}}{{\gamma}_{n,{a}_{n}}\left(\ell \right)}\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}\mathrm{ln}\left({p}_{n}\left(\ell \right)\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}{P}_{c,n}\right)\right)\phantom{\rule{2.56804pt}{0ex}}.\end{array}

(21)

Then, plugging the expression for the users’ SINRs and highlighting the terms that depend on the *k*-th user’s transmit powers, (21) can be rewritten as

\begin{array}{l}-Q{\beta}_{k}\sum _{\ell =1}^{L}\frac{{\sigma}^{2}\mathrm{tr}\left({\mathit{R}}_{k,{a}_{k}}\right(\ell \left)\right)+\sum _{i\ne k}{p}_{i}\left(\ell \right)\mathrm{tr}\left({\mathit{R}}_{i,{a}_{k}}\left(\ell \right){\mathit{R}}_{k,{a}_{k}}\left(\ell \right)\right)}{{p}_{k}\left(\ell \right)\left(\left({\mathrm{tr}}^{2}\right({\mathit{R}}_{k,{a}_{k}}\left(\ell \right))+2\mathrm{tr}({\mathit{R}}_{k,{a}_{k}}\left(\ell \right){\mathit{R}}_{k,{a}_{k}}^{H}\left(\ell \right))\right)}\\ -\sum _{n\ne k}\sum _{\ell =1}^{L}\frac{Q{\beta}_{n}{p}_{k}\left(\ell \right)\mathrm{tr}\left({\mathit{R}}_{k,{a}_{n}}\left(\ell \right){\mathit{R}}_{n,{a}_{n}}\left(\ell \right)\right)}{{p}_{n}\left(\ell \right)\left(\left({\mathrm{tr}}^{2}\right({\mathit{R}}_{n,{a}_{n}}\left(\ell \right))+2\mathrm{tr}({\mathit{R}}_{n,{a}_{n}}\left(\ell \right){\mathit{R}}_{n,{a}_{n}}^{H}\left(\ell \right))\right)}\\ -\sum _{\ell =1}^{L}\mathrm{ln}\left({p}_{k}\left(\ell \right)+{P}_{c,k}\right)+{C}_{k}\phantom{\rule{2.56804pt}{0ex}},\end{array}

(22)

with *C*_{
k
}including all the terms that do not depend on the *k*-th user’s transmit powers. Then, by inspection, it is seen that (56) coincides with the objective of (18), and the first part of the thesis is proved. In order to prove the second part, let us first note that (18) is not a convex problem, because its objective is not concave, as can be verified by direct computation of its Hessian. However, the objective of (18) has a unique maximum which lies within the concave region of the objective function. To prove this, let us denote each summand of the objective of (18) as the function

\phantom{\rule{-6.0pt}{0ex}}\begin{array}{c}{g}_{k}\left({p}_{k}\right(\ell \left)\right)=-\left(\frac{{a}_{k}\left(\ell \right)}{{p}_{k}\left(\ell \right)}+{c}_{k}\left(\ell \right){p}_{k}\left(\ell \right)+\mathrm{log}\left({p}_{k}\right(\ell )+{P}_{c,k})\right)\phantom{\rule{2.56804pt}{0ex}}.\end{array}

(23)

For all *k*=1,…,*K*, the first derivative of *g*_{
k
}(*p*_{
k
}(*ℓ*)) is given by

\frac{\partial {g}_{k}\left({p}_{k}\right(\ell \left)\right)}{\partial {p}_{k}\left(\ell \right)}=\frac{{a}_{k}\left(\ell \right)}{{p}_{k}^{2}\left(\ell \right)}-{c}_{k}\left(\ell \right)-\frac{1}{{p}_{k}\left(\ell \right)+{P}_{c,k}}\phantom{\rule{2.77626pt}{0ex}}.

(24)

Setting (24) to zero and elaborating yields the equation

\phantom{\rule{-6.0pt}{0ex}}{c}_{k}\left(\ell \right){p}_{k}^{3}\left(\ell \right)+\left({c}_{k}\right(\ell ){P}_{c,k}+1){p}_{k}^{2}\left(\ell \right)\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}{a}_{k}\left(\ell \right){p}_{k}\left(\ell \right)+{a}_{k}\left(\ell \right){P}_{c,k}\phantom{\rule{2.77626pt}{0ex}},

(25)

which is guaranteed to admit a unique solution for positive *p*_{
k
}(*ℓ*), because an increasing cubic curve with zero intercept is on the left-hand-side, while a line with positive slope and intercept is on the right-hand-side. Denoting by{p}_{k}^{\ast}\left(\ell \right)such a solution, it is also seen that, for all *k*=1,…,*K*, *g*_{
k
}(*p*_{
k
}(*ℓ*)) is an increasing function for{p}_{k}\left(\ell \right)\le {p}_{k}^{\ast}\left(\ell \right)**, thus implying that**{p}_{k}^{\ast}\left(\ell \right)is a maximizer for *g*_{
k
}(*ℓ*). Hence, for all *k*=1,…,*K*, it holds that

\phantom{\rule{-11.0pt}{0ex}}\begin{array}{c}max\left(\sum _{\ell =1}^{L}{g}_{k}\left({p}_{k}\right(\ell \left)\right)\right)\le \sum _{\ell =1}^{L}max\left({g}_{k}\right({p}_{k}\left(\ell \right)\left)\right)=\phantom{\rule{0.3em}{0ex}}\sum _{\ell =1}^{L}{g}_{k}\left(\phantom{\rule{0.3em}{0ex}}\underset{k}{\overset{\ast}{p}}\right(\ell \left)\right)\phantom{\rule{2.56804pt}{0ex}},\end{array}

(26)

which shows that{\left\{{p}_{k}^{\ast}\right(\ell \left)\right\}}_{\ell =1}^{L}is the unique maximizer of the objective of (18). Next, let us compute the Hessian of the objective of (18). It is easy to realize that all the off-diagonal components equal zero, whereas for all *ℓ*=1,…,*L*, we have

\begin{array}{l}\frac{{\partial}^{2}{u}_{k}}{\partial {p}_{k}^{2}\left(\ell \right)}=\frac{1}{{\left({p}_{k}\right(\ell )+{P}_{c,k})}^{2}}-2\frac{{a}_{k}\left(\ell \right)}{{p}_{k}^{3}\left(\ell \right)}\\ \phantom{\rule{3.5em}{0ex}}=\frac{{p}_{k}^{3}\left(\ell \right)-2{a}_{k}\left(\ell \right){\left({p}_{k}\right(\ell )+{P}_{c,k})}^{2}}{{p}_{k}^{3}\left(\ell \right){\left({p}_{k}\right(\ell )+{P}_{c,k})}^{2}}\phantom{\rule{2.77626pt}{0ex}}.\end{array}

(27)

Thus, the resulting Hessian is a diagonal matrix with diagonal entries given by (27). Also, it is seen that (27) vanishes when{p}_{k}^{3}\left(\ell \right)=2{a}_{k}\left(\ell \right){\left({p}_{k}\right(\ell )+{P}_{c,k})}^{2}, which is guaranteed to admit a unique solution for positive *p*_{
k
}(*ℓ*) because the intersection between a positive cubic curve with zero intercept and a convex parabola with positive intercept is unique. For all *ℓ*=1,…,*L*, denote by{\stackrel{\u0304}{p}}_{k}\left(\ell \right)such a solution. Then, by inspection, it can also be seen that (27) is negative for{p}_{k}\left(\ell \right)<{\stackrel{\u0304}{p}}_{k}\left(\ell \right), thus implying that *g*_{
k
}(*ℓ*) is concave for{p}_{k}\left(\ell \right)<{\stackrel{\u0304}{p}}_{k}\left(\ell \right). Consequently,{u}_{k}=\sum _{\ell =1}^{L}{g}_{k}\left({p}_{k}\right(\ell \left)\right) is concave when{p}_{k}\left(\ell \right)<{\stackrel{\u0304}{p}}_{k}\left(\ell \right)for all *ℓ*=1,…,*L*. Now, in order to complete the proof, it is to be shown that{p}_{k}^{\ast}\left(\ell \right)\le {\stackrel{\u0304}{p}}_{k}\left(\ell \right), for all *ℓ*=1,…,*L*, which is equivalent to showing that the first-order derivative of *g*_{
k
}(*ℓ*) is negative when evaluated at{\stackrel{\u0304}{p}}_{k}\left(\ell \right). To see this note that from the left-hand-side of (27), it follows that, for all *ℓ*=1,…,*L*,{\stackrel{\u0304}{p}}_{k}\left(\ell \right)has to satisfy the equation\frac{{a}_{k}\left(\ell \right)}{{\stackrel{\u0304}{p}}_{k}^{2}\left(\ell \right)}=\frac{{\stackrel{\u0304}{p}}_{k}\left(\ell \right)}{2{\left({\stackrel{\u0304}{p}}_{k}\right(\ell )+{P}_{c,k})}^{2}}. Therefore, from (24), we have

\frac{\partial {g}_{k}\left({\stackrel{\u0304}{p}}_{k}\right(\ell \left)\right)}{\partial {p}_{k}\left(\ell \right)}=\frac{{a}_{k}\left(\ell \right)}{{\stackrel{\u0304}{p}}_{k}^{2}\left(\ell \right)}-{c}_{k}\left(\ell \right)-\frac{1}{{\stackrel{\u0304}{p}}_{k}\left(\ell \right)+{P}_{c,k}}=

(28)

\begin{array}{c}\frac{{\stackrel{\u0304}{p}}_{k}\left(\ell \right)}{2{\left({\stackrel{\u0304}{p}}_{k}\right(\ell )+{P}_{c,k})}^{2}}-{c}_{k}\left(\ell \right)-\frac{1}{{\stackrel{\u0304}{p}}_{k}\left(\ell \right)+{P}_{c,k}}=\\ -\frac{{\stackrel{\u0304}{p}}_{k}\left(\ell \right)+2{P}_{c,k}+2{c}_{k}\left(\ell \right)\left({\stackrel{\u0304}{p}}_{k}\right(\ell )+{P}_{c,k})}{2\left({\stackrel{\u0304}{p}}_{k}\right(\ell )+{P}_{c,k})}<0\phantom{\rule{2.77626pt}{0ex}}.\end{array}

(29)

Hence, the thesis. □

Thus, Proposition 3 allows to reformulate the non-convex problem (18) as a convex one by restricting the problem domain to the concave region of the objective function, which can be done by simply imposing the additional constraints{p}_{k}\left(\ell \right)\le {\stackrel{\u0304}{p}}_{k}\left(\ell \right)for all *ℓ*=1,…,*L* in (18). This causes no loss of optimality, since the global maximum has been proved to lie in the concave region of the objective function.

Moreover, since numerical algorithms might still be too complex when the network’s load grows too large or in scenarios when computational complexity is a critical issue, in the following, we provide another, even more, computationally efficient technique to solve (18), based on the alternating maximization algorithm [35]. According to the alternating maximization, a function can be cyclically maximized with respect to one (or a block of) variable, while keeping the other variables fixed. For the case at hand, this means that in each cycle, problem (18) can be solved with respect to one of the transmit powers, say *p*_{
k
}(*ℓ*), while keeping the other powers {*p*_{
k
}(*q*)}_{q≠ℓ} fixed, thus converting problem (18) into a sequence of scalar sub-problems, each of which can be solved in closed-form. Indeed, when only the generic power *p*_{
k
}(*ℓ*) is to be optimized, problem (18) can be recast as

\left\{\begin{array}{l}\underset{{p}_{k}\left(\ell \right)}{\mathrm{max}}{g}_{k}\left({p}_{k}\right(\ell \left)\right)\\ \mathrm{s.t.}\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{2.77626pt}{0ex}}{p}_{k}\left(\ell \right)\ge 0\phantom{\rule{2.77626pt}{0ex}},\phantom{\rule{2.77626pt}{0ex}}{p}_{k}\left(\ell \right)\le {P}_{\mathrm{max},k}-\sum _{q\ne \ell}{p}_{k}\left(q\right)\end{array}\right.\phantom{\rule{2.77626pt}{0ex}}.

(30)

In Proposition 2, it has already been proved that *g*_{
k
}(*p*_{
k
}(*ℓ*)) admits a unique maximizer{p}_{k}^{\ast}\left(\ell \right) which is given by the solution to (25). Consequently, the solution to problem (30) is given by

{p}_{k}\left(\ell \right)=min\left\{\left({P}_{max,k}\left(\ell \right)-\sum _{q\ne \ell}{p}_{k}\left(q\right)\right),\underset{k}{\overset{\ast}{p}}\left(\ell \right)\right\}\phantom{\rule{2.77626pt}{0ex}}.

(31)

Equipped with this result, the formal alternating maximization algorithm to solve problem (18) can be stated as follows

**Algorithm 2** Solution of problem (18)

Similarly as for Algorithm 2, convergence in Algorithm 2 is declared when the value of the objective of (18) after two successive outer loops is below a given threshold. Convergence of Algorithm 2 can be proved with similar algorithms as for Algorithm 1.

From (31), it also follows that for large *P*_{max,k}, Algorithm 2 is guaranteed to converge to the global solution of (18) in *L* iterations. Indeed, for large *P*_{max,k}, after *L* iterations of the alternating maximization, the resulting transmit powers will be{p}_{k}\left(\ell \right)={p}_{k}^{\ast}\left(\ell \right), which is the global maximizer of solution of (18). In Section 6, the performance obtained using Algorithm 2 to solve (18) will be contrasted to that achieved by solving (18) through its convex reformulation.

### 3.3 The proposed algorithm as a potential game

In this section, we will briefly provide a different look on the proposed algorithm, showing how it fits into the framework of game theory and in particular of potential games. Let us first give some details on non-cooperative games and potential games.

In its strategic form, a game\mathcal{G}can be described as a triplet\mathcal{G}=\{\mathcal{K},{\left\{{\mathcal{S}}_{k}\right\}}_{k=1}^{K},{\left\{{u}_{k}\right\}}_{k=1}^{K}\}, wherein\mathcal{K}is the set of players (e.g., the communicating devices in a wireless network),{\mathcal{S}}_{k}is the set of all possible strategies for the *k*-th player, and *u*_{
k
}represents the utility function or payoff of the *k*-th player; *u*_{
k
}is a scalar function depending on the strategies taken by all players of the game. Thus, a change in strategy from one player affects all the other players as well, and triggers a dynamic process, in which players iteratively update their own strategies as a reaction to changes in the strategies of the other players. This process is mathematically represented by the set of coupled problems

\underset{{s}_{k}}{\mathrm{max}}{u}_{k}({s}_{k},{\mathit{s}}_{-k})\phantom{\rule{2.77626pt}{0ex}},\phantom{\rule{2.77626pt}{0ex}}\phantom{\rule{2.77626pt}{0ex}}\forall k\in \mathcal{K}\phantom{\rule{2.77626pt}{0ex}},

(32)

with *s*_{
k
}and *s*_{−k}being the *k*-th player’s strategy and set of the other player’s strategies, respectively. The coupled problems (32) are usually referred to as best-response dynamics (BRD), because for allk\in \mathcal{K}, in the *k*-th iteration, given the strategies of the other players *s*_{−k}, player *k* responds by choosing his own strategy *s*_{
k
}in order to maximize his own utility function. Each fixed point of (32), if any, is termed Nash equilibrium (NE). At an NE, no user can *unilaterally* improve its own utility by taking a different strategy, thus implying that each user, provided that the other users’ strategies do not change, is not interested in changing his own strategy. In general, given a generic strategic-form game, convergence of the BRD to an NE is not guaranteed, even if one or more NEs exist.

We give now the formal definition of a potential game [16]. A strategic game\mathcal{G}=\left[\mathcal{K},\left\{{\mathcal{S}}_{k}\right\},\left\{{u}_{k}\right\}\right]is called an *exact potential game* if there exists a functionV:{\mathcal{S}}_{1}\times {\mathcal{S}}_{2}\times \dots {\mathcal{S}}_{K}\to \mathcal{R}such that for anyk\in \mathcal{K}and for any({s}_{k},{\mathit{s}}_{-k}),({s}_{k}^{\ast},{\mathit{s}}_{-k})\in {\mathcal{S}}_{1}y\times {\mathcal{S}}_{2}\times \dots {\mathcal{S}}_{K}, we have

{u}_{k}({s}_{k},{\mathit{s}}_{-k})-{u}_{k}({s}_{k}^{\ast},{\mathit{s}}_{-k})=V({s}_{k},{\mathit{s}}_{-k})-V({s}_{k}^{\ast},{\mathit{s}}_{-k})\phantom{\rule{2.77626pt}{0ex}}.

(33)

The function *V* is called the potential function of the game. A very attractive property of potential games is that at least one NE is guaranteed to exist and that the BRD always converges to an NE, provided the potential function is upper-bounded. In our scenario, it can be seen that the distributed resource allocation algorithm can be seen as a potential game{\mathcal{G}}_{\mathrm{pot}}, with the mobile users as players, with potential function *V* given by (21) and utility functions given by *u*_{
k
}=*V*−*C*_{
k
}, for all *k*=1,…,*K*, with *C*_{
k
}being the additive constant that appears in (56). Thus, the resource allocation policy obtained at the fixed point of Algorithm 1 can be regarded as an NE of{\mathcal{G}}_{\mathrm{pot}}.