3.1 Definition of energy efficiency with imperfect CSI
Traditionally, energy efficiency is defined for a single link (M = 1), under the assumption that both the link adaptation controller and the receiver are aware of the channel state, and the channel state stays the same during the link adaptation and the reception. Under this assumption, link adaptation controller can always adapt transmission power and rate so that the packet is successfully decoded with probability one according to the Shannon's channel coding theorem [13]. Then, given a channel coefficient h ∈ ℂ, the energy efficiency for the single link, where a UE always successfully transmits a packet with transmission rate R by spending total power P = P_{
T
}+ P_{
C
}to a receiver, is defined (e.g., [9]) as
{U}_{h}\left({P}_{T}\right)\triangleq \frac{R\left({P}_{T},h\right)}{{P}_{T}+{P}_{C}}\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{nats}}\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{per}}\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{Joule}}
(1)
Assuming capacity achieving channel codes and sufficiently large block length, we have R(P_{
T
}, h) = ln(1 + h^{2}P_{
T
}) nats. The maximal energy efficiency {U}_{h}^{*} for a given channel coefficient h is achieved with the optimal rate and power allocation {P}_{T}^{*}\left(h\right):
{U}_{h}^{*}=\frac{R\left({P}_{T}^{*}\left(h\right),h\right)}{{P}_{T}^{*}\left(h\right)+{P}_{C}}\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{and}}\phantom{\rule{2.77695pt}{0ex}}{P}_{T}^{*}\left(h\right)=\underset{{P}_{T}}{argmax}\frac{R\left({P}_{T},h\right)}{{P}_{T}+{P}_{C}}.
(2)
Now, once a distribution of the random variable h characterizing h is given, the optimal energy efficiency of the network U^{⋆} can be defined as {U}^{*}\triangleq E\left({U}_{h}^{*}\right).
However, as the channel states used for the link adaptation is not necessarily the same as those for the receiver, we consider a general framework for taking potential imperfectness of CSI at the link adaptation controller and extend the energy efficiency definition accordingly. For the uplink transmission considered in this section, we make the following assumptions.

(Assumption 1) The actual channel vector of the network during the transmission of a packet is h ∈ ℂ^{M× 1}, and the m th component of h, denoted by h_{
m
}, is the channel coefficient between BS m and the UE. The received signals at each BS are corrupted by circularly symmetric additive white Gaussian noise (AWGN) of zero mean and unit variance. We note that this assumption models a receiver treating interference signals from the other UEs transmitting to the other base stations as noise and scaling the received signal, so that the interferenceplusnoise power is one.

(Assumption 2) The receiver is aware of the actual channel vector h.

(Assumption 3) The link adaptation controller determines the transmission rate R and the transmission power P_{
T
}based on CSI \stackrel{\u0303}{h}=hw, where w∈ ℂ^{M× 1}is a random vector characterizing the CSI error and models the uncoordinated interference from other UEs in the cellular network. The link adaptation controller is aware ^{2} of the distribution of w.

(Assumption 4) To cope with channel outages, the link adaptation controller applies a power backoff strategy to determine the transmission rate. The controller assumes that the transmission power is αP_{
T
}where α ∈ (0,1] for the rate calculation, even though the actual transmission power is P_{
T
}. In this case, the transmission rate R is determined as R=R\left({P}_{T},h,\alpha \right)=ln\left(1+\alpha {P}_{T}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right).
Under these assumptions, we define expected energy efficiency of the network seen at the link adaptation controller {\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T},R\right).
{\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T},R\right)\triangleq \frac{\left(1Pr\left(O\right)\right)R\left({P}_{T},\stackrel{\u0303}{h},\alpha \right)}{\left({P}_{T}+{P}_{C}\right)}.
(3)
Here, the numerator is the average throughput (nats per channel use) achieved with the transmission rate R\left({P}_{T},\stackrel{\u0303}{h},\alpha \right)[14], and Pr(O) is the probability of hypothetical channel outage events seen at the link adaptation controller:
Pr\left(O\right)=Pr\left(R\left({P}_{T},\stackrel{\u0303}{h},\alpha \right)\ge \stackrel{\u0304}{R}\left({P}_{T},\stackrel{\u0303}{h},+w\right)\right),
(4)
where \stackrel{\u0304}{R}\left({P}_{T},\stackrel{\u0303}{h}+w\right) is the maximum achievable rate with the hypothetical actual channel \stackrel{\u0303}{h}+w, and hence \stackrel{\u0304}{R}\left({P}_{T},\stackrel{\u0303}{h}+w\right)=ln\left(1+{P}_{T}{\u2225\stackrel{\u0303}{h}+w\u2225}^{2}\right). Then, the optimal transmission power and transmission rate {P}_{T}^{*} and R* with a given \stackrel{\u0303}{h} are determined by,
\left({P}_{T}^{*},{R}^{*}\right)=\underset{{P}_{T},R}{argmax}{\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T},R\right),
(5)
and the maximal expected energy efficiency with a given \stackrel{\u0303}{h} is given by,
{\u0168}_{\stackrel{\u0303}{h}}^{*}={\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T}^{*},{R}^{*}\right).
(6)
Once a distribution of the random vector \stackrel{\u0303}{h} characterizing \stackrel{\u0303}{h}=h+w is given as well as the distribution of w, the optimal energy efficiency {\u0168}^{*} of the network can be defined as {\u0168}^{*}\triangleq E\left({\u0168}_{\stackrel{\u0303}{h}}^{*}\right).
3.2 With perfect CSI at the link adaptation controller
In this subsection, we consider the uplink transmission with perfect CSI at the link adaptation controller. To model this case, in addition to the four assumptions in Section 3.1, we further assume that the CSI error vector w is deterministically 0 so that \stackrel{\u0303}{h}=h, and that the link adaptation controller chooses α = 1, regardless of the CSI h. In this case, the maximization problem of (6) reduces to (2) and has been analyzed in [9], as in the following theorem.
Theorem 1. The optimal power allocation{P}_{T}^{*}and the optimal transmission rate R* for the uplink transmission with perfect CSI at the link adaptation controller with a given h is obtained by the following relation:
{P}_{T}^{*}={\left(\frac{1}{{U}_{h}^{*}}\frac{1}{{\u2225h\u2225}^{2}}\right)}^{+}and\phantom{\rule{2.77695pt}{0ex}}{R}^{*}=ln\left(1+{P}_{T}^{*}{\u2225h\u2225}^{2}\right),
(7)
where
{U}_{h}^{*}={U}_{h}\left({P}_{T}^{*}\right)=\frac{ln\left(1+{P}_{T}^{*}{\u2225h\u2225}^{2}\right)}{{P}_{T}^{*}+{P}_{C}}.
(8)
We note that efficient algorithms are available to solve the power maximization (7), e.g., from [9].
3.3 With imperfect CSI at the link adaptation controller
The expected energy efficiency at the link adaptation controller {\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T}\right) of the uplink transmission modeled in Section 3.1 is further expanded as,
\begin{array}{ll}\hfill {\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T},\alpha \right)& =\frac{R\left({P}_{T},\stackrel{\u0303}{h},\alpha \right)}{{P}_{T}+{P}_{C}}.\left(1Pr\left(O\right)\right)\phantom{\rule{2em}{0ex}}\\ =\frac{ln\left(1+\alpha {P}_{T}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right)}{{P}_{T}+{P}_{C}}.Pr\left(ln\left(1+\alpha {P}_{T}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right)\le ln\left(1+{P}_{T}{\u2225\stackrel{\u0303}{h}+w\u2225}^{2}\right)\right)\phantom{\rule{2em}{0ex}}\\ =\frac{ln\left(1+\alpha {P}_{T}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right)}{{P}_{T}+{P}_{C}}.Pr\left(\u2225\stackrel{\u0303}{h}+w\u2225\ge \sqrt{\alpha}\u2225\stackrel{\u0303}{h}\u2225\right).\phantom{\rule{2em}{0ex}}\end{array}
(9)
Denoting the probability density function and cumulative distribution function of random variable \u2225\stackrel{\u0303}{h}+w\u2225 as {f}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(x\right) and {F}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(x\right)={\int}_{0}^{x}{f}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(\xi \right)d\xi respectively, the expected energy efficiency is
{\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T},\alpha \right)=\frac{ln\left(1+\alpha {P}_{T}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right)}{{P}_{T}+{P}_{C}}.\left(1+{F}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(\sqrt{\alpha}\u2225\stackrel{\u0303}{h}\u2225\right)\right).
(10)
The maximal expected energy efficiency with this link adaptation controller is obtained through a joint maximization over P_{
T
}and α:
{\u0168}_{\stackrel{\u0303}{h}}^{*}=\underset{{P}_{T},\alpha}{max}{\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T},\alpha \right)
(11)
We obtain the following theorem on this joint maximization.
Theorem 2. Define{P}_{T}^{\left(1\right)}and R^{(1)}as,
{P}_{T}^{\left(1\right)}\triangleq {\left(\frac{1}{{\u0168}_{\stackrel{\u0303}{h}}\left({P}_{T}^{\left(1\right)},1\right)}\frac{1}{{\u2225\stackrel{\u0303}{h}\u2225}^{2}}\right)}^{+}and\phantom{\rule{2.77695pt}{0ex}}{R}^{\left(1\right)}\triangleq ln\left(1+{P}_{T}^{\left(1\right)}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right),
(12)
and let A be a positive real number satisfying the following relation:
\frac{A{\u2225\stackrel{\u0303}{h}\u2225}^{2}}{\left(1+A{\u2225\stackrel{\u0303}{h}\u2225}^{2}ln\left(1+A{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right)\right)}=\frac{{f}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(\u2225\stackrel{\u0303}{h}\u2225\right)}{1{F}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(\u2225\stackrel{\u0303}{h}\u2225\right)}.
(13)
If{P}_{T}^{\left(1\right)}\le A, then the transmission power and the transmission rate achieving the maximum expected energy efficiency are achieved with α* = 1:
{P}_{T}^{*}={P}_{T}^{\left(1\right)}\phantom{\rule{2.77695pt}{0ex}}and\phantom{\rule{2.77695pt}{0ex}}{R}^{*}={R}^{\left(1\right)}.
(14)
Otherwise, the transmission power and the transmission rate achieving the maximum expected energy efficiency are achieved with α* < 1, and{P}_{T}^{*}, and α* satisfies the following relation:
\frac{{P}_{T}^{*}{\u2225\stackrel{\u0303}{h}\u2225}^{2}}{\left(1+{\alpha}^{*}{P}_{T}^{*}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right)ln\left(1+{\alpha}^{*}{P}_{T}^{*}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right)}=\frac{{f}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(\sqrt{{\alpha}^{*}}\u2225\stackrel{\u0303}{h}\u2225\right)}{1{F}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(\sqrt{{\alpha}^{*}}\u2225\stackrel{\u0303}{h}\u2225\right)}.
(15)
Proof. See Appendix A
Theorem 2 states that when a condition of {P}_{T}^{\left(1\right)}\le A holds, the link adaptation controller can maximize the expected energy efficiency by choosing {P}_{T}={P}_{T}^{\left(1\right)} and R=ln\left(1+{P}_{T}^{\left(1\right)}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right) with treating \stackrel{\u0303}{h} as the actual CSI, where {P}_{T}^{\left(1\right)} can be efficiently found by algorithms introduced in [9]. To see when the condition of {P}_{T}^{\left(1\right)}\le A holds, we need to take a closer look at A, which is a parameter determined dependent on the distribution of w.
As an example case, we consider a circularly symmetric Gaussian distribution for each component of w, with mean 0 and variance {\sigma}_{w}^{2}. Then, {\u2225\stackrel{\u0303}{h}+w\u2225}^{2} is noncentral chisquare distributed, with degrees of freedom 2M, mean {\u2225\stackrel{\u0303}{h}\u2225}^{2} and noncentrality parameter 2{\u2225\stackrel{\u0303}{h}\u2225}^{2}\u2215{\sigma}_{w}^{2}. To evaluate A in (13), we further assume that \u2225\stackrel{\u0303}{h}\u2225=1,, Pc = 1 and consider various {\sigma}_{w}^{2} with M = 1,2,3,4. We first notice that {P}_{T}^{\left(1\right)}\approx 1.72 with \u2225\stackrel{\u0303}{h}\u2225=1 regardless of the value of {\sigma}_{w}^{2} as {P}_{T}^{\left(1\right)} does not depend on the distribution of w. In addition, we define B\triangleq \left({f}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(\u2225\stackrel{\u0303}{h}\u2225\right)\right)\u2215\left(1{F}_{\u2225\stackrel{\u0303}{h}+w\u2225}\left(\u2225\stackrel{\u0303}{h}\u2225\right)\right) and evaluate B with these parameters as shown in Figure 2. Since the left hand side of (13) is a decreasing function of A, the A meeting the condition (13) gets smaller as B increases. As shown in Figure 2, B is bounded above for a given M and σ_{
w
}> 0, and hence the minimum A satisfying the condition (13) is greater than the supremum of B obtained with the given M and σ_{
w
}> 0. For example, when M = 1, the supremum of B is equal to 0.5, and A satisfying the condition (13) with B = 0.5 is A ≃ e^{2}, which is greater than {P}_{T}^{\left(1\right)}=1.72. Since the supremum of B further reduces as M increases, a greater A is needed to satisfy the condition (13) with the supremum of B as M increases. From these observations, we find that for M ≥ 1, the condition of A\ge {P}_{T}^{\left(1\right)}=1.72 holds for all {\sigma}_{w}^{2}>0, and hence the link adaptation controller can simply assign {P}_{T}={P}_{T}^{\left(1\right)} and R=ln\left(1+{P}_{T}^{\left(1\right)}{\u2225\stackrel{\u0303}{h}\u2225}^{2}\right) without explicitly considering the variance of the CSI error {\sigma}_{w}^{2} for achieving the maximum expected of the network.
When the condition of {P}_{T}^{\left(1\right)}\le A holds, the maximum expected energy efficiency of the uplink coordinated multipoint reception with imperfect CSI at the link adaptation controller is
{\u0168}_{\stackrel{\u0303}{h}}^{\mathsf{\text{*}}}=\frac{ln\left(1+{P}_{T}^{\left(1\right)}\parallel \stackrel{\u0303}{h}{\parallel}^{\mathsf{\text{2}}}\right)}{{P}_{T}^{\left(1\right)}+{P}_{C}}\cdot \left(1{F}_{\parallel \stackrel{\u0303}{h}+w\parallel}\left(\sqrt{\alpha}\parallel \stackrel{\u0303}{h}\parallel \right)\right).
(16)
For example, the resultant maximum expected energy efficiency with \parallel \stackrel{\u0303}{h}\parallel =1,{P}_{C}=1 and w whose elements are distributed with circularly symmetric Gaussian with mean 0 and variance {\sigma}_{w}^{2} is shown in Figure 3. {\u0168}_{\stackrel{\u0303}{h}}^{*} decreases as σ_{
w
}increases and it increases as the number of cooperating base stations M increases.
In general, given a distribution of \stackrel{\u0303}{h}, which is a random vector characterizing \stackrel{\u0303}{h}, the maximum expected energy efficiency {\u0168}^{*}=E\left({\u0168}_{\stackrel{\u0303}{h}}^{*}\right) increases as we increase the M number of base stations participating in the cooperative link adaptation and decoding, because of the increased transmission rates and smaller outage probability for a given transmission rate thanks to the diversity reception of the coordinated multipoint reception.