In RBEBPC, the goal is to improve the system throughput by excluding the strong interference resource blocks and by redistributing the system interference among small cells. To conduct RBEBPC, some mutual interference correlated information, such as **G** and previous **P**, should be exchanged among small cells. Considering the large number of small cells and the backhaul ability of each small cell, the centralized cooperation structure is adopted to guarantee the efficiency of information exchange. The macro cell, which covers all the small cells in the system, acts as the centralized node to collect and forward all the necessary information. The procedure of RBEBPC is described in Algorithm 1.

In order to obtain **G**, in the initialization state, all small cells transmit with equal power on all available sub-channels (line 3 in Algorithm 1). Then, the macro cell collects and forwards the necessary part of **G** to the small cells (line 4 and line 5). After information collection, small cells will conduct the cooperative power control for *K* rounds. In each round, small cells first report the used power levels to the macro cell (line 7). The macro cell, based on the reported power level, plays the resource block exclusion game and calculates the interference constraint based on the resource block exclusion results for each small cell (*step 1*). Then each small cell solves a throughput maximization problem by following the received constraint from the macro cell (*step 2*).

The necessary signaling overhead among the small cells, the serving UEs, and the macro cell in RBEBPC scheme can be estimated as follows. In the *Basic information collection* phase, if the quantification of a sub-channel gain of a certain small cell needs *α* bits, each UE should feedback *α*
*N*
*L* bits to the access small cell to indicate the interfering channel gain on all the sub-channels. Then, each small cell forwards the *α*
*N*
*L* bits to the macro cell via the backhaul link. Each small cell receives *α*
*N*
*L* bits information to indicate the channel gains correlated to the interfering UEs in the system. In the *Cooperation based power control* phase, each small cell forwards *β*
*L* bits to indicate the previous power level in step 1, where *β* is the necessary bits to quantify the power level on a sub-channel. Each small cell receives *NL* bits that indicate the resource block exclusion results and receives *γ*(*L*+1) bits that indicate the suffered interference on each sub-channel and the generated total interference, where *γ* is the necessary bits to quantify the interference level on a sub-channel. So in Algorithm 1, the signaling overhead between a UE and the serving small cell is *α*
*N*
*L* bits. The signaling overhead between a small cell and the macro cell is 2*α*
*N*
*L*+*K*(*β*
*L*+*N*
*L*+*γ*(*L*+1)) bits. As for the typical wireless channel, the time scale for change of path amplitude is several hundreds of millisecond [19]. In long-term evolution (LTE), the transmission time interval is 1 ms, which is much smaller than the time of path amplitude change. So in the extreme case, the backhaul link between the small cell and the macro cell should satisfy the data rate max{*α*
*N*
*L*,*β*
*L*,*N*
*L*+*γ*(*L*+1)} kbps. For the typical configuration *α*=*β*=*γ*=64, *N*=10, *L*=50, the backhaul link data rate should satisfy 0.032 Gbps. If the backhaul links are implemented by using the standardized passive optical network (PON) [20] systems whose downstream is about 2.5 Gbps, the signaling overhead of RBEBPC is acceptable in practice.

In the following subsections, the resource block exclusion, interference calculation, and power optimization in RBEBPC are described in detail.

### 4.1 Resource block exclusion

Given the system power allocation result **P** and system channel gain matrix **G**, a binary matrix **A**∈{0,1}^{L×N} is defined to map the available sub-channels in the system:

$$ [\mathbf{A}]_{lj} =\left\{ \begin{array}{ll} 1, & \text{if \([\mathbf{P}]_{lj}> 0\)}, \\ 0, & \text{if \([\mathbf{P}]_{lj}\le 0\)}. \end{array} \right. $$

((10))

For the *j*-th small cell, the *l*-th sub-channel is available if and only if [**A**]_{
lj
}=1. Otherwise, [**A**]_{
lj
}=0 and the *j*-th small cell does not use the *l*-th sub-channel.

In Figure 2, the total available system resource blocks can be partitioned into \(\mathcal {C}=\{\mathcal {N}_{1},\mathcal {N}_{2},\dots,\mathcal {N}_{L}\}\), where \(\vert \mathcal {C}\vert =L\) and \(\mathcal {N}_{l}\subseteq \mathcal {N}\) denotes the set of small cells for which the *l*-th sub-channel is available (i.e., \(\mathcal {N}_{l}=\{j\in \mathcal {N}|[\mathbf {A}]_{\textit {lj}}=1\}\)). Given a sub-channel, the allocated power is not transferable, i.e., the power cannot be shared among small cells in an arbitrary ratio. For the sub-channels with strong interference, the system throughput will improve if some small cells mute the sub-channels. In this subsection, we resort to the coalition game to determine the sub-channels that need to be muted. Note that the system interference on the *l*-th sub-channel is only determined by the small cells in \(\mathcal {N}_{l}\). Besides, the system interferences of different sub-channels are independent once **P** is given. So in the following analysis, we only consider the *l*-th sub-channel and the correlated small cells in \(\mathcal {N}_{l}\).

###
**Definition**
**1**.

Let \(\mathcal {N}=\{1,2,\dots,N\}\) be a set of fixed players called the *grand coalition*. Non-empty subsets of are called *coalitions*. A *collection* (in the grand coalition ) is any family \(\mathcal {D}=\{\mathcal {D}_{1},\mathcal {D}_{2},\dots,\mathcal {D}_{s}\}\) of mutually disjoint coalition, and *s* is called its *size*. If additionally \(\mathop \cup \limits _{t=1}^{s}\mathcal {D}_{t}=\mathcal {N}\), the collection is called a *partition* of .

Based on Definition 1, the throughput of a coalition \(\mathcal {E}\subseteq \mathcal {N}_{l}\) is as follows:

$$ R(\mathcal{E})=\sum\limits_{j\in\mathcal{E}}\frac{W}{L}\log_{2}\left(1+\frac{{p_{j}^{l}} G_{jj}^{l}}{\sigma^{2}+{I_{j}^{l}}}\right). $$

((11))

If the small cells in the coalition mute in a cooperative manner and only one small cell in transmits, the achievable throughput is as follows:

$$ \tilde{R}(\mathcal{E})=\max\limits_{j\in\mathcal{E}}\frac{W}{L}\log_{2}\left(1+\frac{{p_{j}^{l}} G_{jj}^{l}}{\sigma^{2}+\sum\limits_{i\in\mathcal{N}_{l}\backslash\mathcal{E}}{p_{i}^{l}} G_{ji}^{l}}\right). $$

((12))

The formulated coalition game of the *l*-th sub-channel is denoted by \(\mathcal {G}^{l}=(\mathcal {N}_{l},v)\), where \(\mathcal {N}_{l}\) is the set of players and \(v: \eta (\mathcal {N}_{l})\mapsto \mathbb {R}\) is the payoff function of the coalitions [21]. The payoff function should take the interference and system throughput into consideration. A suitable payoff function for coalition (\(\forall \mathcal {E}\in \eta (\mathcal {N}_{l})\)) is as follows:

$$ v(\mathcal{E}) =\left\{ \begin{array}{ll} \tilde{R}(\mathcal{E}), & \text{if} \,\,\,R(\mathcal{E}) < \tilde{R}(\mathcal{E}), \\ 0, & \text{if} \,\,\,R(\mathcal{E}) \ge \tilde{R}(\mathcal{E}). \end{array} \right. $$

((13))

Note that for *∅*, we have *v*(*∅*)=0. Based on Equation 13, the establishment of coalition (\(\vert \mathcal {E}\vert \ge 2\)) has two effects. The first effect is the reduction of transmission resource blocks for small cells in . Before the formulation of , small cells in can have at most \(\vert \mathcal {E}\vert \) transmission resource blocks (i.e., each small cell owns a transmission resource block) and at least two transmission resource blocks (i.e., is established based on two existing coalitions). However, when is formulated, small cells in only have one transmission resource block. The second effect is the reduction of interference of the remaining transmission resource block. Since the small cells in mute in a cooperative manner, the interference of small cells in is canceled.

Some properties of \(\mathcal {G}^{l}\) are summarized as follows.

###
**Definition**
**2**.

The form of a coalition game \((\mathcal {N},v)\) is the characteristic form, if and only if the value of a coalition \(\mathcal {E}\in \eta (\mathcal {N})\) solely depends on the players in and is in independent with how the players in \(\mathcal {N}\backslash \mathcal {E}\) are structured.

###
**Property**
**1**.

The form of game \(\mathcal {G}^{l}\) is the characteristic form.

###
*Proof*.

This property directly follows by the definition of payoff function (Equation 13).

From *Property 1*, we can see that the value of the formulated payoff function is only sensitive to the players in the given coalition. In small cell networks, small cells are coupled with each other via the effect of interference. In (Equation 13), the effect of small cells beyond the given coalition is treated as a whole and is independent with their partition structure. By using Equation 13, we can obtain the value of a coalition once the coalition is given.

Given a partition \(\mathcal {P}=\{\mathcal {P}_{1},\mathcal {P}_{2},\dots,\mathcal {P}_{s}\}\) of \(\mathcal {N}_{l}\), we can find a unique vector \(\mathbf {v}=[v(\mathcal {P}_{1}),v(\mathcal {P}_{2}),\dots,v(\mathcal {P}_{s})]^{T}\in \mathbb {R}^{s\times 1}\) to represent the value of each coalition in . Besides, since the coalition value of \(\mathcal {G}^{l}\) is the maximum achievable throughput by a single member small cell, game \(\mathcal {G}^{l}\) has a transferable utility, i.e., the achievable throughput can be arbitrarily portioned among the small cells of a coalition (for example, via the proper choice of coding strategy [18]).

###
**Definition 3**.

A transferable utility coalition game is supperadditive if and only if \(\forall \mathcal {E}_{1}\subseteq \mathcal {N}_{l}\), \(\mathcal {E}_{2}\subseteq \mathcal {N}_{l}\) and \(\mathcal {E}_{1}\cap \mathcal {E}_{2}=\emptyset \):

$$ v(\mathcal{E}_{1}\cup\mathcal{E}_{2})\ge v(\mathcal{E}_{1})+v(\mathcal{E}_{2}). $$

((14))

For the transferable utility coalition game with supperadditivity, establishing a bigger coalition is always beneficial.

###
**Property 2**.

Due to the implied tradeoff between interference and throughput in (Equation 13), the formulated game \(\mathcal {G}^{l}\) is not supperadditive and the grand coalition is not always formed; thus, disjoint independent coalitions will form in the network.

The proof of *Property*
2 can be found in the Appendix. The payoff function (Equation 13) shows that in the formulated coalition, only one player transmits and the rest in the coalition keep silent. Once a coalition is formulated, the available number of resource blocks reduces but the quality of the remaining resource block improves. Due to the random deployment of small cell and the fluctuation of the wireless channel quality, small cells may not be prone to form the grand coalition. In small-scale area where strong interference exists, small cells form a coalition with a higher possibility since the payoff of the coalition is probably higher if the interference is removed. The main obstacle of the grand coalition construction lies in (Equation 13) because the reduction of the available resource blocks is the cost of coalition construction. In practice, the construction of grand coalition needs harsh conditions. In the case that the interferences among small cells are so strong that if more than one small cell share the sub-channel the total throughput of the sub-channel reduces, the grand coalition will form.

###
**Definition 4**.

A comparison relation *⊳* is defined for two collections \(\mathcal {D}=\{\mathcal {D}_{1},\mathcal {D}_{2},\dots,\mathcal {D}_{s}\}\) and \(\mathcal {F}=\{\mathcal {F}_{1},\mathcal {F}_{2},\dots,\mathcal {F}_{w}\}\) that satisfy \(\mathop \cup \limits _{m=1}^{s}\mathcal {D}_{m}=\mathop \cup \limits _{n=1}^{w}\mathcal {F}_{n}=\mathcal {H}\subseteq \mathcal {N}\). Thus, \(\mathcal {D}\triangleright \mathcal {F}\) means that the way partitions is preferred to the way partitions .

###
**Definition 5**.

For collection and defined in *Definition 4*:

$$ \mathcal{D}\triangleright\mathcal{F} \iff \sum\limits_{m=1}^{s}v(\mathcal{D}_{m})>\sum\limits_{n=1}^{w}v(\mathcal{F}_{n}). $$

((15))

Definitions 4 and 5 provide a preference among partitions. Definition 5 indicates that ‘social welfare’ (the total throughput) is considered as the baseline. So the defined preference is coordinate with the target of system throughput improvement. We can use the exhaustive search method to obtain the maximum throughput partition structure of \(\mathcal {G}^{l}\), in which a rather large search space should be considered. For the player set \(\mathcal {N}_{l}\), the number of possible partition, which is given by a value known as Bell number, grows sharply with the number of players in \(\mathcal {N}_{l}\). For example, the Bell number of \(\mathcal {N}_{l}\) is 115,975 when \(\vert \mathcal {N}_{l}\vert =10\). However, by following some simple rules, we can obtain a stable practical partition structure of \(\mathcal {G}^{l}\).

For any two collections \(\mathcal {D}=\{\mathcal {D}_{1},\mathcal {D}_{2},\dots,\mathcal {D}_{s}\}\), \(\mathcal {F}=\{\mathcal {F}_{1},\mathcal {F}_{2},\dots,\mathcal {F}_{w}\}\) and the grand coalition that satisfy \((\mathop \cup \limits _{m=1}^{s}\mathcal {D}_{m})\cup (\mathop \cup \limits _{n=1}^{w}\mathcal {F}_{n})=\mathcal {N}\) and \((\mathop \cup \limits _{m=1}^{s}\mathcal {D}_{m})\cap (\mathop \cup \limits _{n=1}^{w}\mathcal {F}_{n})=\emptyset \), we define two operation rules to construct the stable partition structure.

###
**Definition 6**.

**Merge rule:** If \(\{\mathop \cup \limits _{m=1}^{s}\mathcal {D}_{m}\}\cup \mathcal {F}\triangleright \mathcal {D}\cup \mathcal {F}\), merge into \(\{\mathop \cup \limits _{m=1}^{s}\mathcal {D}_{m}\}\):

$$ \begin{aligned} &\text{Merge}(\mathcal{D}\cup\mathcal{F})=\\ &\left\{ \begin{array}{l} \{\mathop\cup\limits_{m=1}^{s}\mathcal{D}_{m}\}\cup\mathcal{F},\quad \text{if} \{\mathop\cup\limits_{m=1}^{s}\mathcal{D}_{m}\}\cup\mathcal{F}\triangleright\mathcal{D}\cup\mathcal{F}, \\ \mathcal{D}\cup\mathcal{F},\quad \text{if} \mathcal{D}\cup\mathcal{F}\triangleright\{\mathop\cup\limits_{m=1}^{s}\mathcal{D}_{m}\}\cup\mathcal{F}. \end{array} \right. \end{aligned} $$

((16))

###
**Definition 7**.

**Split rule:** If \(\mathcal {D}\cup \mathcal {F}\triangleright \{\mathop \cup \limits _{m=1}^{s}\mathcal {D}_{m}\}\cup \mathcal {F}\), split \(\{\mathop \cup \limits _{m=1}^{s}\mathcal {D}_{m}\}\) into :

$$ \begin{aligned} &\text{Split}(\{\mathop\cup\limits_{m=1}^{s}\mathcal{D}_{m}\}\cup\mathcal{F})=\\ &\left\{ \begin{array}{l} \{\mathop\cup\limits_{m=1}^{s}\mathcal{D}_{m}\}\cup\mathcal{F},\quad \text{if} \{\mathop\cup\limits_{m=1}^{s}\mathcal{D}_{m}\}\cup\mathcal{F}\triangleright\mathcal{D}\cup\mathcal{F}, \\ \mathcal{D}\cup\mathcal{F},\quad \text{if} \mathcal{D}\cup\mathcal{F}\triangleright\{\mathop\cup\limits_{m=1}^{s}\mathcal{D}_{m}\}\cup\mathcal{F}. \end{array} \right. \end{aligned} $$

((17))

Note that the above operation rules of Definitions 6 and 7 use *⊳* ‘locally’, by focusing on the coalitions that take part and result from the merge operation and split operation. Algorithm 2 summarizes the procedure that uses the merge-split rule to obtain the stable partition structure of \(\mathcal {G}^{l}\). Due to *Property 1*, the conduction of the split rule in Algorithm 2 can be transformed into the merge rule, if the coalition remains to be splitted is treated as a smaller grand coalition. The split operation is equivalent to the merge operation in the smaller grand coalition as long as we treat the effect of the players beyond the smaller grand coalition as a whole.

It is difficult to directly describe the complexity of Algorithm 2, because the termination condition of Algorithm 2 depends on the specific state of the small cell network. But we can estimate the complexity of Algorithm 2 in some extreme cases. Since the split operation can be treated as a special kind of merge operation, we only analyze the complexity of merge operation here. In the worst case where no coalition with size bigger than two is formed, the potential number of possible coalitions that should be considered is \(\zeta _{1}=2^{\vert \mathcal {N}_{l}\vert }-\vert \mathcal {N}_{l}\vert -1=\sum \limits _{k=2}^{\vert \mathcal {N}_{l}\vert }\begin {pmatrix}\vert \mathcal {N}_{l}\vert \\k\end {pmatrix}\). While in the best case where the grand coalition is formed, the potential number of possible coalitions that should be considered is \(\zeta _{2}=\vert \mathcal {N}_{l}\vert -1\). So for each merge or split operation, the number of coalitions that should be considered is between *ζ*
_{1} and *ζ*
_{2}.

###
**Definition 8**.

A partition \(\mathcal {P}=\{\mathcal {P}_{1},\mathcal {P}_{2},\dots,\mathcal {P}_{s}\}\) of the grand coalition is \(\mathbb {D}_{c}\) stable, if no players in are interested in leaving through any operation (not necessary merge or split) to form a partition different with . A partition \(\mathcal {Q}=\{\mathcal {Q}_{1},\mathcal {Q}_{2},\dots,\mathcal {Q}_{w}\}\) of the grand coalition is \(\mathbb {D}_{\textit {hp}}\) stable if no coalition has the incentive to split or merge.

###
**Property 3**.

The partition structure of \(\mathcal {G}^{l}\) obtained by Algorithm 2 is \(\mathbb {D}_{\textit {hp}}\) stable. The existence of \(\mathbb {D}_{c}\) stable partition in game \(\mathcal {G}^{l}\) is not always guaranteed. If the \(\mathbb {D}_{c}\) stable partition of \(\mathcal {G}^{l}\) exists, the partition structure obtained by Algorithm 2 is \(\mathbb {D}_{c}\) stable.

The proof of *Property 3* can be found in the Appendix. Property 3 indicates that the final outcome of Algorithm 2 is stable. The outcome of Algorithm 2 can not be improved by any merge or split operation. Generally speaking, the optimal partition results can be found by using the exhaustive method. But the number of possible cases (the Bell number) that should be considered in the exhaustive method is too huge to manage. The outcome of Algorithm 2 may not be globally optimal but it is at least stable.

Denote the outcome of Algorithm 2 by \(\mathcal {P}^{*}=\{\mathcal {P}_{1}^{*},\mathcal {P}_{2}^{*},\dots,\mathcal {P}_{x}^{*}\}\). From Equation 13, we can find the set of players \(\mathcal {U}=\{y_{1},y_{2},\dots,y_{x}\}\), where \(y_{t}\in \mathcal {P}_{t}^{*}\) is the player whose throughput on the sub-channel equals \(\tilde {R}(\mathcal {P}_{t}^{*})\), if other players in \(\mathcal {P}_{t}^{*}\) mute in the sub-channel. So the resource blocks with strong interference are marked by modifying the available sub-channel mapping matrix:

$$ [\mathbf{A}]_{lj} =\left\{ \begin{array}{ll} 1, & \text{for} j \in \mathcal{U}, \\ 0, & \text{for} j\in \mathcal{N}_{l}\backslash\mathcal{U}. \end{array} \right. $$

((18))

The resource blocks with strong interference are excluded by setting zero value to the corresponding elements in the available sub-channel mapping matrix. Figure 4 shows a schematic diagram of resource block exclusion. Part of the total system blocks are excluded by playing the formulated games. A detailed example of the resource exclusion procedure can be found in the simulation part.

### 4.2 Interference calculation

After the resource block exclusion, the suffered interference of the *j*-th UE on the *l*-th sub-channel is as follows:

$$ \tilde{I}_{j}^{l}=\sum\limits_{i=1,i\neq j}^{N}{p_{i}^{l}}G_{ji}^{l}[\mathbf{A}]_{li}. $$

((19))

The interference power level on the excluded resource block is not considered in \(\tilde {I}_{j}\). The generated interference of small cell *j* is defined as:

$$ \bar{I}_{j}=\sum\limits_{l=1}^{L}\sum\limits_{i=1,i\neq j}^{M}{p_{j}^{l}} G_{ij}^{l} [A]_{li}. $$

((20))

The generated interference depends on three factors: the previous allocated power vector, the interference channel gain and the channel availability. If the sub-channel is unavailable for some small cells, the interference to the small cell on this sub-channel has no contribution to the generated interference. Based on the system channel gain matrix **G**, previous system power allocation results **P** and system current mapping matrix **A**, the macro cell calculates the generated interference vector \(\bar {\mathbf {I}}=[\bar {I}_{1},\bar {I}_{2},\dots,\bar {I}_{N}]^{T}\in \mathbb {R}^{N\times 1}\) and the suffered interference vector \(\tilde {\mathbf {I}}_{j}=[\tilde {I}_{j}^{1},\tilde {I}_{j}^{2},\dots,\tilde {I}_{j}^{L}]^{T}\in \mathbb {R}^{L\times 1}\) (\(\forall j\in \mathcal {M}\)) after the sub-channel exclusion operation. The two interference correlated vectors are delivered to the small cells to be used in the power optimization procedure.

### 4.3 Power optimization

After obtaining the interference constraint \(\bar {\mathbf {I}}\) and available sub-channel mapping matrix **A**, each small cell will optimize the transmission power on each available sub-channel based on the last transmission power allocation result **P**. Instead of solving **P**
**1**, the *j*-th (\(\forall j\in \mathcal {N}\)) small cell will solve the following problem:

$$ \textbf{P2:}\quad \max\sum\limits_{l\in\mathcal{L}_{j}}\frac{W}{L}\log_{2}(1+\frac{{p_{j}^{l}} G_{jj}^{l}}{\sigma^{2}+\tilde{I}_{j}^{l}}) $$

((21))

$$ s.t.\quad \sum\limits_{i\in\mathcal{M}\backslash\{j\}}\sum\limits_{l\in\mathcal{L}_{j}} {p_{j}^{l}} G_{ij}^{l}[A]_{li}\le \bar{I}_{j}, $$

((22))

$$ \sum\limits_{l\in\mathcal{L}_{j}}{p_{j}^{l}}\le P0, $$

((23))

$$ {p_{j}^{l}}\ge0, \forall l\in\mathcal{L}_{j}, $$

((24))

where \(\mathcal {L}_{j}\) is the set of available sub-channels for the *j*-th small cell (i.e., \(\mathcal {L}_{j}=\{l\in \mathcal {L}|[\mathbf {A}]_{\textit {lj}}=1\}\)). The constraint of (Equation 22) is used to redistribute the total generated interference on all the available sub-channels. The formulated **P**
**2** is a concave problem and the proof can be found in the Appendix.

Since **P**
**2** is concave to the domain \(\mathbf {p}_{j}\in \mathbb {R}^{\vert \mathcal {L}_{j}\vert \times 1}\), we can use the interior point method [22] to solve it. However, it is more convenient to solve the dual problem of **P**
**2**, if we take the problem structure into consideration. First, due to concavity, the optimal solution to **P**
**2** is equivalent to the optimal solution to the dual problem. Second, the number of variables in **P**
**2** is \(\vert \mathcal {L}_{j}\vert \). While in the dual problem, the number of variables reduces to two. Because constraints (Equation 24) are treated as slack conditions, only (Equation 22) and (Equation 23) are used when solving the dual problem. Third, in some special cases, the close form solution to the dual problem can be obtained, which greatly accelerates the computation. The details of solving the dual problem are represented as follows. The Lagrangian of **P**
**2** is as follows:

$$ \begin{aligned} &L\left(\lambda,\mu,\mathbf{\theta},\mathbf{p}_{j}\right)= \\ &\sum\limits_{l\in\mathcal{L}_{j}}\frac{W}{L}\log_{2}\left(1+\frac{{p_{j}^{l}} G_{jj}^{l}}{\sigma^{2}+\tilde{I}_{j}^{l}}\right)+\lambda\left(\bar{I}_{j}-\right.\\ &\sum\limits_{i\in\mathcal{M}\backslash\{j\}}\sum\limits_{l\in\mathcal{L}_{j}} {p_{j}^{l}} G_{ij}^{l}[A]_{li}\left.\right)+\mu\left(P0- \sum\limits_{l\in\mathcal{L}_{j}}{p_{j}^{l}}\right)+\sum\limits_{l\in\mathcal{L}_{j}}\theta_{l} {p_{j}^{l}}\\ & \equiv \sum\limits_{l\in\mathcal{L}_{j}}[z\log_{2}\left({b_{j}^{l}}+ {p_{j}^{l}}\right)+{a_{j}^{l}}]+\lambda\left(\bar{I}_{j}-\sum\limits_{l\in\mathcal{L}_{j}}{c_{j}^{l}}{p_{j}^{l}}\right)+\\ &\mu\left(P0-\sum\limits_{l\in\mathcal{L}_{j}}{p_{j}^{l}}\right)+\sum\limits_{l\in\mathcal{L}_{j}}\theta_{l} {p_{j}^{l}}, \end{aligned} $$

((25))

where \(z=\frac {W}{L}\), \({a_{j}^{l}}=\frac {W}{L}\log _{2}\left (\frac {G_{\textit {jj}}^{l}}{\sigma ^{2}+\tilde {I}_{j}^{L}}\right)\), \({b_{j}^{l}}=\frac {\sigma ^{2}+\tilde {I}_{j}^{l}}{G_{\textit {jj}}^{l}}\), \({c_{j}^{l}}=\sum \limits _{i\in \mathcal {M}\backslash \{j\}}G_{\textit {ij}}^{l}[\mathbf {A}]_{\textit {li}}\). The first-order partial derivatives of \({p_{j}^{l}}\) is as follows:

$$ \frac{\partial L(\lambda,\mu,\mathbf{\theta},\mathbf{p}_{j})}{\partial {p_{j}^{l}}}=\frac{z}{\ln 2}\frac{1}{{b_{j}^{l}}+ {p_{j}^{l}}}-\lambda {c_{j}^{l}}-\mu+\theta_{l}. $$

((26))

Based on the KKT condition of **P**
**2**, we can see that **θ** is a slack variable vector that can be eliminated [22]. By solving \(\frac {\partial L(\lambda,\mu,\mathbf {p}_{j})}{\partial {p_{j}^{l}}}=0\) and substituting the results to Equation 25, we can obtain the dual problem of **P**
**2**:

$$ \begin{aligned} &\mathbf{P3:}\quad \min g(\lambda,\mu)\\ &=\min\max\limits_{\mathbf{p}_{j}}L(\lambda,\mu,\mathbf{p}_{j}) \\ &=\min\limits_{\lambda,\mu}\sum\limits_{l\in\mathcal{L}_{j}}\left[-z\log_{2}(\lambda {c_{j}^{l}}+\mu)+(\lambda {c_{j}^{l}}+\mu){b_{j}^{l}}\right]+\\ &\lambda \bar{I}_{j}+\mu P0+\sum\limits_{l\in\mathcal{L}_{j}}\left[z\log_{2}(\frac{z}{\ln 2})+{a_{j}^{l}}-\frac{z}{\ln 2}\right] \end{aligned} $$

((27))

$$ s.t. \quad \lambda\ge 0,\mu\ge 0. $$

((28))

It is easy to verify that **P**
**3** is a convex optimization problem. By using the KKT conditions, constraints (Equation 28) can be neglected, i.e., we can first solve **P**
**3** without constraints (Equation 28) and then using (Equation 28) to examine the correctness of the solution. So **P**
**3** can be efficiently solved by using the Newton Method, which is a method suitable for the convex optimization problem without constraints [22]. In some cases, we can obtain the close form optimal solution (*λ*
^{∗},*μ*
^{∗}) to **P**
**3**.

###
**Case**
**1**.

*λ*=0. If *λ*=0, then *g*(*λ*,*μ*) degenerates into *g*(0,*μ*). By solving \(\frac {\partial g(0,\mu)}{\partial \mu }=0\), we can obtain the optimal *μ*:

$$ \mu^{*}=\frac{z}{\ln 2}\frac{\vert\mathcal{L}_{j}\vert}{ P0+\sum\limits_{l\in\mathcal{L}_{j}}{b_{j}^{l}}}. $$

((29))

###
**Case**
**2**.

*μ*=0. If *μ*=0, then *g*(*λ*,*μ*) degenerates into *g*(*λ*,0). By solving \(\frac {\partial g(\lambda,0)}{\partial \lambda }=0\), we can obtain the optimal *λ*:

$$ \lambda^{*}=\frac{z}{\ln 2}\frac{\vert\mathcal{L}_{j}\vert}{\bar{I}_{j} +\sum\limits_{l\in\mathcal{L}_{j}}{c_{j}^{l}}{b_{j}^{l}}}. $$

((30))

When the optimal (*λ*
^{∗},*μ*
^{∗}) is obtained, we can use Equation 26 and solve \(\frac {\partial L(\lambda ^{*},\mu ^{*},\mathbf {p}_{j})}{\partial {p_{j}^{l}}}=0\) to obtain the optimal \( {p}_{j}^{l*}\). Due to the elimination of **θ**, we must verify the correctness of the power results. If \( p_{j}^{l*}\ge 0\) for all \(l\in \mathcal {L}_{j}\), we can conclude that \(\mathbf {p}_{j}^{*}\) is the solution to **P**
**2**. If some elements of \(\mathbf {p}_{j}^{*}\) are negative, we must remove these sub-channels with the minimum power and solve **P**
**3** again until a solution \({p}_{j}^{l*}\ge 0\) for all \(l\in \mathcal {L}_{j}\) is found.

The power optimization procedure is summarized in Algorithm 3. It is difficult to analyze the complexity of Algorithm 3 directly. But the number of iteration times of the Newton Method of Algorithm 3 can be estimated in some extreme cases. In the best cases where the optimal value can be achieved by (Equation 29) or (Equation 30), the iteration time of the Newton Method is one. In the worst case where the optimal value is obtained without using these close form equations, the maximum number of iterations is bounded by \(\frac {g(\lambda ^{0},\mu ^{0})-g(\lambda ^{*},\mu ^{*})}{\tau }+6\) [22], where *λ*
^{0} and *μ*
^{0} separately represent the initialized value of *λ* and *μ*, respectively, and *τ* is the maximum reduction value of function *g*(*λ*,*μ*) during the iterations.