Distributed stochastic power control in ad hoc networks: a nonconvex optimization case

Signal-to-interference-plus-noise-based power allocation in wireless ad hoc networks is inherently a nonconvex optimization problem because of the global coupling induced by the co-channel interference. To tackle this challenge, we first show that the globally optimal point lies on the boundary of the feasible region. This property is utilized to transform the utility maximization problem into an equivalent max–min problem with more structure. By using extended duality theory, penalty multipliers are introduced for penalizing the constraint violations, and the minimum weighted utility maximization problem is then decomposed into subproblems for individual users to devise a distributed stochastic power control algorithm, where each user stochastically adjusts its target utility to improve the total utility by simulated annealing (SA). The proposed distributed power control algorithm can guarantee global optimality at the cost of slower convergence due to SA involved in the global optimization. The geometric cooling scheme with suitable choice of penalty parameters is then used to improve the convergence rate. Next, by integrating the stochastic power control approach with the back-pressure algorithm, we develop a joint scheduling and power allocation policy to stabilize the queueing systems under random packet traffic. Finally, we generalize the above distributed power control algorithms to multicast communications, and show their global optimality for multicast traffic.


INTRODUCTION
T HE broadcast nature of wireless transmissions makes wireless networks susceptible to interference, which deteriorates quality of service (QoS) provisioning.Power control is considered as a promising technique to mitigate interference.One primary objective of power control is to maximize the system utility that can achieve a variety of fairness objectives among users [2], [3], [4], [5].However, maximizing the system utility, under the physical interference model, often involves nonconvex optimization and it is known to be NP-hard, due to the complicated coupling among users through interference [6].
Due to the nonconvex nature of the power control problem, it is challenging to find the globally optimal power allocation in a distributed manner.Notably, [7], [8] devised distributed power control algorithms to find power allocations that satisfy the local optimality conditions, but global optimality could not be guaranteed in general, except for some special convexifiable cases (e.g., with strictly increasing log-concave utility functions).Another thread of work applied game-theoretic approaches to power control by treating it as a noncooperative game among transmitters [9], [10].However, distributed solutions that converge to a Nash equilibrium may be suboptimal in terms of maximizing the
• Yalin E. Sagduyu and Jason H. Li are with Intelligent Automation, Inc., Rockville, MD 20855, USA (e-mail: ysagduyu@i-a-i.com;jli@i-a-i.com).• Part of this paper will be presented at the IEEE International Conference on Communications, ICC 2011 [1].
total system utility.Different from these approaches, [11] proposed a globally optimal power control scheme, named MAPEL, by exploiting the monotonic nature of the optimization problem.However, the complexity and the centralized nature of MAPEL hinder its applicability in practical scenarios, and thus it can be treated rather as a benchmark for performance evaluation in distributed networks.
To find the globally optimal power allocation in a distributed setting, an interesting work [12] has proposed the SEER algorithm based on Gibbs sampling [13], which can approach the globally optimal solution in an asymptotic sense when the control parameter in Gibbs sampling tends to infinity.Notably, for each iteration in the SEER algorithm, each user utilizes Gibbs sampling to compute its transition probability distribution for updating its transmission power, where the requirement for message passing and computing the transition probability distribution in each iteration can be demanding when applied to ad-hoc communications.
A challenging task in distributed power control in adhoc networks is to reduce the amount of message passing while preserving the global optimality.In this paper, we tackle this challenge by combining recent advances in extended duality theory (EDT) [14] with simulated annealing (SA) [15].Compared with the classical duality theory with nonzero duality gap for nonconvex optimization problems, EDT can guarantee zero duality gap between the primal and dual problems by utilizing nonlinear Lagrangian functions.This property allows for solving the nonconvex problem by its extended dual while preserving the global optimality with distributed implementation.Furthermore, as will be shown in Section II, for the subproblem of each individual user, the extended dual can then be solved through stochastic search using SA.In particular, we first transform the original utility maximization problem into an equivalent maxmin problem.This step is based on the key observation that in the case with continuous and strictly increasing utility functions, the globally optimal solution is always on the boundary of the feasible (utility) region.Then, appealing to EDT and SA, we develop a distributed stochastic power control (DSPC) algorithm that stochastically searches for the optimal power allocation in the neighborhood of the feasible region's boundary, instead of bouncing around in the entire feasible region.Specifically, we first show that DSPC can achieve the global optimality in the underlying nonconvex problem, although the convergence rate can be slow (but this is clearly due to the slow convergence nature of SA).Then, to improve the convergence rate of DSPC, we propose an enhanced DSPC (EDSPC) algorithm that employs the geometric cooling schedule and performs a careful selection of penalty parameters.As a benchmark for performance evaluation, we also develop a centralized algorithm to search for the globally optimal solution over simplices that cover the utility region.The performance gain is further verified by comparing our distributed algorithms with MAPEL [11], SEER [12], and ADP [7] algorithms.Worth noting is that the proposed DSPC and EDSPC algorithms do not require any knowledge of channel gains, which is typically needed in existing algorithms, and instead they need only the Signal-to-Interference-plus-Noise (SINR) feedback for adaptation.
Next, we integrate the above distributed power control with the back-pressure algorithm [22] and devise a joint scheduling and power allocation policy for improving the stability in the presence of dynamic packet arrivals and departures.This policy fits into the dynamic back-pressure and resource allocation framework and enables distributed utility maximization without extra technical conditions [23] [25].Then, we generalize the study to consider multicast communications, where a single transmission may simultaneously deliver packets to multiple recipients [16].Specifically, we extend DSPC and EDSPC algorithms to multicast communications with distributed implementation, and show that these algorithms can also achieve the global optimality in terms of jointly maximizing the minimum rates on bottleneck links in different multicast groups.
The rest of the paper is organized as follows.In Section 2, we first introduce the system model, establish the equivalence between the utility maximization problem and its max-min form, and then develop both centralized and distributed algorithms for the max-min problem.Next, in Section 3, building on these power control algorithms, we develop a joint scheduling and power allocation policy to stabilize queueing systems.The generalization to multicast communications is presented in Section 4. We conclude the paper in Section 5.

System Model
We consider an ad-hoc wireless network with a set L = {1, ..., L} of links, where the channel is interferencelimited, and all L links treat interference as noise, as illustrated in Fig. 1.Such a model of communication is also applicable to cellular networks [2].Each link consists of a dedicated transmitter-receiver pair. 1 We denote by h lk the fixed channel gain between user l's transmitter and user k's receiver, and by p l the transmission power of link l with P max l being its maximum power constraint.For static channels, the received SINR for the lth user with a matched filter receiver is given by where p = (p 1 , ..., p L ) is a vector of the users' transmission powers and n l is the noise power.Accordingly, the lth user receives the utility U l (γ l ), where U l (•) is continuous and strictly increasing.We assume that each user l's utility is zero when γ l = 0, i.e., U l (0) = 0.For ease of reference, some notation is listed in Table 1. 2

Network Utility Maximization
We seek to find the optimal power allocation p * that maximizes the overall system utility subject to the individual power constraints, given by the following optimization problem:  In general, ( 2) is a nonconvex problem 3 .In particular, if the utility function is the Shannon rate achievable over Gaussian flat fading channels, namely U l (γ l (p)) = w l log(1 + γ l (p)), where w l > 0 is a weight associated with user l, (2) boils down to the weighted sum rate maximization problem, which is known to be nonconvex and NP-hard [6].Note that the weights can serve as the fairness measures [17] for different scenarios.In particular, in queueing systems, for arrival rates within the stability region, packet queues can be stabilized by solving this weighted sum rate maximization problem, when the instantaneous queue lengths are chosen as the weights.In Section 3, we will discuss how to stabilize the packet queues by integrating our distributed power control algorithms with the back-pressure algorithm.
Let F denote the feasible utility region, where for each point U = (U 1 , ..., U L ) in F , there exists a power vector p such that U l = U l (γ l (p)) for all l ∈ L. The feasible utility region F is nonconvex, and in general, finding the globally optimal solution to (2) in F is challenging.In the following example, we illustrate the geometry of F for the utility U l (γ l (p)) = w l log(1+γ l (p)) and evaluate the solutions to (2) given by some existing power control approaches discussed in Section 1.
Example: For the case with two links, Fig. 2 illustrates the nonconvex feasible utility region F for different system parameters.We compare the performance of the existing approaches [2], [7], [11], [12] in Table 2.
Remarks: The solutions to (2) given by [2], [7], [11] are either distributed but suboptimal or optimal but centralized.In particular, [2] solves (2) by using geometric 3.For some special utility functions U l (.), (2) can be transformed into a convex problem [4].In this paper, we focus on the nonconvex case that cannot be transformed to a convex problem by change of variables.= 2.In both cases, the noise power is 0.1 for each link, and the weights are w 1 = 0.57, w 2 = 0.43.programming (GP) under the high-SINR assumption, which yields a suboptimal solution to (2) when the assumption does not hold (e.g., this is the case in this example above).The ADP algorithm [7] can guarantee only local optimality 4 in a distributed manner.The MAPEL algorithm [11] can achieve the globally optimal solutions but it is centralized with high computational complexity.Compared with these algorithms, the SEER algorithm [12] can guarantee global optimality in a distributed manner but message passing needed in each iteration can be demanding, i.e., each link needs the knowledge of the channel gains, the receiver SINR and the signal power of all the other links.It is worth noting that the performance of SEER hinges on the control parameter that can be challenging to choose on the fly.

From Network Utility Maximization to Minimum Weighted Utility Maximization
In order to devise low-complexity distributed algorithms that can guarantee global optimality, we first study the basic properties for the solutions to (2), and then convert (2) into a more structured max-min problem.
Lemma 2.1: The optimal solution to (2) is on the boundary of the feasible utility region F .
4. The local optimal solution found by ADP happens to be globally optimal only in one of the cases that are illustrated in Table 2.
Proof: Let U * denote a globally optimal solution to (2) over F , and γ * denote the corresponding SINR that supports U * .Since U l (•) is continuous and strictly increasing, proving that U * is on the boundary of F is equivalent to showing that γ * is on the boundary of the feasible SINR region.Suppose that γ * is not on the boundary of the feasible SINR region, which indicates that there exists some point γ such that γl ≥ γ * l for all l ∈ L and γl > γ * l for some l.Since U l (•) for any l ∈ L is strictly increasing in γ l , we have U l (γ l ) ≥ U l (γ * l ) for all l ∈ L and U l (γ l ) > U l (γ * l ) for some l, which contradicts the fact γ * is a globally optimal solution.Hence, Lemma 2.1 follows.
Based on Lemma 2.1, if we can characterize the boundary of F , then it is possible to solve (2) efficiently.Thus motivated, we first establish, by introducing a "contribution weight" for each user, the equivalence between (2) and the minimum weighted utility maximization problem.
Lemma 2.2: Problem ( 2) is equivalent to the following minimum weighted utility maximization: Proof: Let t = l∈L U l (γ l (p)) denote the total utility.Since U l (.) is nonnegative, we define x l ∈ [0, 1] as a ratio for the contribution of user l's utility to t.Therefore, U l (γ l (p)) = tx l and l∈L x l = 1.Then (2) can be rewritten as Then, in the context of maximizing t, it suffices to relax . Therefore, (4) can be treated as the hypograph form of (3), i.e., ( 4) and (3) are equivalent [18], thereby concluding the proof.
For given x, (3) is quasi-convex 5 .By introducing an auxiliary variable t, we obtain the following equivalent formulation: which can be solved in polynomial time through binary search on t [18].By transforming (2) to this more 5.By definition, a function f : R n → R is quasi-convex, if its domain domf and all its sublevel sets Sc Fig. 3: An illustration of the max-min problem for the case with two links.structured max-min problem (3), we are able to find each boundary point efficiently.Then, the problem is reduced to finding a globally optimal x * , given which we can obtain a globally optimal solution, i.e., the tangent point of the hyperplane and F , as illustrated in Fig. 3. Intuitively speaking, x represents a search direction.Once we find the best search direction x * , p * can be obtained efficiently by searching along the direction of x * .

Algorithms for Global Optimization
In this section, we study algorithms achieving global optimality for (3).First, we propose a centralized algorithm for (3), which will serve as a benchmark for performance comparison.Then, by using EDT and SA, we propose a distributed algorithm, DSPC, for the problem (3).Building on this, we propose an enhanced algorithm EDSPC to improve the convergence rate of DSPC.

1) A Centralized Algorithm
Based on Lemma 2.1 and Lemma 2.2, we develop a centralized algorithm (Algorithm 1) to solve the max-min optimization problem (3) under consideration.Roughly speaking, by dividing the simplex S = {x| l∈L x l = 1, 0 ≤ x l ≤ 1, ∀l ∈ L} into many small simplices, the algorithm can find the optimal point on the boundary of F .
Proposition 2.1: Algorithm 1 converges monotonically to a globally optimal solution to (3) as the approximation factor ǫ approaches zero.
Proof: For given ǫ, Algorithm 1 divides the simplex S = {x| l∈L x l = 1, 0 ≤ x l ≤ 1, ∀l ∈ L} into ⌈1/ǫ⌉ simplices 6 .Then Algorithm 1 computes the power allocation p * by solving (5) at x given by the center point of the simplex.Since the optimal search direction x * is in S, when the approximation factor ǫ approaches 6. ⌈1/ǫ⌉ denotes the smallest integer greater than 1/ǫ.zero, Algorithm 1 exhaustively searches every point in the simplex S. Therefore, Algorithm 1 can converge monotonically to a globally optimal solution to (3).
Remarks: In Algorithm 1, by controlling ǫ, one can obtain a solution arbitrarily close to a globally optimal one.Accordingly, Algorithm 1 can guarantee global optimality.However, the complexity of this algorithm can be high and this is possible only with centralized implementation.Algorithm 1 will be used only as a benchmark for performance evaluation of distributed algorithms.

Algorithm 1
Initialization: Choose the approximation factor ǫ > 0, and construct the initial simplex S with the vertex set V = {v 1 , ..., v L }, where v l = e l and e l is the lth unit coordinate vector.Let v c = 1 L l∈L v l be the center of S. Compute p * by solving (5) Then, each simplex S l is defined as having vertex set V \v l v.
2) For each new simplex, compute p * by solving (5) at x given by the center point of the simplex.3) Find the current best solution to (3).Until The number of simplices is greater than 1/ǫ.

2) DSPC Algorithm
Next, we devise a distributed stochastic power control (DSPC) algorithm based on EDT [14] and SA [15].To this end, we first introduce auxiliary variables and use EDT to transform (3) with the auxiliary variables into an unconstrained problem.Then, we solve the unconstrained problem by using the SA mechanism.Specifically, define t l = U l (γ l (p)) x l and rewrite (3) as Next, we use EDT to write Lagrangian for (6) as where (y) + = max(0, y), and α ∈ R and β ∈ R L are the penalty multipliers for penalizing the constraint violations.Based on EDT [14], there exist finite α * ≥ 0 and β * l ≥ 0 for all l ∈ L such that, for any α > α * and β l > β * l , ∀ l ∈ L, the solution to (7) is the same as (6).Note that (7) does not include the power constraints, due to the fact that there is no coupling among the user powers.Therefore, the minimization of ( 7) with respect to the primal variables (p, x, and t) can be carried out individually by each user in a distributed fashion.
The next key step is to perform a stochastic local search by each user based on SA.Let t l , x l and p l denote the primal values of the lth user, and t ′ l and x ′ l denote the new values randomly chosen by the lth user.Accordingly, t ′ l x ′ l can be treated as a new target utility for the lth user.To achieve this target utility, the lth user updates p ′ l by where γ l is the current SINR measured at the lth user's receiver.Note that ( 8) does not need any information of channel gains except the SINR feedback, i.e., γ l .Since (8) corresponds to the distributed power control algorithm of standard form as described in [19]  7 , it converges geometrically fast to the target utility.Thus, we assume that each user l updates p l in a faster timescale than t l and x l such that p l always converges before the next update of t l and x l .Let ∆ denote the difference between L(p l , x l , t l |p −l , x −l , t −l , α, β) and , where y −l is the vector y without the lth user's variable.If ∆ ≥ 0, i.e., t ′ l , x ′ l and p ′ l reduce Lagrangian (7), then they are accepted with probability 1; otherwise, they are accepted with probability exp ∆ T , where T is a control parameter (it is also called temperature).Note that, as T decreases, the acceptance of uphill move becomes less and less probable, and therefore a fine-grained search takes place.It has been shown that, as T tends to 0 according to a logarithmic cooling schedule, SA converges to a globally optimal point [13], [20].To compute ∆ locally by each user l, we assume that user l broadcasts the terms t l , x l and β l (t l x l − U l (γ l (p))) + , whenever any of these terms changes.
Besides updating the primal variables, each user l also needs to update α and β l to satisfy α > α * and β l > β * l .Here, we apply the method given by [14] to update α and β l .In particular, if any constraint is violated, α and β l are updated as follows: where σ and ̺ l are used to control the rate of updating α and β l .Thus, after initialization, α and β l increase in proportion to the violation of the corresponding constraint, which may lead to excessively large penalty values.Since it is beneficial to periodically scale down the penalty values to ease the unconstrained optimization, α and β l are scaled down by multiplying with a random value (it is chosen empirically between 0.7 to 0.95) 8 if the penalty decrease condition is satisfied, i.e., the maximum violation 7. A power control algorithm is of standard form, if the interference function (the effective interference each link must overcome) is positive, monotonic and scalable in power allocation [19].
8. See [14] for the detailed description on the choice of these parameters.
of constraints is not decreased after running Step 1 in Algorithm 2 several times consecutively, e.g., five times 8 .A detailed description of DSPC algorithm is given in Algorithm 2.
Proposition 2.2: The distributed stochastic power control algorithm (Algorithm 2) converges monotonically to a globally optimal solution to (3), as temperature T in SA decreases to zero.
Proof: For a given pair of α and β, Algorithm 2 converges to a globally optimal solution to (7) by using the logarithmic cooling schedule [13], [20].If the solution satisfies the constraints of (6), it is also a globally optimal solution to (6) based on EDT, i.e., current α and β satisfy α > α * and β l > β * l for all l ∈ L [14].By iteratively updating α and β, Algorithm 2 will converge to a globally optimal solution to (3), when α and β satisfy α > α * and β l > β * l for all l ∈ L. Remarks: The DSPC algorithm can guarantee global optimality in a distributed manner without the need of channel information.In particular, it can adapt to channel variations by utilizing the SINR feedback.However, the convergence rate of DSPC is slow due to the use of logarithmic cooling schedule.
Repeat for each user l 1) Randomly pick t ′ l and x ′ l in the feasible region, and update p ′ l according to (8). 2) Keep sensing the change of β l (t l x l − U l (γ l (p))) +  broadcast by other users.3) Compute ∆, and accept t ′ l , x ′ l , and p ′ l with probability 1, if ∆ ≥ 0, or with probability exp( ∆ T ), otherwise.4) Broadcast t ′ l and x ′ l , if t ′ l and x ′ l are updated.5) For each time epoch τ i , update T = T 0 / log(i + 1).Until T < ǫ.

Step 2: update penalty variables
For each user l, 1) Update α and β l according to (9), and scale down α and β l , if the penalty decrease condition is satisfied.

2) Goto
Step 1 until no constraint is violated.

3) Enhanced DSPC Algorithm
It can be seen from Algorithm 2 that it is critical to find the optimal penalty variables α and β for computing (7).Moreover, a logarithmic cooling schedule is used to ensure convergence to a global optimum.To improve the convergence rate, we propose next an enhanced algorithm for DSPC (EDSPC) by empirically choosing the ini-tial penalty values α 0 and β 0 and employing a geometric cooling schedule [15], which reduces the temperature T in SA by T = ξT , 0 < ξ < 1, at each time epoch.Compared with the logarithmic cooling schedule, T converges to 0 much faster under the geometric cooling schedule, which in turn improves the convergence rate of DSPC.The resulting solution is given in Algorithm 3.
We note that although EDSPC converges much faster than DSPC, it may yield only near-optimal solutions.Based on EDT, we choose α 0 > α * and β 0l > β * l , ∀ l ∈ L, to satisfy the optimality conditions for penalty variables.Obviously, by choosing large α 0 and β 0l , these conditions can be always satisfied.Nevertheless, very large penalties introduce heavy costs for constraint violations such that EDSPC may end up with a feasible but suboptimal solution.Therefore, the selection of initial penalty values plays a critical role in the performance of EDSPC and deserves more attention in future work.

Algorithm 3 Enhanced Distributed Stochastic Power Control (EDSPC)
Initialization: Choose ǫ > 0. Let α = α 0 , β l = β 0l , ∀l ∈ L, and randomly choose p, x and t.Set T = T 0 , and select a sequence of time epochs {τ 1 , τ 2 , ...} in continuous time.Repeat for each user l 1) Randomly pick t ′ l and x ′ l in the feasible region, and update p ′ l according to (8). 2) Keep sensing the change of β l (t l x l − U l (γ l (p))) +  broadcast by other users.3) Compute ∆, and accept t ′ l , x ′ l , and p ′ l with probability 1, if ∆ ≥ 0, or with probability exp( ∆ T ), otherwise.4) Broadcast t ′ l and x ′ l , if t ′ l and x ′ l are updated.5) For each time epoch τ i , update T = ξT .Until T < ǫ.

Numerical Examples
In this section, we evaluate the utility and convergence performance of Algorithms 2 and 3 (DSPC 9 and EDSPC).We consider a wireless network with six links randomly distributed on a 10m-by-10m square area.The channel gains h lk are equal to d −4  lk , where d lk represents the distance between the transmitter of user l and the receiver of user k.We assume U l (γ l (p)) = log(1+γ l (p)), P max l = 1 and n l = 10 −4 for all l ∈ L, and consider one randomly generated realization of channel gains given by 0.3318 0.0049 0.0141 0.0021 0.0016 0.0007 0.0031 0.9554 0.0063 0.0140 0.0012 0.0025 0.0155 0.0042 0.6166 0.0046 0.0108 0.0018 0.0017 0.2188 0.0340 0.6754 0.0062 0.0215 0.0020 0.0017 0.2216 0.0042 0.2955 0.0028 0.0007 0.0079 0.0254 0.2553 0.0404 0.3025  Fig. 4 shows how the total utility in the EDSPC algorithm converges over time, where we choose all the initial penalty values equal to 10. Also, we choose ξ = 0.9, ρ = 1 and ̺ = 1, and use Algorithm 1 as a benchmark to evaluate the optimal performance.As shown in Fig. 4, the EDSPC algorithm approaches the optimal utility, when the initial penalty values are carefully chosen.Moreover, the convergence rate of the EDSPC algorithm is much faster than DSPC, since DSPC continues updating the penalty values after the optimal solution is found for the current penalty values.Fig. 5 illustrates the average performance (with confidence interval) of DSPC, EDSPC, and SEER under 100 random initializations, with the same system parameters as in Fig. 4. As shown in Fig. 5, both DSPC and EDSPC are robust against the initial value variations.
A comparison with the SEER and ADP is also depicted in Fig. 4 and 5.As mentioned in Section 1, ADP can only guarantee local optimality.Therefore, for nonconvex problems (e.g., in this example), ADP may converge to a suboptimal solution.As noted in [12], the performance of SEER heavily hinges on the control parameter that can be challenging to choose on the fly.In contrast, DSPC can approach the globally optimal solution regardless of the initial parameter selection, but the convergence rate may be slower.Further, EDSPC improves the convergence rate, but the initial penalty values would impact how close it can approach the optimal point.From the point of view of reducing the amount of message passing, in our algorithms each link does not need any knowledge of the channel gains (including its own channel gain), the receiver SINR of the other links and the signal power of the other links, which are all used in the SEER algorithm.

JOINT SCHEDULING AND POWER CONTROL FOR STABILITY OF QUEUEING SYSTEMS
In Section 2, we studied the distributed power allocation, by using DSPC and EDSPC, for utility maximization in the saturated case.In this section, we generalize the study by considering a queueing system with dynamic packet arrivals and departures.Specifically, we develop a joint scheduling and power allocation policy to stabilize packet queues by integrating our power control algorithms with the celebrated back-pressure algorithm [22].

Stability Region and Throughput Optimal Power Allocation Policy
Consider the same wireless network model with L links as in Section 2. We assume that there are S classes of users in the system, and that the traffic brought by users of class s follows {A sl (t)} ∞ t=1 , which are i.i.d.sequences of random variables for all l = 1, ..., L and s = 1, ..., S, where A sl (t) denotes the amount of traffic generated by users of class s that enters the link l in slot t.Let Q s T (l) (t) and Q s R(l) (t) denote the current backlog in the queue of class s in slot t on the transmitter and receiver sides of link l, respectively.The queue length Q s T (l) (t) evolves over time as ) where r s l (t) denotes the transmission rate of link l for users of class s.The third term in (10) denotes the traffic from the other links.Assuming that the second moments of the arrival process {A sl (t)} ∞ t=1 are finite, the queue length process {Q s T (l) (t)} ∞ t=1 forms a Markov chain.Let E sl = 1 be the indicator that the path of users of class s uses link l, and E sl = 0, otherwise.As is standard [21], [22], [23], the stability region is defined as follows.
Definition 3.1: The stability region Λ is the closure of the set of all {ψ s } S s=1 for which there exists some feasible power allocation policy under which the system is stable, i.e., Λ = p∈P Λ(p), where Λ(p) = {{ψ s } S s=1 | S s=1 E sl ψ s < r l (p), ∀l}, and P denotes the set of feasible power allocation.Here ψ s denotes the first moment of {A sl (t)} ∞ t=1 , i.e., the load brought by users of class s, and r l (p) denotes the rate of link l under power allocation p.
For the sake of comparison, the throughput region 10 F of the corresponding saturated case is defined as the set of all feasible link rates, i.e., F = {r|r l = r l (p), p ∈ P}.In general, the throughput region F may be different from the stability region Λ, except for some special cases (e.g., in slotted ALOHA systems the throughput region and the stability region are the same [28] for two links and in a multiple-access channel the information theoretic capacity region is equivalent to its stability region [29]).
The system is stable if the arrival rates of packet queues are less than the service rates such that the queue lengths do not grow to infinity.In order to stabilize packet queues, it is critical to find the optimal scheduling and power allocation policy that maximizes the weighted sum rate given by (11).By integrating our power control algorithms and the back-pressure algorithm, we propose the following joint scheduling and power allocation policy (presented in Algorithm 4) to stabilize the queueing system.
Proposition 3.1: The joint scheduling and power allocation policy (Algorithm 4) can stabilize the system when the load {ψ s } S s=1 is strictly interior to the stability region Λ, i.e., there exists some ǫ > 0 such that {ψ s + ǫ} S s=1 ∈ Λ.The proof is similar to that in [23], [24], and is omitted for brevity.
Note that Algorithm 4 can be viewed as a single-hop dynamic back-pressure and resource allocation policy [24], crafted towards solving the weighted sum rate maximization problem (11).Specifically, by using the DSPC algorithm, Algorithm 4 can be implemented distributively to find the globally optimal resource allocation.We should caution that EDSPC can be applied to improve the convergence rate of Stage 2 in Algorithm 4 but it may render a suboptimal schedule (i.e., it can not stabilize all possible {ψ s } S s=1 within Λ), due to the fact that EDSPC may not always find the global optimal power allocation.
To reduce the complexity, we can consider a policy that computes (11) every few slots, and it can be shown that this policy can also stabilize the system, when {ψ s } S s=1 is strictly interior to the stability region Λ [26], [27].

Numerical Examples
In this section, we present numerical results to illustrate the use of Algorithm 4 for stabilizing a queueing system.We consider a one-hop network (i.e., E = {E sl } is the identity matrix) with two users (classes), where the channel gains are given by h 11 = 0.3, h 12 = 0.5, h 21 = 0.03, and h 22 = 0.8, and the noise power is 0.1 for each link.The maximum transmission power is set to 1 and 2 for links 1 and 2, respectively.Besides, we assume that the users of class s arrive at the network according to a Poisson process with rate λ s , and that the size of file brought by each user follows an exponential distribution 10.Note that the feasible utility region F defined in Section 2 is the throughput region, when the utility function is the same as the rate function.

Algorithm 4 Joint Scheduling and Power Allocation Policy
Stage 1: For each link l, select a link weight according to w l (t) = max s=1,...,S D s l (t), where the difference of queue lengths of class s is D s l (t) = max(Q s T (l) (t)−Q s R(l) (t), 0), if the receiver of link l is not the destination of class s's traffic, and D s l (t) = Q s T (l) , otherwise.Stage 2: Compute the optimal power allocation p * in each slot t by solving the following problem with DSPC algorithm Thus, the transmission rate of link l in slot t is given by r D sl (t) denote the class scheduled in slot t; if multiple classes satisfy this condition, then s * l is randomly chosen as one of these classes.Then, schedule these classes according to the solution given by Stage 2. with mean ν s .The load brought by users of class s is then ψ s = λ s ν s .For this example, we also study the stability region Λ and compare it with the throughput region F of the corresponding saturated case as illustrated in Fig. 6.The stability region follows from the union of link rates that are conditioned on whether the other link is backlogged or not [28], [29].First, we derive the stability region for the given power allocation.Then, we vary power allocation in the feasible region, and by taking the envelope of these regions, we obtain the overall stability region shown in Fig. 6.However, different from the previous cases, where the throughput region is the same as the stability region, e.g., in a slotted ALOHA system with two links [28] and in a multiple-access channel [29], in our case under the SINR model, the throughput region F is strictly smaller than the stability region (due to the underlying nonconvex optimization problem), as observed from Fig. 6, which is the convex hull of F , i.e., Co(F ), achievable by timesharing across different transmission modes 11 .
Then, we vary the arrival rate λ and the average file size ν to change the traffic intensity ψ = λν.Assuming that the arrival rate and the average file size of each user are the same, we compare the sample paths of each user's queue length for ψ = 1 (λ = 1, ν = 1) with ψ = 1.5 (λ = 1.5, ν = 1) in Fig. 7.When ψ = 1, which falls in the stability region shown in Fig. 6, the system is stabilized by using Algorithm 4, while, when ψ = 1.5, which is outside the stability region, the system becomes unstable.Fig. 8 illustrates the average delay of the system as a function of the arrival rates.The delay is finite for small loads and grows unbounded when the 11.The transmission mode is defined as the transmission rate pair within the throughput region F .

POWER CONTROL FOR MULTICAST COMMU-NICATIONS
Due to wireless multicast advantage [16], multicasting enables efficient data delivery to multiple recipients with a single transmission.In this section, we extend the distributed stochastic power control algorithms in Section 2 to support multicast communications.

System Model
Beyond the model described in Section 2, we consider that each user l has one transmitter and a set M l of receivers.The corresponding transmission rate, r l , is determined by the bottleneck link among these transmitterreceiver pairs, i.e., r l = min m∈M l r lm , where r lm denotes the link rate between the transmitter of user l and its receiver m, and it is calculated based on the Shannon Here, we do not consider the general broadcast capacity region but rather focus on maximizing the bottleneck link rates.

Network Utility Maximization
We seek to find the optimal power allocation p * that maximizes the overall system utility subject to the power constraints in multicast communications, as follows: maximize l∈L U l (r l ) subject to r l = min m∈M l r lm , ∀ l ∈ L r lm = log(1 + γ lm (p)), ∀ l ∈ L, m ∈ M l 0 ≤ p l ≤ P max l , ∀l ∈ L variables {p, {r l }, {r lm }}.
(12) Similar to (2), ( 12) is nonconvex due to the complicated interference coupling between individual links.In order to devise distributed algorithms to solve (12), it suffices to relax r l = min m∈M l r lm in (12) as r l ≤ log(1 + γ lm (p)), ∀ l ∈ L, m ∈ M l .Thus, (12) can be rewritten as

Distributed Global Optimization Algorithms
We develop next distributed algorithms that can find the globally optimal solutions to (13) based on EDT and SA.
To this end, we first rewrite the optimization problem (13) as Next, we use EDT to write Lagrangian for (14) as where α lm ∈ R are the penalty multipliers.Based on EDT, there exist finite α * lm ≥ 0 for all l ∈ L, m ∈ M l such that, for any α lm > α * lm , ∀ l ∈ L, m ∈ M l , the solution to (15) is the same as ( 14) [14].Since there is no coupling among the power constraints of the individual users, (15) does not include the power constraints.Thus, each user satisfies its own power constraint while minimizing (15) in a distributed operation.
As in Section 2, the key step is to let each user perform a local stochastic search based on SA.Let r l and p l denote the primal values of the lth user, and r ′ l denote the new values randomly chosen by the lth user, which is treated as a new target transmission rate for the lth user.Different from the unicast communications case, the lth user updates p ′ l by where γ lm is the current SINR measured at the receiver m of user l.Note that ( 16) does not need any information of the channel gains except the SINR feedback from the intended receivers, i.e., γ lm .Since ( 16) is in standard form as described in [19], it converges geometrically fast to the target utility.The steps to update r l and α lm are similar to DSPC Algorithm 2 in Section 2. A detailed description of DSPC algorithm for multicast communications is presented in Algorithm 5. Proposition 4.1: The distributed stochastic power control algorithm for multicast communications (Algorithm 5) converges to a globally optimal solution to (13), as temperature T in SA approaches zero.
Proof: The proof is based on EDT and SA, and follows similar steps used in the proof of Proposition 2.2, and it is omitted here for brevity.
Likewise, to improve the convergence rate, we also propose an enhanced algorithm for Algorithm 5 by empirically choosing the initial penalty values and employing a geometric cooling schedule.The resulting algorithm is given in Algorithm 6.Similar to the unicast case, Algorithms 5 and 6 do not need any knowledge of channel information (or the bottleneck link) and they are dynamically updated by the SINR feedback from the intended receivers.

Numerical Examples
In this section, we evaluate the performance of Algorithms 5 and 6 for multicast communications.We consider a wireless network with four transmitters and each transmitter has two receivers.These transmitters and receivers are randomly placed in a 10m-by-10m square area.The channel gains h lm are equal to d −4  lm , where d lm represents the distance between the transmitter l and the receiver m.  transmitter l and the receiver m.We assume U l (r l ) = r l , P max l = 1, and n lm = 10 −4 for all l ∈ L and m ∈ M l .Fig. 9 illustrates the fast convergence performance of Algorithms 5 and 6 in multicast communications. 12 Besides, we examine the average performance (with confidence interval) of DSPC and EDSPC for multicast communications under 100 random initializations with the same system parameters as in Fig. 9.As illustrated 12.The other existing algorithms have been specifically designed for unicast communications; therefore, they are excluded here from the performance comparison.

CONCLUSION
We studied the distributed power control problem of optimizing the system utility as a function of the achievable rates in wireless ad hoc networks.Based on the observation that the global optimum lies on the boundary of the feasible region for unicast communications, we focused on the equivalent but more structured problem in the form of maximizing the minimum weighted utility.Appealing to extended duality theory, we decomposed the minimum weighted utility maximization problem into subproblems by using penalty multipliers for constraint violations.We then proposed a distributed stochastic power control (DSPC) algorithm to seek a globally optimal solution, where each user stochastically announces its target utility to improve the total system utility via simulated annealing.In spite of the nonconvexity of the underlying problem, the DSPC algorithm can guarantee global optimality, but only with a slow convergence rate.Therefore, we proposed an enhanced distributed algorithm (EDSPC) to improve the convergence rate with geometric cooling schedule in simulated annealing.We then compared DSPC and EDSPC with the existing power control algorithms and verified the optimality and complexity reduction.
Next, we proposed the joint scheduling and power allocation policy for queueing systems by integrating our distributed power control algorithms with the backpressure algorithm.The stability region was evaluated, which is shown to be strictly greater than the throughput region in the corresponding saturated case.Beyond unicast communications, we generalized our power control algorithms to multicast communications by jointly maximizing the minimum rates on bottleneck links in different multicast groups.Our distributed stochastic power control approach guarantees global optimality without the need of channel information, while reducing the computation complexity, in general systems with unicast and multicast communications, and applies to both backlogged and random traffic patterns.

9 .
The geometric cooling schedule is employed to accelerate the convergence rate of DSPC in the simulation.DSPC updates penalty values until they satisfy the threshold-based optimality condition.

Fig. 6 :Fig. 7 :
Fig. 6: Comparison of the stability region and the throughput region.

TABLE 1 :
Summary of the notations and definitions.

TABLE 2 :
The performance of the existing approaches for Case I and II.
The channel gains h lm are equal to d −4 lm , where d lm represents the distance between the Algorithm 5 DSPC for Multicast Communications Initialization: