Joint Utility-Based Power Control and Receive Beamforming in Decentralized Wireless Networks

,


Introduction
Two central mechanisms for resource allocation and interference management in wireless networks are power control and beamforming.In order to ensure a high utilization of wireless resources, transmit powers and beamformers should be optimized jointly to exploit interdependencies between them.As is widely known the overall network can be optimized with respect to different optimization goals.In general, there exist two main approaches that are typically used.The classical QoS-based approach aims at satisfying a certain quality-of-service (QoS) requirement with minimum power.To circumvent the feasibility problem a related approach is to solve the so-called max-min SIR-balancing problem.In contrast to this stands the utility-based resource allocation problem where the network operator aims at optimizing a weighted aggregate utility so as to maximize the overall network performance.By appropriately choosing the utility function one can trade overall system efficiency against fairness.Widely known are the α-fair strictly concave utility functions introduced by [1].
In decentralized wireless networks, however, in addition to efficiently managing wireless resources the two challenging tasks are to distributedly assign these resources and to apply stochastic algorithms that deal with noisy measurements and estimations.Thus in this paper we focus on the following problem: maximizing an aggregate utility jointly over powers and receive beamformers in real-world decentralized wireless networks.
1.1.Related Work.Classical QoS-based power control has been studied extensively (e.g., [2][3][4]).It aims at allocating transmit powers to the users such that each user meets its SIR target.Provided that the SIR requirements are feasible there exist iterative distributed algorithms that attain the target SIR [3,5,6].Note, that a closely related approach to the classical approach is to maximize the minimum SIR [7][8][9][10].In contrast to the classical QoS-based power control, the objective of utility-based power control is to optimize the overall network performance with respect to some aggregate utility function [11][12][13][14][15][16][17][18].Recently distributed utility-based power control algorithms have been developed by [14,[16][17][18].In [17] the problem of joint power control and end-to-end congestion control is addressed where the power control part is a special case of the power control problem in [14].The approach of [16] is a game-theoretic one.References [16,17] apply a flooding protocol to pass locally available quantities to other nodes.The authors of [18] interpret the utility-based power control problem as a joint optimization of powers and SIR assignment over the feasibility region.They proposed a distributed power control and SIR assignment algorithm for the uplink in a multicell wireless network.In contrast [14] proposed a distributed utility-based power control algorithm for general wireless networks applying the notion of the adjoint network and thus avoiding to use a relatively expensive flooding protocol.In addition, the authors touched the problem of stochastic approximation and show how to deal with it in practice.
Independently from and simultaneously to our work the authors of [19] have proposed a distributed utility-based joint power control and receive beamforming algorithm for cellular uplinks applying the scheme of [18].Apart from that, so far most work on joint power control and beamforming has focused on the QoS-based resource allocation, especially on the so-called max-min SIR-balancing and its related problem.For example in [20][21][22] the duality between uplink and downlink channels is exploited.Another strategy was proposed by [23,24] showing that the problem can be embedded in semidefinite and conic optimization programs.The work of [25] extended the publication [22] to solve the max-min SIR-balancing problem under general power constraints.
However, apart from [15] the above set of publications considered only the deterministic case.First works incorporating imprecise knowledge of received waveforms can be found in [26][27][28].Recently, stochastic algorithms for joint QoS-based power control and receive beamforming and their convergence analysis have been proposed by [29,30].

Summary of Main Results and Paper Organization.
In the following we consider the problem of joint power control and receive beamforming in order to maximize a certain aggregate utility function that represents the QoS attained and is a function of the SIR.However, in contrast to the pure power control problem [14], it is not known which class of utility functions allows a convex formulation of this joint optimization problem and thus enables an efficient global solution in distributed wireless networks.In particular, in case of the logarithmic function, the aggregate utility function appears to have relatively many local maxima.Now, in this paper, under the assumption of perfect synchronization we first reformulate the joint power and receiver control problem as a pure power control problem under optimal receivers.This follows from the fact that an optimal receiver can be obtained in closed-form solution for any power vector.However, an efficient implementation of the equivalent gradient projection algorithm is notoriously difficult to achieve in decentralized wireless networks.Thus, we decompose the problem into two coupled subproblems and propose an alternating algorithm that converges to a stationary point.If we confine our attention to utility functions whose relative concavity is larger than that of the logarithmic function, numerical experiments suggest that then the proposed algorithm may converge to a global maximum for a large set of initial SIRs.
In contrast to [15] which touches the problem briefly this paper provides a more detailed analysis.In addition it is devoted to practical implementation aspects that are completely missing so far.As already mentioned, in real-world networks noisy measurements and estimations occur.We embed the proposed alternating algorithm into the framework of stochastic approximation.In particular, we discuss in detail the imprecise knowledge of received waveforms and the influence of step size control on the convergence properties.Finally we provide extensive simulations on the convergence behavior as well as performance comparisons with pure power control schemes.
Potential applications of the resource allocation scheme presented in this paper are envisaged for example in wireless mesh networks to control transmit powers and beamformers of base stations (mesh routers).These base stations create a wireless backbone via multihop ad hoc networking and have practically unlimited energy supply.

System Model and Problem Statement
2.1.System Model.We consider a general multiple-antenna wireless network with an established network topology, in which all links share a common wireless spectrum.All users are equipped with M ≥ 2 antennas.Let K ≥ 2 users compete for access to the wireless links and let K = {1, . . ., K} denote the index set of all users.Assume that k ∈ K is arbitrary but fixed and define u l := u (k) l = (u (k)  1,l , . . ., u (k) M,l ) ∈ C M to be the effective transmit vector of transmitter l ∈ K associated with receiver k.The effective transmit vector u l is the product of the channel matrix between transmitter l ∈ K and receiver k and its transmit beamformer.It determines the "direction" of the transmit signal.The effective transmit vector is assumed to be arbitrary but fixed, which implies that the channels and transmit beamformers are fixed.In contrast, the receive beamformers acting as linear receivers should be jointly optimized with transmit powers of the users.We use v k ∈ C M and p k ≥ 0 to denote the receive beamformer and transmit power of user k, respectively.The receive beamformers of all users are collected in the receive beamforming matrix V = (v 1 , . . ., v K ) ∈ C M×K and their transmit powers in the power vector p = (p 1 , . . ., p K ) ∈ R K + (In what follows, R + and R ++ denote the set of nonnegative reals and positive reals, resp.).The transmit powers of the users are subject to individual power constraints P 1 , . . ., P K > 0 so that p ∈ P must hold, where P Furthermore, since the signal-to-interference ratio (SIR) is independent of the norm of the receive beamformers, we can assume that v k 2 = 1 for each 1 ≤ k ≤ K, and hence 1} denotes the set of all beamforming matrices.Note that both P and V are compact sets, so is also their Cartesian product P × V.
Finally, we define P + = P ∩ R K ++ .In words, P + is the set of positive power vectors satisfying the power constraints.
The main figure of merit is the SIR at the output of each receiver.Using the above notation and considering the fact that all users are perfectly synchronized, the SIR of user k is given by where σ 2 > 0 is the variance of independent zero-mean additive Gaussian noise, and Since R ++ is an open set, all these assumptions imply that the first derivative U (x) is positive on R ++ , that is, there are no isolated points x > 0 such that U (x) = 0.The joint utility-based power control and receive beamforming problem can be written as follows.Given any weight vector w = (w 1 , . . ., w K ) > 0, we search for a power vector p * ∈ P and a beamforming matrix where Since the noise variance is strict positive, standard arguments can be used to show that with our choice of the utility functions, the maximum exists.The convexity discussion of this problem, the development of a distributed algorithm, and its implementation in real-world environments with noisy measurements together with the performance evaluation by simulations will be the main tasks of this paper.

A Class of Utility Functions
Suppose that the utility function U : R ++ → Q is further confined to satisfy where U (x) > 0 and U (x) < 0, x > 0 denote the first and second derivatives of U, respectively.Then, we know that [15], for any fixed V ∈ V, G e (s, V) := G(e s , V) is concave in the logarithmic power vector s := log p ∈ S with p ∈ P + and S := s = log p : Here and hereafter, log p, p ∈ R K ++ , and e s , s ∈ R K , are defined component-wise.Since the logarithm is a bijection from R ++ onto R and p * > 0, there is a one-to-one relationship s * = log p * between optimal power vectors p * and optimal logarithmic power vectors s * .The motivation beyond this substitution is the following fact [15].
Therefore, for G e (s, V) to be concave in s, it is sufficient that SIR k (e s ) is a log-concave function of s ∈ R K .In economics, the quantity r(x, U) ≥ 0 is known as the coefficient of relative risk aversion [31] and is used to measure the relative concavity of U(x).The larger the value of r(x, U) ≥ 0 is, the larger is the relative concavity of U(x) at x > 0, and therefore a better fairness performance (at the cost of the throughput performance) can be expected.A prominent example of a function that satisfies (4) is the logarithmic function Now the question is what happens if we use this class of utility functions in the joint power control and receive beamforming problem (2).First note that this problem can be written as a power control problem because where ) is used to denote an optimal receive beamforming matrix for a given p > 0 and the last step follows from (A.2).Obviously, since the SIR of user k depends only on the kth receive beamformer, one obtains max where is the SIR under an optimal receive beamforming (see Section 4) and the last step follows since U(x), x > 0, is a strictly increasing function and w is a positive vector.Now, using the substitution s = log p, p ∈ P + , (in accordance with the power control problem of [15]) and it follows that a solution to ( 2) is any pair (p * , V * ) ∈ P + × V given by p * = e s * and V * = V * (p * ) where In words, as in [15], the problem reduces to a power control problem except that now each SIR is assumed to attain its maximum overall receive beamformers.Unfortunately, the condition ( 4) is not sufficient for F(s) defined by ( 9) to be a concave function of s ∈ S. A simple counter example is constructed in the appendix for U(x) = log(x), x > 0. Numerical experiments show that if U(x) = log(x), x > 0, then the gradient projection algorithm is not globally convergent, that is, it in general converges to a local maximum which is not global.Given V and P, the aggregate utility function seems to have relatively many local maxima.
A simple idea is to further restrict the class of utility functions by requiring larger values of r(x, U) for all x > 0. For instance, we could demand that r(x, U) > 1. (11) This excludes the logarithmic function and implies that U(e x ) is strictly concave.A class of utility functions that satisfies (11) are the following functions Indeed, it may be easily verified that r(x, U (α) ) = α, and hence (11) holds for all α ≥ 2, α ∈ N. Another example is . So, at low values of x > 0, the function in (13) behaves like the logarithmic function.In contrast, as x increases, it is similar to the negative inverse function.Numerical experiments with the utility function (13) suggest that in this case, the gradient projection algorithm (see Section 4.1) converges to a global maximum for a relatively large set of initial SIR values.When compared with the logarithmic utility function, convergence to a local point was observed in significantly fewer cases.However, we can show that F(s) with ( 13) is not concave in general and the standard gradient projection algorithms are not globally convergent for all initial SIR levels.
An interesting problem is whether a global convergence (if not for all starting points, then at least for most of them) of the gradient projection algorithm can be achieved by requiring that r(x, U) ≥ c, x > 0, for some sufficiently large constant c ≥ 2. Increasing the constant c leads to utility functions with larger relative concavities.In particular, as shown below, if there is a utility function for which each addend in ( 9) is concave on R K , then F(s) is concave for all utility functions with a larger coefficient r(x, U).Observation 1.Let g : R ++ → Q 1 be any utility function for which (4) holds, and suppose that each addend in Since g and f are bijective utility functions, there is a twice continuously differentiable and strictly increasing function This implies that h (g(x)) ≤ 0, x > 0, and hence one obtains h (x) ≤ 0, x ∈ Q 1 due to the bijectivity of g.
Applying this observation to the class in (12) reveals that if there was some α ≥ 2 such that U (α) (SIR * k (e s )) is concave on R K for each k ∈ K, then the problem (10) would be a convex problem for all U (α ) (x) with α ≤ α .Then, as discussed in the following section, we would be able to efficiently and arbitrarily close approximate the max-min fair rate allocation for any power constraints.

An Arbitrarily Close Approximation of the Max-Min
Fair Allocation.Reference [1] introduced the class of utility functions in (12) to obtain different tradeoffs between throughput and fairness performance in wireline communications networks.In particular, it was shown that if each source is assigned the utility function U (α) (x), x > 0, then the corresponding rate allocation tends to the max-min rate allocation as α → ∞.For a large family of modulations determining the relationship between data rates attainable on wireless links and the SIR at the receiver output, this result carries over to our setting.To be precise, assume that Φ : R + → R + is a one-to-one continuously differentiable function that maps the SIR values onto the data rates.A common assumption is that Φ(x) = log(1 + x), x ≥ 0. By this model, the set of all simultaneously achievable data rates is which is a (connected) compact set since Φ(SIR k (p, v k )) is continuous on the compact set P × S M−1 , where S M−1 is the unit sphere in C M .This yields the following observation (see [15] and [1, Lemma 3]).
By the observation, p * = e s * with U(x) = U (α) (x) converges to the max-min power allocation as α tends to infinity [22].Moreover, for every α, ∇F(s) exists and is continuous on R K so that efficient gradient projection algorithms could be used to approximate the max-min power allocation for any power constraints if the algorithms were global convergent for some sufficiently large α (as discussed before).

Utility-Based Power Control and Beamforming Algorithm
In this section, we derive a gradient projection algorithm for the problem (19) and prove its convergence.To this end, let us first identify optimal receive beamformers.By (8), an optimal receive beamformer of user k is exactly that beamformer for which the kth SIR attains its maximum.Hence, where is positive definite since σ 2 is positive.As a consequence, the inverse matrix of Z k (p) exists regardless of the choice of the effective transmit vectors u k and p ∈ P. Note that the SIR can be written in this compact form due to the assumption of perfect synchronization.An optimal receive beamformer v * k (p) can be easily found when the SIR k is rewritten as a Rayleigh quotient to obtain [32] where c k > 0 is a constant chosen such that v * k (p) 2 = 1.Consequently, with an optimal beamformer, the SIR of user k is equal to From this, it follows that and (s * = log p * ), with appropriately chosen constants c 1 , . . ., c K > 0.
If we assume the utility function (13) or the functions (12), then F(s) in ( 19) can be written using the inverse of which is independent of the index k.Indeed, by the Sherman-Morrison formula [33], it follows that and hence So, if U(x) = log(x/(1 + x)), x > 0, the aggregate utility function in (9) yields Choosing U(x) = U (α) (x) given by ( 12) gives where α ≥ 2 and the constant 1/(1 − α) can be neglected as it has no impact on the maximizer.

Gradient Projection Algorithm.
All partial derivatives of SIR * k (e s ) with SIR * k (p) given by (18) exist and are continuous functions on R K because the inverse matrix Z −1 k (e s ) exists for all s ∈ R K , regardless of the choice of the effective transmit vectors, and the entries in Z −1 k (e s ) vary continuously with the entries in Z k (e s ).Hence, we can consider a gradient projection algorithm with a constant step size δ > 0 (sufficiently small) where Π s (x) is the projection of x ∈ R K on the closed convex set S [34,35] and the kth partial derivative ∇ k F(s) = (∂F/∂s k )(s) yields where the following identity was used.For an invertible and differentiable matrix function A(x), there holds

EURASIP Journal on Wireless Communications and Networking
Hence, due to the individual power constraints on each user ∀ k s k ≤ log P k , the algorithm (26) takes the form where SIR * k (e s ) is defined by (18).
Lemma 2. For a sufficiently small step size δ > 0, the sequence {p(n)} generated by the algorithm (29) with p(n) = e s(n) converges to a local stationary point.
Proof.By standard results [34,35], the gradient projection algorithm converges to a stationary point for sufficiently small values of δ > 0 if F(s) is bounded above, continuously differentiable on S, and the gradient ∇F(s) is Lipschitz continuous on any bounded subset of S. The first condition is clearly satisfied due to the power constraints.The second condition holds as well since, by assumption, the utility function U(x) is twice continuously differentiable.Hence, the Hessian of F(s) is bounded in the matrix 2-norm on any bounded subset of S. This implies that ∇F(s) is Lipschitz continuous on any bounded subset of S [36, page 70].
Note that the maximum feasible step size in the algorithm may depend on the choice of the starting point s(0).

Distributed Implementation
The computation of the gradient in ( 29) might be too expensive to be implemented in a distributed environment.In this section, we slightly modify the algorithm so that it can be implemented in a distributed manner.The basic idea is to increase the value of the function G(p, V) in the following alternating fashion.For some given receive beamforming matrix V(n) and power vector p(n), a new power vector p(n This alternating process is repeated until convergence. Let us first consider the power vector update.To this end, let V be fixed and define F V (s) := G(e s , V).Then, the power vector can be updated according to the following algorithm for some s(0) ∈ R K , where, with some abuse of notation, ∇ F V(t) (s(t)) is used to denote a noisy estimation of the gradient vector ∇F V(t) (s(t)) and {δ p (t)} with δ p (t) > 0 is an appropriately chosen sequence of diminishing step sizes [37].If {s k (t)} L t=1 is a sequence generated by (30) for some L ≥ 1, then we put s(n + 1) = (s 1 (L), . . ., s K (L)).Note that the estimate ∇ k F V(t) (s(t)), k ∈ K can be computed in a distributed manner using the adjoint network of [15].This scheme enables each transmitter to estimate its current update direction from the received signal power.This mitigates the problem of global coordination of the transmitters when carrying out gradient-projection algorithms in distributed wireless networks.More precisely, instead of each node sending its message separately as in case of classical flooding protocols, nodes transmit simultaneously (only coarse synchronization is required) over the adjoint network such that each node can estimate its gradient component from the received power.The price for this are possible estimation errors that usually can be dealt with a diminishing step size [37] as is shortly discussed in the following Section 5.1.Now assume that s = log p is fixed.Distributed algorithms for computing optimal receive beamformers defined by ( 17) are widely established.These algorithms are based either on blind or pilot-based estimation methods [38].In the latter case, if X k is a pilot symbol of user k with zero mean and E[|X k | 2 ] = e sk , and r k ∈ C M represents the observations at receiver k, then v * k given by ( 17) minimizes the mean square error where κ > 0 is a normalizing constant chosen such that, in the minimum, v k 2 = 1.For practical implementation, we can assume κ = 1, and then normalize the beamformers so that their l 2 -norms are equal to one.Besides note that the expectation is taken with respect to r (k) := (r k , X k ), which depends on the logarithmic power vector s ∈ R K .Now if the convex function θ k (v k ) was explicitly known, then the algorithm (with the complex gradient operator ∇ which gives the direction of steepest ascent of θ k : (31) would converge to v * k defined by (17) for a sufficiently small step size δ k > 0. The problem is that the function θ k is usually not known since the distribution of r k is not known [38].Therefore, ∇θ k (v k (n)) cannot be computed and the algorithm must be modified using the framework of stochastic approximation [37].The idea is to consider the functions θ r (k) Then, under some conditions on the estimation error and for any v k (0) ∈ C M , the algorithm converges to v * k (in some probabilistic sense), provided that the step size δ k (t) > 0 with lim t → ∞ δ k (t) = 0 and ∞ t=0 δ k (t) = +∞ is chosen suitably [26].Now combining these two ingredients leads to the following joint power control and receive beamforming algorithm.At the beginning of every frame, s(0) and V(0) are set to be equal to the current transmit powers and receive beamformers.Then, all users concurrently execute N ≥ 1 updates of their transmit powers and receive beamformers.The nth update consists of the following intermediate steps.
(i) For fixed V(n) and some L ≥ 1, each user k ∈ K generates a sequence {s (l)   k } L l=1 by carrying out (30) and defines s k (n + 1) = s (L)  k .(ii) For some L ≥ 1 and with n+1) , each user k ∈ K executes L iterations of the algorithm (32) to obtain the sequence {v (l) The convergence of the algorithm (in some probabilistic sense) strongly depends on the choice of the step sizes in (30) and (32) as well as on the properties of the estimation errors in (30) and (32).However, we point out that the algorithm is motivated by the following observation.If the estimates in (30) are known perfectly meaning that we can use δ p (t) = δ p for sufficiently small δ p > 0 and ( 31) is used instead of (32), then the sequence {(s(n), V(n))} generated by the resulting algorithm converges to a stationary point.This is because, under this assumption, (30) and ( 31) are both monotonic, and hence we have (for all n ∈ N 0 ) This implies that the sequence {G(p(n), V(n))} is monotonically increasing, provided that the step sizes are sufficiently small.Moreover, it is bounded since Therefore, the algorithm converges to a stationary point.In addition, verifying the second order sufficiency conditions would show that this stationary point is also a local maximizer for the problem (2).Due to scarce resources in wireless networks, it is reasonable to choose the number of updates N = 1 in every frame.In addition, instead of transmitting pilot signals in the intermediate step (ii), the optimal receive beamformers can be estimated during the data transmission using some blind estimation method (see [38] and references therein).So, at the beginning of every frame, the step (i) is executed only once.Then, the resulting transmit powers are used for data transmission.During this time, the receive beamformers are updated online after each transmitted symbol.However, numerical experiments suggest that the scheme should not exclusively rely on blind methods to estimate the optimal receivers with a sufficient accuracy.

Stochastic Approximation View.
As already mentioned, in real-world networks estimation errors and other distorting factors as quantization noise occur.Now the interesting question is, what is the impact of these stochastic noisy measurements on the convergence properties.Does the proposed algorithm still converge and under what conditions?In the case of such uncertainties, the proposed algorithm has to be analyzed in the context of stochastic approximation theory.
In the following we give several interesting insights.However, the topic is too broad to be discussed in all details.We also refer to [37] as a comprehensive reference.We assume that the estimated gradient component ∇ F V(t) (s(t)) is a random variable of the form where M k (t), k ∈ K is the estimation noise process that fulfills the following conditions: (A.4) The estimation noise process depends on the receiver noise process which is assumed to be a martingale difference that is uncorrelated with transmit symbols and has a finite variance.
(A.5) The estimation noise is zero mean and exogeneous, in the sense that M k (t), k ∈ K is independent of the iterate value.
Assuming these two conditions one can deal with the estimation noise applying a diminishing step size sequence that satisfies δ p (t) > 0 with lim t → ∞ δ p (t) = 0 and ∞ t=0 δ p (t) = +∞.A typical choice for a step size sequence is for instance δ p (t) = ct y for some y ∈ (0, 1].The choice of the step size is central to the effectiveness of the algorithm as is shown by simulations in the next section. In the previous algorithm the powers and beamformers are updated in parallel, meaning that the power control algorithm does not wait for the convergence of the receive beamformers and vice versa.Thus the convergence of this practical stochastic algorithm is only verified by simulations presented in the following section.In addition, note that condition (A.5) is not necessarily fulfilled by the distributed power control algorithm.Thus the estimates ∇ k F V(t) (s(t)) may be biased by some b k (t) meaning that Simulation results indicate that the algorithm still converges to a contraction region around the optimal point provided that the bias is bounded by a scaled version of the true gradients.

Influence of
Step Size Control.In the following, we show exemplarily the convergence behavior of the proposed scheme for U(x) = −1/x, x > 0 and a random channel realization.We consider a wireless system with M = 2 transmit and receive antennas, and K = 4 users operating at a SNR level of 30 dB.The weight vector is w = 1.The noisy measurements of the gradient are assumed to be  independent zero-mean Gaussian random variable (and thus fulfils the conditions of a martingale difference noise) whose variance σ 2 z (t) depends on t and is 10 percent of the absolute gradient value.We have L = 1 and L = 16 steps in (i) and (ii), respectively.Hence, during each iteration step n, the algorithm performs 1 power control step and estimates the beamformers using 16 pilot symbols.The diminishing step sizes for the intermediate steps are δ p (t) = c 1 /(t + 1) y and δ k (t) = c 2 /(t + 1) y for some positive constants c 1 , c 2 and some exponent y ∈ (0, 1]. Figure 1 depicts the aggregate utility, the mean square error of the SIR, and the SIR for two users over the number of iterations n for different values of y to show the influence of the diminishing step size.As can be easily seen, if the step size vanishes fast the algorithm converges much slower than with a slowly decreasing step size.However, the behavior is very smooth causing nearly no oscillations in contrast to a slowly decreasing step size.Figure 2 depicts the aggregate utility, the mean square error of the SIR, and the SIR for two users over the number of iterations n for different values of c 1 and c 2 and a fixed y = 0.5 to show the influence of the start step size values.Here a higher (but sufficiently small) start step size leads to a faster but oscillating convergence compared to lower start step sizes with a slow but smooth convergence behavior.
Summarizing, we state the following.It is important, that the step sizes c 1 and c 2 are sufficiently small to ensure that the algorithm does not diverge.Besides the decrease of the step sizes adjusted by exponent y should be not too small to avoid a very slow convergence speed.In case of a dynamic environment where the channel changes over time, y should be chosen to be able to follow the channel changes.This is paid with a more oscillating behavior.Finally note that the length of the pilot sequences also depends on the number of users because the link-specific pilot sequence is typically a pseudo-noise sequence with good autocorrelation properties.

Influence of Biased Gradient Estimators.
In Figure 3 a convergence example is depicted for the case that the estimates ∇ k F V(t) (s(t)) are biased by some b k (t).Further independent simulations suggest that the proposed algorithm converges to a contraction region around the optimal point if the bias is small enough.Otherwise the algorithm may diverge.However, the conditions on the bias to ensure convergence to a contraction region remain an open question.joint utility-based power control and receive beamforming.In Figure 4 the maximum and minimum SIRs are depicted as a function of α representing the concavity of the utility function chosen.The figures show that a significant performance gain can be achieved by a joint optimization.Note that in this simulation example only a total throughput of 4.3 can be supported if the users transmit with maximum power and receive with a filter that is matched to the channel.In addition, the simulations confirm that with increasing concavity (α) the utility-based resource allocation strategy achieves fairness at the expense of a decreasing throughput performance.For α → ∞ max-min fairness is achieved.
Finally, in Figure 5 we show the performance gains that can be achieved in an exemplary wireless mesh network (Downlink) where the base stations are connected wirelessly with an Access Point (AP).More precisely, the total network throughput and the delay performance are depicted over the arrival rate.For a fixed routing and a fixed scheduling strategy we compare the static resource allocation, that adapts the beamformers to the channel and transmits with maximum available transmit powers, with the utility-based power control and with the joint utility-based power control and receive beamforming for U(x) = log(x).The weights are chosen to represent the queue differences in order to support low delays.As can be easily seen the joint resource allocation outperforms the utility-based power control.

Conclusions
We proposed a framework for joint power control and receive beamforming in wireless networks, with the goal to maximize some aggregate utility function of the SIRs.The paper is a step to better understand the problem of utility-based power control and receive beamforming.We especially give insights into practical implementation issues and exemplarily show the effects of noisy estimations (unbiased and biased) as well as the influence of step size control on the convergence properties.However, the interesting theoretical issue of global convergence seems to be further unresolved.

Figure 1 :
Figure 1: Convergence behavior of the distributed algorithm in case of noisy measurements for different values of y, c 1 = 30 and c 2 = 0.5.

c 1 =c 1 =Figure 2 :
Figure 2: Convergence behavior of the distributed algorithm in case of noisy measurements for different values of the step sizes c 1 , c 2 and y = 0.5.

Figure 3 :
Figure 3: Convergence behavior of the distributed algorithm in case of noisy measurements y = 0.4, c 1 = 40, c 2 = 1 for the biased and unbiased case.

6. 3 .
Comparison with Pure Utility-Based Power Control.In this last section, we compare utility-based power control with

Figure 4 :
Figure 4: Maximum and minimum SIR (a), throughput (b) over α for utility-based power control (dashed lines) and utility-based joint power and receiver control (solid lines).

Figure 5 :
Figure 5: Throughput (a) and delay (b) performance as a function of the arrival rate for different resource allocation schemes and a given mesh network (c).
the attenuation of the power from transmitter of user l to receiver of user k where v k , u l denotes the inner product of the vectors v k , u l .Note that the SIR of user k depends only on the kth receive beamformer v k .