The Waterfilling Game-Theoretical Framework for Distributed Wireless Network Information Flow

We present a general game-theoretical framework for the resource allocation problem in the down-link scenario of distributed wireless small-cell networks, where multiple access points (APs) or small base stations send independent coded network information to multiple mobile terminals (MTs) through orthogonal frequency division multiplexing (OFDM) channels. In such a game-theoretic study, the central question is whether a Nash equilibrium (NE) exists, and if so, whether the network operates efﬁciently at the NE. For independent continuous fading channels, we prove that the probability of a unique NE existing in the game is equal to 1 . We show that this resource allocation problem can be studied as a potential game, and hence efﬁciently solved. We discuss the convergence of waterﬁlling based best-response algorithm. Finally, numerical results are provided to investigate the inefﬁciency of NE.


I. INTRODUCTION
Recently, there has been an increasing interest for small cell networks, where people can access Internet over many different APs or small base stations (also known as out-door femtocells or small cells [1,2]).Typically, in such a wireless network, several femto-cells are installed out-door on a given backbone network (with heterogenous links as fiber, ADSL, power line) to provide signal coverage in dense environments.The general idea is to exploit the heterogeneous wired infrastructure without any new cabling and provide wireless high data rate to the users through self-organized mechanisms.Unfortunately, if users are connected to a single out-door femto-cell, they may suffer from low throughput from time to time due to the limited-backhaul capacity (some wireless high speed femto-cells access the Internet through low capacity DSL or power line links, e.g., 1Mbps), despite the presence of a high speed wireless link.As a result, users need to access to different APs in the nearby femto-cells in order to aggregate the sum capacity of the backhaul links.An interesting concept is to consider the nearby femto-cells as a virtual femto-cell group, whose backhaul capacity is the sum of the access capacities of all the nearby femto-cell group (as shown in Fig. 1).The issue of load balancing [3] in the wired network (and how the different packets are split with respect to the backhaul capacity from a main decentralized scheduler), although important, is not dealt with in this contribution and we will suppose that perfect load balancing holds.
In this paper, we focus on the resource allocation problem for the downlink scenario (from femto-cell group to MTs) using OFDM air-interface [4] over a number of dedicated sub-channels.
We assume that all these femto-cells get independent independent packets (network coding is applied at the source) from the Internet via their backhauls, and send them physically to each MT in a distributed manner.Usually, in this situation each femto-cell needs to decide how to distribute the total available transmit power over N downlink sub-channels (sub-carriers or clusters of sub-carriers), i.e., should it allocate all its power to a single sub-channel, spread the power over all the sub-channels, or choose some subset of sub-channels on which to transmit?Traditionally, this resource allocation problem is considered as a global optimization problem.
It is well known that the problem of maximizing a single user's sum-rate (corresponding to the Shannon transmission rate [5]) over all the sub-channels is a classic convex optimization problem [6], whose solution is "waterfilling" [7,8,9].The multi-user version of this problem is, a non-convex optimization which is generally difficult to find the exact solution, since it may have several local optimal points [10,11,12].However, to solve the multi-user problem, it usually requires a central computing resource (a scheduler with comprehensive knowledge of the channel state information (CSI)) to globally manage the system resources.This process is centralized, it involves feedback and overhead communication whose load scales linearly with the number of transmitters and receivers in the network.
It is certainly possible to improve the useful data transmissions by reducing transmissions of insignificant or unnecessary feedback information.In this direction, a selective multi-user diversity algorithm has been introduced in [13].The key idea is to find a suitable trade-off between the network performance and the feedback load.Nevertheless, this partial feedback approach still has its self-limitation in network scaling problems.As wireless networks are becoming more and more dense, the global optimization approach will be more and more difficult to meet the needs of future wireless communication development.
Within the recent ten years, increased research interest has been given to self-organizing wireless networks in which nodes allocate resources in a decentralized manner [1].Non-cooperative games theory [14], borrowed from many economic applications [15] provides an alternative solution by considering every femto-cell as a selfish player who "plays" the game by rationally choosing its transmit power levels.In this respect, it is important to study the NE [16] (the solution concept of non-cooperative games) because it represents a predictable outcome for a self-organizing network.
It is worth to mention that a special case of this game has been studied in [10], where the authors show an infinite number of NE under their specific channel gain assumptions.However, up to now, the characterization of NE in the wireless setting is still not clear as it depends on the channel fading statics and the number of players.The goal of this paper is therefore to address this fundamental problem as well as the convergence issue.
The paper is organized in the following form: In section II, we introduce the problem formulation.In section III, we study the existence and uniqueness of NE and we characterize the NE set.In section IV, we study the problem as a potential game.Finally, numerical results are provided in section V followed by conclusions in section VI.

A. Multi-user OFDM model
We consider an OFDM downlink scenario with M non-cooperative APs simultaneously sending information to N MTs over N sub-channels (as shown in Fig. 2).We assume that each sub-channel is pre-assigned to a different MT by a scheduler, i.e., each MT receives signals only on the assigned sub-channel.Without loss of generality, throughout this paper we assign sub-channel n to MT n, for n = 1, . . ., N .This implies that both MT set and sub-channel set share the same index in our context.
October 2, 2009 DRAFT Furthermore, we assume that the sub-channels are block fading, i.e. the channel fading coefficients are constant during the transmission of a codewords or block.Within a given transmission block, let G ∈ R M ×N ++ be the channel gain matrix whose (m, n) entry is g m,n , the channel gain of the link from AP m to MT n on the pre-assigned sub-channel n.We assume that G is a random M × N matrix with i.i.d.(due to independent fading) entries.We further assume that the distribution function of each positive entry g m,n is a continuous function.
By assuming that the MTs use low complexity single-user decoders [7], we can write the signal-to-interference-plus-noise-ratio (SINR) of the signal from AP m received at MT n as σ 2 + M j=1,j =m g j,n p j,n where p m,n is the power transmitted from AP m on sub-channel n, σ 2 is the variance of the white Gaussian noise.For AP m, write the maximum achievable sum-rate as [7] and the power constraint as where P max m is maximum transmit power of AP m and P max m > 0, ∀m.

B. As a non-cooperative game
Here, we introduce a non-cooperative strategic game for this OFDM model.

A. Definition of Nash equilibrium
In such a non-cooperative game setting, each player m acts selfishly, aiming to maximize its own payoff, given other players' strategies and regardless of the impact of its strategy may have on other players and thus on the overall performance.The process of such selfish behaviors usually results in Nash equilibrium, a common solution concept for non-cooperative games [16].
Definition 3.1: A power strategy profile p ⋆ is a Nash equilibrium if for every m ∈ M, for all p m ∈ P m .
From above, it is clear that a NE simply represents a particular "steady" state of a system, in the sense that, once reached, no player has any motivation to unilaterally deviate from it.
In many cases, the NE represents the result of learning and evolution of all the participants.
Therefore, it becomes fundamentally important to predict and characterize such point(s) from the system design perspective of wireless networks.In the rest of the paper, we will focus on charactering such point(s).The following questions will be addressed one by one: • Does a NE exist in our game?
• Is the NE unique or there exist multiple NE points?
• How to reach a NE if it exists?
• How does the system perform at NE?

B. Existence and uniqueness of Nash equilibrium
It is known that in general NE point does not necessarily exist.Therefore, we first investigate the existence of NE in our game.We introduce the following theorem: Once existence is established, it is natural to consider the characterization of the equilibrium set.The uniqueness of an equilibrium is quite a desirable property, if we wish to predict what will be the network behavior.But unfortunately many game problems have more than one equilibrium point [15].As an example of system with infinite NE we could consider a special instance of our game, namely the symmetric waterfilling game.This case is studied in [10] and it is characterized by equal cross-talk channel coefficients.Then, in general, our game G does not have a unique equilibrium.Nevertheless, under the assumption of i.i.d.continuous entries in G, we will show that the probability of having a unique Nash equilibrium is equal to 1.
For any player m, given all other players' strategy profile p −m , the best-response power strategy p m can be found by solving the following maximization problem, p m,n ≥ 0, ∀n which is a convex optimization problem, since the objective function u m is concave in p m and the constraint set is convex.Therefore, the Karush-Kuhn-Tucker (KKT) conditions for optimization are sufficient and necessary for the optimality [6].The KKT conditions are derived from the Lagrangian for each player m, and are given by where λ m ≥ 0, ν m,n ≥ 0, ∀m ∀n are dual variables associated with the power constraint and transmit power positivity, respectively.The solution to ( 5)-( 7) is known as waterfilling [7] where Before analyzing the equilibrium set, we derive the following theorem: The proof can be found in Appendix A.
From (10), it is easy to find λ m > 0, since ν m,n ≥ 0, g m,n > 0, ∀m ∀n.From (11), we have This equation implies that, at the NE, all APs must dedicate their maximum power.However, it is still difficult to find an analytical solution for ( 10)-( 12), since the system consisting of ( 8) and ( 9) is nonlinear.To simplify this problem we consider linear equations instead of nonlinear ones.The following lemma provides a key step in that direction.
October 2, 2009 DRAFT Lemma 3.4: For any realization of channel matrix G, there exist unique values of the Lagrange dual variables λ and ν for any Nash equilibrium of the game G. Furthermore, there is a unique vector s = [s 1 , . . ., s n ] T such that any vector p corresponding to a Nash equilibrium satisfies The proof can be found in Appendix B.
Now, let Z be the following (M + N ) × M N matrix: where g n is the n th column of G, I M is the M × M identity matrix, and 0 M is the zero vector of length M .Let c be the following vector of length Then, ( 13) and ( 14) can be written in the form of linear matrix equation Define the following sets Lemma 3.6: 2) The proofs of Lemma 3.5 and 3.6 can be found in Appendix C and D, respectively.
Based on the results from Lemma 3.4 to Lemma 3.6, we derive the following theorem.
Theorem 3.7: For any realization of a random M × N channel gain matrix G with i.i.d.continuous entries, the probability that a unique Nash equilibrium exists in the game G is equal to 1.
The proof can be found in Appendix E.
Thus, from Theorem 3.2 and 3.7, we have established the existence and uniqueness of NE in our game G.

IV. CONVERGENCE TO THE NASH EQUILIBRIUM
Equilibrium is meaningful in practice only if it is reachable from non-equilibria states.In fact, there is no reason to expect a system to operate initially at equilibrium.The "convergence to equilibrium" is in general a much harder problem which is usually related to the analysis of synchronous or asynchronous update mechanisms (see some references for interference channels [20,21]).

A. Potential game approaches
Fortunately, our game G can be studied as a potential game 1 .Potential games are known to have nice properties for the convergence of the best-response or greedy algorithms to the equilibrium.
All the potential games admit a potential function.This potential function is a unique global function that all the players optimize when they optimize their own utility functions.Thus, the set of pure Nash equilibria can be found by simply locating the local optima of the potential function.
Such games have received increasing attention recently in wireless networks [24,25,26], since the existence of potential function enables the design of fully distributed algorithms for resource allocation problems.
In fact, there are various notions of potential games (with different definitions related to slightly different properties for the existence and convergence of equilibrium), such as exact potential, weighted potential, ordinal potential, generalized ordinal potential, pseudo potential, etc.Here we only give the definition of the exact potential games, which is closely related to our game.

Definition 4.1:
A strategic game G is called an exact potential game if there exists a function for all (p m , p −m ) , (q m , p −m ) ∈ P. The function v is called as exact potential of the game.
Obviously, equation (18) implies that the NE of the original game G must coincide with the NE of the potential game, which is defined as a new game taking potential function v as utility functions for all the players.Therefore, we can transform the non-cooperative strategic game G into a potential game, if we can find a potential function that quantifies the difference in the utility function due to unilaterally deviating each player, as indicated in (18).
Taking inspiration from the result derived in the single channel case [25], it is not difficult to see that in our multi-channel case, G is an exact potential game with the following potential 1 The notation of potential games was firstly used for games in strategic form by Rosenthal (1973) [19], and later generalized and summarized by Monderer (1996) [22].
Denote by ζ m,n the term σ 2 + j =m g j,n p j,n , which represents the aggregate interference plus noise to user m's signal on sub-channel n.Now, the potential function v ⋆ is a common utility for all players in the potential game.
In order to find the single-user best-response in the potential game, one needs to solve the following maximization problem:

B. Distributed algorithm and convergence property
Note also that if each AP has complete knowledge of the channel state information, i.e., the matrix G (as considered in Section II), the uniqueness of the Nash equilibrium guaranties that each AP can determine independently in a decentralized way the power allocation at the Nash equilibrium.In order to acquire information about the whole channel matrix G is typically necessary a feedback channel from MSs to APs to transmit the channel estimations.In fact, in this case each AP can perform locally the best-response algorithm described in the following section and based on repeated maximization of problem (20) by starting from a random point p −m ∈ j =m P.However, the structure of the problem (20) suggests an alternative approach to reduce eventually the signalling on the feedback channel.In fact, the repeated optimization of problem ( 20) could be performed in a distributed way feeding back at the APs only the private channel gain g m and the aggregate interference plus noise ζ m .Nevertheless, note that such a distributed implementation of the algorithm would lead to a temporary phase where the APs are not transmitting at an equilibrium point.In our numerical results we will ignore the cost of feedback, and we focus on analyzing the theoretic upper-bound.
From the above discussion, we introduce a simple algorithm based on the iterative waterfilling [28] that players can follow to reach the NE m,n = 0, ∀m ∀n repeat In this algorithm, we assume that the same game could be myopically played repeatedly: in each round, every myopic player (player has no memory of past game-rounds) chooses its best-response according to the single-player waterfilling that depends on the current state of the game.The following theorem shows the convergence and optimality of the algorithm.

Theorem 4.2: The DPIWF algorithm converges to a Nash equilibrium of the OFDM noncooperative game G.
The proof can be found in Appendix F.
Although the final convergence (in power allocation) of DPIWF is proved, one may wonder whether the convergence behavior of the actual total network rate (the objective function in (21)) coincide with the convergence behavior of the corresponding potential function (19).We will discuss this issue in our simulation part.
A more general discussion about the convergence properties of potential games can be found in [22], where it shows that every bounded potential game2 has the approximate finite improvement property (AFIP), i.e., for every ǫ > 0, every ǫ-improvement path is finite.Then, it is obvious that every such finite improvement path of the exact potential games terminates in an ǫ-equilibrium3 point.In other words, the sequential best-response (players move in turn and always choose a best-response) converges to the ǫ-equilibrium independent of the initial point.
Note that this is a very flexible condition for the convergence, since order of playing can be deterministic or random and need not to be synchronized.It is one of the most interesting properties of the potential games, especially in order to distributively find the equilibrium in self-organizing systems.
It is not difficult to find that the simultaneous best-response (at each iteration, all the players choose their best-responses simultaneously) does not necessarily converges, due to the "pingpong" effect generated by myopic players.However, [23] has shown that for infinite pseudopotential games (a general case of exact potential games) with convex strategy space and singlevalued best-response 4 , the sequence of simultaneous best-response (reminiscent of fictitious play) also converges to the equilibrium.
It is interesting to note that for many practical systems with finite transmit power states, the similar results still hold for the convergence of the sequential best-response.The only difference is that, in the finite case, the existence of exact potential function implies the finite improvement property (FIP), and therefore, the sequential best-response converges to the exact Nash equilibrium (instead of ǫ-equilibrium).

V. NUMERICAL EVALUATION
In this part, numerical results are provided to validate our theoretical claims.We consider frequency-selective fading channels with channel matrix G of size M × N , where M is the total number of transmitters (players) and N is the total number of receivers.We assume the Rayleigh fading channel gain g m,n are i.i.d.among players and for different sub-channels.The maximum power constraint for each player m is asummed to be identical and normalized as Pm = 1.
In Fig. 3, we show the convergence behaviors of potential function and the actual total network rate (we will use the short term "actual rate") by using the proposed DPIWF algorithm for a random channel realization.We set the number of transmitters to M = 10 and the number of receivers to N = 10.As expected, in both Fig. 3a and Fig. 3b the potential function converges rapidly (at the 4 th iteration).In Fig. 3a, the actual rate converges slightly slower (at the 6 th iteration) and maintains the monotonically increasing slope.However, in Fig. 3b, the actual rate finally converges, but unfortunately it neither monotonically increases nor rapidly converges (at the 34 th iteration) comparing to the convergence speed of its potential function.Note that we use this example in order to show readers that a "defective" convergence (for the actual rate) may happen during the iteration steps of DPIWF algorithm, whereas (we will show immediately that) the actual rate converges "ideally" in most cases for a random channel gain matrix with i.i.d.Gaussian entries.
In order to measure the performance efficiency of distributed networks operating at the unique NE, we provide here the optimal power allocation strategy in centralized approaches as a target upper-bound for the total network rate (which is the transmit sum-rate of all players in the network).We will ignore the performance loss due to the necessary uplink and downlink signalling transmission.The total network rate maximization problem can be formulated as p m,n ≥ 0, ∀m ∀n which unfortunately is a difficult problem, since the objective function is non-convex in p.
However, a relaxation of this optimization problem (see in [12]) can be considered as a geometric programming problem [27], therefore, can be transformed into a convex optimization problem.
A low complexity algorithm was proposed in [12] to solve the dual problem by updating dual variables through a gradient descent.Note that the algorithm always converges, but may converges to a local maximum point in a few cases.We will use this algorithm in our simulations.
October 2, 2009 DRAFT In the following part, we will address two main practical questions through numerical results: 1) How does the network performance behave at the unique NE (the decentralized optimality) in comparison to the global optimal solution (the centralized optimality)?More precisely, we are interested in comparing the average total network rate instead of the instantaneous total network rate, i.e. ū(M, N ) is the average total network rate for a M transmitters and 2) What about the convergence behavior for the actual total network rate when using DPIWF algorithm?Does it converge rapidly (as in Fig. 3a) for most cases?
Let's consider the first question.In Fig. 4, we compare the average total network rate of both decentralized and centralized networks for two different channel noise levels σ 2 = 0.1 and 1, respectively.Network parameters are selected as follows: the number of transmitters M ∈ [1,25], the number of receivers N takes several representative values, such as 5, 10 and 15.The plots are obtained through Monte-Carlo simulations over 10 4 realizations for the channel gain matrix G. First, we can see in both figures Fig. 4a and Fig. 4b, the centralized optimality always outperforms the decentralized optimality.Second, for a fixed number of transmitters N when we increase the number of receivers M , the performance loss of decentralized systems (compare to centralized systems) becomes more and more apparent.In fact, this phenomenon can be intuitively understood as follows: when there are a great number of selfish players, the hostile competition turns the multi-user communication system into an interference-limited environment, where interference begins to dominate the performance efficiency.
Moreover, we note that in Fig. 4 the average performance of centralized systems is an increasing function of M (for a fixed value of N ), and the average performance of decentralized systems corresponding to NE show an increasing slope before diminishing and reaching convergence.For the typical values of N , i.e., N = 5, 10 and 15, in Fig. 4a, when σ 2 = 0.1 the average performance of decentralized systems are maximized approximately at M = 4, 9, 14, respectively; in Fig. 4b, when σ 2 = 1 the average performance of decentralized systems are maximized approximately at M = 6, 11, 16, respectively.It simply shows that different noise variance (in general channel condition) have a different impact on the decentralized system performance.This observation is fundamentally important for improving the spectral efficiency of a distributed multi-user OFDM hot-spot network: for a given area (given the number of receivers N and the current channel condition), there exists an optimal number of hot-spots (denoted as M ⋆ ) to be put in the network.
Roughly speaking: when M > M ⋆ , the system is overloaded due to the increase of competition over limited resources; when M < M ⋆ , the system is operated at the unsaturated state, since system resources are not fully exploited.
Let's now consider the second question.In Fig. 5, we show the probability of convergence to the decentralized optimality (NE) within 5 iterations for σ 2 = 0.1 and 1, respectively.To be more precise, we define the "convergence" as: the total network rate exceeds 99% performance of the final rate.We find that the probability of convergence is quite satisfactory (more than 98.2% in all cases), and this convergence probability tends to 1 when M ≫ N and M ≪ N .An interesting observation is that the minimal convergence probability always occurs when M = N , regardless of the noise variance value σ 2 .

VI. CONCLUSIONS AND FUTURE WORKS
In this paper we described the wireless small-cell networks as a strategic non-cooperative game.Each transmitter (AP) is modeled as a player in the game who decides, in a distributed way, the strategy of how to allocate its total power through several independent fading channels.
We studied the existence and uniqueness of NE.Under the condition of independent continuous fading channels, we showed that the probability of the equilibrium being unique is equal to 1.
Convergence issues have been addressed based on potential game analysis.Numerical studies have shown that, with very high probability, the DPIWF algorithm converges to 99% of the final rate under 5 iterations.

APPENDIX
A. Proof of Theorem 3.3 Proof: We prove the necessary and sufficient parts separately.

1) Proof of necessary condition (the only if part):
From the definition of NE (Definition 3.1), if a power set {p m } is a NE, it must satisfy all the best-response conditions in (3) simultaneously.Suppose a situation that all the players' power except player m's power reaches the NE point: p ⋆ 1 , . . ., p ⋆ m−1 , p m , p ⋆ m+1 , . . ., p ⋆ M .In this case when all other players' powers are fixed, as shown in ( 4), the best-response of player m is to set its power according to (8), which is exactly given by the single-player waterfilling treating all other players' signals as noise.
2) Proof of sufficient condition (the if part): From convex optimization theory [6], we know that the KKT conditions of the convex optimization problem are necessary and sufficient conditions for optimality.Therefore, we can say that a power strategy p m satisfies the best response condition if and only if it satisfies the single-player KKT conditions ( 5)- (7).Then collectively, we say a set {p m } satisfies all the best-response conditions simultaneously if and only if it satisfies ( 10)- (12).From Definition 3.1, if a set {p m } satisfies all the best-response conditions, it must be a NE.
This completes the proof.

B. Proof of Lemma 3.4
Proof: Consider a NE p ∈ R KN ×1 , from Theorem 3.3, the following equation is true Now, assume there exist two different Nash equilibria, e.g.p 0 , p 1 (p 0 = p 1 ), the following equation must also hold from where we have From above, it is easy to see that (22) holds if and only if we have α T β = 0 and α T γ = 0, which are equivalent to the following two equations, respectively, First, from (23), we observe that the value of s n (= k g k,n p k,n ) is fixed for any NE point.

C. Proof of
where d n 1 σ 2 +sn .From Lemma 3.4, we know that all the Nash equilibria must satisfy (25), with the same λ m and d n .In (25), the number of independent linear equations is |X |, while the number of unknown parameters is M + Ñ (since the rest of d n , n / ∈ N is known to be d n = 1 σ 2 ).It is well known that the solution to the system of linear equations is the empty set, if the number of independent equations is larger than the number of variables [18].Since each entry g m,n is i.i.d.random, it is obvious that, with probability 1, the equations in (25)  Any NE must satisfy (17); assume that two different power strategies p and p′ are both solutions to (17).Then Ẑ (p − p′ ) = 0.By the rank-nullity theorem [18], since the rank of Ẑ is equal to the number of its columns, this implies p − p′ = 0, which means there must be exactly one NE.

Theorem 3 .
2: A Nash equilibrium exists in the OFDM game G. Proof: Since P m is convex, closed, and bounded for each m; u m (p m , p −m ) is continuous in both p m and p −m ; and u m (p m , p −m ) is concave in p m for any set p −m , at least one Nash equilibrium point exists for G [17], [15].
and denote by |X | and |N | their cardinalities.From equation (12), if an index (m, n) / ∈ X we must have p m,n = 0. Without loss of generality, we assume that N = {1, . . ., Ñ } for Ñ ≤ N .Let Z be the (M + Ñ ) × M Ñ matrix formed from the first M + Ñ rows and first M Ñ columns of Z, p is formed from the first M Ñ elements of p, and c is formed from the first M + Ñ elements of c.Then, any NE solution must satisfy Zp = c.(16) Let Ẑ be the (M + Ñ ) × |X | matrix formed from the columns of Z that correspond to the elements of X .Similarly, let p be the vector of length |X | with entries p m,n such that (m, n) ∈ X (same order as they were in p).Then any NE solution must satisfy Ẑp = c.(17) Lemma 3.5: For any realization of a random M × N channel gain matrix G with i.i.d.continuous entries, if M Ñ > M + Ñ , the probability that |X | ≤ M + Ñ is equal to 1.
n ≥ 0, ∀n Only when the private channel gain g m = {g m,1 , . . ., g m,N } and the aggregate interference plus noise ζ m = {ζ m,1 , . . ., ζ m,N } are both known to player m, (20) can be solved as a convex optimization.It is easy to verify that this single-user best-response is the same waterfilling solution expressed in (8), due to the property of potential function.
m,n + g m,n p m,n end for until convergence

Fig. 1
Fig. 1 Illustration of femto-cell group with distributed network information flow