An OFDMA resource allocation algorithm based on coalitional games

This work investigates a fair adaptive resource management criterion (in terms of transmit powers and subchannel assignment) for the uplink of an orthogonal frequency-division multiple access network, populated by mobile users with constraints in terms of target data rates. The inherent optimization problem is tackled with the analytical tools of coalitional game theory, and a practical algorithm based on Markov modeling is introduced. The proposed scheme allows the mobile devices to fulfill their rate demands exactly with a minimum utilization of network resources. Simulation results show that the average number of operations of the proposed iterative algorithm are much lower than K · N , where N and K are the number of allocated subcarriers and of mobile terminals.


Introduction
The advent of high-definition entertainment services justifies the need for wideband, high-capacity wireless communication technologies that use the available bandwidth efficiently and provide data rates close to channel capacity [1]. Multicarrier channel access techniques such as orthogonal frequency-division multiple access (OFDMA) can be exploited to increase data rates, by dividing a frequency-selective broadband channel into a multitude of orthogonal narrowband flat-fading subchannels. An intelligent and scalable joint power and bandwidth allocation mechanism is crucial to ensure the quality of service (QoS) to the consumer at a reasonable cost [2].
The problem of subcarrier and power assignment in OFDMA has been extensively considered in the literature during the last few years. The proposed solutions mainly fall into two different categories: margin-adaptive and rate-adaptive methods. The goal of marginadaptive schemes (such as [3]) is to minimize the total transmit power expenditure to achieve the (minimum) QoS requirements. Algorithms based on the rate-adaptive criterion (such as [4]) aim on the contrary at achieving the maximum data rate subject to different QoS constraints.
Most algorithms focus on the downlink scenario, with constraints on the total power transmitted by the radio base station. In the uplink scenario, the restrictions apply on an individual basis to each user terminal, and the simplest solution to maximize channel capacity of mobile devices under a power constraint is the water filling (WF) criterion [5]. In this case, channel capacity is increased when every subcarrier is assigned to the user with the best path gain, and the power is distributed according to the WF criterion. However, the WF solution is highly unfair, since only users with the best channel gains receive an acceptable channel capacity, while users with bad channel conditions achieve very low data rates. To derive fair resource allocation schemes, we resort to other techniques, described in the following.
Generally, a resource allocation algorithm can be either centralized or distributed. In centralized schemes like [6,7], the algorithm is executed by a central unit (like the radio base station) that is aware of the channel conditions and the demands of all mobile terminals. In a distributed model (such as [8]), each mobile terminal tries to accomplish its own (minimum) QoS autonomously. In general, centralized techniques show better performance at the expense of a higher signaling between terminals and central unit, and lower scalability. In the context of distributed algorithms, several cross-layer approaches were developed (e.g., [9,10]) to reduce the total power consumption and to support different services and traffic classes in the downlink channel of an OFDMA system. Maximizing the power efficiency in uplink OFDMA has also been tackled in [7,11,12] using different formulations for the joint resource allocation problem.
Recently, coalitional game theory [13,14] has been used to address the problem of fair resource allocation for OFDMA systems using either centralized or distributed algorithms. Roughly speaking, coalitional game theory studies the actions of a group of individual agents (such as mobile devices) that compete for a common resource (such as the wireless medium) by possibly finding synergies and forming coalitions among each others. Han et al. in [6] introduce a distributed algorithm for the OFDMA uplink based on the Nash bargaining solution (NBS) [13] and the Hungarian method [15] to maximize the overall system rate under individual power and rate constraints. The NBS guarantees each user to achieve its own demand, thus providing fairness to the resource allocation. The proposed algorithm shows a complexity O(K 2 Nlog 2 N + K 4 ), without considering the expensive computational load to solve the (convex) equations of the NBS. In [16], Chee et al. propose a centralized algorithm for the OFDMA downlink scenario based on NBS and Raiffa-Kalai-Smorodinsky bargaining solution (RBS) [17]. NBS guarantees the minimum rate, while RBS bounds the maximal rate achieved by each user, respectively. The results show a good performance only when the gap between the maximum and the minimum rate is large. The complexity of this algorithm is O(K N + K 2 ), again without considering the solution of the RBS. In [18], Noh proposes a distributed and iterative auction-based algorithm in the OFDMA uplink scenario with incomplete information. The experimental complexity of the algorithm is O(K Nlog 2 K). However, the simulation parameters are not realistic (three users and subcarriers), and it is thus hard to estimate the computational complexity when using real-world network parameters.
All the mentioned schemes, which represent, to the authors' knowledge, the most relevant algorithms for OFDMA resource allocation with coalitional game theory, exhibit a good trade-off between overall system rate and fairness. Unfortunately, they also present a number of common problems: (i) most algorithms are based on non-linear programming, which is computationally expensive and hardly scalable when considering thousands of subcarriers and tens of users. Thus, they are not suitable for implementation by network designers; (ii) although the resource apportionment results to be fair from the users' point of view, the achieved QoS may be much larger than demanded. This implies a waste of network resources from the service provider perspective, which has not been considered by previous works; and (iii) to reduce the computational burden, each subcarrier is allocated to mobile terminal in an exclusive manner, although this may limit the number of simultaneous connections in the uplink channel.
In this work, we aim at fulfilling each user's QoS requirement in terms of target transmit rates exactly with the best utilization of the network resources, so as to satisfy both the users and the service provider. We also aim at designing a low-complexity algorithm that allows a centralized solution for the joint power and bandwidth allocation for OFDMA uplink channels to be achieved in a few steps using typical network parameters. In our approach, we allow every subcarrier to be possibly shared among more than one user, and we add a constraint on the maximum number of used subcarriers per terminal. This is achieved by dividing the available bandwidth into a number of disjoint blocks of consecutive subcarriers and forcing each terminal to use at most one subcarrier per block. The motivation of this is twofold: we wish to (i) increase the signal-to-interference-plus-noise ratio (SINR) on the used subcarriers, which also simplifies channel estimation; and (ii) exploit the frequency diversity to increase the performance of forward error correction techniques.
The remainder of the paper is structured as follows. Section 2 introduces the basics of coalitional game theory. In Section 3, we formulate the resource allocation problem in the uplink OFDMA scenario as a coalitional game, whereas in Section 4 we introduce a solution algorithm based on Markov modeling. Section 5 presents our the experiment results, and some conclusions are drawn in Section 6.
Notation: For the reader's convenience, Section 7 reports the list of symbols used throughout the paper.

Brief review of coalitional game theory
A coalitional game is a game where groups of players (the coalitions), instead of single players, interact and compete [13,14]. It is denoted as G = (M, ν), where M denotes the set of players and ν the coalition function. We also denote with x m the payoff of player m in M, m = 1, 2, . . . , M = |M|. If S ⊆ M is a coalition (subset) of M formed in G, then its members get an overall payoff ν (S), with ν (S) = 0 when S = ∅. In a cooperative game with transferable utility (TU), the payoff of a coalition can be expressed by a real value.
A relevant issue in coalitional games is how the players make mutual binding agreements to form the coalition that provides them with the highest payoff. When the players are better off when staying together, they tend to form the grand coalition (i.e., the coalition of all the agents) [14]. The grand coalition is formed only if the game is superadditive: An important issue in a coalitional TU game is how to distribute the payoff of the grand coalition among agents. The fundamental solution is the core solution, defined as follows: Definition 2: Let M be the set of M players of the superadditive TU game G, and let ν be the payoff of the game. The core of G is the set In other words, x ℝ M is a core of G if and only if no payoff distribution can improve upon x m ∈ x ∀m ∈ M. ■ In other words, the core of a coalitional game is the set of all payoff vectors (i.e., all those vectors whose entries add up to a same amount equal to the utility of the grand coalition) such that the sum of all payoffs of the players in any existing coalition S is no smaller than the utility of the coalition.
For a non-superadditive coalitional game, the network formation process does not lead the players to form a grand coalition. In this case, Definition 2 does not apply. Let us redefine the core set in a general (not necessarily superadditive) coalitional formation TU game. Let ψ = [S 1 , S 2 , . . . , S m ] denote a partition of the set M wherein S i ∩ S j = ∅ for i ≠ j, m i=1 S i = M and S i = ∅ for i = 1, ... m, and let Ψ denote the set of all possible partitions ψ. Let us also define F = [S 1 , S 2 , . . . , S m ], such that m i=1 S i = M and S i = ∅ for i = 1, ... m, as a family of (non-disjoint) coalitions.
Definition 3: A core apportionment x ℝ M is a payoff distribution with the following property: S∈ψ ν(S) = ν(M). ■ The core allocation set can be found through linear programming and can also be an empty set. We can study the non-emptiness of the core without explicitly solving the core equation, using the following lemma: Lemma 1 [13]: A necessary and sufficient condition for the core of a TU game to be non-empty is the TU game to be balanced.

Definition 4:
A superadditive TU game G for a family F of coalitions is balanced if, for any S ∈ F , the inequality holds, where μ S is a collection of numbers in [0, 1] (balanced weights) such that with 1 S ∈ R M denoting the characteristic vector whose elements are ■ Definition 5: A non-superadditive TU game G for a family F of coalitions is balanced if, for every balanced collection of weights μ S, and for any S ∈ F ,

Problem formulation
Let us consider the uplink of a single-cell infrastructure OFDMA system with total bandwidth B, subdivided in N subcarriers with frequency spacing Δf = B/N. The cell is populated by K mobile terminals, each terminal k ∈ K = [1, . . . , K] experiencing a complex-valued channel gain H kn on the nth subcarrier to the base station and having a data rate requirement R k (in bit/s). We assume that fulfilling such constraints simultaneously by all terminals is feasible.
To ensure fairness among users, the set Figure 1. Each terminal is allowed to take at most one subcarrier per each subblock. This is done to avoid assignments of contiguous blocks of subcarriers to users that may be in a deep-fading frequency range.
Our resource allocation strategy consists in finding a vector of transmit powers P k , where P k = [p k1 , ..., p kN ], with p kn representing the power allocated by terminal k over the nth subcarrier, that allows the QoS constraint R k to be satisfied. We decouple the problem into the cascade of subchannel assignment and (subsequent) power allocation.

A. Subchannel assignment
We describe here two different options to perform this function: 1) Best-carrier assignment: For every subblock N (d) , every terminal k ∈ K is assigned its best subcarrier n The probability of assigning the same subcarrier to multiple mobile terminals is nonnull.
2) Vacant-carrier assignment: In a sequential manner, for every subblock N (d) , every terminal k ∈ K is assigned its best subcarrier n we would like to ensure exclusive use of each subcarrier n ∈ N (d) to better exploit the available bandwidth B (i.e., to reduce the multiple access interference). So, if n (d) k has been already assigned to some other terminal ℓ <k, then terminal k is assigned the best vacant (unassigned) subcarrier to n (d) k within the channel coherence bandwidth. Clearly, this is not considered if k >N/D, so that terminal k is assigned its best subcarrier in the subblock anyway. Note that the ordering of K has a negligible impact on system performance when N is, as usual, sufficiently high.
Both assignment strategies can be modified to address the case in which each terminal is allowed to have a different number of assigned subcarriers (different D k for each mobile terminal), based on its own data rate requirement R k . This can be done, for instance, by assigning the subcarriers on a terminal basis rather than on a subblock basis. This modification to the algorithm might lead to a bad performance given particular configurations of the network, whereas the average performance in the long run proves to be experimentally equivalent to the case of equal number of blocks D across all users. However, for the sake of simplicity, we consider the same D for all terminals from now on.

B. Power allocation
To derive a stable solution to the power allocation subproblem, we consider it as a coalitional game, in which each subchannel n (d) k ∈ N is identified as a player in the game. To model the coalitional game, we build K coalitions ψ = [S 1 , . . . , S K ], to be assigned to the K terminals. Each coalition S k , k ∈ K, contains the D players . Note that (i) the members of each coalition are fixed, since one player cannot move from one coalition to another; and (ii) since a subcarrier n ∈ N can be shared among multiple users, there exist virtual copies of it belonging to different coalitions. For the sake of notation, we will identify with a generic n ∈ S k any of the subcarriers assigned to terminal k. The strategy of each player n ∈ S k is represented by the optimal power expenditure p kn ∈ [0,p kn ], wherep kn is the maximum power expenditure over subcarrier n by terminal k. Note that (i) if n / ∈ S k , p kn = 0; and (ii) if n ∈ S k , we can also have p kn = 0, which means that the kth terminal does not transmit on the nth subcarrier, and it thus bears an actual number of active subcarriers D k < D.
The system under investigation aims at fulfilling the QoS requirement of every terminal k in terms of target rate R k . For simplicity, we estimate the achieved data rate as the Shannon capacity C k of terminal k that can be approached by using suitable channel coding techniques [19]: (8) where C k is the Shannon capacity achieved by terminal k on its subcarrier n ∈ N : Clearly, C kn = 0 if n / ∈ S k , since p kn = 0. If n ∈ S k , C kn depends on the received SINR g kn at the base station on subcarrier n, which is a function of the strategy (i.e., the transmit power) chosen by player n (i.e., one of the D subcarriers assigned to the kth terminal), of the transmit power of other terminals on the same subcarrier (if n / ∈ S k , p jn = 0), of the corresponding channel gains, and of the power of the additive white Gaussian noise (AWGN) σ 2 w . Note that, in an OFDMA system, there is no interference between adjacent subcarriers. Hence, C kn considers only intra-subcarrier noise that occurs when the same subcarrier is shared by more terminals. Each player n ∈ S k causes interference only to its virtual N : = n ∈ S j , with j ≠ k and for any d', 1 ≤ d' ≤ D. The mobile terminals and the service provider are most satisfied when each mobile terminal k achieves its own data rate requirement exactly: C k = R k . In view of this goal, we can force all players in each coalition S k to select their strategies (i.e., the power allocation for terminal k over the available bandwidth B) so as to maximize a utility function for the kth coalition S k , defined as where u(·) is the step function, with u(y) = 1 if y ≥ 0 and u(y) = 0 otherwise (see Figure 2). If C k = R k , S k , earns the highest possible payoff ν(S k ) = +∞. If C k >R k , S k gets a positive payoff, whereas it obtains a negative payoff if C k <R k . The factor a is a finite positive constant (much) greater than one (i.e., 1 ≪ a < +∞) that ensures ν(S k ) to be negative when C k <R k . This is expedient to let the players distinguish a capacity C k that is lower/upper than R k only by knowing their own coalition's payoff. Note that, in practice, +∞ can be represented by the largest countable number available (e.g., 2 64 -1) in a given simulation platform.
The payoff of each coalition is a real number and, in our formulation, the most important parameter is the gain of each coalition, whereas the outcome of each player does not matter at all. For instance, we can equally divide the payoff of the coalition among all players. Therefore, this game is a TU one [13,14]. The specific shape of our utility function (10) is actually immaterial and was chosen to ensure fast convergence of the iterative algorithm that will be introduced later on. We could have considered any utility function that increases as the difference C k -R k moves from ±∞ to 0, just to make sure that, for any C k ≠ R k , each coalition has an incentive to move toward C k = R k .
To provide further insight into the problem, we investigate now some properties of the proposed game G. As a first step, we note that the players in G = (M = k∈K S k , ν) with the utility function (10) do not tend to form the grand coalition. This is because every player n ∈ S k cannot leave its coalition S k : the members of every coalition are fixed and do not change during the game. This may appear inappropriate to the notion of a coalitional game. However, our assumption is fairly common in economic problems like the study of a bargaining game between two corporations when each corporation has its own business branches. In this case, the members (branches) of each coalition (corporation) are fixed [20].
A relevant result for our game is the following: (10) is not empty.
Proof: The number of coalitions and the number of players in each coalition are both fixed. Since each player belongs just to one coalition, the unique balanced collection of weights (μ S ) S∈ψ is μ S = 1 ∀S ∈ ψ. To conclude the proof, we must verify that S∈ψ ν(S) ≤ max ψ∈ S∈ψ ν(S). Since the target rates of all terminals are assumed to be feasible, then every coalition expects C k to approach R k . Therefore, every coalition is allowed to earn the highest possible payoff.■ In the following section, we will show how the fundamental properties of our game lead to a practical allocation algorithm.

The best-response algorithm
We are interested in answering questions like: How do the players set their proper transmit power amounts? Dynamic learning models provide a framework for analyzing the way the players may set their proper strategies. A player adopts a certain power amount if and only if this matches its coalition's interests, and this goal can be achieved through a best-response iterative algorithm [21] based on Markov modeling [22]. Each player takes its own decisions individually, myopically, and concurrently with the others, so as to lead its own coalition's payoff toward +∞(C k = R k ). At each (discrete) time step of the algorithm, the autonomous players simultaneously adjust their transmit powers based on a model to increase the payoff of their own coalitions. Although this leads to interference when virtual copies of the same subcarriers simultaneously change their powers, we show that this dynamic myopic procedure guarantees the maximum payoff to each coalition.
The process starts up at time step t = 0 with an arbitrary assignment of the transmit powers p t=0 kn to all K · D players in the game (that are grouped in K coalitions with players n ∈ S k with n = n (d) . At the generic time step t, our system is in the , and ν t = [ν(S t 1 ), . . . , ν(S t K )] ∈ R K contains the payoffs of the coalitions in ψ t . The evolution of the Markov chain is then dictated by the strategy of the game. The strategy of each player n ∈ S k is to find the best power amount p t kn that leads to an increase in the payoff ν(S t k ) of its own coalition S k . In practice, player n ∈ S k decides whether to change its power allocation, making its coalition better off, or to keep transmitting at the same power level (e.g., when its coalition's payoff is infinite). The following snippet pseudocode shows how each player n ∈ S k takes its decision during time step t.
), then p t+1 kn =p kn ; //accept else p t+1 kn = p t kn ; //discard In this algorithm, ν(S k ) is the "trial" value of the current payoff of the coalition when the tentative powerp kn is adopted: it is computed with p jn = p t jn for all n ∈ N and for any j ≠ k, and p kn =p kn . At each step of the update process, the power step p kn is the particular outcome (value) of a random variable uniformly distributed between 0 and p kn , with p kn p kn . As better detailed in Section 5, optimal values for p kn can be found in order to minimize the algorithm computational load, based on experimental results. If ν(S t k ) ≤ 0, then C k <R k , and the best strategy for player n ∈ S k is to increase its current transmit power so as to increase its coalition's payoff. As a result of the random power stepping, the tentative power is a random number in the interval [p t kn ,p kn ]. Player n ∈ S k accepts this value if and only if the coalition payoff ν(S t k ) increases, otherwise it ends up transmitting at its previous value. If 0 < ν(S t k ) < ∞, player n ∈ S k s best strategy is on the contrary to decrease p t kn , and thus the tentative (random) transmit power belongs to the interval [0, p t kn ]. At the end of each time step t, the base station computes the payoff ν(S k ), ∀k ∈ K with updated power amounts. A uniformly distributed random power stepping is adopted to increase the probability of picking the (unknown) best adjustment value, and thus both to reduce the convergence time of the algorithm and to possibly minimize the overall power consumption. As is apparent, the convergence speed of the algorithms depends not only on the parameters of the network but also on the choice of the maximum update step p kn .
As already stated, two copies n ∈ S k and n ∈ S j (the virtual copies of the same subcarrier n) may happen to wish to adjust their transmit powers in a conflicting (and thus incompatible) way. If we assume that each player just follows the decision rules listed in the pseudocode above, then the probability of conflicting decisions will be high. To reduce the occurrence of this event, we modify our algorithm by requesting each player not to update its transmit power at every step of the game with a probability l [0, 1]. At each time step t, every player n ∈ S k selects a random number ξ t kn uniformly distributed in [0, 1]. If ξ t kn > λ, then the player applies the algorithm and (possibly) update p t+1 kn , otherwise p t+1 kn = p t kn (i.e., during time step t, it skips the update process, and the value of p t kn is maintained). If l is close to 1, then the probability of conflicting decisions tends to 0, but the algorithm will have a large convergence time, since the probability of updates is low. In addition to the conflicts described above, another potentially disruptive condition may arise between different subcarriers belonging to the same coalition: if both (myopic) players simultaneously increase their powers p t kn > 0 and p t kn > 0, it may occur that C k >R k . To optimize the update mechanism and to cope with both negative kinds of events, we could consider a variable and adaptive threshold λ t kn for each virtual copy of the same subcarrier (each player). However, to reduce the complexity of the algorithm, we assume λ t kn = λ > 0 for all the players (i.e., virtual copies of the subcarriers). As better detailed in Section 5, the optimal value of l must be selected as a suited trade-off. Note that the value of l is common knowledge among the players at every step of the algorithm. Nevertheless, interference between concurrent, conflicting decisions may prevent the coalitions from achieving the expected payoff. If all coalitions earn less than the previous time step, all players assign the previous power amount for the next time step. There may exist network configurations in which the iterative algorithm is not guaranteed to converge. To account for these situations, we place a maximum number of operations Θ, beyond which the algorithm is stopped, and the sum of the users' demands is supposed to be unfeasible.
We show now that our proposed algorithm reaches a stable state, which corresponds to the core apportionment of the game. We model the evolution of the algorithm as the output of a finite-state Markov chain with state space Ω = {ω = (ψ, ν)|ψ Ψ, ν ℝ K }. For all time steps t, ψ t = ψ belongs to the subset of all possible disjoint coalitions Ψ with exactly D members, and remains fixed for the whole duration of the algorithm. The time evolution of the algorithm as a Markov chain is due to time variability of ν t , which depends on the power levels p t kn chosen by the players in the coalitions collected by ψ t . We the use this notation for the sake of convenience, to emphasize that ν t is directly connected to ψ t . The Markov process asymptotically tends toward a stable coalition structure state, where no player has any incentive to change its power. In other words, all coalitions get their maximum payoffs. Our algorithm guarantees that when t ∞, this Markov chain tends toward a singleton steady state with probability 1.
Definition 6 [22]: A set Φ ⊂ Ω is an ergodic set if, for any ω Φ and ω' Φ, the probability of reaching the state ω' starting from ω is zero. Once the Markov chain falls into a state belonging to an ergodic set, it never leaves that set, and it wavers between the states in that ergodic set from then on. The probability of reaching any state in the ergodic set is strictly positive. ■ Lemma 2 [22]: In any finite Markov chain, no matter which state the process starts from, the probability of ending up into an ergodic set tends to 1 as time tends to infinity.
Definition 7 [22]: Singleton ergodic sets are called absorbing states. ■ If Φ is an absorbing state and ω Φ, the probability of ending up into state ω when beginning from ω is one. In fact, absorbing states individually represent points of equilibrium.
Lemma 3: The state ω = (ψ, ν) is an absorbing state of the best-response process if and only if ν(S k ) = +∞ ∀S k ∈ ψ (11) Proof: This condition ensures that no player has any incentive to change its power amount. If this condition is met, then no coalition can get a higher payoff by deviating from state ω = (ψ, ν). Since all the target rates are feasible, this condition is also necessary.

Theorem 2:
The best-response process has at least one absorbing state.
Proof: Since the best-response algorithm is a Markov process, Lemma 2 ensures that the best-response process reaches an ergodic set Φ. To conclude the proof, it is enough to show that Φ is singleton. Suppose that the number of states in the ergodic set is |Φ| > 1. Then, all players revise their strategies without conflicting decisions with a non-null probability. As a consequence, the Markov process moves to a new state, in which all coalitions' payoff are higher than those achieved in the previous state. This means that the probability of going back to the previous state is null, which contradicts the notion of an ergodic set. ■ Note that Theorem 2 does not ensure the uniqueness of the ergodic set in the best-response process. There may exist some different combinations of the power allocation for the players to reach to a steady state. It means that the game possesses multiple equilibria. The major finding of Theorem 2 is that according to the way the players adjust their strategies, the best-response process leads to one of the steady states, in which no player has any incentive to revise its power allocation.
Theorem 3: The set of payoffs associated with an absorbing state of the best-response process coincides with the set of core allocation: i. if ω = (ψ, ν) is an absorbing state, then ν is a core allocation. ii. if ν is a core allocation, then all ω = (ψ, ν) are absorbing states.

Proof:
Part (i) Suppose ω = (ψ, ν) is an absorbing state but ν is not a core allocation. In this case, there exist some coalitions that can obtain a higher payoff. This is contradictory, since the game reaches an absorbing state when every coalition gets the maximum payoff.
Part (ii) If ν is a core allocation, then no coalition can earn by letting its member change their powers. This implies that the state will not move to a new state, and thus the current state is absorbing. ■ Coalitional games aim at identifying the best coalitions of the agents and a fair distribution of the payoff among the agents. Interestingly, in this game the absorbing state coincides with one of the Nash equilibria [13] of the game. Suppose there are K = 2 mobiles connected to a base station with N = 1 subcarrier only. In this case, the M = K · N = 2 copies of the subcarrier, each constituting a coalition, are engaged in a 2 × 2 game. Every player has two strategies: either p k = 0 or p k =p k . It is straightforward to verify that, in this game, a mixed (versus pure) Nash equilibrium exists which satisfies the stability of the static game. With due attention to the notation, we can extend this result to a general case.
Theorem 4: The set of absorbing states in the bestresponse process and the set of Nash equilibria of the static game are asymptotically (in the long run) equivalent.
Proof: Let us consider the coalitions in the bestresponse process as players in a static game. Lemma 2 ensures that this process reaches an ergodic set in the long run. According to Theorem 2, this set is singleton, and thus its member is an absorbing state. Hence, no coalition (i.e., no player in the static game) has any incentive to revise its strategy. In static games, this is the definition of a Nash equilibrium. ■ We can now conclude that the absorbing state is an extension of the Nash equilibrium, since the coalitions bind agreements with each other as economic agents and earn a vector value rather than a real number. Once the coalitions reach the absorbing state, their payoff is the highest possible (+∞), and no coalition is willing to revise its current strategy. In general, as follows from Theorem 4, the Nash equilibrium of the game is Paretooptimal (efficient), since no other strategy can achieve a payoff greater than +∞.

Numerical results
In this section, we evaluate the performance of the bestresponse algorithm presented in Section 4. We consider some cases with different numbers of mobile terminals, target data rates, and subcarriers, showing that our suggested scheme reaches a steady state after a few steps only. To increase the convergence speed of the algorithm, we introduce a tolerance parameter ε in our utility function, such that if |C k /R k -1| <ε, then we assume that the payoff is +∞. We can possibly set an asymmetric range [ε 1 , ε 2 ] such that ε 1 ≤ (C k /R k -1) ≤ ε 2 , so as to favor solutions with C k >R k .
We consider the following parameters for our simulations: the maximum power of each terminal k on each subcarrier n isp kn =p = 3μW; the power of the ambient AWGN noise on each subcarrier is σ 2 w = 100 nW, and the constant number in (10) is a = 5000. We also set Θ = 10K · N as the stopping criterion of the iterative algorithm, where K and N depend on the network parameters of the simulation. The path coefficients H kn , corresponding to the frequency response of the multipath wireless channel at the carrier frequency nΔf, are computed using the 24-tap ITU modified vehicular-B channel model adopted by the IEEE 802.16m standard [23]. To account for the large-scale path loss, we assumed the terminals to be uniformly distributed between 3 and 100m. Based on numerical optimizations, the parameter l that reduces the probability of conflicting decisions among members of different coalitions for different number of terminals, subcarriers, and signal bandwidth is l = 0.97.
The initial power allocation is p kn = 0 ∀k ∈ K and ∀n ∈ N . This experimentally provides the minimal power consumption at the steady state, and in most cases the minimum number of steps of the algorithm. Figure 3 reports the behavior of the achievable rate C k as a function of the time step t in a network with K = 10 terminals, N = 1024 subcarriers, and bandwidth B = 10 MHz using the vacant-carrier assignment scheme. The target rates, reported in Figure 3 with solid markers on the right axis, are assigned randomly to each terminal using a uniform distribution in the range [100, 250] kb/s. Further parameters are as follows: tolerance ε 1 = 0, ε 2 = 0.01 power update step p kn =p kn /25 = 120 nW, and number of subblocks D = 32. Numerical results show the convergence of C k to the respective target rates R k after 31 steps of the best-response algorithm.
In the remainder of this section, we will evaluate the average performance of our proposed algorithm in terms of power expenditure and computational burden using realistic system parameters and extensive simulation campaigns. Note that we are not able to implement the joint resource allocation techniques available in the literature and reviewed in Section 1, mainly due to the unfeasible algorithmic complexity when using tens of terminals, hundreds of subcarriers, and high data rates (on the order of Mb/s). As a consequence, in the following we will compare our measured results with the theoretical performance provided by the literature. The complexity figures given in Section 1 will be used as a reference to compare the performance of our proposed scheme in terms of computational demand.   Figure 4 shows the average normalized power expenditure ζ k at the steady state as a function of K, computed by averaging ζ k = 1 N n∈N p kn p kn over all terminals. This serves as a measure for the average total power consumption normalized to the maximum power expenditure available to each terminal. As can be noticed, ζ k increases for K ≥ N/D, since the number of shared subcarriers increases and the terminals must spend more power to overcome the intra-subcarrier noise. Interestingly, the power expenditure of the proposed centralized algorithm shows higher efficiency than the distributed and cross-layer schemes available in the literature (e.g., see [7,10,12]). For instance, when considering 500 random realizations of a system with bandwidth B = 10 MHz and N = 1024 subcarriers, and using the vacant-carrier assignment model, we find that, in the case of a total sum-rate demand of 20 Mb/s (i.e., with a spectral efficiency of 2 b/s/Hz) and R k = R 200 kb/s (i.e., K = 100 terminals), the maximum power consumption per user is 31 μW and the average power consumption of the system is 0.53 mW. In the multicell scenario of [7], the average power expenditure for each cell is 8 mW when the achievable data rate is 40 Mb/s. When considering the cross-layer algorithm proposed in [10], the average power expenditure per mobile terminal is 0.4 W with maximal spectral efficiency of 2 b/s/Hz, whereas the average power expenditure per mobile terminal required by the energy-efficient techniques proposed in [12] is 0.4 and 1.2 W when the achieved data rate is equal to 40 and 140 kb/s, respectively. Figure 5 shows the computational burden of our algorithm expressed in terms of the average number of operations per terminal required to reach the steady state as a function of the number of terminals K, with the vacant-carrier assignment model. The number of operations is measured experimentally by counting the number of steps required by the subchannel assignment plus the total number of trials required to update the transmit power according to the best-response algorithm. As can be seen, the number of operations increases as D increases. This can be justified since increasing D increases the number of players K · D, which yields an increase in the number of conflicting decisions. Note that the proposed algorithm is able to provide a spectral efficiency higher than 1 b/s/Hz, which occurs, for instance, when we assume more than K = 50 users with rates R k = 200 kb/s over a bandwidth B = 10 MHz in the proposed scenario, with a linear computational burden at the base station using appropriate values for the parameters. In this particular example, a good trade-off between performance and complexity is D = {8, 16} and p kn = 600 nW. Using these values, the number of operations of the proposed algorithm is experimentally lower than the product K · N, and so considerably lower than the number of operations required by the schemes available in the literature (e.g., see [6,16,18]). Our experiments with different data rate demands show that a smaller data rate reduces also the number of operations significantly. To further reduce  the number of operations, we can also increase the tolerance parameters (e.g., with ε 2 = 0.1, we experience a reduction in the number of operations on the order of 20-30%). Note also that the spectral efficiency achieved by the proposed fair resource allocation method, while showing a linear computational burden, is comparable with that provided by sum-rate maximizing algorithms (e.g., see [24]). In practice, a reasonable value for the maximum spectral efficiency achieved by the network in the region of linear computational load in all simulated scenarios (not reported here for the sake of brevity) is slightly lower than 2 b/s/Hz. For higher spectral efficiencies, no parameter selections can achieve the optimal resource allocation with linear complexity, and the number of operations appears to increase exponentially with the number of mobile terminals. However, note that the solutions can be found in most cases. Figures 6 and 7 depict the simulation results of a network with R k = R = 200 kb/s ∀k ∈ K, N = 1024, B = 10 MHz, and ε 1 = 0, ε 2 = 0.04 using the best-carrier assignment model. Solid lines represent the case p kn =p kn /5 = 600 nW whereas dashed lines depict the case p kn =p kn /25 = 120 nW. Squares, upper triangles, and lower triangles correspond to D = {16, 32, 64}, respectively. Figure 6 shows the average normalized power expenditure ζ k at the steady state as a function of K. As can be seen, the average power expenditure using the best-carrier assignment model is lower than with the vacant-carrier assignment, since the terminals having better channel conditions can spend less power.
A drawback of the best-carrier assignment is an increased number of operations required by the algorithm. Figure 7 shows the average number of operations per terminal required to reach the steady state as a function of the number of terminals K. As can be seen, the best-carrier assignment model has a computational burden higher than vacant-carrier assignment model, since the number of shared subcarriers in the best-carrier assignment model is larger than in the vacant-carrier assignment, which increases the probability of interference between simultaneous decisions in the bestreply algorithm. Note that, using the best-carrier assignment model, the case D = 16 appears to be computationally expensive. Figure 8 shows the average number of operations per terminal in the case of a network with parameters R k = R = 500 kb/s ∀k ∈ K, N = 512, B = 10 MHz, and ε 1 = 0, ε 2 = 0.04 using vacant-carrier assignment model. Solid and dashed lines represents the cases p kn = 3μW and p kn = 600 nW, respectively, whereas circles, squares, upper triangles, and lower triangles depict D = {8, 16, 32, 64}, respectively. Even in this case, with more severe requirements in terms of target data rates, the number of operations is shown to be lower than the product K · N, again using spectral efficiencies higher than 1 b/s/Hz. Finally, Figure 9 shows the average number of operations per terminal in the case of a network with parameters B = 20 MHz, N = 2048, R k = 2 Mb/s, ε 1 = 0, and ε 2 = 0.04 with vacant-carrier assignment model. Solid and dashed lines represent the cases p kn = 3μW and p kn = 600 nW, respectively, whereas circles, squares, and upper triangles depict D = {64, 128, 256}, respectively. The number of operations is again lower than K · N even in the case of high data rate demands.  As can be seen in Figures 5, 7, 8, and 9, due to the random behavior of the proposed algorithm, there is a strict relation between the average number of operations, the network parameters, and the algorithm parameters (including the channel assignment model). Depending on the parameter selection, we see different shapes (linear or exponential behavior) for the average number of operations. Thus, estimating the analytical complexity function for the best-response algorithm is hard to do. However, for all tested scenarios (not reported here for the sake of brevity), there exist properly tuned values (such as D, p kn ) that provide an average number of operations for the proposed algorithm that are lower than the product K · N, even with high data rate demands like in the cases of Figures 8 and 9. The parameter that most impacts on the number of operations is D. Our experiments show that, for the optimal parameter selection (i.e., when the number of operations scales linearly with N and K), the average number of active subcarriers per terminal (i.e., those which bear p kn > 0) is approximately D/2 when the vacant-carrier model is adopted. This rule of thumb can be used as a design criterion for the proposed algorithm. Let us consider Figure 10, which reports the average number of active subcarriers to each mobile terminal as a function of the achieved rate R, in the linear computational load regime and using p kn = 600 nW. Dashed and solid lines depict the cases B = {10, 20}MHz, respectively, whereas circles, squares, and upper triangles represent N = {512, 1024, 2048}, respectively. For instance, when B = 20 MHz, N = 512, and R = 500 kb/ s, the average number of active subcarriers is 4. If we look back at Figure 8, we can verify that the linear number of operations can be achieved using D = 8. Note that the number of active subcarriers in the case of B = 10 MHz is higher than in the case B = 20 MHz, since the subcarrier spacing is halved.

Conclusion
This paper described a computationally inexpensive centralized algorithm based on coalitional game theory to address the issue of fair optimal resource allocation (in terms of subcarrier assignment and power control) for the uplink of an infrastructure OFDMA wireless network. The scheme derived here is designed to meet the required data rates exactly, thus ensuring a fair