Cross-layer distributed power control: a repeated game formulation to improve the sum energy efficiency

The main objective of this work is to improve the energy efficiency (EE) of a multiple access channel (MAC) system, through power control, in a distributed manner. In contrast with many existing works on energy-efficient power control, which ignore the possible presence of a queue at the transmitter, we consider a new generalized cross-layer EE metric. This approach is relevant when the transmitters have a non-zero energy cost even when the radiated power is zero and takes into account the presence of a finite packet buffer and packet arrival at the transmitter. As the Nash equilibrium (NE) is an energy-inefficient solution, the present work aims at overcoming this deficit by improving the global energy efficiency. Indeed, as the considered system has multiple agencies each with their own interest, the performance metric reflecting the individual interest of each decision-maker is the global energy efficiency defined then as the sum over individual energy efficiencies. Repeated games (RG) are investigated through the study of two dynamic games (finite RG and discounted RG), whose equilibrium is defined when introducing a new operating point (OP), Pareto-dominating the NE and relying only on individual channel state information (CSI). Accordingly, closed-form expressions of the minimum number of stages of the game for finite RG (FRG) and the maximum discount factor of the discounted RG (DRG) were established. Our contributions consist of improving the system performances in terms of powers and utilities when using the new OP compared to the NE and the Nash bargaining (NB) solution. Moreover, the cross-layer model in the RG formulation leads to achieving a shorter minimum number of stages in the FRG even for higher number of users. In addition, the social welfare (sum of utilities) in the DRG decreases slightly with the cross-layer model when the number of users increases while it is reduced considerably with the Goodman model. Finally, we show that in real systems with random packet arrivals, the cross-layer power control algorithm outperforms the Goodman algorithm.


Motivation
The design and management of green wireless networks [1,2,3] has become increasingly important for modern wireless networks, in particular, to manage Cross-layer distributed power control: A RG formulation to improve the sum EE 3 operating costs.Futuristic (beyond 5G) cellular networks face the dual challenges of being able to respond to the explosion of data rates and also to manage network energy consumption.Due to the limited spectrum and large number of active users in modern networks, energy-efficient distributed power control is an important issue.Sensor networks, which have multiple sensors sending information to a common receiver with a limited energy, capacity have also recently surged in popularity.Energy minimization in sensor networks has been analysed in many recent works [4,5,6].Several of the above described systems have some common features: 1. Multiple transmitters connected to a common receiver.
2. Lack of centralization or coordination, i.e., a distributed and de-centralized network.
3. Relevance of minimizing energy consumption or maximizing energy-efficiency (EE).

Transmitters that have arbitrary data transmission.
These features are present in many modern systems like a sensor network which has multiple sensors with limited energy connected in a distributed manner to a common receiver.These sensors don't always have information to transmit, resulting in sporadic data transmission.Another example would be several mobile devices connected to a hot-spot (via wifi or even Bluetooth).
Due to these features of the network, inter-transmitter communication is not possible and the transmitters are independent decision makers.Therefore, implementing frequency or time division multiple access becomes harder and a MAC protocol (with single carrier) is often the preferred or natural method of channel access.

Novelty
In many existing works, both network-centric and user-centric approaches have been studied.In a network-centric approach, the global energy-efficiency (GEE) is defined as the ratio between the system benefit (sum-throughput or sum-rate) over the total cost in terms of consumed power [7,8].However, when targeting an efficient solution in an user-centric problem, the GEE becomes not ideal as it has no significance to any of the decision makers.In this case, other metrics are required to reflect the individual interest of each decision maker.Therefore, we redefine the GEE to be the sum over individual energy-efficiencies as a suitable metric of interest [9].
The major novelty of this work is in improving the sum of energy-efficiencies for a communication system with all the listed features above.In such a decentralized and distributed network, as each transmitter operates independently, implementing a frequency division or a time division multiple access is not trivial.Therefore, we are interested in looking at a MAC system where all transmitters operate on the same band.Additionally, EE will be our preferred metric due to its relevance.This metric has been defined in [10] as the ratio between the average net data rate and the transmitted power.In [11,12], the total power consumed by the transmitter was taken into account in the EE expression to design distributed power control which is one of the most well known techniques for improving EE.However, many of the works available on energy-efficient power control consider the EE defined in [10] where the possible presence of a queue at the transmitter is ignored.In contrast with the existing works, we consider a new generalized EE based on a cross-layer approach developed recently in [13,14].This approach is important since it takes into account: 1) a fixed cost in terms of power namely, a cost which does not depend on the radiated power; and 2) the presence of a finite packet buffer and sporadic packet arrival at the transmitter (which corresponds to including the 4th feature mentioned above).Although providing a more general model, the distributed system in [14] may operate at a point which is energy-inefficient.Indeed, the point at which the system operates is a Nash equilibrium (NE) of a certain non-cooperative static game.The present work aims at filling this gap by not only considering a cross-layer approach of energy-efficient power control but also improving the system performance in terms of sum of energy-efficiencies.

State of the art
Nash bargaining (NB) solution in a cooperative game can provide a possible efficient solution concept for the problem of interest as it is Pareto-efficient.
However, it generally requires global channel state information (CSI) [15].Therefore, we are interested in improving the average performance of the system by considering long-term utilities.We focus then on repeated games (RG) where repetition allows efficient equilibrium points to be implemented.
Unlike static games which are played in one shot, RG are a special case of dynamic games which consider a cooperation plan and consist in repeating at each step the same static game and the utilities result from averaging the static game utilities over time [16].There are two relevant dynamic RG models: finite (FRG) and discounted (DRG).The FRG is defined when the number of stages during which the players interact is finite.For the DRG model, the discount factor is seen as the stopping probability at each stage [17].The power control problem using the classic EE developed by Goodman et al in [10] has been solved with RG only in [18] where authors developed an operating point (OP) relying on individual CSI and showed that RG lead to efficient distributed solution.Here, we investigate the power control problem of a MAC system by referring to RG (finite and discounted) where the utility function is based on a cross-layer approach.Accordingly, we contribute to: 1. determine the closed-form expressions of the minimum number of stages for the FRG and the maximum discount factor for the DRG.These two parameters identify the two considered RG.
2. determine a distributed solution Pareto-dominating the NE and improving the system performances in terms of powers and utilities compared not only to the NE but also to the NB solution even for high number of users.
3. show that the RG formulation when using the new EE and the new OP leads to significant gains in terms of social welfare (sum of utilities of all the users) compared to the NE. 4. show that the following aspects of the cross-layer model improve considerably the system performances when comparing to the Goodman model even for large number of users: -the minimum number of stages in the cross-layer EE model can always be shorter than the minimum number of stages in the Goodman EE formulation.
-the social welfare for the DRG in the cross-layer model decreases slightly when the number of users increases while it decreases considerably in the Goodman model.

5.
show that in real systems with random packet arrivals, the cross-layer power control algorithm outperforms the Goodman algorithm and then the new OP with the cross-layer approach is more efficient.

Structure
This paper is structured as follows.In section 2, we define the system model under study, introduce the generalized EE metric and define the non-cooperative static game.This is followed (section 3) by the study of the NB solution.In section 4, we introduce the new OP, give the formulation of both RG models (FRG and DRG) and determine the closed-form expressions of the minimum number of stages and the maximum discount factor as well.Numerical results are presented in section 5 and finally we draw several concluding remarks.
2 Problem statement

System model
We consider a MAC system composed of N small transmitters communicating with a receiver.The i th transmitter transmits a signal x i with a power p i ∈ [0, P max i ] where P max i is the maximum transmit power assumed identical for all users (P max i = P max ).The additive noise, which is the same for all users, is an additive white Gaussian noise denoted as n with zero mean and variance σ 2 .We assume that the users transmit their data over block fading channels.The channel gain between user i and the receiver is given by g i .Thus, the baseband signal received at the receiver is written as: Therefore, the resulting SINR γ i corresponding to the i th transmitter is given by [18,19]: where p = (p 1 , p 2 , . . ., p N ) defines the power vector of all users and can be written as p = (p i , p −i ) with p −i = (p 1 , . . ., p i−1 , p i+1 , . . ., p N ).
The purpose of this work is to determine how each user is going to control its power in an optimum way.Game theory, as a powerful mathematical tool, helps to solve such an optimization problem where the utility function is the EE which is a function of the users powers.Since the system under study has multiple agencies each with individual interest, the sum over individual energy-efficiencies will be considered as the performance metric reflecting the individual interest of each decision maker.

Energy-efficiency metric
The EE is defined in [10] as a ratio of the net data rate to the transmit power level and is given by: where R is the transmission rate (in bit/s) while f : [0, +∞) → [0, 1] denotes the efficiency function which is sigmoidal and corresponds to the packet success rate verifying f (0) = 0 and lim x→+∞ f (x) = 1.Authors of [11] were the first to consider a total transmission cost of the type radiated power (p i ) + consumed power (b) to design distributed power control strategies for multiple access channels [13,14] as follows: In [13,14], a more generalized EE metric has been developed by considering a packet arrival process following a Bernoulli process with a constant probability q and a finite memory buffer of size K.The new EE expression is given by: where the function Φ identifies the packet loss due to both bad channel conditions and the finiteness of the packet buffer and is expressed as follows: where Π K (γ i ) is the stationary probability that the buffer is full and is given by: with: It is important to highlight that this new generalized EE given by ( 5) includes the conventional case of (4) when making q → 1.

Static cross-layer power control game
The static cross-layer power control game is a non-cooperative game which can be defined as a strategic form game [17].

Definition 1
The game is defined by the ordered triplet G = N , (S i ) i∈N , (u i ) i∈N where N is the set of players (the N transmitters), S 1 , . . ., S N are the corresponding sets of strategies with S i = [0, P max i ] and u 1 , . . ., u N are the utility functions given by: where χ i (p) is given by equation ( 5).
In a non-cooperative game, each user (player) seeks to maximize selfishly its individual utility function.The optimum solution results then by setting ∂u i /∂p i to zero as follows: where Authors in [13,14] proved that such equation has a unique best response.
In the game G, this best response defines the NE and is denoted as . However, the NE solution is not always Pareto-efficient for many scenarios.We highlight in Fig. 1 that the NE is not on the Pareto frontier (the outer boundary of the achievable utilities region).Therefore, we are motivated to design a more efficient solution than the NE.For this, as a first step we investigate the NB solution.

Nash bargaining solution
Due to the inefficiency of the NE, a Pareto-efficient solution can be achieved by introducing the cooperation between the players.The resulting solution is called NB solution whose determination requires two elements [20]: -the region of achievable utilities formed by the set of the feasible utilities of all the players should be compact and convex [21]; -the threat point is defined by the NE of the one-shot game [22].

Compactness and convexity of the achievable utilities region
We denote R the achievable utilities region defined as follows: As the strategies sets S 1 , . . ., S N are compact since S i = [0, P max i ] and the utility function u i is continuous, the region R is compact for a given channel configuration [22].Since it is generally not convex, time-sharing has been a solution to convexify it.In order to illustrate the main idea of this technique applied to our problem, let us consider a system of 2 users [22].During a time fraction τ , the users use the powers (p 1 , p 2 ) to have utilities (u 1 , u 2 ).During [15,22].Thus, the new achievable utilities region (for the 2-users system) is: R ={ We define R * the Pareto boundary (the outer frontier) of the convex hull of R. Fig. 1 shows the convexified achievable utilities region with the NE point, the NB solution and the Nash curve (both will be defined next).

Existence and uniqueness of the NB solution
Let R N B define the improvement region of utilities versus the NE and it is given by: The NB solution belongs to the region R N B .Here, in the power control game G, there exists a unique NB solution denoted as and is given by [21]: Since the NE can always be reached and the achievable utility region is a compact convex set, the NB solution exists.It is unique since it verifies certain axioms: individual rationality and feasibility, independence of irrelevant alternatives, symmetry, Pareto optimality (efficiency) and independence of linear transformations [21].The NB solution results from the intersection of the Pareto boundary ( R * ) with the Nash curve whose form is m ) where m is a constant chosen such that there is precisely one intersection point [22] (see Fig. 1).Although the NB solution is Pareto-efficient, it generally requires global CSI at the transmitters due to the Nash product (m) introducing all the users utilities [15].For this reason, we are looking for another efficient solution through the study of the dynamic RG.

Repeated games formulation
RG consist in their standard formulation, in repeating the same static game at every time instance and the players seek to maximize their utility averaged over the whole game duration [16].Repetition allows efficient equilibrium Cross-layer distributed power control: A RG formulation to improve the sum EE 13 points to be implemented and which can be predicted from the one-shot static game according to the Folk theorem, which provides the set of possible Nash equilibria of the repeated game [18,23].In a repeated game, certain agreements between players on a common cooperation plan and a punishment policy can be implemented to punish the deviators [16].In what follows, we introduce the new OP and characterize the two RG models.

New OP
The new OP consists in setting p i |g i | 2 to a constant α which is unique when maximizing the expected sum utility over all the channel states.It is given by [19]: The power of the i th player is then deduced as follows: The new OP Pareto-dominates the NE and relies on individual CSI at the transmitter.In order to implement a cooperation plan between the players, we assume in addition to the individual CSI assumption, that every player is able to know the power of the received signal at each game stage, which is denoted by [18]: When assuming that p i |g i | 2 is set to the constant α, the received signal power can be written as: Accordingly, each transmitter needs only its individual SINR and the constant α (depending only on p i and |g i | 2 ) to establish the received signal power P y .
We assume that the data transmission is over block fading channels and that channel gains . Thus, the interval to which the received signal power belongs, is the players detect a variation of the received signal power, a deviation from the cooperation plan has occurred.Indeed, when playing at the new OP, the received signal power is constant and equal to . Consequently, when any player deviates from the new OP, the latter quantity changes and the deviation is then detected [18].

Repeated games characterization
A RG is a long-term interaction game where players react to past experience by taking into account what happened in all previous stages and make decisions about their future choices [24,25].The resulting payoff is an average over all the stage payoffs.We denote by t, the game stage which corresponds to the instant in which all players choose their actions.Accordingly, a profile of actions can be defined for all players as p(t) = (p 1 (t), p 2 (t), . . ., p N (t)).A history h(t) of player i at time t is the pair of vectors (P y,t , p i,t ) = (P y (1), P y (2), . . ., P y (t − 1), p i (1), p i (2), . . ., p i (t − 1)) and which lies in the set H t = (∆ t−1 , P t−1 i ) with (as all the users have the same maximum power) [18].Histories are fundamental in RG as they allow players to coordinate their behavior at each stage so that previous histories are known by all the players [25].We denote δ i,t the pure strategy of the i th player.It defines the action to select after each history [18,25]: In RG literature, there are two important models [17]: -the finite RG where the number of stages of the game (denoted as T ≥ 1) during which the players interact is finite; -the discounted RG where the discount factor (denoted as λ ∈]0, 1[) is seen as the stopping probability at each stage.
The utility function of each player results from averaging over the instantaneous utilities over all the game stages in the FRG while it is a geometric average of the instantaneous utilities during the game stages in the DRG [18,25,26].We denote δ = (δ 1 , δ 2 , . . ., δ N ) the joint strategy of all players.
Definition 2 A joint strategy δ satisfies the equilibrium condition for the repeated game defined by N , for the DRG such that: for the FRG (20) In RG with complete information and full monitoring, the Folk theorem characterizes the set of possible equilibrium utilities.It ensures that the set of NE in a RG is precisely the set of feasible and individually rational outcomes of the one-shot game [24,25].A cooperation/punishment plan is established between the players before playing [18].The players cooperate by always transmitting at the new OP with powers pi .When the power of the received signal changes, a deviation is then detected and the players punish the deviator by transmitting with their maximum transmit power P max i in the FRG and by playing at the one-shot game in the DRG.In what follows, we give the equilibrium solution of each repeated game model and mention the corresponding algorithm [27,28,29].It is important to note that in contrast with iterative algorithms (e.g., iterative water-filling type algorithms), there is no convergence problem in repeated games (FRG and DRG).Indeed, the transmitters implement an equilibrium strategy (referred to as the operating point) at every stage of the repeated game.

Finite RG
The FRG is characterized by the minimum number of stages (T min ).If the number of stages in the game T verifies T > T min , a more efficient equilibrium point can be reached.However, if it is less than T min , the NE is then played.

Assuming that channel gains |g
, we have the following proposition [19]: Proposition 1 (FRG equilibrium) : When supposing the following condition is met: T ≥ T min with: such that: Then, the NE corresponding to the T -stage FRG is given by the following action plan for any (T, T min ) and ∀t ≥ 1: is the SINR at the NE while γi and γ i are the SINRs related to the maximal utility and the utility min-max respectively (the proof of this proposition is detailed in [19]).The corresponding algorithm is as follows.
Algorithm 1: FRG Algorithm 1) Each user transmits at the new OP with power pi during the first phase of the game t ∈ {1, 2, . . ., T − T min }.
As the FRG has a finite number of stages, this phase ensures the punishment of the deviator for two reasons [18]: ⋄ if it deviates at the last stage, it cannot therefore be punished; ⋄ if it deviates earlier, the punishment can be not sufficiently severe.
3) The power of the received signal is assumed to be constant during the first phase.When it changes, a deviation is then detected.
4) The deviator is punished by other transmitters by playing at their maximum transmit power P max i .

Discounted RG
In the DRG, the probability that the game stops at stage t is λ(1 − λ) t−1 with λ ∈]0, 1[ defines the discount factor [17].Accordingly, we can express the analytic form of the maximum discount factor in a DRG when assuming that channel gains Proposition 2 (DRG equilibrium) : When assuming the following condition is met: with: Then, the NE corresponding to the DRG is given by the following action plan ∀t ≥ 1: pi when all other players play p−i For the proof, see App. A. The corresponding algorithm is as follows.
Algorithm 2: DRG Algorithm 1) Each user transmits at the new OP with power pi .
2) When the power of the received signal changes, a deviation is detected.
3) The other transmitters punish the deviator by transmitting at the one-shot game with power p * i .

Numerical results
In this section, we consider the efficiency function f (x) = e −c/x with c = 2 R R 0 − 1.It has be proven in [30,31] that such a function is sigmoidal as it is convex on the open interval (0, c/2] and concave on (c/2, +∞).The throughput R and the used bandwidth R 0 are equal to 1 Mbps and 1 MHz respectively.
The maximum power P max is set to 0.1 Watt while the noise variance is set to 10 −3 Watt.The buffer size K, the packet arrival rate q and the consumed power b are fixed to 10, 0.5 and 5 × 10 −3 Watt respectively.We consider Rayleigh fading channels and a spreading factor L introducing an interference processing (1/L) in the interference term of the SINR.
In Fig. 2, we present the achievable utility region, the new OP, the NE and the NB solution.We stress that the new OP and the NB solution dominate both the NE in the sense of Pareto.The region between the Pareto frontier and the min-max level is the possible set of equilibrium utilities of the RG according to the Folk theorem.In order to study the efficiency of the new OP versus the NB solution and the NE, we are interested in comparing powers and utilities of the three equilibria by averaging over channel gains for different scenarios (different number of users N in the system).In Fig. 3, we plot the power and the utility that a user (in a system of N users) can reach for each equilibrium.Thus, we highlight that the new OP and the NB solution have better performances than the NE as they Pareto-dominate it.When N = 2, we notice that the new OP and the NB solution are more efficient than the NE.It is clear that the NB solution requires less power and provides higher utility compared to the new OP, but it is important to stress that values, in terms of powers and utilities, are slightly different for both equilibria (new OP and NB solution).
When N > 2, we highlight that lower powers are provided with the new OP which leads also to higher values of the utilities.Thus, we notice that the new OP gives better performances than the NE and the NB solution.Therefore, the new OP contributes not only to improve the system performances better than the NE for any given scenario but also enables important gains in terms of powers and utilities when compared to the NB solution for a system with a large number of users (N > 2).We are interested in studying the performances of the social welfare ( i u i ) according to the FRG versus the NE in a multi-users system.The corresponding expression is given by: In Fig. 4, we present the ratio of the social welfare corresponding to the FRG (ω F RG ) vs the NE social welfare (ω N E ).We proceed by averaging over channel gains lying in a compact set such that 10 log 10 (ν max /ν min ) = 20.We highlight that the social welfare of the FRG reaches higher values than the NE (ω F RG > ω N E ).In addition, we notice that the social welfare ratio increases with the number of users for both models (Goodman and cross-layer).The minimum number of stages T min according to the cross-layer model is much lower compared to the one related to the Goodman model.To illustrate this, when N = 3, T min for the Goodman model is equal to 4600 while it is 3700 for the cross-layer model.This difference becomes considerable with the increase of the number of users.Indeed, when N = 4, the minimum number of stages for the Goodman EE is 14300 while it is equal to 10900 for the cross-layer approach.
We are interested in plotting the minimum number of stages as a function of the consumed power b and the packet arrival rate q according to both EE models.Results, obtained by averaging over channel realizations, are drawn in figures 5 and 6.According to Fig. 5, we stress that T min increases with the number of users while it decreases with the spreading factor.It is clear that for any values of N and L, it exists a consumed power b = 0 for which T min is less than T min when b = 0. Thus, a good choice of the fixed consumed power leads to a lower minimum number of stages for the cross-layer model compared to the Goodman model.
In Fig. 6, we highlight that the minimum number of stages is an increasing function of the packet arrival rate q according to the cross-layer model while it is a constant function for the Goodman model since the latter does not take into account the packet arrival process.One can confirm that the minimum number of stages is an increase function of the number of users as deduced previously.Simulations show that it exists a packet arrival rate q 0 before which Cross-layer distributed power control: A RG formulation to improve the sum EE

23
T min of the cross-layer model is much lower than T min of the Goodman model for different number of users.Simulations show that q 0 ≈ 0.6 and for q ≥ q 0 , T min of the cross-layer model converges to T min corresponding to the Goodman model.It is important to highlight that when N = 3 and q ≥ q 0 , T min of the cross-layer model takes higher values than T min corresponding to the Goodman model but values are quite similar.With the increase of the number of users, the difference between the minimum number of stages for both models becomes noticeable.According to figures 5 and 6, one can conclude that the cross-layer model can be exploited for short games.For this reason, we studied the variation of λ max as a function of η and q for both EE models and for different number of users.Results are given in figures 8 and 9.According to Fig. 8, we deduce how λ max decreases with the number of users for both EE models.In addition, we stress that the values reached by λ max becomes closer when N takes higher values.This can explain Fig. 7.The study of the variation of λ max versus the packet arrival rate q (in Fig. 9) shows that the maximum discount factor λ max decreases with the number of users and with the packet arrival rate q as well.Simulations show that it exists a packet arrival rate q 1 before which the λ max corresponding to the cross-layer model takes higher values than the maximum discount factor of the Goodman model for different number of users.We notice that starting from q 1 , the maximum discount factor of the cross-layer model converges to λ max corresponding to the Goodman model.
In a second step, we plotted in Fig. 10 the variation of the DRG social welfare as a function of λ ≤ λ max .We notice that ω DRG is an increase function of λ.Thus, when λ = λ max , ω DRG reaches highest value.However, we stress that ω DRG decreases with the number of users especially for the Goodman model while it is quite similar for the cross-layer model.This confirms that the proposed new OP is still quite efficient and can be utilized for games with high number of users.Finally, we plot for both RG models (FRG and DRG) the social welfare when using the cross-layer approach against the constant power b for two different values of the packet arrival rate q (0.5 and 0.7).The considered system is composed of 2 users and the spreading factor L is fixed to 4. The idea consists in studying the efficiency of the cross-layer approach regarding the Goodman (bits/Joule) q = 0.7 q = 0.7 with p[q → 1] q = 0.5 q = 0.5 with p[q → 1] Fig. 12: Plotting the DRG social welfare against b for q = 0.5 and q = 0.7: the cross-layer approach improves the power control when compared to the Goodman algorithm.

Conclusion
In this paper, we have investigated RG for distributed power control in a MAC system.As the NE is not always energy-efficient, the NB solution might be a possible efficient solution since it is Pareto-efficient.However, the latter, in general, requires global CSI at each transmitter node.Thus, we were motivated to investigate using the repeated game formulation and develop a new OP, that simultaneously is both more efficient than the NE and achievable with only individual CSI being required at the transmitter.Also, we consider a new EE metric taking into account the presence of a queue at the transmitter with an arbitrary packet arrivals.
Cooperation plans are proposed where the new OP is considered and closed-form expressions of the minimum number of stages for the FRG and the maximum discount factor for the DRG have been established.The study of the social welfare (sum of utilities of all the users) shows that considerable gains are reached compared to the NE (for the FRG and DRG).Moreover, our model proves that even with a high number of users, the FRG can always be played with a minimum number of stages shorter than when using the Goodman model.In addition, the social welfare in the DRG decreases slightly with the number of users with the cross-layer approach while it decreases considerably with the Goodman model.Finally, the comparison of the cross-layer algorithm versus the Goodman algorithm, shows that in real systems with random packet arrivals, the cross-layer power control algorithm outperforms the Goodman algorithm.Thus, the new OP with the cross-layer approach is more efficient.
An interesting extension to this work would be to consider the interference channel instead of the MAC channel and generalize the framework applied here.Another possible extension would be to consider the multi-carrier case and the resulting repeated game.
Appendix A (Proof of λ max )

A.1 Determination of the maximal utility
Let us determine the maximal utility that a player can get and which is denoted We denote ṗi the power maximizing the utility function u i and which is the solution of the following equation: with . Therefore, the expression of the maximum utility function writes as: ui ( ṗi , We have to study then the behavior of ui ( ṗi , p −i ) regarding p j for j = i and then we determine the sign of ∂ ui ( ṗi , p −i ) ∂p j which is given by: We are interested to study the sign of the numerator: The next step would be to determine the sign of the expression −b∂φ( γi ) ∂ γi > 0 since f is an increasing function of the SINR.Therefore, we need to determine the sign of ∂φ( γi ) ∂ γi .We have: The sign of the first term is negative while the sign of the second term is the same as ∂Π(γ i )/∂γ i since (1 − f (γ i )) > 0 and we have: However ρ(γ i ) = q(1 − f (γ i )) (1 − q)f (γ i ) and then: As shown in [13], we have: The latter quantity can be expressed as: Consequently, we have: (39) Therefore, ∂Π(γ i ) ∂γ i < 0 and hence ∂φ(γ i ) ∂γ i < 0. In particular, we have ∂φ( γi ) ∂ γi < 0. Thus, we have −b∂φ( γi ) ∂ γi > 0 and finally ∂ ui ( ṗi , p −i ) ∂p j < 0. We deduce then that ui is a decreasing function of p j .It reaches its maximum when p j = 0 and it is minimum when p j = p max j (for all j = i).When substituting p j = 0 in the SINR expression, this allows the determination of the optimal power: with: γ i = p i |g i | 2 σ 2 .The latter equation is a function of the SINR.We determine then the solution in terms of SINR which we denote γi and for which the optimal power is pi = γi σ 2 |g i | 2 .This SINR exists due to the quasi-concavity of u i in (p i , p −i ) [13,14].Then, we have: A.2 Determination of λ max The SINR γi refers to the SINR when playing the new OP while γ * i , γi and γ i are the SINRs at the NE, at the maximal utility and at the utility min-max respectively.In order to simplify expressions, we define the following notations: At a stage t, the equilibrium condition is [18]: λū i (p(t)) + s≥t+1 λ(1 − λ) s−t E g [u * i (p(s))] ≤ λũ i (p(t)) + s≥t+1 λ(1 − λ) s−t E g [ũ i (p(s))] (42) Knowing that s≥t+1 (1 − λ) s−t = (1 − λ)/λ, we have:

Fig. 1 :
Fig. 1: Pareto-efficiency of the NB solution vs the NE.

Fig. 2 :
Fig. 2: Pareto-dominance of the new OP and the NB solution vs the NE (L = 2).

Fig. 3 :
Fig. 3: Better performances in terms of power and utility with the new OP for different number of users N .

Fig. 4 :Fig. 5 :
Fig. 4: Improvement of the social welfare in FRG vs the NE as a function of the number of stages of the game T (L = 5).

Fig. 6 :
Fig. 6: Lower values of T min of the cross-layer model when comparing to Goodman model (L = 5).

Fig. 7 :
Fig. 7: Improvement of the social welfare in DRG vs the NE for Goodman and cross-layer models as a function of the spectral efficiency η for different number of users N .

Fig. 8 :
Fig. 8: Variation of λ max for Goodman and cross-layer models as a function of the spectral efficiency η with different number of users N .

i σ 2 +
j =i p * j ν max i F Thus: