The random access NUM with multiclass traffic

In this article, we consider the network utility maximization (NUM) problem for the random access network with multiclass traffic. The utilities associated with the users are not only concave, but also nonconcave functions. Consequently, the random access NUM problem becomes more difficult to solve. Based on the successive approximation method, we propose an algorithm that jointly controls the rate and the persistent probability of the users. The proposed algorithm converges to a suboptimal solution to the original problem which also satisfies the Karush–Kuhn–Tucker conditions. We also generalize the framework so that a broader choice of utility functions can be applied.


Introduction
The network utility maximization (NUM) for the random access wireless networks is thoroughly studied in the literature, e.g., [1][2][3].The assumption of strictly concave utilities in conventional works makes the NUM to merely address the elastic traffic which is from nonreal-time applications.In current Internet, there are many kinds of traffic, both elastic and inelastic.The inelastic traffic from the real-time applications does not have the strictly concave form anymore.They are usually modeled by sigmoidal utilities, which are convex at the lower region and concave at the higher region as depicted in Figure 1 [4].As a result, the analysis frameworks in [1][2][3] cannot be applied in the case of multiclass traffic and it is very difficult to address the nonconvexity of the problem.x 4 +400 , U 4 (x) = x 6 x 6 +10 6 , U 5 (x) = x 24 , U 6 (x) = x 2 24 2 .(U 1 , U 5 , and U 6 are normalized at x max = 24 Mbps) The early studies that deal with the inelastic traffic in the basic NUM problem for wired networks are [5,6].The authors utilize the standard dual-based algorithm to allocate the rate.Certainly, this algorithm does not result to an optimal solution because of the nonconvexity of the primal problem.The duality gap is not always zero and the result is suboptimal or even infeasible.Therefore, the authors of [5] offer a 'self-regulate' mechanism for the users to access the network without fluctuation.
On the other hand, the authors of [6] find the conditions for which the dual-based algorithm converges to a global optimum.It turns out that the link capacity must be higher than a critical value.Then, they propose the link 'capacity provisioning' to satisfy those conditions.Another method to solve the basic NUM is using the sum-of-square method in [7].The nonconvex NUM is relaxed and solved by semidefinite programming.However, this method requires a centralized and offline computation.Its framework is also difficult to integrate into the cross-layer optimization problem in which the dual decomposition approach has shown its efficiency [8].Extending the work in [6] to the random access WLANs, the authors of [9] design a dual-based algorithm to jointly allocate the rate and the persistent probability of elastic and inelastic traffic.Consequently, their algorithm only converges just in the case where the link capacities are higher than critical values.Otherwise, only the lower bound and upper bound are specified.
In this article, we address the random access NUM for multiclass traffic using the successive approximation method.The solutions to the convex approximation problems converge to a suboptimal solution which also satisfies the Karush-Kuhn-Tucker (KKT) conditions of the original problem.The successive approximation method is first introduced in [10].It is usually applied to geometric programming in the power control problems such as [11][12][13].Similar to our previous work [14] which jointly controls the rate and power in a multi-hop wireless network with multiclass traffic, the nonconcave objective of the problem is approximated to a concave function.After solving a series of approximation problems, the algorithm converges.Moreover, we generalize our analysis framework and show that a broader choice of utilities can be obtained.
The rest of the article is organized as follows.Section 'Design of the successive approximation algorithm' introduces the network model and propose the successive approximation algorithm.Section 'More general utility functions and analysis' generalizes the framework analysis and finds the conditions on the utility.The numerical results and some discussions are presented in Section 'Numerical results and discussions'.Finally, conclusions are given in Section 'Conclusions'.

Network model
We consider a wireless LAN with the set of users N .We assume that every user is one-hop neighbor to another.Each user generates saturated traffic, i.e., it always has packets to transmit.If each user i attempts to access the medium with probability p i , then the probability of successful transmission of user i will be p i ∏ j̸ =i (1 − p j ).As a result, the long-term transmission rate of user i is where c i is the wireless link capacity of user i.The random access NUM is stated as follows [1,9] (P1) : max.
where U i is the utility function of user i.In this article, we assume that x min is strictly greater than 0 to avoid dividing by zero in the mathematical analysis.
Each user is associated with a utility function.We will mention a broader choice of utility functions that can be applied to our framework later in Section 'More general utility functions and analysis'.In this section, we consider two groups of utility functions: 1.The concave utilities for elastic traffic 2. The sigmoidal utilities for inelastic traffic ( The sigmoidal function (2) has an inflection point at ) 1/a .It is convex in (x min , x in ) and concave in (x in , x max ).In the literature, sigmoidal function is usually used for the real-time utility because it is small when the rate is below x in and increases quickly when the rate exceeds x in .As a result, x in is also considered the demand of a real-time connection (see Figure 1).
Similar to the articles on utility optimality of multiclass traffic, e.g., [5,7,9,15], the concave utilities usually cannot take the conventional form of α-fair utility which is ln(x) if α = 1 and x 1−α 1−α if α > 0 and α ̸ = 1 [16].It is shifted 1 unit on the x-axis.With the present of sigmoidal utilities which are usually the same as (2) or 1 1+e −a(x−b) , a, b > 0 in the literature, the utilities of the users are normalized or at least have close values at x max in order to be comparable.Otherwise, the inelastic flows always take the advantage over the elastic flows because of the conventional α-fair utility is negative as α > 1.So the concave utilities usually have the form as (1) in these articles.

Approximation problem
Since the utilities (1) and ( 2) are always positive as x > 0, we maximize the logarithm of the aggregate utility instead of itself and replace (P1) by an equivalent problem as follows (P2) : max.ln The Lagrangian of (P2) is given by where ν i is the multiplier associated with the constraint x i ≤ c i p i ∏ j̸ =i (1 − p j ) for all i ∈ N .We have the following result Lemma 1. (P1) and (P2) share the same optimal/suboptimal solutions.Moreover, if (x * , p * , ν * ) is a KKT point of (P2), which means that the following conditions are satisfied then (x * , p * , ( ) is a KKT point of (P1).Proof.Since logarithm is a monotonically increasing function, the first statement is obvious.We now verify the second statement.The Lagrangian of (P1) is given by We can easily verify that (4)-( 7) are equivalent to the KKT conditions of (P1), which are when We now derive an inequality to approximate (P2) to a new problem which can equivalently be transformed to a convex one.From the arithmetic-geometric mean inequality, we have for all u ≽ 0, θ ≻ 0, and 1 T θ = 1.Replacing u i with U i (x i ) θ i and taking the logarithm of both sides of the inequality yields ln The equality of ( 13) holds if and only if Now we consider the approximation problem as follows As we have mentioned earlier, there is a sequence of approximations.The superscript τ is used here to indicate that this is the τ th approximation problem, θ (τ ) is a fixed value in τ th approximation problem.
It will be proved that, by updating θ and solving the approximation problem many times, the solution to the approximation problem converges.At the stationary point, the approximation becomes exact.
Changing the variables xi ln(x i ) as in [1,9] to separate the product form of the constraints, the following problem is obtained ) is a function of xi parameterized by θ i , and ci ln(c i ).
Proof.See the Appendix for the proof.
From Lemma 2, (P4 τ ) is a convex problem; therefore, it can be solved efficiently for an optimal solution.In the next section, we will solve (P4 τ ) using the dual-based decomposition approach.

Solution to the approximation problem and the algorithm
We apply the dual decomposition method to solve (P4 τ ).Its Lagrangian is given by ) .
Since both subproblems ( 15) and ( 16) are convex problems, the first-order conditions are sufficient to establish their optimal solutions.The solution to the first subproblem (15) at time instant t is given by where [ z] z max z min = min(max(z, z min ), z max ), the projection of z on [ z min , z max ].Solving the second subproblem ( 16) yields the persistent probability [1] We now apply the subgradient algorithm to solve the dual problem. (τ ) ) where xi and p i are specified by (17) and (18), respectively.Hence, the subgradient update is as follows [17] λ (τ )  i (t where γ (t) is the step-size sequence, x(τ) i (t) and p (τ ) i (t) are calculated according to (17) and (18), respectively, at time instant t. [ a] + = max(a, 0).Once again, we use the superscript τ in ( 17)- (19) to indicate that they are the values in solving the τ th approximation problem.From the above analysis, we develop the successive approximation algorithm for the multiclass traffic in the one-hop random access wireless network as described in Algorithm 1.

Algorithm 1 Successive approximation algorithm for multiclass traffic
1. Initialize from θ (0) is the stationary value of the τ th (outer-)iteration.At step 10, the new value θ is calculated by the stationary rate of previous outer-iterations.Moreover, the initial value of a new outer-iteration is the stationary value of the previous outer-iteration at step 11.
Proof.See the Appendix for the proof.
We have some discussions on the distributed implementation and the message passing mechanism of the proposed algorithm.There are two kinds of updates in Algorithm 1, the inner-updates ( 17)-( 19) and the outer-updates (14).In each inner-iteration, a user uses the information ∑ j∈N λ j (t) to update its persistent probability according to (18).The persistent probabilities of all the nodes are also needed to update the user's multiplier according to (19).Hence, after each inner-iteration, each user broadcasts its information (p i and λ i ) to all the other users in the network.At the outer-iteration, each user needs the information of total utility of all the users to update its θ-value according to (14).Therefore, each user also broadcasts its current utility value to all the other users in each outer-iteration.Note that, the users update their θ-values as recognizing the stationary of the inner-iterations.The following technique can be used for the users to recognize the stationary.The users broadcast their utility periodically after each T time-slots.So, each user can always keep track of the aggregate utility value of the system.It only updates its θ-value as recognizing the stationary of this value.
Finally, there are some mechanisms to reduce the amount of message passing in the network: 1.Each node piggybacks its information p i , λ i , and θ i by inserting them into their data packets.
Since all nodes are one-hop neighbors to each other, the other odes can overhear these information and update their values based on the received information.
2. The multiplier update (19) can be a local update as follows.We rewrite the update (19) by ] + , where is the successful transmission probability of node i.The value p succ can be estimated locally.For example, (1) p succ i ≈ number of successful transmissions of i number of transmissions of i , or (2) we can estimate the probability that the channel is idle p idle ≈ number of timeslots the channel is idle number of timeslots and the successful transmission probability will be By estimating this parameter locally, the multipliers can be implicitly updated.Therefore, the amount of message passing in the network is reduced significantly.

More general utility functions and analysis
In the first part of this section, we focus on the conditions of utility functions that the above analysis can still be applied.It is easy to see that the first criteria are • twice continuously differentiable and monotonically increasing function; The important condition is that the function Ũi (x i ) = θ i ln ) must be strictly concave.
Equivalently, we must have d 2 Ũi (x i ) With the assumption U i (x i ) > 0, ∀x i > 0, the condition is equivalent to where We next consider the logarithm transformation from (P1) to (P2).Indeed, the log-transformation ln(u) transforms u into a 'more' concave function, for example, x + 1 is linear but ln(x + 1) is strictly concave; ) is concave.We generalize the analysis by using a general concave function f (u) which is monotonically increasing.Instead of using the approximation inequality (13) from arithmetic-geometric mean inequality, we use Jensen's inequality for all vector θ , such that θ ≻ 0 and 1 T θ = 1.In this case, the condition on the utility function in order to perform the analysis is that θ i f ( U i (e xi ) θ i ) must be concave, or its second derivative in terms of xi must be negative equivalently.Hence, We note at the factor It is always positive because f is a monotonically increasing and concave function.The higher the factor, the quicker the slope of f changes, and the more relaxed the condition of utility.
Particularly, if f (.) has the form of well-used α-fair family, (β is used here to distinguish from α parameter in (1)), we can see that the analysis in Section 'Design of the successive approximation algorithm' is a special case as β = 1, and the condition (22) becomes exactly (20) in this case.In case of β > 0 and So, the higher the value of β, the more relaxed the condition (22).We consider some following examples: with 0 < α < 1: although this function is a canonical α-fair concave function, it cannot be applied to [1].Lemma 1 therein requires a 'sufficiently' concave utility function, i.e., α > 1 for the α-fair family.However, with the transform function f (u) = −1/u (which corresponding to β = 2) and the new approximation (21) instead of (13), our framework can be applied.

Linear/convex utility function U
) is a linear function, the analysis in Section 'Design of the successive approximation algorithm' cannot be applied.With the use of e Mx i is a strictly concave function.Note that this utility function certainly leads to the nonconvergence of the standard dual-based algorithm in [1,9] because it is not a concave function., then the exponential utility can still be applied.

Numerical results and discussions
In this section, we use x i x i +1 as elastic utility with α = 2 and i +400 as inelastic utility with k = 400 and a = 4 (see Figure 1).The rate unit for calculating utilities is Mbps.The inner-iteration is considered stationary if x(t)−x(t− 1)   x(t−1) ≺ 10 −4 .x min = 0.01 Mbps and x max = c Mbps.The diminishing step size 0.001/t is used for Algorithm 1. λ 0 (0) is 0.1.

Convergence of the algorithm
In the first experiment, we want to examine the convergence of Algorithm 1 in case of scarce resource.We consider a network with two inelastic users.The link capacities are all 6 Mbps.With the use of standard dual-based algorithm presented in [9, Alg.1], although the persistent probabilities of two flows converge, we cannot find any step size for the convergence of the rates.With Algorithm 1, however, both rates and persistent probabilities converge to a stationary point as shown in Figure 2a,b.

Figures 2 The rate and persistent probability of two inelastic flows, c =[ 66] Mbps
We can see that although two users are symmetric, i.e., the same utilities as well as link capacities, one of them accesses the channel most of the time whereas the other one is mostly abandoned.This result shows the major difference from the resource allocation of elastic flows in which all elastic flows are fairly allocated the resource.Therefore, by using the sigmoidal utilities, the admission control is implicitly integrated as we solve the NUM.This is an advantage of using the sigmoidal utility.Also we have a remark that we rarely have fairness among inelastic users.Intuitively, when there is not enough resource for both flows, it is better to drop one flow and keep the other one than to maintain both inelastic flows with bad quality.This unfairness is also similar to the real-time system with the explicit admission control scheme.Some real-time connections can be dropped to guarantee the system performance because of the lack of the resource.
To mitigate the unfairness among the users as well as to avoid the starvation of some users in the network, we can guarantee a minimum persistent probability for each user.The constraint 0 ≼ p ≼ 1 is replaced by the new one p min ≼ p ≼ 1 where max i p min i ≤ 1/|N | to avoid the infeasibility.As a result, the persistent probability update (18) for each user in the τ th outer-iteration becomes ) for all i ∈ N .With the new lower bound p min , all the users have a minimum chance to access the channel.

A heuristic implementation
We implement a heuristic algorithm in this experiment by limiting the number of inner-iterations in each τ -step to a fixed value T. As we have seen, Algorithm 1 has two levels of convergence.The outer-iterations update θ and the inner-iterations solve the convex approximation problem.
Theoretically, the number of inner-iterations must be large enough for the convergence in every outer-step.In the heuristic algorithm, we limit the number of inner-iterations to a fixed value T.Moreover, we also apply a constant step size to the subgradient update (19) since it usually has a faster convergence than the diminishing step-size.It is known that with the dual-based subgradient algorithm using constant step size, the primal function sequence calculated from the running average primal values {x (τ ) (t) = 1 t ∑ t k=1 x (τ ) (k), t = 1, 2, . ..} converges to an optimal value (of P3 τ ) within an error ( [18], Sec.1.2).The feasible violation of the running average primal sequence also converges to zero.So, in the heuristic algorithm, θ i corresponding to user i is updated according to , the running average value of the previous outer-iteration.The heuristic algorithm converges to the same solution as Algorithm 1 does in most of our experiments.However, we have a note that its convergence cannot be guaranteed theoretically.The reason is that with the dual-based subgradient update solving the approximation problem, the primal value x(τ−1) k (T) can be infeasible.Therefore, the inequality ( 27) is no longer valid, i.e., we cannot guarantee a feasible improvement of the objective in every outer-iterations.
We repeat the experiment in Subsection 'Convergence of the algorithm' with T = 5. Figure 2c,d shows the evolution of rate and persistent probability with the heuristic algorithm.The convergence is much faster than the ones with stationary inner-iterations as shown in Figure 2a,b.We consider another example in which there are four users, two elastic and two inelastic.The link capacities are c =[ 36 24 6 48] Mbps. Figure 3c,d also shows the convergence of heuristic algorithm which is also much faster than that of Algorithm 1 in Figure 3a,b.Given θ, (P4 τ ) as well as (P3 τ ) have a unique optimal solution due to the strict convexity of (P4 τ ).So, we can see that the result of Algorithm 1 only depends on choosing the initial θ (0) .In this experiment, we evaluate the stationary point according to different initial θ (0) .Let consider again the network with four users in Section 'A heuristic implementation'.We uniformly generate 100 random initial vectors θ (0) and run Algorithm 1 with these 100 initial points.Figure 4 shows the results of 100 experiments starting from these initial points.We can see that 72% of the experiments reach the globally optimal point x * =[ 4.20 3.36 0.01 9.03] Mbps, p * =[ 0.28 0.32 0.01 0.39], and Usum * = 2.52.
Figure 4 The stationary points as randomly choosing the initial point θ (0)   Compare to the standard dual-based algorithm We compare the aggregate utility archived by Algorithm 1 to the lower and upper bounds calculated from the standard dual-based algorithm in [9] as the number of users in the network increases gradually.In [9], after log-transforming the rate variables of the original NUM, the standard dual-based algorithm (Algorithm 1 therein) can achieve the stationary value of the multipliers, i.e., λ * , due to the convexity of the dual problem.Therefore, the lower bound is calculated by . The upper bound is the value of the dual function at the point λ * .Notice that this upper bound is absolutely not a feasible solution in case of nonzero duality gap.
We fix the link capacities at 12 Mbps and increase the number of users gradually.Half of the users have the elastic utilities and the other ones have the inelastic utilities.Figure 5 shows that when the number of users increases, the aggregate utility also increases.It is always higher than the lower bound specified by the standard dual-based algorithm in [9].
Figure 5 Aggregate utility comparison between the proposed algorithm and the upper and lower bounds specified by the standard dual-based algorithm [9] Compare to binary exponential backoff MAC protocol In this experiment, we want to compare our proposed algorithm to the MAC protocol running binary exponential backoff (BEB) rule such as IEEE 802.11DCF.It is known that the window-based BEB MAC protocol implicitly maximizes it own utility function in a noncooperative game model [19].Its equilibrium persistent probability depends on the maximum and minimum contention windows (CW).In this experiment, the minimum CW for BEB MAC is 7 time-slots and the maximum CW is 1,023 time-slots.All the links are fixed at 12 Mbps.We vary the number of users from 4 to 50.Half of the users are elastic and the other ones are inelastic.The collision probability is the probability when there are more than one user access the channel at the same time.The system throughput is calculated according to [20] with the setting parameters are listed in Table 1. Figure 6 shows the system throughput and collision probability of the proposed algorithm and BEB MAC.When the number of nodes is small, the collision of our proposed protocol is a little bit higher than that of BEB MAC and the system throughput of our proposed protocol is slightly lower than BEB MAC.However, when the number of nodes in the network increases, the collision of the BEB MAC also increases since the users use the incomplete information of the network condition in their distributed operation.With our proposed algorithm, many users tend to decrease their access probability (extend their contention window equivalently) to decrease the number of collisions for each user (see Figure 6a).As a result, the system throughput of BEB MAC decreases much faster than that of our proposed protocol as we increase the number of nodes in the network (see Figure 6b).

Figures 6 System throughput and collision probability comparison between the proposed algorithm and the BEB MAC protocol
Proof of Theorem 1 Define x (τ ) (0) to be the initial point of step τ , and x (τ ) * to be the stationary point of step τ .First of all, we show that x (τ ) * is obtainable in each outer-iteration.Give θ, it is known that problem (P3 τ ) has a unique optimal solution because it is a strictly convex problem with a strictly concave objective.With the assumptions on the step size γ (t) > 0, lim t→∞ γ (t) = 0, and ∑ ∞ t=1 γ (t) = ∞, the dual-based subgradient algorithm converges to the optimal point given θ (τ ) in each τ -step according to ( [17], Prop.8.2.5).

:
In this article, we use italic characters to denote variables and bold characters to denote vectors.For example, x =[ x 1 , . . ., x |N | ], p =[ p 1 , . . ., p |N | ], and c =[ c 1 , . . ., c |N | ] are |N |-dimensional vectors which elements are x i , p i , and c i , respectively.The words 'user' and 'node' are sometimes used interchangeably.

Figure 3
Figure 3 The rate and persistent probability of four flows, two elastic and two inelastic, c =[ 36 24 6 48] Mbps

k
Figure1 [1]is clear that we cannot use the standard dual-based algorithm in[1]because of the same reason as the above examples.The inequality (22) becomes β > 1 + 1x i .Therefore, if we choose β such that β > 1 + 1

Table 1
Setting parameters for experiment 4.5