A Distributed Cross-Layer Optimization Method for Multicast in Interference-Limited Multihop Wireless Networks

We consider joint optimization of data routing and resource allocation in multicast multihop wireless networks where interference between links is taken into account. The use of network coding in such scenarios leads to a nonconvex optimization problem. By applying the probability collectives (PCs) technique the original problem is turned into a new problem which is convex over probability distributions. The resulting problem is then further decomposed into a data routing subproblem at network layer and a power allocation subproblem at physical layer in order to achieve a cross-layer distributed solution for the whole range of SINR values. The proposed approach is also extended to minimum cost multicast problems and routing problems based on multicommodity ﬂow and single Steiner tree, resulting in new distributed algorithms for such problems.


INTRODUCTION
In this paper, we consider the problem of resource allocation in wireless multihop networks, where a source node is simultaneously transmitting common information to a set of destinations via relay nodes.In contrast with the wired networks, link capacities are not fixed rather in general can be functions of communication resources such as transmit power.Hence achieving optimal throughput requires joint optimization of data flow routing and resource allocation.
As shown by Ahlswede et al. in [1], data routing can be performed efficiently through network coding where nodes are allowed to mix information and send certain functions of received data on their outgoing links.Network coding was originally developed for wired networks (more precisely, a network with fixed capacity and lossless links).In such networks, multicast capacity (which is an upper bound for multicast throughput) is always achievable by network coding, whereas in general it may not be achievable with routing [1,2].Li et al. showed in [3] that linear coding usually suffices in achieving the maximum rate.A polynomial-time algorithm to achieve the maximum multicast rate in directed networks is proposed in [4].Alternatively, Ho et al. in [5] designed a distributed algorithm based on random network coding.Li et al. in [6] formulated the problem of computing optimal throughput as a linear optimization problem and proposed a distributed algorithm to solve this problem.
The problem of joint optimization of data flow routing and resource allocation has also been investigated by different researchers.In wireline networks, where multicast routing scenarios is considered, the problem is equivalent to the Steiner tree problem which is known to be NP-hard [7].However, by the use of network coding, this problem can be solved efficiently in a distributed manner.The main idea is to assume a convex (concave) cost (utility) objective function so that the problem can be formulated as a convex optimization problem and be solved efficiently by using Lagrange relaxation and subgradient methods [8,9].A game theoretic solution to this problem has also been proposed by Bhadra et al. in [10].However, solving the aforementioned problem is more difficult in wireless networks.Since link capacity is in general a function of link power, achieving the optimal result requires consideration of both network and physical layers.Finding optimal multicast routing in routed wireless network is an NP-hard problem [11].The joint optimization of routing and resource allocation based on multicommodity is investigated in [12,13] where distributed cross-layer solutions are offered.As shown in [12], with the assumption that link capacity is a concave and increasing function of the communication resources allocated to the link, the problem will become a convex optimization problem which can be solved efficiently by dual decomposition.In [13], CDMA wireless networks are considered and it is shown that for relatively high values of SINR, this problem can also be turned into a convex optimization problem.In addition, based on single Steiner tree routing, Cheng et al. in [14] addressed energy-efficient routing in multihop ad-hoc wireless networks.They proposed a distributed algorithm for optimal routing in interference-free networks through proper power allocation to each link.
Recently, the problem of joint optimization of data flow routing and resource allocation in wireless networks when network coding is used in the network layer has also become of interest.Minimum cost multicast problem has been considered in [15,16].They formulated the problem as a convex optimization problem using time sharing to eliminate interference between links and offered a centralized cross-layer approach.Yuan et al. in [17] have offered a crosslayer optimization framework to achieve optimal throughput in wireless networks.They showed that by use of time (frequency) sharing or applying logarithmic transformation at high SINR values, as well as assuming concave utility function, a distributed solution can be obtained via dual decomposition.An analogous approach has also been adopted in [18].
The main goal of this paper is to extend the scope of such problems to high interference scenarios (low SINR) as well as nonconvex (concave) cost (utility) objective functions, where we deal with a nonconvex optimization problem and traditional optimization techniques are not applicable any more.Our approach will focus on cases where network coding or routing is applied in network layer.When network coding is applied in network layer, we use max flow-min cut theorem [1] to formulate the problem as a nonlinear constrained problem.Then by the use of the new probability collectives (PCs) method, the problem is turned into a convex optimization problem over the space of probability distribution functions.Consequently, it will be shown that the new problem can be decomposed into two subproblems that are coupled via a set of Lagrangian multipliers: data routing in network layer and power management in physical layer.Subsequently, distributed cross-layer algorithms are proposed in order to obtain the solution in the new framework.It should be mentioned that one of the main features of our method is providing a distributed and parallel solution, in contrast with traditional centralized schemes for solving nonlinear constrained optimization problems (e.g., projection method [19]) or evolutionary algorithms (e.g., genetic algorithms [20] or particle swarm optimizations [21]).This feature provides the possibility of applying this method to multihop wireless networks without an infrastructure support.
Finally, extension of the proposed method to routing problems based on traditional multicommodity and single Steiner tree is also presented and it is shown that as expected, network coding-based solutions can generally lead to better performance in comparison with routing-based solutions.
The organization of the paper is as follows.Section 2 describes the original optimization problem addressed in the paper and in Section 3, it is shown how by use of probability collectives the problem will be transformed into a convex form and subsequently decomposed to achieve a fully distributed solution.Instead of maximizing the throughput, in some scenarios the goal is to minimize a cost function (e.g., energy) while fulfilling a certain achievable multicast throughput.Section 4 extends the methods described in Section 3 to such min-cost multicast problems.Subsequently, extension of the proposed approach to single tree solutions is provided in Section 5. Simulation results are presented in Section 6, and finally Section 7 concludes the paper.A summary of probability collectives optimization scheme is also presented in the appendix.

PROBLEM FORMULATION
In traditional routing, nodes are only allowed to replicate and forward received data packets.In such networks, each data unit is transmitted in a tree-structure.This tree includes a path from source to each destination known as the Steiner tree.Maximum achievable throughput can be obtained by computing the maximum number of pairwise capacity-disjoint trees resulting in a centralized process with high computational complexity.In order to reduce the complexity, two suboptimal solutions can be applied: multicommodity flow routingand single Steiner tree routing [22].In multicommodity flow routing, multicast session is treated as multiple unicast sessions and dedicated bit rate resources are allocated to different destinations.In this case, the multicast rate r is feasible if there is a flow vector between a source and each destination with a rate equal or greater than r, and also the sum of these flows at each link does not exceed the link capacity.As will be shown in Section 6, this property simplifies the problem formulation and enables us to achieve a distributed solution.Another special case of the general routing problem is to send information via a single Steiner tree.Although this case is of special importance in networks modelled by unlimited capacity links (e.g., wireless optical networks [23]), it is still applicable in limited capacity networks if link capacity of each tree is not less than r, and data can be sent to destination at rate r via such tree.By use of network coding, the multicast rate r is feasible if and only if there is a flow vector between source and each destination (called conceptual flow) with a rate equal or greater than r, and also max of these flows (called max of flows or link flow) does not exceed the link capacity.In this paper, we will consider both approaches and provide corresponding optimization solutions in each scenario.
A data network can be represented by a directed graph G = (V , E), where the vertex set V and the edge set E denote the nodes and links, respectively.An s − d flow with value r is a length-|E| nonnegative vector x satisfying the flow conservation constraint: where I(n) and O(n) are defined as the set of incoming and outgoing links at node n, respectively.Also, s, d i , and N denote sender (source), ith receiver (destination) and the number of receivers, respectively.Let f l , c l , and e i,l , respectively, denote flow, capacity, and conceptual flow associated with ith destination of link l.In order to achieve a tractable solution for the problem addressed in this paper, which is inherently difficult to solve due to its inherent nonconvex structure, it is assumed that network topology is time invariant, in other words nodes are static, not moving, and connected via fixed links.Such assumption is valid in quasistationary wireless mesh networks as well as static adhoc networks.However, in multihop wireless networks, due to interference, each achievable link rate not only depends on the power allocated to the link itself, but also on the power allocated to other links.Consequently, achievable rate of a link may be formulated as a function of SINR defined as For example, in CDMA wireless networks the achievable rate can be defined as where G ll , p l , and σ 2 l are the link gain, power, and noise variance, respectively, and G l j is the interference gain from link j to link l.The power constraints for each link and node can then be constrained as Consequently, the maximum utility derived by a feasible multicast rate can be achieved by the following optimization problem: maximize U(r) ≡ minimize − U(r) subject to: r ∈ [r min , r max ], (5)

OBTAINING A DISTRIBUTED PC-BASED SOLUTION
In order to obtain a manageable solution for the problem presented in Section 2, we adopt the probability collectives (PCs) optimization method.As will be shown subsequently, by proper use of PC approach, the problem will be transformed into a convex form and subsequently decomposed to achieve a fully distributed solution.A brief introduction to PC and its key concepts such as Maxent Lagrangian is presented in the appendix.

The general framework of PC-based optimization
Lets assume that the variables r, f l , e i,l , and p l take a finite number of values in the ranges [r min , r max ], [0, r max ], [0, r max ], and [0, p l,max ], respectively.In this way, it is ensured that the solutions obtained at each step satisfy the constraints ( 5), ( 7), (8), and (11).It should be noted that the other constraints are already included in Maxent Lagrangian and also all feasible values for e i,l and f i are in the range [0, r max ].The equality constraint ( 6) can be rewritten as Since the above constraint ensures that the source node injects a flow of at most r in the network, at each intermediate node the outgoing flow is less than the incoming flow and each of the receivers receive at a flow rate greater than or equal to r.This is possible, if and only if, the flow conservation constraint (6) is satisfied.This is an important issue since we assumed that all constraints are of the form of nonequalities.Let q (t) r , q (t) ei,l , q (t) fl , and q (t) pl denote probability distributions associated with variables r, e i,l , f l , and p l , at step t, respectively.By expanding the Lagrangian, the following convex optimization problem will be obtained: where T and S are part of the PC optimization framework briefly described in the appendix.In addition, in order to reduce the number of equations, the constraints for nonnegative probabilities and unity probability distributions are not explicitly mentioned.Also, the time dependency of probability distributions is assumed implicitly.Finally, the Lagrange multipliers are updated according to (A.9):

Problem decomposition
Subsequently, minimizing the Maxent Lagrangian can be decomposed into the following subproblems in network and physical layers, respectively, as follows: The network layer subproblem can be further decomposed into a set of single variable subproblems as follows: where The physical layer subproblem can also be decomposed into a set of the following subproblems at each link: By use of Newton updating scheme for subproblems (20)-( 22), we will obtain updating rules similar to (A.7) for q r (x i ), q ei,l (x i ), q fl (x i ), and q pl (x i ) where G is replaced by G 1 to G 4 , in each case as follows:

Proposed distributed algorithm
The overall distributed algorithm is subsequently given by Algorithm 1 The "exact" convergence is achieved when all constraints are satisfied and the probability distributions converge to impulse function.However, in practice "approximate" convergence criteria can also be defined [24].For example, if the following constraints are satisfied at iteration t + 1, then an "approximate" convergence is achieved where C i is a non-equally constraint of the form C i (x) ≤ 0, δ i , and ε i are sufficiently small positive scalars.
The aforementioned algorithm can consequently be performed in a distributed fashion: at network layer q r , q ei,l , q fl , and ξ i,l are updated based on local information: updating q r , q ei,l , and q fl needs only previous probability distributions associated with variables r, e i,l , and f l , respectively (see ( 23)-( 25) and (A.7)).Also, ξ i,l can be updated by computing E(e i,l − f l ) requiring only probability distributions q ei,l and q fl .μ i,n can be updated at each node (except at the receivers) using probability distributions of flow and conceptual flows of the incoming and outgoing links.At the receivers, E(r) should also be taken into account.Therefore, in step 2a, the source also broadcasts E(r).At the physical layer, each link can calculate its expected capacity and broadcast E(λ l c l ) to other links.Consequently, each link can update its probability distribution based on (26).
The overall algorithm then works as follows: at network layer, at iteration t, each node n uses the previous probability distributions associated with its outgoing link (i.e., q (t−1) fl , ei,l , l ∈ O(n)) as well as Lagrange multipliers, μ (t−1) i,n and ξ (t−1) i,l , (l ∈ O(n)) in order to coordinate with other nodes and obtain new appropriate values for its outgoing links flows (i.e., f * (t)  i,l ).This procedure can be performed in parallel since each node uses previous probability distributions and Lagrange multipliers corresponding to its neighboring nodes (i.e., nodes that have at least a common link with this node).In a similar way, nodes at the physical layer update (a) At the source node, q r is updated according to ( 23) and (A.7).E(r) is calculated and broadcasted to the network.(b) For each link, q e i,l and q f l is updated according to (24), (25), and (A.7).(c) Lagrange multipliers μ i,n and ξ i,l are updated according to (15) and (16).
At physical layer: (d) q p l 's are updated according to ( 26) and (A.7) and broadcasted to the network.(e) Lagrange multipliers ν n are updated based on (17).
Cross Layer Optimization: (f) Lagrange multipliers λ i are updated based on (18).
Algorithm 1 the probability distributions associated with their outgoing links power in order to achieve new appropriate values for link capacities.The two layers coordinate with each other in order to balance links flow and links capacities.Finally, the algorithm will continue until approximate convergence is achieved.In order to achieve approximate convergence, all the problem constraints (which can be rewritten in the form C i (x) ≤ 0, should not exceed a small specific positive value (i.e., C i (x) ≤ ε i ) and for all probability distributions, we should have q (t+1) i − q (t) i ≤ δ i .In other words, all constraints should be approximately satisfied and the probability distributions should converge to an approximate steady state condition.It is not hard to check that all the problem constraints can be calculated in a distributed fashion (see ( 6)-( 12)) in appropriate node at physical or network layer.Therefore, at each step after updating probability distributions and achieving new appropriate values (i.e., x * (t) i ), each node can calculate its related constraints and probability distributions in order to check if they meet the convergence conditions and subsequently announce it to the network.The algorithm will be terminated when each node achieves the aforementioned approximate convergence.
While the network layer tries to allocate appropriate flow (i.e., bandwidth) to each link in order to achieve an optimal multicast throughput, the physical layer assigns link powers in order to support the required bandwidths.Lagrange multipliers λ l 's play an important rule in such coordination between layers.When (expected) capacity supported by physical layer is less than the expected flow of the link, λ l is increased in order to enforce physical layer to increase link capacity by increasing link power and subsequently notifies network layer to decrease link flow.On the other hand, if physical layer assigns more bandwidth than is required in network layer, excess power is allocated by physical layer to the link.This effect will in turn cause interference to other links, resulting in a decrease in the capacity of other links.In this case, by decreasing Lagrange multipliers, physical layer decreases link power and consequently the link capacity, while network layer realizes that it can inject more flow to this link.The optimal solution is achieved when link capacity and link flow become equal (if we are interested in maximum throughput, regardless of how much power is consumed it suffices that each link flow does not exceed link capacity).However, in the proposed method, since link powers as well as link flows are selected from a discrete set, these two values may not be equal in the final solution and the resulting link capacities are usually more than link flows.Lagrange multiplier λ l can also be interpreted as the bandwidth cost of link l.Network layer tries to send data via links with relatively lower cost in order to minimize the total cost incurred, while physical layer tries to maximize the total benefit achieved by providing more bandwidth to network layer.
As mentioned in the appendix, the PC algorithm converges to at least a local minimum that satisfies the given constraints.Therefore, the proposed algorithm achieves a feasible multicast rate (corresponding to a local maximum of the utility function).The proposed method is more complex than traditional convex optimization problems since it requires updating a probability distribution (associated with each scalar variable) rather than a scalar value, resulting in a higher computational complexity as well as more memory space.However, this additional complexity is inevitable due to the nonconvexity of the original problem.It should be noted that it is possible to reduce this complexity by selecting variables from a smaller set, but this may result in further suboptimality.

EXTENSION TO MINIMUM COST MULTICAST
In Section 3, we considered joint optimization of data flow routing and link power adjustment in order to achieve the optimal throughput.Alternatively, we can investigate the problem of link power allocation in order to minimize a cost (e.g., total consumed power) while fulfilling a certain achievable multicast throughput.This problem can be formulated as follows: minimize l∈E g l (p), subject to: where g l (p) is an arbitrary (not necessarily convex) function of link powers.Following a similar approach as presented in Section 3, a distributed algorithm can be designed by decomposing the Maxent Lagrangian.In addition, we can modify the multicast rate optimization problem to maximization of a net utility function similar to [9] where the utility function can be defined as In aforementioned problems, we concentrated on finding the optimal data flow in network layer, rather than the code design problem.In order to establish a multicast session with network coding, it suffices to compute the appropriate data flow and then compute a code that determines the content of each link flow following the method presented in [4,5].Joint optimization of data routing and resource allocation using multicommodity flow can be formulated in a similar way, by replacing max flow with accumulated flow in the constraints.Therefore, the constraint f i,l ≤ c l should be replaced with i∈D f i,l ≤ c l .Clearly, in this case, less flow can be dedicated to each destination, resulting in a suboptimal solution compared with the network coding-based solutions.In this respect, our solution can be considered as an extension of work in [13] to nonconvex cost functions.In addition, while in [13] only low-interference scenarios where link capacities are approximated by log (SINR) are taken into account, our approach does not assume such approximation and can consequently be applied in both low and high interference scenarios.

A SOLUTION BASED ON SINGLE TREE ROUTING
In earlier sections, we have offered a distributed algorithm for a general network by applying network coding at the network layer.Also, it has been shown that when routing is used at the network layer, with some modifications, we can achieve a distributed solution by using multicommodity flow routing scheme.Another routing-based solution of interest is based on single Steiner tree.Although such solution is only suboptimal in relation to that of a general Steiner tree problem, it can be implemented in a distributed fashion with lower complexity.Therefore, in this section, we will also extend our method by presenting a solution based on single Steiner trees.We study both acyclic and general networks, where in each case, a Steiner tree is constructed through which data can be multicasted from source to the destinations.

Acyclic networks
First, we consider a network with no cycles (i.e., an acyclic network) and will address the general problem in Section 5.2.Consider an arbitrary subgraph G = (V , E ) V ⊆ V , E ⊆ E.An indicator variable, e l , is associated with each link defined as follows: Note that a subgraph can be characterized by an indicator vector, e, defined as e = e l , ∀l ∈ E.
An intermediate node (a node which is neither a source nor a destination node) in optimum multicast subgraph should act as a relay node, that is, only retransmit received packets.Therefore, searching for optimum subgraphs can be restricted to subgraphs with such property.

Theorem 1. A subgraph includes a path from source to each destination, if and only if, constraints (31)-(33) are satisfied l∈O(S)
e l > 0, (31) Proof.Assume a subgraph includes a path from a node to each destination, so it includes the source and one of its outgoing links and constraint (31) is satisfied.If an intermediate node included in the subgraph acts as a relay node, at least one of its outgoing links and one of its incoming links will be included in the subgraph.Otherwise, none of its links will be included in the subgraph.In both cases, constraint (32) is satisfied.The subgraph should include all destinations and at least one incoming link of each destination.Consequently, constraint (33) is also satisfied.
Satisfying constraints (31)-( 33) ensures that the subgraph includes a path from source to each destination.Since the network has no cycles, if there is no path from source to a destination, it should make a cycle with some relay nodes and/or other destinations in order to satisfy constraint (31), contradicting the definition of an acyclic network.
Constraints (31)-( 33) can be interpreted as follows: constraint (31) states that the source sends data packets to network via at least one of its outgoing links.Condition (32) states that intermediate nodes act as relay nodes and retransmit received packets.Constraint (33) insures that all destination nodes receive packets.Consequently, finding the optimal multicast subgraph can be performed via searching the set of subgraphs satisfying constraints (31)-( 33).It should be noted that the minimum-cost subgraph has a tree structure corresponding to the minimum cost Steiner tree.Since the optimum subgraph includes a path from source to each destination, it comprises of a tree consisting of such paths.This tree is sufficient for transmitting information from source to receivers.Consequently, every other link in the optimum subgraph is redundant.A subgraph with minimum cost incurred is the optimal solution and can be formulated as follows: where h i (e) is defined as (35) By using the PC theory, the above problem can be solved as follows: a discrete probability distribution, q el , is associated with each variable, e l .Then the following problem is solved: Assume the problem of multicasting data at an achievable rate, r 0 , with minimum cost incurred.Based on the earlier discussion, in order to multicast data at rate r 0 , it suffices to construct a single Steiner tree with link capacities greater than or equal to r 0 .This problem can be formulated as follows: minimize l∈E g l (p), subject to: Using PC, the above problem can be rewritten as follows: The Lagrange multipliers are then updated according to (A.9): The minimization problem in (38) can then be decomposed into the following subproblems in network and physical layers, respectively, as follows: Comparing ( 40) with (36), it can be realized that the network layer problem corresponds to finding minimum-cost multicast subgraph (i.e., Steiner tree) with link costs equal to λ l r 0 .The network problem can in turn be decomposed into the following single-variable subproblems: minimize qe l E λ l r 0 e l +ξ head(l) h head(l) (e)+ξ tail(l) h tail(l) (e) −TS q el . (42) It should be noted that link l, corresponding to e l , is in connection with exactly two nodes, the node whose link exits from it (head (l)) and the node whose link enters it (tail (l)).Therefore, only h tail (l) and h head (l) will be functions of e l and should be considered in (42).The physical layer problem can then be decomposed as follows: and q el 's and q pl 's are updated according to (A.7), where G is replaced by G 5 and G 6 as follows: The probability distributions associated with indicator variables and link powers can be updated in a distributed fashion, at network and physical layers, respectively.Updating q el 's requires computing E(e l ), E(h head (l) (e)), and E(h tail (l) (e)).E(h head (l) (e)) and E(h tail (l) (e)) can be computed by using probability distribution of indicator variables associated with links connected to nodes head (l) and tail (l), respectively.Each node can update its outgoing links by exchanging links probability distributions with its neighbors.Lagrange multipliers can also be updated at each link l (more precisely at node this link originates from), using q el and q pl .Hence a distributed algorithm can be designed and the proposed approach can be extended to find the maximum net utility function: It can easily be shown that each subproblem at network layer is given by minimize It should be noted that the subproblems in physical layer are also of the form given in (41).However, in this case since variable r couples the subproblems, the network layer problem cannot be decomposed in a way similar to (40).

General networks
In this part, we propose a method that can be applied in an arbitrary (cyclic or acyclic) network, however, at a higher complexity cost.The s − d i binary flow with rate r is defined as a length-|E| vector f i satisfying the flow constraint: where each component of f i , f i,l takes its value from the set {0, 1}.Note that, by this definition, the s − d i binary flow with unit value corresponds to a path from source to ith destination.A set of N binary flows from source to destinations constructs a multicast subgraph, since it ensures existence of a path between source and each destination.Therefore, the link l of network graph (G) is included in this multicast graph (i.e., e l = 1) if it is included in at least one path from a source node to a destination (or equivalently: , where ∨ denotes logical or).Consequently, we define e l as The optimum graph can be found by exploring all subgraphs constructed in this way.This problem can be formulated as minimize l∈E e l b l , subject to: where h j (e) is defined as before.A probability distribution is associated with each variable f i,l (rather than e l ) and q fi,l .
Then by solving the following problem: and based on discussion presented in Section 5.1, the minimum cost multicast problem at rate r 0 can be formulated as minimize l∈E g l (p), subject to: , Lagrange multipliers μ i,n can be updated as follows: where Lagrange multipliers ξ i , λ i , ν n can be updated as before.Finally, the above problem can be decomposed into subproblems (54) and (40): minimize where q fi,l 's can be updated based on the above equations and q pl 's are updated as before.In this way, a distributed algorithm can be designed in a similar way as the algorithm presented earlier.
Based on the above discussions, the difference between the proposed method and the distributed algorithm presented in [14] becomes more evident.In fact, in [14] a link is assumed to be either enabled, if the received power exceeds a threshold value and data can be transferred via this link at a desired rate, or disabled, if the received power does not reach the threshold level.However, our approach takes both link capacity and interference into account, reflecting a more realistic cooperation between different nodes (in physical and network layers) in order to achieve the desired throughput.

SIMULATION RESULTS
Consider the network represented in Figure 1, as an example.The source node (S) multicasts data to receivers d 1 and d 2 via the network.The goal is to achieve optimal throughput in the range [0, 2].The utility function is assumed to be equal to r 2 (which is a not a concave but monotonic function of r).We define net-utility as r 2 − l∈E 0.001 * p l (therefore, the main emphasis will be on achieving maximum rate rather than minimizing total consumed power).Each link is assumed to select its transmit power from a discrete set of values {0, 1, 2, . . ., 5} and each node has a power budget equal to 10. f l and e i,l are also assumed to take values from the set {0, .2,.4, . . ., 2}.Link gains, interference gains, and noise variances are assumed to be equal to 1, 0.05 and 0.1, respectively.Also, we assume that achievable rate of each link is given by c l (p) = log(1 + SINR l ).Figures 2, 3, 4, 5 show the flow and capacity associated with each link in this scenario.Due to the symmetric structure of the network, some links have the same link flow and capacity.After 2000 iterations, the optimal multicast throughput is achieved.Figure 6 shows conceptual flows and flows (e 1,l , e 2,l , f l ) of the links, where e 1,l and e 2,l satisfy flow conservation and link capacity constraints and also the multicast rate of 2 is achieved.This multicast rate is feasible since all the link flows are supported by the physical layer (i.e., each link has a capacity greater than its flow).Link capacities and link power vectors are consequently given by [1.35,1.35It should be noted that if the utility function is assumed to be given by log(1 + r) rather than r 2 , we can apply the method proposed in [13] (i.e., underestimate link capacity by log(SINR l )) and use logarithmic transformation.However, the maximum multicast throughput in this case would only reach the value of 1.54.This is due to the fact that as a result of the relatively high interference between links, such underestimation will not lead to the optimal solution.
As an example of minimum cost multicast, consider the case of multicasting data based on single Steiner tree at rate r 0 = 1.9 with minimum total link power.It can be verified that the optimal solution is achieved when link power vector is equal to [4, 4, 4, 0, 0, 4, 0, 0, 0] and link capacity vector is given by [1.9, 1.9, 1.9, 0, 0, 1.9, 0, 0, 0].   Figure 7 shows the optimum Steiner tree which can be shown by indicator vector e = [1, 1, 1, 0, 0, 1, 0, 0, 0].As shown in Figures 8 and 9, the proposed method converges to the optimum value where each link flow is equal to r 0 e l .These figures show that such multicast rate is feasible since all links flow are supported by the physical layer.Also,  important to note that while we have considered the network presented in Figure 1 in our simulations, as presented in earlier sections, our proposed algorithms are quite general and can be applied to any network with arbitrary topology.Since in problem formulation the network is assumed to be quite arbitrary, our PC-based algorithm converges to at least a suboptimum solution, corresponding to a local minimum cost.

CONCLUSIONS
In this paper, the problem of finding an optimal multicast solution in multihop wireless networks with interference has been addressed.Using the PC method, the problem has been turned into a convex optimization problem over probability distributions.Consequently, it was shown that the new problem can be decomposed into two subproblems at network and physical layers and a corresponding cross-layer distributed approach to solve the problem has been proposed.Also, distributed cross-layer algorithms for multicommodity flow-based routing and single Steiner tree routing have been proposed.
As expected, the network coding-based solution performs better than the solution based on routing.

APPENDIX PROBABILITY COLLECTIVES (PCS) OPTIMIZATION SCHEME
Consider the optimization problem, min x G(x), where each component of x, x i is a discrete scalar variable assumed to take a finite number of possible values |X i |.A continuous variable with a finite domain can also be approximated with a variable with a finite number of values.In PC method, each variable is considered as an agent [24][25][26] where at each step, each agent updates its probability distribution according to the maximum entropy (Maxent) principle and independently from other agents [27].Based on this approach, given the prior knowledge of the utility function, the goal is to find the probability distribution that is consistent with the a priori knowledge and also attains the maximum entropy.We will denote the entropy by S q i = − xi∈Xi q i x i ln q i x i , ( A . 1 ) where q i denotes the probability distribution of agent i and q i (x i ) is the probability that agent i takes the value x i .Since agents are assumed to be independent, the joint distribution q of the agents will be of the form: and consequently, Using the Maxent principle, the original optimization problem can then be converted into the following optimization problem over the probability distribution [25]: minimize l(q, T) = E(G) − TS(q), subject to: xi q i (x i ) = 1, q i x i ≥ 0, ∀x i , (A.4) where l(q, T) is the Maxent Lagrangian, T is a positive Lagrange multiplier, and Each agent is aware of the previous probability distribution of other nodes and updates its probability distribution as the solution of following convex optimization problem: − T xi q i (x i ) ln q i (x i ) , subject to: xi q i x i = 1, q i x i ≥ 0, ∀x i , (A.6) where E[G | x i ] = G(x) j / =i q j (x j ).Using Newton Updating, the following updating rule can be obtained [24]: where α and t, respectively, denote the step size and the iteration number.As the parameter T is gradually decreased, and in the limit when T→0, the set of probability distributions that simultaneously minimize Maxent Lagrangian will become the same as the set of delta functions at the local minima of the objective function G [24,25].
Updating can be performed in parallel for different agents, since calculating expectations in (A.7) requires previous (step) probability distributions.Therefore, by exchanging previous probability distributions, each agent can update its probability distributions, simultaneously.
Other constraints can also be included by augmenting the objective function with Lagrange multipliers λ i and the constraint functions C i (x): where c i (x) is a nonequality constraint of the form: C i (x) ≤ 0 [28].The updating rule for Lagrange multipliers is obtained by taking the derivative of the augmented Lagrangian with respect to each Lagrange multiplier: where [•] + = max{•, 0} and η is a positive step size.Consequently, the minimizer of (A.4) when the function G is augmented with Lagrange multipliers λ i corresponds to at least a local minimum of the original objective function G subject to the same constraints [26,29].

( 1 )
Initialize (a) Assign the starting probabilities for each variable, typically a uniform distribution over its possible values.(b) Set the parameters {T, α, η} (2) Optimize the Lagrangian At network layer:

Figure 3 :Figure 4 :
Figure 3: The flow and capacity of links 3 and 6.