Multi-user video streaming using unequal error protection network coding in wireless networks

In this article, we investigate a multi-user video streaming system applying unequal error protection (UEP) network coding (NC) for simultaneous real-time exchange of scalable video streams among multiple users. We focus on a simple wireless scenario where users exchange encoded data packets over a common central network node (e.g., a base station or an access point) that aims to capture the fundamental system behaviour. Our goal is to present analytical tools that provide both the decoding probability analysis and the expected delay guarantees for different importance layers of scalable video streams. Using the proposed tools, we offer a simple framework for design and analysis of UEP NC based multi-user video streaming systems and provide examples of system design for video conferencing scenario in broadband wireless cellular networks.


Introduction
Real-time multi-user (or multi-party) video streaming refers to a scenario where multiple users, interconnected by a common communication network, perform real-time exchange of video streams [1,2]. Each of the users continuously creates its own video stream and is interested in the continuous and real-time recovery of the streams generated by a subset or the set of all the other users. Application examples include video conferencing, multiview video systems, multi-party peer-to-peer (P2P) video exchange, emerging multimedia-oriented social networking (e.g., "see-what-i-see" applications), etc. However, designing robust and efficient multi-user video streaming systems over wireless networks faces a number of challenges, most notably, the strict delay limits enforced by real-time requirements and time-variable wireless channel conditions responsible for frequent packet losses.
Network coding (NC) is a novel information processing technique applied in network nodes in which, instead of simple forwarding of received data packets, the data packets are combined and resulting network coded packets are transmitted instead. The idea was first introduced for the single-source multicast problem, where it was shown that, unlike routing, it achieves the capacity of the multicast connection [3]. For the single-source multicast problems represented by directed acyclic graphs with unit-capacity error-free edges, the class of linear network codes achieves the multicast connection capacity [4]. Furthermore, random linear codes over sufficiently large finite fields open the way for simple and fully distributed network code design [5]. The random linear coding (RLC) approach is adapted for practical implementation in lossy packet networks [6,7], and suggested in a number of wireless networking applications [8,9].
To increase throughput and improve error resilience, NC has been recently suggested for applications in multimedia streaming [10][11][12][13][14], and in particular, for multi-user video conferencing [15][16][17]. In [15], which is closest to our work, RLC is investigated for multi-party video conferencing in wireless broadband cellular systems. This study demonstrates that RLC applied within the central node possess a potential to reduce the end-to-end delay, increase throughput and improve the transmission reliability and system fairness.
In this article, we explore analytical tools for the design and analysis of a real-time multi-user video streaming system that applies scalable video coding and unequal error protection (UEP) RLC. We focus on a simple scenario where wireless users exchange video streams over a common central node with the goal of capturing http://jwcn.eurasipjournals.com/content/2012/1/218 the fundamental system behaviour. This work builds upon our recent theoretical analysis of UEP RLC schemes for erasure channels [18]. Addressing a specific UEP RLC application, the real-time multi-user video streaming, this article extends the layer decoding probability analysis of UEP RLC addressed in previous work to include additional performance measures such as expected decoding delays of different video layers and evolution of the expected received video quality of exchanged scalable video streams over time. Using the presented set of analytical tools, we offer a simple framework for the design and analysis of UEP RLC based real-time multi-user scalable video streaming systems. The framework provides flexible approach for reliable exchange of layered video streams over dynamically changing wireless channels. The application of the proposed framework and the benefits over the standard RLC approach applied in [15] are demonstrated through the distortion-optimized system design examples.
The article is organized as follows. Section "RLC: an overview and UEP extension" provides a background on RLC and its UEP extensions, and provides a decoding performance analysis of the UEP RLC. The proposed multi-user video streaming setup is introduced in Section "Multi-user video streaming using UEP NC". The same section formulates the distortion-based optimization of the multi-user video streaming system based on UEP NC. Selected UEP NC code design examples are discussed in Section "System optimization and results". The article is concluded in Section "Conclusions".

Background and motivation
For wireless broadcasting, NC is usually motivated by the two-user packet exchange example in Figure 1 (see e.g., [19]). Instead of replicating and independently transmitting each user packet, the central node XORs the incoming user packets and broadcasts a single coded packet. As a result, the number of packet transmissions required for two-way packet exchange between users reduces from four to three.
The two-user example can be extended to multi-user scenarios using opportunistic binary NC (XOR-ing) for

Figure 1
Simple two-user NC example.
wireless broadcast networks, proposed in [20]. However, the broadcasting node needs to know the buffer content of its neighbours in order to construct the optimal encoded packet. On the other hand, extension to multiuser scenario is possible by applying RLC over received data packets using non-binary finite field coefficents [15]. After N u users upload their data to the central node, the central node broadcasts random linear combinations of users' packets in a "rateless" fashion, until each user recovers the data packets of other users ( Figure 2). A user needs to receive any N u − 1 encoded packets broadcasted by the central node in order to recover other users with high probability, if the finite field size is sufficiently large. In contrast, without NC, an alternative is repeated broadcasting of user packets in a "data carousel" fashion, or managing the one-to-many automatic repeat request (ARQ) mechanism, which is known to suffer a number of drawbacks (e.g., excessive delay and the feedback implosion problem).
In this article, we extend the basic idea of Figure 2 to the case where users exchange scalable coded video streams. Users' messages are organized into layers of different importance, starting with the most important and continuing with progressively less important layers. If, e.g., due to poor channel conditions or low access data rates, a user is unable to fully recover other users, it benefits from recovering as many importance layers of other users' messages starting from the most important layer onwards. For increased protection of more important layers over error-prone wireless links, the advantages of the UEP forward error correction (FEC) coding are demonstrated in a number of research studies [21][22][23]. In this article, we focus on the UEP RLC to explore the benefits of both rateless UEP FEC coding and NC.

Random linear coding
Let x = {x 1 , x 2 , . . . , x K } be a source message that consists of K equal-length source packets. RLC applied over the message x produces encoded packets ω as random linear combinations of source packets with coefficients randomly selected from a given finite field GF (2 q Figure 2 Simple multi-user NC example (N u = 5).
where α i is a randomly selected element of GF(2 q ), and ω is of the same length as source packets. Each encoded packet carries a header information containing global encoding coefficients within the global encoding vector g = {α 1 , α 2 , . . . , α K } [6]. For unicast transmission, to relax the overhead requirements, both the transmitter and the receiver may use the pair of synchronized random number generators (RNG) to produce the same sequence of global encoding coefficients. In this case, only a short RNG seed needs to be conveyed within the packet overhead. The RLC encoding can be repeated at the transmitter in a rateless fashion, until the receiver collects enough encoded packets to decode the source message using the Gaussian elimination (GE) decoding. GE decoding introduces complexity limitations on the message length K. However, for real-time interactive multimedia applications, small values of K are acceptable for practical deployment [24,25].

UEP random linear coding
Let x = {x 1 , x 2 , . . . , x K } be a layered source message containing K equal-length source packets classified into L importance layers. The source message starts with the most important base layer (BL) and continues with progressively less important enhancement layers (EL). The subset of the source message containing the l-th layer is denoted as x l and contains k l source packets, where L i=1 k i = K. We denote the subset of the source message containing the first l layers as x 1:l and the number of source packets in the first l layers is K l = l i=1 k i ( Figure 3).
The UEP RLC scheme called expanding window (EW) RLC was investigated in [18]. The EW RLC introduces a set of EWs over the layered source message following the importance structure of the message. More precisely, for the L-layer message, the set of L EWs is defined where the l-th EW, 1 ≤ l ≤ L, contains the source block subset x 1:l ( Figure 3).

EW RLC encoding
The EW RLC introduces the probabilistic encoding process over the set of EWs. For each encoded packet, one of the EWs is first selected using the predefined window selection distribution (ξ ) = L i=1 i ξ i , where i is the probability of selection of the i-th window, L i=1 i = 1. Then, an encoded packet is created by applying RLC only over the selected window. Arbitrarily many EW RLC encoded packets can be produced by independently repeating the encoding process for each encoded packet.

EW RLC decoding
A receiver collects correctly received EW RLC coded packets and decodes the source message (or subset of its layers) using the standard GE decoder, as if standard RLC is used. An important difference is that the parts of the layered source message could be decoded even if less than K encoded packets are received.
For more details on the design of EW RLC, we refer the interested reader to [18].

Performance analysis of EW RLC
In the following, we review a simple upper bound for the set of decoding probabilities P d,l (N) that the content of the l-th window, 1 ≤ l ≤ L, is recovered at the receiver after receiving any N EW RLC encoded packets. The upper bound is general and holds for any packet-level coding and decoding scheme that applies probabilistic encoding and expanding windowing approach. Moreover, the EW RLC in combination with GE decoding achieves this bound as the field size increases [18].
Let y = {y 1 , y 2 , . . . , y N } be a sequence of N received EW RLC encoded packets. For the derivation of upper bounds, y is completely described by the corresponding vector n = {n 1 , n 2 , . . . , n L }, where n l denotes the number of received packets obtained by EW RLC coding over the l-th window. We denote by y l (and y 1:l , respectively) the subset of y containing the set of n l N l = l i=1 n i received packets obtained by EW RLC encoding over the l-th (the first l) window(s).
For a given n, we define a set of variables R l (n), 1 ≤ l ≤ L, using the following recursion: Thus any received y can be recursively transformed into The values of R l (n) provide an upper bound on the rank of the N l × K matrices whose rows are global encoding vectors of N l received packets in y 1:l . In other words, R l (n) is the maximum number of source packets in x 1:l that can be recovered from y 1:l . Using R we can simply determine the set of layers of x the receiver can recover after receiving y. Namely, x 1:l can be fully recovered if R l (n) = K l . In addition, x 1:l can be also recovered if any of the larger http://jwcn.eurasipjournals.com/content/2012/1/218 windows is recovered, i.e., if R m (n) = K m for any l < m ≤ L, because larger windows contain smaller ones.
Formally, the upper bound on P d,l (N) follows by conditioning on n a : where I(·) represents an indicator function equal to 1 if its argument, which is a logical expression, is true, otherwise, I(·) = 0, and i () ( j ()) is logical or (and) of a sequence of logical expressions parametrized by i (j). Conditioning on n is removed using the prior distribution over n: Finally, upper bound on P d,l (N) is obtained as: For a given layered source message, P d,l (N) depends only on the selected window selection distribution (ξ ). b Therefore, designing EW RLC codes with desired P d,l (N) behaviour reduces to the design of appropriate (ξ ) [18].

System model
In this article, we propose the UEP NC as a core component of a real-time multi-user video streaming system. For simplicity, we focus on a wireless cellular network example where N u mobile users (U 1 , U 2 , . . . , U N u ) participate in a multi-party video conferencing session over a common base station (BS) within a single cell ( Figure 4).
The presented results are applicable in similar scenarios, e.g., if instead of a single BS users connect to different BSs mutually interconnected by high-speed links (e.g., fiber optic) or if instead of a cellular network we observe a Wi-Fi access point.
In our scenario, every user continuously segments its own video stream into groups of frames (GOF), where each GOF contains N gof frames, and compresses every GOF using a scalable video coder. For each user U i and each compressed GOF, the output of the video coder is a layered source message x (i) that contains K (i) source packets, each of length b bits, organized into L layers, where the l-th layer contains k (i) l packets. The values of b and L are the same across all users whereas for each user U i and each GOF, the values K (i) and {k Video streaming among users in a session may be observed as a GOF-by-GOF exchange process. A single GOF exchange period repeats every T gof = N gof /N fps seconds, where N fps is the number of frames per second of the video stream, and both N gof and N fps are equal across all users. For simplicity, we assume that GOF periods are aligned among different users, i.e., the messages x (i) are synchronously generated by all the users. For every GOF period, the goal of each user is to share its own and collect other users GOFs, or at least as many of their layers starting from the beginning onwards, within a strict delay limits.
During each GOF period, the data exchange process can be divided into two phases. In the first, upload phase, users simultaneously upload their data to the BS, and in the second, broadcast phase, the BS broadcasts the received data to all the users. We assume that orthogonal channels are allocated between each user and the BS, and for the broadcast transmission by the BS, allowing for simultaneous transmission on all channels. Each wireless link is modeled as a synchronous time-slotted packet-erasure link where the fixed size encoded packets of length b bits are transmitted using a fixed transmission data rate R and erasure probability . We assume Figure 4 Multi-user video streaming system model. http://jwcn.eurasipjournals.com/content/2012/1/218 that the pair (R, ) is in general different on different wireless links, it remains fixed during the transmission period T gof of a single GOF, and may change between different GOF transmissions. The time slot duration T p = b/R represents a time required for a single encoded packet transmission. c In the upload phase, to protect data from erasures, the user U i encodes the source message x (i) using the EW RLC scheme defined by a window selection distribution (i) (ξ ) and streams the encoded packets using the user rate R (i) . The BS recovers the users' messages using N u independent GE decoders dedicated to different users. After decoding as many user layers as possible, the BS creates its own source message x (BS) that contains all or a subset of users' message layers. For example, if all N u users are completely recovered, the message x (BS) is of length Figure 4 for the two-layer scenario. In general, the BS may create the l-th layer x (BS) l from the l-th layers of a subset of users. In the broadcast phase, the BS applies the EW RLC scheme defined by a window selection distribution (BS) (ξ ) over x (BS) and broadcasts the stream of encoded packets using the broadcast rate R (BS) . Each user recovers the BS message x (BS) using the GE decoder where, prior to decoding, the user cancels out its own packets from the received encoded packets. We note that the two phases may overlap in time, i.e., the BS may start broadcasting a subset of (already recovered) layers of x (BS) before the upload phase of all users is completed.

Single-link analysis: decoding and delay performance
To analyze the system in Figure 4, we focus on a single-link transmission of UEP RLC coded layered message between any transmitter-receiver pair during a fixed time period T. Instead of layer decoding probabilities after fixed number of received packets, P d,l (N) (Section "Performance analysis of EW RLC"), we shift our interest to layer decoding probabilities after fixed transmission time, P d,l (T). However, unlike P d,l (N), deriving P d,l (T) requires introduction of a packet-level channel model.
During the period of duration T, the transmission process consists of N T = RT/b encoded packet transmissions (packet slots). We redefine the received sequence y = {y 1 , y 2 , . . . , y N T } to describe the outcome of all N T transmissions, where y i may represent either the received encoded packet or a lost (erased) packet. The received sequence y can be described by vector n = {n 1 , n 2 , . . . , n L , n e }, where, as before, n i is the number of received encoded packets obtained by encoding over the i-th EW, N = L i=1 n i ≤ N T is the number of (correctly) received encoded packets during the interval T, and n e = N T − N is the number of erased packets.
For a fixed T, the number of received encoded packets N is dependent on the underlying channel packet-loss model. For simplicity, we assume a packet erasure channel model that erases encoded packets independently with erasure probability . To obtain P d,l (T), we use conditioning over n: where P (ξ ), (n) is slightly altered version of (3) that accounts for n e erased packet events: P d,l (T|n) can be obtained as the layer decoding probability P d,l (N|n) after N received encoded packets (Section "Performance analysis of EW RLC", Equations (1)-(2)) since, by knowing n, we directly obtain N = L i=1 n i . The knowledge of P d,l (T) implicitly provides information on the decoding delay of the l-th message layer. For example, one can search for minimal transmission period T (l) th such that the l-th message layer decoding probability P d,l (T (l) th ) > P (l) th , where P (l) th is the threshold decoding probability set in advance. For more explicit delay information, one can obtain the probability distribution p d,l (N T ) that the l-th message layer is recovered after exactly the N T -th time slot (and cannot be recovered before): and T p = b/R. The expected delay E l [ N T ] for recovery of the l-th message layer is: The  scheme defined by (ξ ) = 1 ξ + (1 − 1 )ξ 2 is applied over x producing continuous stream of 400-bytes long encoded packets. The wireless link towards the receiver is modeled as a synchronous rate R = 2 Mbit/s link with packet erasure probability = 0.1. Figure 5 presents the evolution of layer decoding probabilities P d,1 (T) and P d,2 (T) over time T at the receiver for the range of 1 values. As a baseline scheme, we start from the middle solid curve for 1 = 0, representing standard RLC applied over the whole message, which results in simultaneous recovery of both message layers. By increasing 1 , we obtain the UEP effect of the EW RLC scheme, where solid curves for the P d,1 (T) gradually shift to the left, i.e., towards earlier recovery of the most important data. The extreme case of 1 = 1 results in the earliest recovery of the most important part represented by the leftmost dashed curve. The increase in 1 comes at the price of delayed decoding of less important layer P d,2 (T). Figure 6  The single-link analysis can be extended to the scenario where the transmitter changes the applied EW RLC code during the transmission (i.e., switches between different (ξ )). As an example, we derive P d,l (T) for T > T 1 , given that the transmitter has applied the EW RLC defined by a (ξ ) during 0 ≤ t ≤ T 1 , and the EW RLC defined by b (ξ ) for t > T 1 . For the link parameters R and , the transmitter sends N 1 = RT 1 /b encoded packets using a (ξ ) and N 2 = R(T − T 1 )/b encoded packets using b (ξ ). The received sequence y =[ y 1 y 2 ] is a concatenation of two sequences y 1 and y 2 of length N 1 and N 2 , respectively. It can be described by the vector n = n 1 n 2 , where the vectors n 1 and n 2 represent the description of  y 1 and y 2 respectively (as defined earlier), and is the component-wise sum of two equal-length vectors. Since n 1 and n 2 follow probability distributions P (a) (x), (n 1 ) and P (b) (x), (n 2 ) given by Equation (6), the layer decoding probabilities are obtained as in (5): where P d,l (T|n) follows from P d,l (N = L i=1 n i |n).

Example 2.
We continue Example 1 by investigating the evolution of P d,1 (T) and P d,2 (T) over time T at the receiver if the transmitter applies a (ξ ) = 0.5ξ + 0.5ξ 2 for the first T 1 = 125 ms, and then changes to b (ξ ) = 0.1ξ + 0.9ξ 2 (the remaining parameters are the same as in the previous example). Figure 7 compares the case where a (ξ ) changes to b (ξ ) with the case where a (ξ ) is used throughout the transmission. Figure illustrates that the (ξ ) change has no effect on P d,1 (T) as it comes too late (P d,1 (T 1 = 0.125) = 1), whereas the improvement of P d,2 (T) for T > T 1 is notable due to the increase in the second window selection probability from 0.5 to 0.9, which points out to possible adaptive (e.g., feedback-driven) updates of (ξ ) during transmission. In addition, Figure 8 demonstrates that the upper bound expressions for P d,l (T) used in this article match very well the exact calculation of P d,l (T) for a finite field size GF(2 8 ) (markers) [18]. Finally, analytical results presented above can be easily extended to Gilbert-Elliot erasure channel model with two states: the good and the bad state. This follows from the fact that the probability distribution of the number n e of erasures over N T consecutive realizations of the channel (i.e., over time interval T) is known (e.g., see [26]). The remaining N T − n e non-erased channel realizations deliver encoded symbols from different EWs according to the multinomial distribution law (3).

Distortion-optimal system design
In this section, we apply the single-link anaysis to analyze the multi-user video conferencing setup introduced in Section "System model". Our goal is to formulate the system design problem that leads to the EW RLC code design providing distortion-optimal system performance. As a distortion measure, we use peak signal to noise ratio (PSNR) as a standard video quality metric following directly from the mean square error (MSE) distortion measure. We use the terms video quality and distortion interchangeably while refering to video quality (PSNR) measure.
We focus on a single message (GOF) exchange cycle among the system users. After the initial delay of t = T gof needed for each user to acquire and compress N gof frames of video (assuming negligible compression delay), the users start the upload phase. During the upload phase, the BS waits to receive enough encoded packets to recover the users' messages or as many of their layers with sufficiently high probability. The upload phase duration T ul is upper bounded by the GOF period duration T gof as, after this period expires, users are supplied with a new set of compressed messages, which marks the beginning of the upload phase of the next message exchange cycle. From the set of recovered layers, the BS creates its own message x (BS) which is broadcasted back to the users over the broadcast downlink channel during the broadcast phase of duration T dl . To simplify the analysis, we assume the upload and the broadcast phase do not overlap, i.e., after the upload phase of duration T ul , the users stop transmitting and start listening the BS for the following period of duration T dl . This analysis provides guaranteed (lowerbound) performance for the overlapping phases case, as we discuss later.
We are interested in the system design that maximizes the total average received video quality at the user terminals after a given target system delay T = T gof + T ul + T dl . Note that, as T gof is constant and T is given in advance, it follows that the system design should optimally balance between the T ul and T dl (T ul + T dl = T − T gof = const.) The time diagram of the system model, ignoring the propagation and data processing delays, is illustrated in Figure 8.

Upload phase
The upload phase is represented by N u parallel and independent single-link transmission processes, each characterized by different message/layer sizes L }) and channel state pairs (R (i) , (i) ). Assuming that the user knows d (R (i) , (i) ) and that the value of T ul is fixed in advance by the BS, the set of layer decoding probabilities P (i−BS) d,l (T ul ) of the i-th user message at the BS can be calculated for any selected EW RLC parameter (i) (ξ ). In the following, we focus on a simple user upload strategy where the user applies standard RLC over the largest window l such that the decoding probability P where P th is a (close to one) value of threshold probability selected in advance. More formally, the i-th user will apply RLC only over the l (i) -th window, where l (i) is obtained as: Note that applying RLC only over the l (i) -th window is equivalent to the special case of applying UEP RLC with the window selection distribution (i) (ξ ) = ξ l (i) (i.e., the one which places probability one on the l (i) -th window).
Overall, the set of N u users will upload the subset of their layers, jointly described by vector l = {l (1) , l (2) , . . . , l (N u ) }, within the upload phase of duration T ul . The probability P th can be selected so as to keep the overall probability P (BS) d,l (T ul ) ≥ P N u th that the BS will recover the set of users' layers described by l during the upload phase of duration T ul sufficiently high.

Broadcast phase
During the broadcast phase, the BS applies the EW RLC code defined by (BS) } 1≤i≤N u ). This problem has been recently addressed for expanding window fountain (EWF) code design in video multicast setup [27], however, with the difference that in this article, instead of broadcasting a single stream, the BS simultaneously broadcasts a mixture of N u user streams.
Each user simultaneously receives N u − 1 video streams originating at the remaining system users. The average received video quality D (i) perceived by the i-th user is obtained by averaging over the received video qualities of all N u − 1 video streams: where D (i) j is the average received PSNR of the j-th user video content as perceived by the i-th user. D (i) j can be obtained by combining the results of the upload and the broadcast phase analysis: where the sum is taken over the set of l(j) ≤ L layers of the j-th user included in x (BS) (l). In the above expression, P is the probability that exactly the first l layers of the BS message x (BS) (l) are recovered at the user i: and D j,1:l is the average received PSNR of the j-th user video content after recovery of the first l layers (averaged over all the frames of the transmitted GOF). Finally, the average received PSNR, averaged across all the users of the multi-user video streaming session, is equal:

System parameters and design
From (11)- (14), by factoring out P (BS) d,l (T ul ), we note that the distortion-optimized system design allows for independent design of the upload and the broadcast phase, given the duration T ul and decoding probability threshold P th are fixed. In other words, by fixing and informing the users on the values of T ul and P th , the set of layers l(T ul ) that can be reliably uploaded by users in the upload phase can be determined by corresponding users. Consequently, the BS message x (BS) (l) and T dl = T − T gof − T ul is also determined, which reduces the broadcast phase design to optimization of the EW RLC code parameter (BS) (ξ ) such that the average received video quality D is maximized after the target system delay T.
Overall, for the distortion-optimized system design, the BS should optimally balance between the upload and the broadcast phase by selecting appropriate T ul , appropriate threshold probability P th , and optimally satisfy heterogeneous user requirements by selecting optimized (BS) (ξ ). The optimal solution weights between the number of layers that could be uploaded to the BS with reliability P th after T ul and their quality of reconstruction at the set of heterogeneous users after T dl .

System optimization
For the system model and distortion-optimized design discussed above, the system optimization process is performed centrally, e.g., at the video conference server collocated with the central BS node. Given the parameters of all the user messages ( , uplink channel conditions (R (i) , i ) and the broadcast channel conditions (R (BS) ,{ BS−i } 1≤i≤N u ), the BS should provide the duration of the upload phase T ul , the threshold probability P th and the EW RLC code design parameter (BS) (ξ ), such that the average received PSNR D is maximized after the target system delay T. In other words, the BS solves the following problem: where 0 ≤ T ul ≤ min{T gof , T − T gof } and for (BS) (ξ ) we (BS) l = 1. Assuming that the BS knows the channel conditions (e.g., by measurements and user reporting), it still needs to know the user message parameters to be able to perform the above optimization. Since these data cannot be obtained instantaneously at the BS, to avoid delays, we assume that the BS uses information available from recent GOF exchanges (e.g., the last GOF or the average over last several GOFs). This way, the BS is able to perform system optimization prior to the start of the upload phase and to broadcast the required parameters T ul and P th back to the users. The users then determine the number of layers l (i) to upload to the BS and start the upload phase.
In general, the complexity of calculation of the set of layer decoding probabilities in Sections "Performance analysis of EW RLC" and 2 grows exponentially, due to an exponential number of terms in sums given in Equations (4) and (5), as K and L grows. However, in practical applications, the calculations are tractable due to the fact that K, L and N u are usually small. For example, K is already bounded by GE decoding complexity and should not exceed K ∼ 100; the number of scalable video layers is typically small, e.g., L < 5; and for comfortable use of realtime multi-user video conferencing system, N u should also be small, e.g., N u < 5. (note that N u can be larger as long as each user displays only a small subset of active user streams). With the restrictions on K, L and N u , the optimization problem can be evaluated at the BS side server with acceptable delay. Alternatively, the BS may run optimization less frequently then on a GOF-by-GOF basis, using accumulated averages of channel conditions and GOF message lenghts and periodically update the users and the BS transmitter with the new values of (T ul , P th ) and (BS) (ξ ), respectively. http://jwcn.eurasipjournals.com/content/2012/1/218

Design examples
The multi-user video streaming system design proposed in this article is illustrated using numerical examples.

Example 3.
In this example, we present a distortionoptimized UEP NC solution for the multi-user video conferencing system with N u = 4 users (Figure 4) 3 = 0.05) and (R (4) =1.5 Mbps, 4 = 0.12), to account for the variations in particular uplink conditions. The BS broadcast rate is set to R (BS) =6 Mbps and, for simplicity, the broadcast erasure rates towards each user are set equal to the erasure rates of the corresponding uplink channels, i.e., BS−i = i . Given the system parameters above, we seek for the optimal system parameters (T ul , (BS) (ξ )) such that the average received PSNR D across all system users is maximized after the target delay T = 250 ms. For simplicity, we fix P th = 0.99. The solution is illustrated in Figure 9 where average PSNR is plotted as a (two-dimensional) function of (T ul ,   [ T] = 58.58 < 66 = T ul . This points out to the possibility of approximated system design using expected delay calculations.

Example 4.
Additional flexibility in the system design is obtained if the users compress their video streams into larger number of layers. In this example, we observe the performance of the distortion-optimized system design for the same transmission parameters as in the previous example, but where the layered source message is compressed into L = 4 quality layers (see Table 3 for the message parameters). The system performs optimally for T ul = 64 ms where users are able to upload the set of layers l = (2, 3, 3, 2), where P th = 0.99 is assumed fixed. The EW RLC broadcast phase parameters that achieve the optimal value D = 34.88 are for the window selection distribution (BS) (ξ ) = 0.5ξ + 0.5ξ 3 . We note that the gain obtained in average system distortion D is not large, due to the fact that compressing video into larger number of layers introduces small performance penalties, but the system flexibility reflected through better layer resolution provides more options for the system design process.

Decode-and-broadcast versus buffer-and-broadcast
In the proposed system, we apply decode-and-broadcast operation in central multi-user video streaming point: the uploading streams are firstly decoded and then broadcasted within the non-overlapping broadcasting stage. Clearly, this approach simplifies applications of our analytical tools and enables simple and elegant system design, however, improvements are possible if the broadcast phase is initialized before the incoming user messages are completely recovered. A possible improvements are shortly commented below.

Layer-by-layer decode-and-broadcast
Let us assume the upload phase where a general UEP RLC is applied instead of the specific RLC case that encodes the largest window decodable within T ul . In this case, the unequal recovery time (URT) property enables the central point to decode user layers sequentially over time, starting from the BL onwards [18]. Thus the central point is able to produce encoded packets as soon as the BL of the message x (BS) is decoded and include additional layers as soon as they become available while updating the broadcast EW RLC code parameter (BS) (ξ ) "on the fly, " as illustrated in Example 2. We note that this scenario introduces a trade-off between increase in the upload delays of higher layers and decrease in the beginning of the broadcast phase, which has to be balanced by the optimal solution. Unfortunately, the distortion optimized system design for this scenario would result in tedious optimization problem, which is why we leave it out of consideration. However, we note that expected delay analysis, similar to the one presented in Table 2, could be used as a simple approximation for the layer-by-layer decode-and-broadcast system design.

Buffer-and-broadcast
Finally, the simplest buffer-and-broadcast solution follows the standard NC approach in which all the received encoded packets are buffered, and new encoded packets produced by applying RLC over the buffer content [6,7]. In the proposed UEP RLC case, the central point maintains L separate buffers, each collecting encoded packets of different users produced over one of the L windows. As soon as the upload phase starts filling the buffers, the broadcast phase starts producing encoded packets where each encoded packet results from applying RLC over one of the buffers selected independently by the appropriate window (i.e., buffer) selection distribution (BS) (ξ ). Although very efficient, this solution lacks efficient analysis and distortion-based optimization tools. In addition, the problem of broadcasting linearly dependent encoded packets may become significant as the upload user rates decrease and broadcast rate increases (i.e., the rate of encoded packets generation exceeds the rate of incoming source data).

Conclusions
Real-time sharing of video content among multiple users over wireless networks is underlying a number of existing and upcoming mobile multimedia services. For robust, flexible and efficient implementation of such services, this article considered a combination of scalable video coding and UEP NC. We have presented analytical tools capable of producing the values of key system design parameters that result in the distortion-optimal system performance. The applications of the proposed tools are illustrated through several examples involving a simple single access point multi-user scenario.
Endnotes a For compactness, we denote R l (n) as R l . b Note that, due to the probabilistic encoding, the decoding performance is independent of the packet erasure process in the channel and depends only on the number N of received packets. c This model roughly captures the behaviour of adaptive modulation and coding (AMC) at the physical layer of cellular systems where, depending on the channel quality feedback available at the BS, different AMC modes could be approximated by different (R, ) pairs. We assume slowly-varying channels where AMC mode changes are of the order of T gof . d In state-of-the-art wireless cellular broadband systems such as LTE or WiMAX, channel quality indicators (CQI) are continuously fed back by user equipment to the BS. e For presentation purpose, Figure 9 is obtained by bruteforce calculation over a grid of points in (T ul , (BS) 1 ) space. In general, (one of ) the optimal solution(s) can be obtained by applying nonlinear programming methods such as sequential quadratic programming (e.g., using MATLAB).