Open Access

Multi-user video streaming using unequal error protection network coding in wireless networks

EURASIP Journal on Wireless Communications and Networking20122012:218

https://doi.org/10.1186/1687-1499-2012-218

Received: 27 February 2012

Accepted: 20 June 2012

Published: 13 July 2012

Abstract

In this article, we investigate a multi-user video streaming system applying unequal error protection (UEP) network coding (NC) for simultaneous real-time exchange of scalable video streams among multiple users. We focus on a simple wireless scenario where users exchange encoded data packets over a common central network node (e.g., a base station or an access point) that aims to capture the fundamental system behaviour. Our goal is to present analytical tools that provide both the decoding probability analysis and the expected delay guarantees for different importance layers of scalable video streams. Using the proposed tools, we offer a simple framework for design and analysis of UEP NC based multi-user video streaming systems and provide examples of system design for video conferencing scenario in broadband wireless cellular networks.

Introduction

Real-time multi-user (or multi-party) video streaming refers to a scenario where multiple users, interconnected by a common communication network, perform real-time exchange of video streams [1, 2]. Each of the users continuously creates its own video stream and is interested in the continuous and real-time recovery of the streams generated by a subset or the set of all the other users. Application examples include video conferencing, multi-view video systems, multi-party peer-to-peer (P2P) video exchange, emerging multimedia-oriented social networking (e.g., “see-what-i-see” applications), etc. However, designing robust and efficient multi-user video streaming systems over wireless networks faces a number of challenges, most notably, the strict delay limits enforced by real-time requirements and time-variable wireless channel conditions responsible for frequent packet losses.

Network coding (NC) is a novel information processing technique applied in network nodes in which, instead of simple forwarding of received data packets, the data packets are combined and resulting network coded packets are transmitted instead. The idea was first introduced for the single-source multicast problem, where it was shown that, unlike routing, it achieves the capacity of the multicast connection [3]. For the single-source multicast problems represented by directed acyclic graphs with unit-capacity error-free edges, the class of linear network codes achieves the multicast connection capacity [4]. Furthermore, random linear codes over sufficiently large finite fields open the way for simple and fully distributed network code design [5]. The random linear coding (RLC) approach is adapted for practical implementation in lossy packet networks [6, 7], and suggested in a number of wireless networking applications [8, 9].

To increase throughput and improve error resilience, NC has been recently suggested for applications in multimedia streaming [1014], and in particular, for multi-user video conferencing [1517]. In [15], which is closest to our work, RLC is investigated for multi-party video conferencing in wireless broadband cellular systems. This study demonstrates that RLC applied within the central node possess a potential to reduce the end-to-end delay, increase throughput and improve the transmission reliability and system fairness.

In this article, we explore analytical tools for the design and analysis of a real-time multi-user video streaming system that applies scalable video coding and unequal error protection (UEP) RLC. We focus on a simple scenario where wireless users exchange video streams over a common central node with the goal of capturing the fundamental system behaviour. This work builds upon our recent theoretical analysis of UEP RLC schemes for erasure channels [18]. Addressing a specific UEP RLC application, the real-time multi-user video streaming, this article extends the layer decoding probability analysis of UEP RLC addressed in previous work to include additional performance measures such as expected decoding delays of different video layers and evolution of the expected received video quality of exchanged scalable video streams over time. Using the presented set of analytical tools, we offer a simple framework for the design and analysis of UEP RLC based real-time multi-user scalable video streaming systems. The framework provides flexible approach for reliable exchange of layered video streams over dynamically changing wireless channels. The application of the proposed framework and the benefits over the standard RLC approach applied in [15] are demonstrated through the distortion-optimized system design examples.

The article is organized as follows. Section “RLC: an overview and UEP extension” provides a background on RLC and its UEP extensions, and provides a decoding performance analysis of the UEP RLC. The proposed multi-user video streaming setup is introduced in Section “Multi-user video streaming using UEP NC”. The same section formulates the distortion-based optimization of the multi-user video streaming system based on UEP NC. Selected UEP NC code design examples are discussed in Section “System optimization and results”. The article is concluded in Section “Conclusions”.

RLC: an overview and UEP extension

Background and motivation

For wireless broadcasting, NC is usually motivated by the two-user packet exchange example in Figure 1 (see e.g., [19]). Instead of replicating and independently transmitting each user packet, the central node XORs the incoming user packets and broadcasts a single coded packet. As a result, the number of packet transmissions required for two-way packet exchange between users reduces from four to three.

Figure 1

Simple two-user NC example.

The two-user example can be extended to multi-user scenarios using opportunistic binary NC (XOR-ing) for wireless broadcast networks, proposed in [20]. However, the broadcasting node needs to know the buffer content of its neighbours in order to construct the optimal encoded packet. On the other hand, extension to multi-user scenario is possible by applying RLC over received data packets using non-binary finite field coefficents [15]. After N u users upload their data to the central node, the central node broadcasts random linear combinations of users’ packets in a “rateless” fashion, until each user recovers the data packets of other users (Figure 2). A user needs to receive any N u −1 encoded packets broadcasted by the central node in order to recover other users with high probability, if the finite field size is sufficiently large. In contrast, without NC, an alternative is repeated broadcasting of user packets in a “data carousel” fashion, or managing the one-to-many automatic repeat request (ARQ) mechanism, which is known to suffer a number of drawbacks (e.g., excessive delay and the feedback implosion problem).

Figure 2

Simple multi-user NC example ( N u =5).

In this article, we extend the basic idea of Figure 2 to the case where users exchange scalable coded video streams. Users’ messages are organized into layers of different importance, starting with the most important and continuing with progressively less important layers. If, e.g., due to poor channel conditions or low access data rates, a user is unable to fully recover other users, it benefits from recovering as many importance layers of other users’ messages starting from the most important layer onwards. For increased protection of more important layers over error-prone wireless links, the advantages of the UEP forward error correction (FEC) coding are demonstrated in a number of research studies [2123]. In this article, we focus on the UEP RLC to explore the benefits of both rateless UEP FEC coding and NC.

Random linear coding

Let x={x1x2,…,x K } be a source message that consists of K equal-length source packets. RLC applied over the message x produces encoded packets ω as random linear combinations of source packets with coefficients randomly selected from a given finite field GF(2 q ): ω = i = 1 K α i · x i , where α i is a randomly selected element of GF(2 q ), and ω is of the same length as source packets. Each encoded packet carries a header information containing global encoding coefficients within the global encoding vector g={α1α2,…,α K }[6]. For unicast transmission, to relax the overhead requirements, both the transmitter and the receiver may use the pair of synchronized random number generators (RNG) to produce the same sequence of global encoding coefficients. In this case, only a short RNG seed needs to be conveyed within the packet overhead. The RLC encoding can be repeated at the transmitter in a rateless fashion, until the receiver collects enough encoded packets to decode the source message using the Gaussian elimination (GE) decoding. GE decoding introduces complexity limitations on the message length K. However, for real-time interactive multimedia applications, small values of K are acceptable for practical deployment [24, 25].

UEP random linear coding

Let x={x1,x2,…,x K } be a layered source message containing K equal-length source packets classified into L importance layers. The source message starts with the most important base layer (BL) and continues with progressively less important enhancement layers (EL). The subset of the source message containing the l-th layer is denoted as x l and contains k l source packets, where i = 1 L k i = K . We denote the subset of the source message containing the first l layers as x1:l and the number of source packets in the first l layers is K l = i = 1 l k i (Figure 3).

Figure 3

Expanding window RLC.

The UEP RLC scheme called expanding window (EW) RLC was investigated in [18]. The EW RLC introduces a set of EWs over the layered source message following the importance structure of the message. More precisely, for the L-layer message, the set of L EWs is defined where the l-th EW, 1≤lL, contains the source block subset x1:l(Figure 3).

EW RLC encoding

The EW RLC introduces the probabilistic encoding process over the set of EWs. For each encoded packet, one of the EWs is first selected using the predefined window selection distribution Γ ( ξ ) = i = 1 L Γ i ξ i , where Γ i is the probability of selection of the i-th window, i = 1 L Γ i = 1 . Then, an encoded packet is created by applying RLC only over the selected window. Arbitrarily many EW RLC encoded packets can be produced by independently repeating the encoding process for each encoded packet.

EW RLC decoding

A receiver collects correctly received EW RLC coded packets and decodes the source message (or subset of its layers) using the standard GE decoder, as if standard RLC is used. An important difference is that the parts of the layered source message could be decoded even if less than K encoded packets are received.

For more details on the design of EW RLC, we refer the interested reader to [18].

Performance analysis of EW RLC

In the following, we review a simple upper bound for the set of decoding probabilities Pd,l(N) that the content of the l-th window, 1≤lL, is recovered at the receiver after receiving any N EW RLC encoded packets. The upper bound is general and holds for any packet-level coding and decoding scheme that applies probabilistic encoding and expanding windowing approach. Moreover, the EW RLC in combination with GE decoding achieves this bound as the field size increases [18].

Let y={y1,y2,…,y N } be a sequence of N received EW RLC encoded packets. For the derivation of upper bounds, y is completely described by the corresponding vector n={n1,n2,…,n L }, where n l denotes the number of received packets obtained by EW RLC coding over the l-th window. We denote by y l (and y1:l, respectively) the subset of y containing the set of n l N l = i = 1 l n i received packets obtained by EW RLC encoding over the l-th (the first l) window(s).

For a given n, we define a set of variables R l (n),1≤lL, using the following recursion:
R 1 ( n ) = min ( n 1 , K 1 ) , R l ( n ) = min ( R l 1 ( n ) + n l , K l ) , 2 l L.
(1)

Thus any received y can be recursively transformed into R={R1(n),R2(n),…,R L (n)}.

The values of R l (n) provide an upper bound on the rank of the N l ×K matrices whose rows are global encoding vectors of N l received packets in y1:l. In other words, R l (n) is the maximum number of source packets in x1:l that can be recovered from y1:l. Using R we can simply determine the set of layers of x the receiver can recover after receiving y. Namely, x1:l can be fully recovered if R l (n)=K l . In addition, x1:lcan be also recovered if any of the larger windows is recovered, i.e., if R m (n)=K m for any l<mL, because larger windows contain smaller ones.

Formally, the upper bound on Pd,l(N) follows by conditioning on na:
P d , l ( N | n ) = = I R l = K l i = l + 1 L j = l i 1 ( R j < K j ) ( R i = K i ) ,
(2)
where I(·) represents an indicator function equal to 1 if its argument, which is a logical expression, is true, otherwise, I(·)=0, and i ( ) ( j ( ) ) is logical or (and) of a sequence of logical expressions parametrized by i (j). Conditioning on n is removed using the prior distribution over n:
P Γ ( ξ ) , N ( n ) = N ! n 1 ! n 2 ! n L ! Γ 1 n 1 Γ 2 n 2 Γ L n L .
(3)
Finally, upper bound on Pd,l(N) is obtained as:
P d , l ( N ) ( n 1 , n 2 , , n L ) : i = 1 L n i = N P Γ ( ξ ) , N ( n ) P d , l ( N | n ) .
(4)

For a given layered source message, Pd,l(N) depends only on the selected window selection distribution Γ(ξ).bTherefore, designing EW RLC codes with desired Pd,l(N) behaviour reduces to the design of appropriate Γ(ξ)[18].

Multi-user video streaming using UEP NC

System model

In this article, we propose the UEP NC as a core component of a real-time multi-user video streaming system. For simplicity, we focus on a wireless cellular network example where N u mobile users ( U 1 , U 2 , , U N u ) participate in a multi-party video conferencing session over a common base station (BS) within a single cell (Figure 4).

Figure 4

Multi-user video streaming system model.

The presented results are applicable in similar scenarios, e.g., if instead of a single BS users connect to different BSs mutually interconnected by high-speed links (e.g., fiber optic) or if instead of a cellular network we observe a Wi-Fi access point.

In our scenario, every user continuously segments its own video stream into groups of frames (GOF), where each GOF contains Ngofframes, and compresses every GOF using a scalable video coder. For each user U i and each compressed GOF, the output of the video coder is a layered source message x(i) that contains K(i)source packets, each of length b bits, organized into L layers, where the l-th layer contains k l ( i ) packets. The values of b and L are the same across all users whereas for each user U i and each GOF, the values K(i)and { k 1 ( i ) , k 2 ( i ) , , k L ( i ) } are in general different.

Video streaming among users in a session may be observed as a GOF-by-GOF exchange process. A single GOF exchange period repeats every Tgof=Ngof/Nfps seconds, where Nfps is the number of frames per second of the video stream, and both Ngofand Nfps are equal across all users. For simplicity, we assume that GOF periods are aligned among different users, i.e., the messages x(i) are synchronously generated by all the users. For every GOF period, the goal of each user is to share its own and collect other users GOFs, or at least as many of their layers starting from the beginning onwards, within a strict delay limits.

During each GOF period, the data exchange process can be divided into two phases. In the first, upload phase, users simultaneously upload their data to the BS, and in the second, broadcast phase, the BS broadcasts the received data to all the users. We assume that orthogonal channels are allocated between each user and the BS, and for the broadcast transmission by the BS, allowing for simultaneous transmission on all channels. Each wireless link is modeled as a synchronous time-slotted packet-erasure link where the fixed size encoded packets of length b bits are transmitted using a fixed transmission data rate R and erasure probability ε. We assume that the pair (R,ε) is in general different on different wireless links, it remains fixed during the transmission period Tgofof a single GOF, and may change between different GOF transmissions. The time slot duration T p =b/R represents a time required for a single encoded packet transmission.c

In the upload phase, to protect data from erasures, the user U i encodes the source message x(i)using the EW RLC scheme defined by a window selection distribution Γ(i)(ξ) and streams the encoded packets using the user rate R(i). The BS recovers the users’ messages using N u independent GE decoders dedicated to different users. After decoding as many user layers as possible, the BS creates its own source message x(BS)that contains all or a subset of users’ message layers. For example, if all N u users are completely recovered, the message x(BS) is of length K (BS) = i = 1 N u K ( i ) packets and contains L layers. The l-th layer x l (BS) comprises the total of k l (BS) = i = 1 N u k l ( i ) packets from the l-th layer data of all N u user messages, x l (BS) = { x l ( 1 ) , x l ( 2 ) , , x l ( N u ) } , as illustrated in Figure 4 for the two-layer scenario. In general, the BS may create the l-th layer x l (BS) from the l-th layers of a subset of users.

In the broadcast phase, the BS applies the EW RLC scheme defined by a window selection distribution Γ(BS)(ξ) over x(BS)and broadcasts the stream of encoded packets using the broadcast rate R(BS). Each user recovers the BS message x(BS)using the GE decoder where, prior to decoding, the user cancels out its own packets from the received encoded packets. We note that the two phases may overlap in time, i.e., the BS may start broadcasting a subset of (already recovered) layers of x(BS)before the upload phase of all users is completed.

Single-link analysis: decoding and delay performance

To analyze the system in Figure 4, we focus on a single-link transmission of UEP RLC coded layered message between any transmitter-receiver pair during a fixed time period T. Instead of layer decoding probabilities after fixed number of received packets, Pd,l(N) (Section “Performance analysis of EW RLC”), we shift our interest to layer decoding probabilities after fixed transmission time, Pd,l(T). However, unlike Pd,l(N), deriving Pd,l(T) requires introduction of a packet-level channel model.

During the period of duration T, the transmission process consists of N T =RT/b encoded packet transmissions (packet slots). We redefine the received sequence y={y1,y2,…, y N T } to describe the outcome of all N T transmissions, where y i may represent either the received encoded packet or a lost (erased) packet. The received sequence y can be described by vector n={n1,n2,…,n L ,n e }, where, as before, n i is the number of received encoded packets obtained by encoding over the i-th EW, N = i = 1 L n i N T is the number of (correctly) received encoded packets during the interval T, and n e =N T N is the number of erased packets.

For a fixed T, the number of received encoded packets N is dependent on the underlying channel packet-loss model. For simplicity, we assume a packet erasure channel model that erases encoded packets independently with erasure probability ε. To obtain Pd,l(T), we use conditioning over n:
P d , l ( T ) = n P Γ ( ξ ) , ε ( n ) P d , l ( T | n ) ,
(5)
where PΓ(ξ),ε(n) is slightly altered version of (3) that accounts for n e erased packet events:
P Γ ( x ) , ε ( n ) = N T ! n 1 ! n 2 ! n L ! n e ! · · [ Γ 1 ( 1 ε ) ] n 1 [ Γ 2 ( 1 ε ) ] n 2 [ Γ L ( 1 ε ) ] n L · ( ε ) n e .
(6)

Pd,l(T|n) can be obtained as the layer decoding probability Pd,l(N|n) after N received encoded packets (Section “Performance analysis of EW RLC”, Equations (1)–(2)) since, by knowing n, we directly obtain N = i = 1 L n i .

The knowledge of Pd,l(T) implicitly provides information on the decoding delay of the l-th message layer. For example, one can search for minimal transmission period T th ( l ) such that the l-th message layer decoding probability P d , l ( T th ( l ) ) > P th ( l ) , where P th ( l ) is the threshold decoding probability set in advance. For more explicit delay information, one can obtain the probability distribution pd,l(N T ) that the l-th message layer is recovered after exactly the N T -th time slot (and cannot be recovered before):
p d , l ( N T ) = = i = 1 N T 1 ( 1 P d , l ( T = i · T p ) ) · P d , l ( T = N T · T p ) ,
(7)
and T p =b/R. The expected delay l [ N T ] for recovery of the l-th message layer is:
l [ N T ] = N T = 1 N T · p d , l ( N T ) .
(8)

The expected delay of recovery of the complete source message [ N T ] for the EW RLC scheme is equal to the recovery delay of the last L-th layer: [ N T ] = L [ N T ] . l [ N T ] is expressed in terms of the number of time slots, but it can be easily converted into absolute time values as l [ T ] = [ N T ] · T p .

Example 1

Let x be a layered source message containing K=60 source packets of size b=3,200 bits (400bytes) divided into L=2 layers: the BL containing k1=20 packets and the EL containing k2=40 packets. The EW RLC scheme defined by Γ ( ξ ) = Γ 1 ξ + ( 1 Γ 1 ) ξ 2 is applied over x producing continuous stream of 400-bytes long encoded packets. The wireless link towards the receiver is modeled as a synchronous rate R=2 Mbit/s link with packet erasure probability ε=0.1. Figure 5 presents the evolution of layer decoding probabilities Pd,1(T) and Pd,2(T) over time T at the receiver for the range of Γ1values. As a baseline scheme, we start from the middle solid curve for Γ1=0, representing standard RLC applied over the whole message, which results in simultaneous recovery of both message layers. By increasing Γ1, we obtain the UEP effect of the EW RLC scheme, where solid curves for the Pd,1(T) gradually shift to the left, i.e., towards earlier recovery of the most important data. The extreme case of Γ1=1 results in the earliest recovery of the most important part represented by the leftmost dashed curve. The increase in Γ1comes at the price of delayed decoding of less important layer Pd,2(T). Figure 6 presents the expected layer decoding delays 1 [ T ] and 2 [ T ] as a function of Γ1. Note that significant decrease of 1 [ T ] with the increase in Γ1initally comes with a relatively small loss in 2 [ T ] . For example, for Γ1=0.5, 1 [ T ] drops from 105 to 62ms (−41%) while 2 [ T ] rises from 105 to 126ms ( + 20%).

Figure 5

P d ,1 ( T ) and P d ,2 ( T ) for EW RLC over the range of Γ 1 values.

Figure 6

E 1 [ T ] and E 2 [ T ] for EW RLC over the range of Γ 1 values.

The single-link analysis can be extended to the scenario where the transmitter changes the applied EW RLC code during the transmission (i.e., switches between different Γ(ξ)). As an example, we derive Pd,l(T) for T>T1, given that the transmitter has applied the EW RLC defined by Γ a (ξ) during 0≤tT1, and the EW RLC defined by Γ b (ξ) for t>T1. For the link parameters R and ε, the transmitter sends N1=R T1/b encoded packets using Γ a (ξ) and N2=R(TT1)/b encoded packets using Γ b (ξ). The received sequence y=[y1y2] is a concatenation of two sequences y1and y2 of length N1 and N2, respectively. It can be described by the vector n = n 1 n 2 , where the vectors n1and n2 represent the description of y1 and y2 respectively (as defined earlier), and is the component-wise sum of two equal-length vectors. Since n1and n2 follow probability distributions P Γ ( a ) ( x ) , ε ( n 1 ) and P Γ ( b ) ( x ) , ε ( n 2 ) given by Equation (6), the layer decoding probabilities are obtained as in (5):
P d , l ( T ) = = n 1 n 2 P Γ ( a ) ( x ) , ε ( n 1 ) P Γ ( b ) ( x ) , ε ( n 2 ) · P d , l ( T | n ) ,
(9)

where Pd,l(T|n) follows from P d , l ( N = i = 1 L n i | n ) .

Example 2

We continue Example 1 by investigating the evolution of Pd,1(T) and Pd,2(T) over time T at the receiver if the transmitter applies Γ a (ξ)=0.5ξ + 0.5ξ2for the first T1=125 ms, and then changes to Γ b (ξ)=0.1ξ + 0.9ξ2(the remaining parameters are the same as in the previous example). Figure 7 compares the case where Γ a (ξ) changes to Γ b (ξ) with the case where Γ a (ξ) is used throughout the transmission. Figure illustrates that the Γ(ξ) change has no effect on Pd,1(T) as it comes too late (Pd,1(T1=0.125)=1), whereas the improvement of Pd,2(T) for T>T1is notable due to the increase in the second window selection probability from 0.5 to 0.9, which points out to possible adaptive (e.g., feedback-driven) updates of Γ(ξ) during transmission. In addition, Figure 8 demonstrates that the upper bound expressions for Pd,l(T) used in this article match very well the exact calculation of Pd,l(T) for a finite field size GF(28) (markers) [18].

Figure 7

P d ,1 ( T ) and P d ,2 ( T ) for EW RLC which changes Γ 1 ( a ) = 0 .5 to Γ 1 ( b ) = 0 . 1 at the time instant T 1 =0.125.

Figure 8

Time-diagram of different transmission phases in the system model.

Finally, analytical results presented above can be easily extended to Gilbert–Elliot erasure channel model with two states: the good and the bad state. This follows from the fact that the probability distribution of the number n e of erasures over N T consecutive realizations of the channel (i.e., over time interval T) is known (e.g., see [26]). The remaining N T n e non-erased channel realizations deliver encoded symbols from different EWs according to the multinomial distribution law (3).

Distortion-optimal system design

In this section, we apply the single-link anaysis to analyze the multi-user video conferencing setup introduced in Section “System model”. Our goal is to formulate the system design problem that leads to the EW RLC code design providing distortion-optimal system performance. As a distortion measure, we use peak signal to noise ratio (PSNR) as a standard video quality metric following directly from the mean square error (MSE) distortion measure. We use the terms video quality and distortion interchangeably while refering to video quality (PSNR) measure.

We focus on a single message (GOF) exchange cycle among the system users. After the initial delay of t=Tgof needed for each user to acquire and compress Ngofframes of video (assuming negligible compression delay), the users start the upload phase. During the upload phase, the BS waits to receive enough encoded packets to recover the users’ messages or as many of their layers with sufficiently high probability. The upload phase duration T ul is upper bounded by the GOF period duration Tgofas, after this period expires, users are supplied with a new set of compressed messages, which marks the beginning of the upload phase of the next message exchange cycle. From the set of recovered layers, the BS creates its own message x(BS) which is broadcasted back to the users over the broadcast downlink channel during the broadcast phase of duration T dl . To simplify the analysis, we assume the upload and the broadcast phase do not overlap, i.e., after the upload phase of duration T ul , the users stop transmitting and start listening the BS for the following period of duration T dl . This analysis provides guaranteed (lower-bound) performance for the overlapping phases case, as we discuss later.

We are interested in the system design that maximizes the total average received video quality at the user terminals after a given target system delay T=Tgof + T ul + T dl . Note that, as Tgofis constant and T is given in advance, it follows that the system design should optimally balance between the T ul and T dl (T ul + T dl =TTgof=const.) The time diagram of the system model, ignoring the propagation and data processing delays, is illustrated in Figure 8.

Upload phase

The upload phase is represented by N u parallel and independent single-link transmission processes, each characterized by different message/layer sizes ( K ( i ) , { k 1 ( i ) , k 2 ( i ) , , k L ( i ) } ) and channel state pairs (R(i),ε(i)). Assuming that the user knowsd(R(i),ε(i)) and that the value of T ul is fixed in advance by the BS, the set of layer decoding probabilities P d , l ( i BS ) ( T ul ) of the i-th user message at the BS can be calculated for any selected EW RLC parameter Γ(i)(ξ). In the following, we focus on a simple user upload strategy where the user applies standard RLC over the largest window l such that the decoding probability P d , l ( i BS ) ( T ul ) > P th 1 (if such exists), where Pthis a (close to one) value of threshold probability selected in advance. More formally, the i-th user will apply RLC only over the l(i)-th window, where l(i) is obtained as:
l ( i ) = max l : P d , l ( i BS ) ( T ul ) > P th .
(10)

Note that applying RLC only over the l(i)-th window is equivalent to the special case of applying UEP RLC with the window selection distribution Γ ( i ) ( ξ ) = ξ l ( i ) (i.e., the one which places probability one on the l(i)-th window).

Overall, the set of N u users will upload the subset of their layers, jointly described by vector l = { l ( 1 ) , l ( 2 ) , , l ( N u ) } , within the upload phase of duration T ul . The probability Pth can be selected so as to keep the overall probability P d , l (BS) ( T ul ) P th N u that the BS will recover the set of users’ layers described by l during the upload phase of duration T ul sufficiently high.

Broadcast phase

During the broadcast phase, the BS applies the EW RLC code defined by Γ(BS)(ξ) over the BS message x(BS)(l), which is determined by the set of uploaded user layers l. From l, one can easily obtain the BS message size parameters ( K (BS) , { k 1 (BS) , k 2 (BS) , , k L (BS) } ). The broadcast phase can be also analyzed using the single-link analysis applied on the parameters of the broadcast transmission, as seen by each of the system users. In other words, given Γ(BS)(ξ), x(BS)(l) and the BS-to-user-i (BS-i) transmission link parameters (R(BS) { ε BS i } 1 i N u ), the single-link analysis provides the set of layer decoding probabilities P d , l ( BS i ) ( T dl ) , 1 l L , describing the i-th user capability to recover the layers of the BS message x(BS)(l) after the broadcast phase. Thus the broadcast phase reduces to the EW RLC design problem for multicast/broadcast setup that aims to simultanously satisfy heterogeneous user link conditions ( { ε BS i } 1 i N u ). This problem has been recently addressed for expanding window fountain (EWF) code design in video multicast setup [27], however, with the difference that in this article, instead of broadcasting a single stream, the BS simultaneously broadcasts a mixture of N u user streams.

Each user simultaneously receives N u −1 video streams originating at the remaining system users. The average received video quality D(i)perceived by the i-th user is obtained by averaging over the received video qualities of all N u −1 video streams:
D ( i ) = 1 N u 1 j = 1 , j i N u D j ( i ) ,
(11)
where D j ( i ) is the average received PSNR of the j-th user video content as perceived by the i-th user. D j ( i ) can be obtained by combining the results of the upload and the broadcast phase analysis:
D j ( i ) = P d , l (BS) ( T ul ) l = 1 l ( j ) P d , 1 : l ( BS i ) ( T dl ) · D j , 1 : l ,
(12)
where the sum is taken over the set of l(j)≤L layers of the j-th user included in x(BS)(l). In the above expression, P d , 1 : l (BS-i) ( T dl ) is the probability that exactly the first l layers of the BS message x(BS)(l) are recovered at the user i:
P d , 1 : l ( BS i ) ( T dl ) = 1 P d , 1 ( BS i ) ( T dl ) , l = 0 P d , l ( BS i ) ( T dl ) · · ( 1 P d , l + 1 ( BS i ) ( T dl ) ) , 0 < l < L P d , l ( BS i ) ( T dl ) l = L ,
(13)
and Dj,1:l is the average received PSNR of the j-th user video content after recovery of the first l layers (averaged over all the frames of the transmitted GOF). Finally, the average received PSNR, averaged across all the users of the multi-user video streaming session, is equal:
D = 1 N u i = 1 N u D ( i ) .
(14)

System parameters and design

From (11)–(14), by factoring out P d , l (BS) ( T ul ) , we note that the distortion-optimized system design allows for independent design of the upload and the broadcast phase, given the duration T ul and decoding probability threshold Pthare fixed. In other words, by fixing and informing the users on the values of T ul and Pth, the set of layers l(T ul ) that can be reliably uploaded by users in the upload phase can be determined by corresponding users. Consequently, the BS message x(BS)(l) and T dl =TTgofT ul is also determined, which reduces the broadcast phase design to optimization of the EW RLC code parameter Γ(BS)(ξ) such that the average received video quality D is maximized after the target system delay T.

Overall, for the distortion-optimized system design, the BS should optimally balance between the upload and the broadcast phase by selecting appropriate T ul , appropriate threshold probability Pth, and optimally satisfy heterogeneous user requirements by selecting optimized Γ(BS)(ξ). The optimal solution weights between the number of layers that could be uploaded to the BS with reliability Pth after T ul and their quality of reconstruction at the set of heterogeneous users after T dl .

System optimization and results

System optimization

For the system model and distortion-optimized design discussed above, the system optimization process is performed centrally, e.g., at the video conference server collocated with the central BS node. Given the parameters of all the user messages ( K ( i ) , { k 1 ( i ) , k 2 ( i ) , , k L ( i ) } ), uplink channel conditions (R(i),ε i ) and the broadcast channel conditions (R(BS), { ε BS i } 1 i N u ), the BS should provide the duration of the upload phase T ul , the threshold probability Pth and the EW RLC code design parameter Γ(BS)(ξ), such that the average received PSNR D is maximized after the target system delay T. In other words, the BS solves the following problem:
max T ul , P th , Γ (BS) ( ξ ) D ,
(15)

where 0 T ul min { T gof , T T gof } and for Γ(BS)(ξ) we have 0 Γ l (BS) 1 , 1≤lL and l = 1 L Γ l (BS) = 1 .

Assuming that the BS knows the channel conditions (e.g., by measurements and user reporting), it still needs to know the user message parameters ( K ( i ) , { k 1 ( i ) , k 2 ( i ) , , k L ( i ) } ) to be able to perform the above optimization. Since these data cannot be obtained instantaneously at the BS, to avoid delays, we assume that the BS uses information available from recent GOF exchanges (e.g., the last GOF or the average over last several GOFs). This way, the BS is able to perform system optimization prior to the start of the upload phase and to broadcast the required parameters T ul and Pth back to the users. The users then determine the number of layers l(i)to upload to the BS and start the upload phase.

In general, the complexity of calculation of the set of layer decoding probabilities in Sections “Performance analysis of EW RLC” and 2 grows exponentially, due to an exponential number of terms in sums given in Equations (4) and (5), as K and L grows. However, in practical applications, the calculations are tractable due to the fact that K, L and N u are usually small. For example, K is already bounded by GE decoding complexity and should not exceed K100; the number of scalable video layers is typically small, e.g., L<5; and for comfortable use of real-time multi-user video conferencing system, N u should also be small, e.g., N u <5. (note that N u can be larger as long as each user displays only a small subset of active user streams). With the restrictions on K, L and N u , the optimization problem can be evaluated at the BS side server with acceptable delay. Alternatively, the BS may run optimization less frequently then on a GOF-by-GOF basis, using accumulated averages of channel conditions and GOF message lenghts and periodically update the users and the BS transmitter with the new values of (T ul ,Pth) and Γ(BS)(ξ), respectively.

Design examples

The multi-user video streaming system design proposed in this article is illustrated using numerical examples.

Example 3

In this example, we present a distortion-optimized UEP NC solution for the multi-user video conferencing system with N u =4 users (Figure 4). We assume users perform real-time exchange of H.264/SVC compressed CIF Stefan, Foreman, News and Coastguard sequences (352×288, Nfps=30), each user sharing a different video sequence. Users encode the sequences into L=2 quality layers (BL and one EL) using the coarse-grain scalable (CGS) coding feature. The GOF size is set to very low value of Ngof=4 in order to reduce the start-up coding delay Tgof=4/30=133ms to the acceptable value. The parameters of the obtained layered source messages, after H.264/SVC compression and averaged over the frames of a sample GOF we use for optimization, are given in Table 1.

Table 1

Parameters of H.264/SVC sequences ( L =2, N gof =4)

Sequence/layers

Number of packets

Bit rate

Y-PSNR

 
 

b= 3,200 [bits]

[kbps]

[dB]

 

Stefan BL

k 1 ( 1 ) =20

476.68

28.44

 

Stefan BL + EL

K(1)=60

1432.56

34.53

 

Foreman BL

k 1 ( 2 ) =12

282.68

33.62

 

Foreman BL + EL

K(2)=42

992.56

38.63

 

News BL

k 1 ( 3 ) =16

379.68

33.47

 

News BL + EL

K(3)=40

948.56

38.36

 

Coast BL

k 1 ( 4 ) =20

474.68

30.32

 

Coast BL + EL

K(4)=64

1522.56

34.69

 

For the uplink channel parameters, for each user, we select rate values around 2Mbps and erasure probabilities in the range ε=0.05−0.15: (R(1)=1.5Mbps, ε1=0.07), (R(2)=1.8Mbps, ε2=0.15), (R(3)=2.3Mbps, ε3=0.05) and (R(4)=1.5Mbps, ε4=0.12), to account for the variations in particular uplink conditions. The BS broadcast rate is set to R(BS)=6Mbps and, for simplicity, the broadcast erasure rates towards each user are set equal to the erasure rates of the corresponding uplink channels, i.e., ε BSi=ε i .

Given the system parameters above, we seek for the optimal system parameters ( T ul , Γ (BS) ( ξ ) ) such that the average received PSNR D across all system users is maximized after the target delay T=250 ms. For simplicity, we fix Pth=0.99. The solution is illustrated in Figure 9 where average PSNR is plotted as a (two-dimensional) function of ( T ul , Γ 1 (BS) ). The system achieves the best average performance for T ul =66ms where all users are able to share at least their BL, while user 3 is able to upload both layers to the BS, i.e., l=(1,1,2,1). For optimal T ul =66ms, a separate (lower) graph shows the system performance for different EW RLC codes at the BS. Although the maximum of D is achieved over the range of first window selection probabilities Γ 1 (BS) , it is favourable to select as large Γ 1 (BS) as possible to reduce the decoding delay for the first layer, while still maintaining high probability of recovery of the second layer of x(BS) at all users.e

Figure 9

Two-layer multi-user video conferencing optimization example.

Table 2 illustrates the average decoding delays for upload and broadcast phase transmisssions for the set of uploaded layers l=(1,1,2,1) and the solution point ( T ul , Γ 1 (BS) )=(0.066,0). We can easily note that the sum of maximum delays experienced during the upload/broadcast phase closely satisfies the delay limits imposed by the system: 2 ( 2 BS ) [ T ] + 2 ( BS 3 ) [ T ] = 108 . 78 < 117 = T T gof , where maximum upload delay is below selected upload duration 2 ( 2 BS ) [ T ] = 58 . 58 < 66 = T ul . This points out to the possibility of approximated system design using expected delay calculations.

Example 4

Table 2

Expected delays in Example 1

Upload

l [ T ] [ms]

Broadcast

l [ T ] [ms]

transmission

 

transmission

 

U1-BS

1 [ T ] = 45 . 80

BS-U1

2 [ T ] = 41 . 29

U2-BS

1 [ T ] = 25 . 09

BS-U2

2 [ T ] = 50 . 20

U3-BS

2 [ T ] = 58 . 58

BS-U3

2 [ T ] = 29 . 20

U4-BS

1 [ T ] = 48 . 48

BS-U4

2 [ T ] = 43 . 36

Additional flexibility in the system design is obtained if the users compress their video streams into larger number of layers. In this example, we observe the performance of the distortion-optimized system design for the same transmission parameters as in the previous example, but where the layered source message is compressed into L=4 quality layers (see Table 3 for the message parameters). The system performs optimally for T ul =64ms where users are able to upload the set of layers l=(2,3,3,2), where Pth=0.99 is assumed fixed. The EW RLC broadcast phase parameters that achieve the optimal value D=34.88 are for the window selection distribution Γ(BS)(ξ)=0.5ξ + 0.5ξ3. We note that the gain obtained in average system distortion D is not large, due to the fact that compressing video into larger number of layers introduces small performance penalties, but the system flexibility reflected through better layer resolution provides more options for the system design process.

Table 3

Parameters of H.264/SVC sequences ( L =4, N gof =4)

Sequence/

Number of packets

Bit rate [kbps]

Y-PSNR

layers

b=3,200 [bits]

 

[dB]

Stefan BL

k 1 ( 1 ) =15

356.22

25.89

Stefan BL + EL

K 2 ( 1 ) =24

567.46

28.15

Stefan BL + 2EL

K 3 ( 1 ) =40

951.18

30.65

Stefan BL + 3EL

K 4 ( 1 ) =64

1522.05

33.23

Foreman BL

k 1 ( 2 ) =7

162.68

29.45

Foreman BL + EL

K 2 ( 2 ) =13

309.87

32.3

Foreman BL + 2EL

K 3 ( 2 ) =24

569.93

34.52

Foreman BL + 3EL

K 4 ( 2 ) =48

1150.6

38.41

News BL

k 1 ( 3 ) =10

235.67

28.99

News BL + EL

K 2 ( 3 ) =19

448.66

32.55

News BL + 2EL

K 3 ( 3 ) =32

759.32

35.21

News BL + 3EL

K 4 ( 3 ) =50

1118.19

38.05

Coast BL

k 1 ( 4 ) =7

164.76

26.66

Coast BL + EL

K 2 ( 4 ) =16

378.52

28.95

Coast BL + 2EL

K 3 ( 4 ) =33

787.74

30.74

Coast BL + 3EL

K 4 ( 4 ) =60

1435.02

33.55

Decode-and-broadcast versus buffer-and-broadcast

In the proposed system, we apply decode-and-broadcast operation in central multi-user video streaming point: the uploading streams are firstly decoded and then broadcasted within the non-overlapping broadcasting stage. Clearly, this approach simplifies applications of our analytical tools and enables simple and elegant system design, however, improvements are possible if the broadcast phase is initialized before the incoming user messages are completely recovered. A possible improvements are shortly commented below.

Layer-by-layer decode-and-broadcast

Let us assume the upload phase where a general UEP RLC is applied instead of the specific RLC case that encodes the largest window decodable within T ul . In this case, the unequal recovery time (URT) property enables the central point to decode user layers sequentially over time, starting from the BL onwards [18]. Thus the central point is able to produce encoded packets as soon as the BL of the message x(BS)is decoded and include additional layers as soon as they become available while updating the broadcast EW RLC code parameter Γ(BS)(ξ) “on the fly,” as illustrated in Example 2. We note that this scenario introduces a trade-off between increase in the upload delays of higher layers and decrease in the beginning of the broadcast phase, which has to be balanced by the optimal solution. Unfortunately, the distortion optimized system design for this scenario would result in tedious optimization problem, which is why we leave it out of consideration. However, we note that expected delay analysis, similar to the one presented in Table 2, could be used as a simple approximation for the layer-by-layer decode-and-broadcast system design.

Buffer-and-broadcast

Finally, the simplest buffer-and-broadcast solution follows the standard NC approach in which all the received encoded packets are buffered, and new encoded packets produced by applying RLC over the buffer content [6, 7]. In the proposed UEP RLC case, the central point maintains L separate buffers, each collecting encoded packets of different users produced over one of the L windows. As soon as the upload phase starts filling the buffers, the broadcast phase starts producing encoded packets where each encoded packet results from applying RLC over one of the buffers selected independently by the appropriate window (i.e., buffer) selection distribution Γ(BS)(ξ). Although very efficient, this solution lacks efficient analysis and distortion-based optimization tools. In addition, the problem of broadcasting linearly dependent encoded packets may become significant as the upload user rates decrease and broadcast rate increases (i.e., the rate of encoded packets generation exceeds the rate of incoming source data).

Conclusions

Real-time sharing of video content among multiple users over wireless networks is underlying a number of existing and upcoming mobile multimedia services. For robust, flexible and efficient implementation of such services, this article considered a combination of scalable video coding and UEP NC. We have presented analytical tools capable of producing the values of key system design parameters that result in the distortion-optimal system performance. The applications of the proposed tools are illustrated through several examples involving a simple single access point multi-user scenario.

Endnotes

a For compactness, we denote R l (n) as R l .

b Note that, due to the probabilistic encoding, the decoding performance is independent of the packet erasure process in the channel and depends only on the number N of received packets.

c This model roughly captures the behaviour of adaptive modulation and coding (AMC) at the physical layer of cellular systems where, depending on the channel quality feedback available at the BS, different AMC modes could be approximated by different (R,ε) pairs. We assume slowly-varying channels where AMC mode changes are of the order of Tgof.

d In state-of-the-art wireless cellular broadband systems such as LTE or WiMAX, channel quality indicators (CQI) are continuously fed back by user equipment to the BS.

eFor presentation purpose, Figure 9 is obtained by brute-force calculation over a grid of points in ( T ul , Γ 1 (BS) ) space. In general, (one of) the optimal solution(s) can be obtained by applying nonlinear programming methods such as sequential quadratic programming (e.g., using MATLAB).

Declarations

Acknowledgements

Dejan Vukobratovic was supported by a Marie Curie European Reintegration Grant FP7-PEOPLE-ERG-2010 ”MMCODESTREAM” within the 7th European Community Framework Programme.

Authors’ Affiliations

(1)
Department of Power, Electronics and Communication Engineering, University of Novi Sad
(2)
Department of Electronic and Electrical Engineering, University of Strathclyde

References

  1. Shiang H, van der Schaar M: Multi-user video streaming over multi-hop wireless networks: a distributed, cross-layer approach based on priority queuing. IEEE J. Sel. Areas Commun 2007, 25(4):770-785.View ArticleGoogle Scholar
  2. Zhu X, Agrawal P, Pal Singh J, Alpcan T, Girod B: Rate allocation for multi-user video streaming over heterogenous access networks. ACM MULTIMEDIA ’07, 2007, 37-46. (Augsburg, Germany)Google Scholar
  3. Ahlswede R, Cai N, yen Robert Li S, Yeung RW: Network information flow. IEEE Trans. Inf. Theory 2000, 46(4):1204-1216. 10.1109/18.850663View ArticleMathSciNetMATHGoogle Scholar
  4. yen Robert Li S, Yeung RW, Cai N: Linear network coding. IEEE Trans. Inf. Theory 2003, 49(2):371-381.View ArticleMathSciNetMATHGoogle Scholar
  5. Ho T, Medard M, Koetter R, Kargerm DR, Effros M, Shi J, Leong B: A random linear network coding approach to multicast. IEEE Trans. Inf. Theory 2006, 52(10):4413-4430.View ArticleMathSciNetMATHGoogle Scholar
  6. Chou PA, Wu Y, Jain K: Practical network coding. Allerton 2003 Conference, 2003.Google Scholar
  7. Lun DS, Medard M, Koetter R, Effros M: On coding for reliable communication over packet networks. Phys. Commun 2008, 1: 3-20. 10.1016/j.phycom.2008.01.006View ArticleGoogle Scholar
  8. Gkantsidis C, Rodriguez P: Network coding for large scale content distribution. IEEE INFOCOM 2005 2005, 2235-2245. (Miami, FL, USA)Google Scholar
  9. Chou P, Wu Y: Network coding for the internet and wireless networks. IEEE Signal Process. Mag 2007, 24(5):77-85.View ArticleGoogle Scholar
  10. Magli E, Frossard P: An overview of network coding for multimedia streaming. IEEE ICME 2009 2009, 1488-1491. (New York, NY, USA)Google Scholar
  11. Zhao J, Yang F, Zhang Q, Zhang Z, Zhang F: LION: layered overlay multicast with network coding. IEEE Trans. Multimed 2006, 8(5):1021-1032.MathSciNetView ArticleGoogle Scholar
  12. Wang M, Li B: R2: random push with random network coding in live peer-to-peer streaming. IEEE J. Sel Areas Commun 2007, 25(9):1655-1666.View ArticleGoogle Scholar
  13. Seferoglu H, Markopoulou A: Video-aware opportunistic network coding over wireless networks. IEEE J. Sel. Areas Commun 2009, 27(5):713-728.View ArticleGoogle Scholar
  14. Thomos N, Frossard P: Network coding of rateless video in streaming overlays. IEEE Trans. Circ. Syst. Video Techn 2010, 20(12):1834-1847.View ArticleGoogle Scholar
  15. Wang H, Chang R, Kuo CCJ: Wireless multi-party video conferencing with network coding. IEEE ICME 2009 2009, 1492-1495. (New York, NY, USA)Google Scholar
  16. Ponec M, Sengupta S, Chen M, Li J, Chou PA: Multi-rate peer-to-peer video conferencing: a distributed approach using scalable coding. IEEE ICME 2009, 2009, 1406-1413. (New York, NY, USA)Google Scholar
  17. Zhang H, Zhou J, Chen Z, Li J: Minimizing delay for video conference with network coding. ACM SIGCOMM 2009 2009. (Barcelona, Spain)Google Scholar
  18. Vukobratović D, Stanković V: Unequal error protection random linear coding strategies for erasure channels. IEEE Trans. Commun 2012, 60(5):1243-1252.View ArticleGoogle Scholar
  19. Wu Y, Chou P, Kung SY: Information exchange in wireless networks with network coding and physical-layer broadcast. Proc. CISS 2005 2005. (Baltimore, MD, USA)Google Scholar
  20. Katti S, Rahul H, Hu W, Katabi D, Medard M, Crowcroft J: XORs in the air: practical wireless network coding. ACM SIGCOMM 2006 2006, 243-254. (Pisa, Italy)Google Scholar
  21. Horn U, Stuhlmuller K, Link M, Girod B: Robust internet video transmission based on scalable coding and unequal error protection. Signal Process. Image Commun 1999, 15: 77-94. 10.1016/S0923-5965(99)00025-9View ArticleGoogle Scholar
  22. Stankovic V, Hamzaoui R: Live video streaming over packet networks and wireless channels. IEEE Packet Video 2003, 2003. (Nantes, France)Google Scholar
  23. Maani E, Katsaggelos AK: Unequal error protection for robust streaming of scalable video over packet lossy networks. IEEE Trans. Circ. Syst. Video Tech 2010, 20(3):407-416.View ArticleGoogle Scholar
  24. Shojania H, Li B: Random network coding on the iPhone: fact or fiction? ACM NOSSDAV 2009, USA, 2009, 37-42. (Williamsburg, VA, USA)Google Scholar
  25. Vingelmann P, Fitzek F, Pedersen M, Heide J, Charaf H: Synchronized multimedia streaming on the iPhone platform with network coding. IEEE CCNC 2011, USA, 2011, 875-879. (Las Vegas, NV, USA)Google Scholar
  26. Wilhelmsson L, Milstein LB: On the effect of imperfect interleaving for the Gilbert–Elliott channel. IEEE Trans. Commun 1999, 47(5):681-688. 10.1109/26.768760View ArticleGoogle Scholar
  27. Vukobratovic D, Stankovic V, Sejdinovic D, Stankovic L, Xiong Z: Scalable video multicast using expanding window fountain codes. IEEE Trans. Multimed 2009, 11(6):1094-1104.View ArticleGoogle Scholar

Copyright

© Vukobratović and Stanković; licensee Springer. 2012

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.