Skip to main content

Joint source-channel coding and optimization for mobile video streaming in heterogeneous wireless networks


This paper investigates mobile video delivery in a heterogeneous wireless network from a video server to a multi-homed client. Joint source-channel coding (JSCC) has proven to be an effective solution for video transmission over bandwidth-limited, error-prone wireless networks. However, one major problem with the existing JSCC approaches is that they consider the network between the server and the client as a single transport link. The situation becomes more complicated in the context of multiple available links because involving a low-bandwidth, highly lossy, or long-delay wireless network in the transmission will only degrade the video quality. To address the critical problem, we propose a novel flow rate allocation-based JSCC (FRA-JSCC) approach that includes three key phases: (1) forward error correction redundancy estimation under loss requirement, (2) source rate adaption under delay constraint, and (3) dynamic rate allocation to minimize end-to-end video distortion. We present a mathematical formulation of JSCC to optimize video quality over multiple wireless channels and provide comprehensive analysis for channel distortion. We evaluate the performance of FRA-JSCC through emulations in Exata and compare it with the existing schemes. Experimental results show that FRA-JSCC outperforms the competing models in improving the video peak signal-to-noise ratio as well as in reducing the end-to-end delay.

1 Introduction

In the past few years, mobile video streaming (e.g., Youtube [1] and Hulu [2]) has become one of the most popular applications, and video traffic headed for handheld devices (e.g., smart cell phones and iPad) has experienced explosive growth. According to the Cisco Visual Index [3] report, video streaming accounts for 57% of mobile data usage in 2012 and will reach 69% by the year 2017. Global mobile data is expected to increase 13-fold between 2012 and 2017. Furthermore, high-definition video has surpassed the standard definition video by the end of 2012 and will comprise 79% of video traffic by 2016.

Although the proliferation of wireless infrastructures has offered the users with many access options (e.g., cellular networks, wireless local area network (WLAN), and Worldwide Interoperability for Microwave Access (WiMAX)), it is still a challenging problem to efficiently provide mobile video streaming services due to performance limitations of single wireless networks. Current WLAN systems cannot provide satisfactory quality of video streaming services due to the small coverage and relatively limited bandwidth as the number of mobile users increases [4, 5]. Even worse, WLAN systems are not robust enough to sustain user mobility [6, 7]. On the other hand, cellular networks, e.g., Universal Mobile Telecommunications System (UMTS) and HSDPA, can provide more robust wireless connections to mobile users. However, their bandwidth is not adequate to support high-quality video streaming with stringent bandwidth requirements [6]. Although 4G LTE and WiMAX can provide a much higher peak data rate and extended coverage, they are not widely deployed yet and the bandwidth limitation will still become a problem as the wireless spectrum is shared by many users [7]. The performance limitations of single wireless networks naturally turn research attentions to aggregate the bandwidth of heterogeneous wireless networks, and it has already attracted considerable research attentions [810]. Conventionally, these bandwidth aggregation algorithms are designed for dynamically allocating video flows with seldom considerations in inherent channel errors and fluctuations, which can significantly impact on the streaming video quality [6, 11].

To address the challenging problems, joint source-channel coding (JSCC) has proven to be an effective solution in designing error-resilient wireless video transmission systems [12, 13]. However, one major problem with the existing JSCC approaches (e.g., [14, 15]) is that they consider the network between the server and the client as a single transport link [16]. The problem becomes more complicated in the context of integrated heterogeneous wireless networks, in which multiple access networks may be simultaneously available. In [17], Jurca et al. studied on the physical path selection and source rate allocation for video streaming over multi-path networks, and experimental results show that video streaming through only certain reliable wireless networks gives better video quality than that through all possible wireless networks. The problem statement is presented in Figure 1, and it can be illustrated that involving an unreliable wireless access network in the transmission during the client mobility will only degrade the user-perceived video quality.

Figure 1
figure 1

Illustration of a mobile video streaming service in a heterogeneous wireless network. In location 1, the user experiences video glitches as the cellular link is unable to support the video streaming well. Then, the user requests to the video server and simultaneously connects to the WLAN access point in location 2. However, the video quality further degrades in the dual mode as the WLAN link is unstable. Then, in location 3, the user switches the WLAN to WiMAX, and the perceived quality is better than that in locations 1 and 2.

Motivated by optimizing the JSCC for mobile video delivery in heterogeneous wireless networks, we propose a flow rate allocation-based JSCC (FRA-JSCC) approach in this work. By the term ‘flow rate allocation’, we mean dynamically picking the appropriate wireless access networks and assigning the transmission rates to each of them. First, the video source rate adaption scheme is designed to satisfy the delay requirements of real-time video applications. Second, forward error correction (FEC) redundancy estimation is performed to meet the tolerable loss rate. Third, a simple but effective search algorithm for flow rate allocation is proposed to minimize end-to-end video distortion. Specifically, the contributions of this paper can be summarized in the following:

  •  An efficient end-to-end video delivery scheme in integrated heterogeneous wireless networks that uses JSCC in conjunction with flow rate allocation in order to improve the perceived video quality.

  •  A mathematical model of JSCC to minimize the end-to-end video distortion over multiple wireless channels. The channel distortion is comprehensively analyzed with both transmission and overdue loss.

  •  Extensive semi-physical emulations in Exata with the real-time H.264 video streaming. Experimental results show that (1) FRA-JSCC improves the average video peak signal-to-noise ratio (PSNR) by up to 3.5, 8.45, and 11 dB compared to the fountain code-based virtual path (FCVP) [6], joint multimedia-FEC rate (JMFR) [16], and dynamic multi-path (DMP) [18]; (2) FRA-JSCC reduces the average end-to-end delay by up to 20.8, 11.5, and 40.3 ms compared to the FCVP, JMFR, and DMP; (3) FRA-JSCC mitigates the effective loss rate by up to 6.05%, 10.5%, and 15.5% compared to the FCVP, JMFR, and DMP.

The remainder of this paper is organized as follows: in Section 2, we briefly discuss the related work. Section 3 presents the system model and problem formulation. In Section 4, we describe the design of the proposed FRA-JSCC in detail. The performance evaluation is provided in Section 5. Conclusion remarks are given out in Section 6. The basic notations used throughout this paper are listed in Table 1.

Table 1 Basic notations used in this paper

2 Related work

The related work to this paper can be generally categorized into two branches: joint source-channel coding and video delivery in heterogeneous wireless networks. We will discuss on each topic respectively in this section.

2.1 Joint source-channel coding

In summary, the JSCC problem includes joint coding and optimal rate calculation for video coding and channel coding, which provides various protection level to the video data according to its level of importance and channel conditions. Most of the related work in video transmission focus on (1) finding an optimal bit rate for video coding and channel coding, e.g., [19, 20]; (2) designing the video coding mechanism to achieve the target source rate under given channel conditions, e.g., [21]; (3) designing the channel coding to achieve the required reliability, e.g., low-density parity check [22], turbo [23], Reed-Solomon (RS) [24], and fountain [25] codes; (4) designing joint optimization framework, including all available error control components together with error concealment and transmission control, to improve global system performance, e.g., [26]. The authors of [14] deal with the optimal allocation of MPEG-2 encoding and media-independent forward error correction rates under the total given bandwidth. They define optimality in terms of minimum perceptual distortion given a set of video and network parameters. They compute the network error parameters after FEC decoding and derive the global set of equations that lead to optimal dynamic rate allocation. In a more recent work [13], Ji et al. studied on the optimization approach of JSCC for layered video broadcasting to heterogeneous devices. The objective is to achieve maximum overall receiving quality of the heterogeneous quality of service (QoS) receivers.

All these works consider the network as a single transport link between the server and the client. They do not address multi-path streaming scenarios, where more than one network path is allocated to the application. Different from previous JSCC approaches, Jurca et al. [16] researched on the optimal FEC scheme and layer selection in multi-path scenario. This approach uses a multi-layer coded video stream, and the base-layer stream is protected by duplicated transmission using multiple physical paths during the handoff. However, the major flaw is that it is generally under the assumption that all the wireless networks are reliable for improving the overall video quality and thus lacks effective network selection algorithm.

2.2 Video delivery in heterogeneous wireless networks

Video delivery in heterogeneous wireless networks has recently attracted much attention, and the general review can be referred to [27, 28]. In the Earliest Delivery Path First [8] algorithm, it takes into account the available bandwidth, propagation delay, and video frame size to estimate the arrival time and aims to find an earliest path to deliver the video packet. The load balancing algorithm (LBA) [10] performs stream adaption in response to varying network status by only transmitting those packets which are estimated to arrive at the client within the decoding deadline and conserves bandwidth by dropping packets that cannot be decoded because they rely on previous packets that have been dropped. A packet prioritization scheme in LBA gives a higher weight to I frames over B and P frames and also to base layer packets over enhancement layer packets. The LBA scheduler sorts packets according to priority weighting and sacrifices lower priority packets to ensure the delivery of those with a higher priority. Song et al. [9] propose a probabilistic multi-path transmission (PMT) scheme, which sends video traffic bursts over multiple available channels based on a probability generation function of packet delay. PMT is not robust to client mobility as it does not dynamically adjust the flow splitting probability according to time-varying channel status. Han et al. [6] proposed an end-to-end virtual path construction system over heterogeneous wireless networks based on fountain code. The goal of this system is to maximize the encoding bit rate on the basis of aggregate bandwidth as well as overcoming the channel loss. However, the big block size of fountain code will lead to a long delay, which is not appropriate for real-time video streaming over the bandwidth-limited and time-varying wireless networks.

Besides, encoded multi-path streaming (EMS) [29] and multi-path loss tolerant (MPLOT) [30] are typical protocols exploiting path diversity in wired/wireless multi-path networks based on erasure code. EMS scheme splits traffic loads over multiple paths according to the path loss rate and dynamically adjusts FEC redundancy. However, EMS was generally under the assumption that all the available paths could be beneficial for the transmission as in [16]. MPLOT is a transport protocol that aims at maximizing the throughput of the upper layer application. However, MPLOT cannot guarantee real-time video delivery as it does not address tight delay constraints.

3 System model and problem formulation

The system model for the proposed FRA-JSCC is depicted in Figure 2. We consider a scenario of a heterogeneous wireless network integrating wireless connections from a single video server to a single destination node. This system involves the models for network path, end-to-end video distortion, and forward error correction. Parts of this section describe each of them.

Figure 2
figure 2

Abstract system model for JSCC in conjunction with flow rate allocation over multiple wireless access networks.

3.1 Network model

The end-to-end connection from the video server to the wireless interface of the mobile client is considered as an independent physical path which includes the wired and wireless domains. It is well known that the wireless access is most likely to be the bottleneck link for the end-to-end transmission due to the limited bandwidth and time-varying channel status. The transmission data packets may encounter loss due to buffer overflow in immediate routers or erasures caused by channel fading in the error-prone wireless channels. In order to simplify the discussion, we generally consider a packet to be lost due to the link fault either in the wired/wireless packet switching networks. Each physical path P r is associated with the following metrics:

  •  Available bandwidth μ r (expressed in Kbps). μ r captures the variation of background traffic and bandwidth fluctuation.

  •  Propagation delay t r which includes the link delays of the wired and wireless networks.

  •  Average loss probability π B r [0,1], assumed to be an i.i.d process and independent of the video streaming rate.

We model the burst loss behavior on each physical path by the continuous-time Gilbert model. It is a two-state stationary continuous time Markov chain. The state X r (t) assumes one of two values: G (good) or B (bad). If a packet is sent at time t with X r (t)=G, then the packet can be successfully delivered. Otherwise, when X r (t)=B, then the packet is lost.

We denote by π G r and π B r the stationary probabilities that P i is good or bad. Let ξ B r and ξ G r represent the transition probability from G to B and B to G, respectively. In this work, we adopt two system-dependent parameters to specify the continuous-time Markov chain packet loss model: (1) the average loss rate π B r and (2) the average loss burst length 1/ ξ B r . Then, we can have

π G r = ξ B r ξ B r + ξ G r and π B r = ξ G r ξ B r + ξ G r .

The available bandwidth and propagation delay of each wireless network can be estimated by packet probing mechanisms (e.g., the pathChirp [31] algorithm employed in this work) over each interface of the mobile client. The loss parameters π B r and ξ B r can be sensed through control protocols or delay measurements [32].

3.2 Video distortion model

In this subsection, we introduce a generic video distortion model [33]. The end-to-end distortion (Dtotal), perceived by the end user, can generally be computed as the sum of the source distortion (Dsrc) and the channel distortion (Dchl). Overall, the end-to-end distortion can thus be written as

D total = D src + D chl .

The video quality depends on both the distortion due to a lossy encoding of the media information and the distortion due to losses experienced in the network. Dsrc is mostly determined by the video source rate and the video sequence parameters (e.g., for the same encoding bit rate, the more complex the sequence, the higher the source distortion). The source distortion decays with increasing encoding rate; the decay is quite steep for low bit rate values, but it becomes very slow at high bit rate. The channel distortion is dependent on the effective loss rate π B , which is caused by the transmission loss and expired arrivals of video packets. It is roughly proportional to the number of video frames that cannot be decoded correctly. Hence, we can explicitly formulate Dtotal (in units of mean square error) as

D total = D 0 + α V - V 0 D src + β × π B D chl ,

in which α, V0, D0, and β are constants for a specific video codec and video sequence. These parameters can be estimated from three or more trial encodings using nonlinear regression techniques. To allow fast adaptation of the flow rate allocation to abrupt changes in the video content, these parameters can be updated for each group of pictures (GOP) in the encoded video sequence [34]. Since this model takes into account the effects of intra-coding and spatial loop filtering, it provides accurate estimates for end-to-end distortion [32].

3.3 Forward error correction

In this work, we use the systematic RS block erasure code for video data protection against channel losses. Generically, a FEC block of n data packets contains k source packets and n-k redundant packets. Usually, the receiver can fully reconstruct the original k data packets if at least k packets of the FEC block are correctly received. In FEC (n,k) code, for every k source packets, (n-k) redundant data packets are introduced to make up a codeword of packets. As long as a client receives at least k out of the n data packets, it can recover all the source packets. If the number of received packet is less than k, the arrival source packets can still be used to contribute to the video decoding process because they have been kept intact by the RS encoding process. In general, for the same code rate k/n, increasing the value of n would enhance the performance of RS code. The FEC code rate n/k needs to be dynamically chosen based on the loss requirement and channel status.

Practically, the frame-level [35], GOP-level [36], or sub-GOP-level [37] FEC coding is often applied for video data protection. In this work, we implement the GoP-level FEC coding (see Figure 3) in order to seamlessly integrate with the source rate adaption mechanism.

Figure 3
figure 3

Illustration of GOP-level FEC coding used in this paper.

3.4 Effective loss rate

The effective loss rate π B represents the combined rates of the lost packets due to channel losses and expired arrivals, i.e.,

π B = π tran + 1 - π tran × π over .

For real-time video applications, each video frame is associated with a decoding deadline. This deadline sets a maximum delay bound for a frame to be successfully delivered to the client in order to contribute to the decoding process. Next, we will provide a comprehensive analysis for the transmission and overdue loss, respectively.

3.4.1 Transmission loss rate

Let c denote a n-tuple which represents a particular failure configuration. If the i th FEC data packet is lost during the transmission, then c i = B. By taking into account all the possible configurations, we can compute the transmission loss rate as

π tran = 1 k all c (c)×P(c),

in which 0<(c)<k is the number of lost source packets for a given c. For the systematic FEC(n,k) we can have

(c)= 0 if i = 1 k 1 c i = B n - k , i = 1 k 1 c i = B otherwise .

As the physical paths to the multi-homed client are independent of each other, we can compute P(c) as follows:

P(c)= r = 1 R ϕ r × P ( c r ) ,

where P( c r ) is the probability of a failure configuration cron P r and ϕ r is an element of the selection vector (Φ= ϕ 1 ,, ϕ R ) for wireless access networks. ϕ r is defined by

ϕ r = 1 if the r th wireless access network is picked , 0 otherwise .

Let p i , j r (θ) denote the probability of the transition from state i to j on P r in time θ, then we can have

p i , j r (θ)=P[ X r (θ)=j| X r (0)=i].

For the classic Markov chain analysis, we can have

p G , G r ( θ ) = π G r + π B r × κ , p G , B r ( θ ) = π B r - π B r × κ , p B , G r ( θ ) = π G r - π G r × κ , p B , B r ( θ ) = π B r + π G r × κ ,

in which κ=exp - ( ξ B r + ξ G r ) × θ . We assume that each element in the vector N= n 1 , n 2 , , n R , r n r =n represents the number of packets dispatched onto each physical path. Now, the value of P( c r ) can be computed as follows

P( c r )= ϕ r × π c 1 r r i = 1 n r - 1 p c i r , c i + 1 r r θ r .

After a sequence of algebraic computations, we can obtain

π tran = 1 k all c (c) r = 1 R π c 1 r r i = 1 n r - 1 p c i r , c i + 1 r r θ r .

The above equation allows us to compute the transmission loss rate of a specific scheduling approach. Based on the Equation 11 in [38], we can obtain the expected value of π tran as in Equation 12,

π tran = 1 k j = n - k + 1 n 0 j 1 , . . , j N j j 1 + . . + j N = j r = 1 R ϕ r π G r × P ( [ n r - 1 j r ] | G ) + π B r × P ( [ n r - 1 j r - 1 ] | B ) × r = 1 R ϕ r × i = 0 k r i × π G r × P ( [ k r - 1 i ] | G ) × P ( [ n r - k r j r - i ] | G ) + π B r × P ( [ i - 1 k r - 1 ] | B ) × P ( [ n r - k r j r - i ] | B ) π G r × P ( [ n r - 1 j r ] | G ) + π B r × P ( [ n r - 1 j r - 1 ] | B ) ,

in which P([ n r - 1 j r - 1 ]|q),q{G,B} denotes that any b out of a consecutive packets are lost given that this block is preceded by a packet which is state q. The detailed computations of P [ n r - 1 j r - 1 ] | q ,q{G,B} can be referred to [14].

3.4.2 Overdue loss rate

The end-to-end packet delay over a single wireless network (d r ) is dominated by the queueing delay at the bottleneck link, and it can be approximated by an exponential distribution [39], i.e.,

P d r > T 1 2 π exp - T d r ,

in which denotes the maximum delay constraint that prevents the playback buffer starvation. Now, we calculate the value of d r . It can be obtained with the following equation:

d r = t r + μ r ( 1 - π r ) n r × S ,

where μ r (1 - π r ) represents the ‘loss-free’ bandwidth of P r and S represents the packet payload size. Then, the probability for expired arrival of packets can be obtained with

P{ d r >T} 1 2 π exp T × n r × S t r × n r × S + μ r ( 1 - π r ) .

The overdue loss rate can be obtained with the equation of

π over = r = 1 R n r × ϕ r × P d r > T n , = 1 2 π × n r = 1 R n r × ϕ r × exp T × n r × S t r × n r × S + μ r ( 1 - π r ) .

3.5 Problem formulation

We are now ready to formulate the problem of flow rate allocation combining the JSCC for video delivery in heterogeneous wireless networks. Note that it is not practical for the video encoder to trace the frequent variation in source rate. Therefore, we adapt the source rate in units of GOP, based on the channel status, FEC code rate, and delay requirements. To allow fast adaptation of the source rate to abrupt changes in the video content, this parameter is updated for each GOP in the encoded video sequence, typically once every 0.25 s (with J = 8 frames, F = 30 fps). The objective is to minimize the summation of the total distortion Dtotal subject to loss, delay and bandwidth constraints:

For each GOP, determine the value of Φ , Ω , V , n tominimize D total = D 0 + α V - V 0 D src + β × π tran D tran + β × ( 1 - π tran ) × π over D over D chl ,
subject to: V × n / k × ω r i = 1 R ω i < μ r , for 1 r < R , V × n / k r = 1 R μ r , π tran + 1 - π tran × π over < Δ , π tran = Equation 12 , π over = Equation 16 .

This is a nonlinear optimization problem with linear constraints. With regard to the computational cost and convergence, it is impractical to derive the exact solution for the minimal video distortion. In the next section, we will show how to resolve this optimization problem in the design of the proposed FRA-JSCC.

4 Design of flow rate allocation-based joint source-channel coding

In this section, we describe the overall design of the proposed FRA-JSCC and outline the functionality of its major components. The system design is presented in Figure 4, and it includes components implemented in both the server and client side, respectively. In order to solve the optimization problem (17), the proposed FRA-JSCC performs the following working steps at the server side: (1) FEC redundancy estimation, (2) source rate adaption, and (3) flow rate allocation. Specifically, the value of FEC redundancy ((n - k)/k) and video source rate (V) is based on the rate allocation vector (Φ). The input and feedback information (e.g., the loss, delay constraints, and channel status) is necessary for the computation steps. The loss and delay requirement is imposed by the video application in order to achieve the required QoS. The encoded video streaming is split among multiple available wireless networks at the weighted round robin distributor, and the packet transmitter is responsible for dispatching the FEC data packets onto different channels.

Figure 4
figure 4

Overall design of the proposed FRA-JSCC consisting of working components at the server and client sides.

At the client side, the video frames will be stored in the playback buffer after the FEC decoding process. The inter-frame resequence step aims at reordering the video frames in case they arrive at the client out-of-order. As each video frame is associated with a decoding deadline, the overdue frames will be discarded and concealed by copying from the last received ones. Next, we will describe the key components in the system design and their working steps.

4.1 FEC redundancy estimation

For estimating the FEC redundancy, we model the multiple wireless networks as a single virtual link with effective loss rate π B . Consider the transmission of k FEC packets (each of size S) over the virtual link from the source to the destination. Let (n - k)/k denote the redundancy (i.e., the fraction of redundant FEC packets in the FEC block). There is an inherent tradeoff between FEC redundancy and its error correction power [29]. With more redundant packets, the receiver can recover from more severe losses, at the cost of larger end-to-end delays and higher loads imposed on networks. Therefore, in the design of FRA-JSCC, the goal is to use ‘just enough’ FEC redundancy to meet the video application’s loss requirement (Δ). With this objective, the FEC adaption policy can be derived under fairly general assumptions by simply bounding the loss tail probability.

Therefore, the FEC redundancy estimation problem can be stated as

n=arg min diff Δ - π B ,

in which

diff(Δ- π B )= Δ - π B if π B < Δ , otherwise ,

and π B can be estimated using Equation 12. Therefore, the FEC redundancy can be obtained i.f.f Φ is determined.

4.2 Source rate adaption

According to the information theory [40], video source distortion can be minimized by increasing the effective encoding rate. On the other hand, the increasing encoding rate will lead to higher transmission rate which imposes heavier load on channels. If the imposed load exceeds the network capacity, it will in turn result in longer delay and packet loss due to network congestion. There is an inherent conflict between the source and channel distortion. Therefore, the critical point in source rate adaption is to find the upper bound under application and channel constraints. The constraint imposed by video applications is the delay requirements. In real-time video applications, delay plays a vital role in enhancing streaming video quality. If a video frame arrives at a destination past the decoding deadline, it is considered lost. In this paper, we propose a source rate adaption algorithm under delay requirements, taking into account FEC redundancy carried out in the last subsection.

The maximum number of packets that can be transmitted through the r th wireless network in the tolerable maximum delay is calculated by

ω r = μ r × ( 1 - π r ) × ( T - t r ) S ,for1rR.

where x denotes the largest integer less than x. Now, we set the weighting factor for the r th wireless channel of the weighted round robin distributor to ω r × ϕ r for 1rR. The proposed joint source and FEC control scheme calculates the FEC decoding failure rate based on the effective loss rate to determine the code rate. First, we define the maximum number of packets which can be transmitted using the constructed virtual link by

Θ= r = 1 R ϕ r × ω r .

Then, the duration of a GOP to be displayed at the client side can be obtained with J/F, in which J is the number of frames in a GOP and F is the video frame rate (in terms of frames per second). The number of bytes within a GOP after being encoded within the duration is Θ × S × k/n, where k/n denotes the FEC code rate. Consequently, the resulting maximum video source rate V for a FEC block is determined by the equation

V= Θ × S × k / n J / F .

4.3 Flow rate allocation

The source packets together with the redundancy packets consist of the ‘flow’ mentioned throughout this paper. The goal of the flow rate allocation is to select appropriate wireless access networks out of all the candidates so as to minimize end-to-end video distortion.

Minimize: D total (Φ)= D 0 + α V ( Φ ) - V 0 +β× π B (Φ).

Until now, we have obtained the expressions of n and V. According to the Theorem 1 in [17], the optimal flow rate allocation solution takes the form of a consecutive series of 1’s, followed by a consecutive series of 0’s, i.e., Φ = [1,1,…,1,0,0,…,0]. Indeed, the inclusion of a wireless access network with high loss rate, long propagation delay, or low bandwidth can theoretically increase Dtotal because more FEC redundancy may be required to compensate for the increased uncertainty. In order to find the optimal solution, we first rank all the available wireless networks according to their ‘loss-free’ bandwidth (μ r (1 - π r )), which has proven to be a good indicator of the network path quality [30]. Then, the optimal flow rate allocation vector can be obtained with a simple but effective search algorithm, i.e.,

Practically, a mobile device has a small number of network interfaces due to the limited battery life, mobility, cost, etc. Thus, the computational complexity required for the proposed flow rate allocation algorithm is negligible although the full search method is used.

4.4 Channel status monitoring

Estimating channel status information based on end-to-end monitoring has been attracting research attention for years. Over heterogeneous wireless networks, it is very important to identify the physical characteristics of each wireless channel in order to utilize channel resources efficiently. The available bandwidth, propagation delay, and channel loss rate are especially important properties for a high-quality video streaming service. So far, numerous algorithms have been proposed to estimate the available bandwidth over wired/wireless networks in the literature [31, 41, 42]. In this paper, the pathChirp algorithm [31] is employed to estimate the available bandwidth through each wireless network with high accuracy and efficiency. During the transient state, a server sends some probe packets with exponentially distributed intervals through each wireless network interface when a video request arrives from a client. Based on the probe packet arrival intervals, the client estimates the available bandwidth using the pathChirp algorithm (for detailed descriptions, please refer to [31]). In the steady state, video data packets are transmitted at a fixed interval, and the client continuously monitors the packet arrival intervals in a sliding window and estimates the available bandwidth based on these intervals. We can easily calculate the propagation delay by the time stamp in each packet header. Now, we can obtain the following information

μ = { μ 1 , μ 2 , , μ R } , π = { π B 1 , π B 2 , , π B R } , and t = { t 1 , t 2 , , t R } .

A client periodically reports information on each physical path to a parameter control unit of a server through the most reliable uplink channel. This information is used to determine the results of FEC parameter tuning, source rate adaption, and source rate allocation in the system design. The procedures of the proposed FRA-JSCC are presented in Algorithm 1.

Algorithm 1 Flow Rate Allocation based Joint Source Channel Coding.

5 Performance evaluation

In this section, we evaluate the efficacy of the proposed FRA-JSCC by comparing it with the existing schemes for video delivery over heterogeneous wireless networks. We first describe the emulation methodology that includes the emulation setup, reference schemes, performance metrics, and emulation scenario.

5.1 Emulation methodology

5.1.1 Emulation setup

We adopt the Exata and Joint Scalable Video Model (JSVM) as the network emulator and video codec, respectively. The architecture of evaluation system is presented in Figure 5, and the main configurations are set as follows:

Figure 5
figure 5

System architecture for performance evaluation.

  •  Exata 2.1 [43] is used as the network emulator. Exata is an advanced edition of QualNet [44] in which we can perform semi-physical emulations. In order to implement the real video streaming-based emulations, we integrate the source code of JSVMa (as Objective File Library (.LIB)) with Exata and develop an application layer protocol of ‘Video Transmission’. The detailed descriptions of the development steps could be referred to Exata Programmer’s Guide [43]. In the emulation topology, the video server has one wired network interface and the mobile client has three wireless network interfaces, i.e., cellular, WLAN, and WiMAX. We can construct an end-to-end connection to a specific wireless network interface by binding a pair of IP addresses from the server and the client. The configurations of the emulated background traffic in the wired networks are listed in Table 2. The server and client are mapped to real computers, which are connected to the emulation server through the Exata Connection Manager. The IEEE 802.11b is adopted as the WLAN protocol. The configurations of heterogeneous wireless networks are summarized in Table 3[4, 5, 45].

Table 2 Parameters of background traffic
Table 3 Parameter configuration of wireless networks
  •  H.264/SVC reference software JSVM 9.18 [46] is adopted as the video encoder. The generated video streaming is encoded at 30 frames per second and a GOP consists of 8 frames. The test video sequences are Foreman, Mother & Daughter, Hall, and Container in QCIF (quarter common interchange format) with 300 frames. Each of the sequences features a different pattern of temporal motion and spatial characteristics which is reflected in their corresponding video quality versus encoding rate dependencies. We concatenate them 10 times to be 3,000 frames long in order to obtain statistically meaningful results. The loss requirement (Δ) and delay constraint () are set to 1% and 250 ms, respectively.

5.1.2 Reference schemes

We compare the performance of FRA-JSCC with the following schemes for video delivery in heterogeneous wireless networks:

  • FCVP [[6]]. As the system proposed in [6] aims at exploiting the path diversity in heterogeneous wireless networks based on fountain code, we name it fountain code-based virtual path construction system. In the implementation of FCVP, the control parameters were updated for every 0.5 s. The symbol and packet size is set to be 8 and 512 bytes, respectively.

  • JMFR [[16]]. The joint multimedia-FEC rate allocation scheme computes the optimal source and FEC rate for scalable video over multi-path networks based on the utility algorithm. The number of video layers is set to be 1 in all the emulations.

  • DMP [[18]]. The dynamic multi-path streaming utilizes multiple paths by maintaining a transmission control protocol (TCP) connection on each path. The sender puts the data packets in a single sender queue. At any time, only one TCP connection can gain the access to the sender queue. The winning TCP connection will keep sending data until the connection is blocked. Another available TCP connection will then gain the access to the sender queue and continue sending data. In order to fairly compare the performance with other competing models, we dynamically adjust the video encoding rate based on the aggregate bandwidth of the available links.

5.1.3 Performance metrics

We adopt the following performance metrics to evaluate the proposed approach against the above competing approaches:

  • PSNR. Peak signal-to-noise ratio is a standard metric of video quality and is a function of the mean square error between the original and the received video frames. If a video frame is lost or past the deadline, it is considered lost but may be concealed by copying from the last received frame before it.

  • Average end-to-end delay. The end-to-end delay of a video frame consists of delay in the network and the resequencing time at the client. It is counted from the generation time of a video frame to the time when it can be decoded.

  • Effective loss rate. As introduced in Section 3.4, the effective loss rate π B includes the transmission and overdue loss. PSNR measures video quality after error concealment for the lost video frames. We measure the effective loss rate to testify the competing models in mitigating the packet loss.

5.1.4 Emulation scenario

We conduct all the emulations in the mobile scenario with trajectories indexed from 1 to 4 as shown in Figure 5. The four mobile trajectories represent the different access options for the mobile user in the integrated heterogeneous wireless networks, e.g., the user could simultaneously access the UMTS and WiMAX while moving along the first trajectory. The mobile client requests to the server through a wireless interface and constructs the connection whenever it moves in the coverage. The moving speed of the client is set to be 2 m/s in all the emulations. In all the emulations, the components of FRA-JSCC are working at the GOP level, i.e., every 0.25 s. It is necessary to update the JSCC parameters for each GOP due to the time-varying wireless channel status. However, with regard to the coding efficiency, it is impractical to trace the rate variation at the video frame level.

For the confidence results, we repeat each set of emulations with different video sequences more than five times and obtained the average results with a 95% confidence interval. The microscopic and mobility results were presented with the measurements of finer granularity.

5.2 Evaluation results

Before showing the experimental results of the performance metrics in detail, we first present the channel status information, which is the feedback with a 0.25-s period from the client. Figure 6 plots the available bandwidth of different wireless access networks during the client mobility along mobile trajectory 3. It can be observed that the available bandwidth of both WLAN and WiMAX experiences fluctuations due to the injected background traffic and client mobility. The instantaneous loss rates are shown in Figure 7. Due to the lack of space, we do not present all the channel status information during the experimentations in this section.

Figure 6
figure 6

Available bandwidth of different wireless networks while moving along mobile trajectory 3. (a) WLAN and (b) WiMAX.

Figure 7
figure 7

Instantaneous channel loss rate of different wireless networks while moving along mobile trajectory 3. (a) WLAN and (b) WiMAX.

5.2.1 PSNR

As shown in Figure 8, FRA-JSCC achieves higher PSNR values and lower variations than the other competing models. The average video PSNR in trajectory 2 is lower than that in trajectory 1 as the WLAN is less stable than the WiMAX. The results verify the instance in Figure 1 and the conclusions in related work [6, 7]. Besides, the superiority of FRA-JSCC and FCVP over the other two schemes is larger in trajectories 3 and 4 as more wireless access networks are available. The substantial improvements in video quality confirm the importance of JSCC in conjunction with flow rate allocation in heterogeneous wireless networks. FRA-JSCC outperforms the FCVP as the Reed-Solomon code is more appropriate than the fountain code for the real-time video and thus reduce the erasure-coding-induced delays. In order to have a microscopic view of the results, we also depict the mean values and standard deviations (Stddev) of mobile trajectory 4 in Table 4. The per frame video PSNR during the interval of [ 0, 20] s is presented in Figure 9. It can be observed that FRA-JSCC maintains the PSNR values at a relatively higher range. In the mobile trajectory 1, the superiority of FRA-JSCC over the JMFR becomes more obvious and is due to the increase number of access options.

Figure 8
figure 8

Average PSNR values and variances under different evaluation scenarios.

Figure 9
figure 9

PSNR values of the received 600 video frames in the Foreman sequence. (a) FRA-JSCC, (b) FCVP, (c) JMFR, and (d) DMP.

Table 4 Average PSNR values for different compared models

5.2.2 Average end-to-end delay

Figure 10 plots the average end-to-end delays as well as the confidence intervals. FRA-JSCC achieves the lowest delay of all the competing models. The delay performance of FCVP is inferior to that of FRA-JSCC and JMFR due to the large block size of fountain and the coding inefficiency. The results indicate the Reed-Solomon code is more suitable for real-time video applications than the fountain code. Figure 11a depicts the cumulative distribution function of the end-to-end video frame delay from a single experiment. We can see that the per-frame delay is significantly lower here than that of the other three reference schemes. Although the FEC encoding is not employed in the DMP, the lost video frames need to be retransmitted, and thus, the end-to-end delay will be increased. As each video frame is associated with a decoding deadline in real-time applications, we plot the ratio of video frames past the decoding deadline of 200 ms in Figure 11b.

Figure 10
figure 10

The average end-to-end delays of all the compared schemes.

Figure 11
figure 11

Delay performance of the competing models. (a) Cumulative distribution function and (b) ratio of video frames past the decoding deadline of 200 ms.

5.2.3 Effective loss rate

Figure 12 depicts the effective loss rates of all the competing schemes under different mobile trajectories. The pattern is very similar to the results presented in Figure 8 as the PSNR is generally proportional to the ratio of lost video frames. FRA-JSCC significantly outperforms the reference schemes as it takes into account both the loss and delay requirements. However, different from the results in end-to-end delay, FCVP outperforms JMFR and DMP in minimizing the effective loss rate as it includes a physical path selection algorithm in the system design. Thus, the transmission loss is substantially decreased.

Figure 12
figure 12

The effective loss rates of all the competing models.

6 Conclusions

In this paper, we have presented a flow rate allocation-based JSCC approach for mobile video delivery in heterogeneous wireless networks. Through modeling and analysis, we have developed solutions for FEC redundancy adaption, video source rate adaption, and flow rate allocation. Experimental results show that the proposed FRA-JSCC is able to dynamically select the appropriate wireless access networks out of all candidates and significantly improve the video PSNR. As future work, we will consider (1) designing a seamless vertical handoff algorithm for optimal-quality video in the integrated WLAN, WiMAX, and cellular networks. The work in [5] formulates the heterogeneous wireless networks as restless bandit systems. However, it does not provide in-depth analysis on the physical characteristics (e.g., the coverage and received signal strength) of each wireless network. We would also consider (2) including an optimal path interleaving mechanism with the FRA-JSCC to overcome the burst loss.


a We choose the JSVM in convenience for the source code integration as both Exata and JSVM are developed using the C++ code, while the H.264/AVC JM ( software is developed using C language.


  1. Hurley C, Chen S, Karim J, YouTube 2005. . Accessed 1 Sept 2013

  2. Hulu: NBC Universal & New Corp. Los Angeles, CA; 2007. . Accessed 1 Sept 2013

    Google Scholar 

  3. Cisco: Cisco visual networking index: Global mobile data traffic forecast update, 2012–2017, May 2013 (online). Available: Accessed 29 Nov 2013

  4. Oliveira T, Mahadevan S, Agrawal DP: Handling network uncertainty in heterogeneous wireless networks. In INFOCOM, 2011 Proceedings IEEE. IEEE Piscataway; 2011:2390-2398.

    Chapter  Google Scholar 

  5. Si P, Ji H, Yu FR: Optimal network selection in heterogeneous wireless multimedia networks. Wireless Netw 2009, 16(5):1277-1288.

    Article  Google Scholar 

  6. Han S, Joo H, Lee D, Song H: An end-to-end virtual path construction system for stable live video streaming over heterogeneous wireless networks. IEEE J. Select. Areas Commun 2011, 29(5):1032-1041.

    Article  Google Scholar 

  7. Yooon J, Zhang H, Banerjee S, Rangarajan S: MuVi: a multicast video delivery scheme for 4G cellular networks. In Proceedings of the 18th Annual International Conference on Mobile Computing and Networking (Mobicom ’12). ACM, New York; 2012:209-220.

    Chapter  Google Scholar 

  8. Chebrolu K, Rao R: Bandwidth aggregation for real-time applications in heterogeneous wireless networks. IEEE Trans. Mobile Comput 2006, 5(4):388-403.

    Article  Google Scholar 

  9. Song W, Zhuang W: Performance analysis of probabilistic multipath transmission of video streaming traffic over multi-radio wireless devices. IEEE Trans. Wireless Commun 2012, 11(4):1554-1564.

    Article  Google Scholar 

  10. Jurca D, Frossard P: Video packet selection and scheduling for multipath streaming. IEEE Trans. Multimedia 2007, 9(3):629-641.

    Article  Google Scholar 

  11. Khalek AA, Heath RW, Caramanis C: A cross-layer design for perceptual optimization Of H.264/SVC with unequal error protection. IEEE J. Select. Areas Commun 2012, 30(7):1157-1171.

    Article  Google Scholar 

  12. Zhang Y, Gao W, Lu Y, Huang Q, Zhao D: Joint source-channel rate-distortion optimization for H.264 video coding over error-prone networks. IEEE Trans. Multimedia 2007, 9(3):445-454.

    Article  Google Scholar 

  13. Ji W, Li Z, Chen Y: Joint source-channel coding and optimization for layered video broadcasting to heterogeneous devices. IEEE Trans. Multimedia 2012, 14(2):443-455.

    Article  Google Scholar 

  14. Frossard P, Verscheure O: Joint source/FEC rate selection for quality-optimal MPEG-2 video delivery. IEEE Trans. Image Process 2001, 10(2):1815-1825.

    Article  Google Scholar 

  15. Ahmad S, Hamzaoui R, Akaidi MA: Adaptive unicast video streaming with rateless codes and feedback. IEEE Trans. Circuits Syst. Video, Technol 2010, 20(2):275-285.

    Article  Google Scholar 

  16. Jurca D, Frossard P, Jovanovic A: Forward error correction for multipath media streaming. IEEE Trans. Circuits Syst. Video, Technol 2009, 19(9):1315-1326.

    Article  Google Scholar 

  17. Jurca D, Frossard P: Media flow rate allocation in multipath networks. IEEE Trans. Multimedia 2007, 9(6):1227-1240.

    Article  Google Scholar 

  18. Wang B, Wei W, Guo Z, Towsley D: Multipath live streaming via TCP: scheme, performance and benefits. TOMCCAP 2009., 5(3): doi:10.1145/1556134.1556142

    Google Scholar 

  19. Bystrom M, Modestino JW: Combined source-channel coding schemes for video transmission over an additive white gaussian noise channel. IEEE J. Select. Areas Commun 2000, 18(6):880-890.

    Article  Google Scholar 

  20. Cernea DC, Munteanu A, Alecu A, Cornelis J, Schelkens P: Scalable joint source and channel coding of meshes. IEEE Trans. Multimedia 2008, 10(3):503-513.

    Article  Google Scholar 

  21. He Z, Cai J, Chen CW: Joint source channel rate-distortion analysis for adaptive mode selection and rate control in wireless video coding. IEEE Trans. Circuits Syst. Video, Technol 2002, 12(6):511-523. 10.1109/TCSVT.2002.800313

    Article  Google Scholar 

  22. Duyck D, Capirone D, Boutros J, Moeneclaey M: Analysis and construction of full-diversity joint network-LDPC codes for cooperative communications. EURASIP J. Wireless Commun. Netw 2010, 2010: 805216. 10.1155/2010/805216

    Article  Google Scholar 

  23. Jaspar X, Guillemot C, Vandendorpe L: Joint source-channel turbo techniques for discrete-valued sources: from theory to practice. Proc. IEEE 2007, 95(6):1345-1361.

    Article  Google Scholar 

  24. Qian L, Jones DL, Ramchandran K, Appadwedula S: A general joint source-channel matching method for wireless video transmission. In Data Compression Conference (DCC ’99), Snowbird, 29–31 Mar 1999. IEEE Piscataway; 1999:414-423.

    Google Scholar 

  25. Xu Q, Stankovic V, Xiong Z: Distributed joint source-channel coding of video using raptor codes. IEEE Trans. Circuits Syst. Video, Technol 2007, 25(4):851-861.

    Google Scholar 

  26. Zhai F, Eisenberg Y, Pappas TN, Berry R, Katsaggelos AK: Rate-distortion optimized hybrid error control for real-time packetized video transmission. IEEE Trans. Image Process 2006, 15(1):40-53.

    Article  Google Scholar 

  27. Apostolopoulos J, Trott M: Path diversity for enhanced media streaming. IEEE Commun. Mag 2004, 42: 80-87.

    Article  Google Scholar 

  28. Ramaboli A, Falowo O, Chan A: Bandwidth aggregation in heterogeneous wireless networks: a survey of current approaches and issues. J. Netw. Comput. Appl 2012, 35(6):1674-1690. 10.1016/j.jnca.2012.05.015

    Article  Google Scholar 

  29. Chow ALH, Yang H, Xia CH, Kim M, Liu Z, Lei H: EMS: Encoded multipath streaming for real-time live streaming applications. In 17th IEEE International Conference on Network Protocols (ICNP 2009), Princeton 13–16 Oct 2009. IEEE, Piscataway; 2010:233-243.

    Google Scholar 

  30. Sharma V, Kar K, Ramakrishnan KK, Kalyanaraman S: A transport protocol to exploit multipath diversity in wireless networks. IEEE/ACM Trans. Netw 2012, 20(4):1024-1039.

    Article  Google Scholar 

  31. Ribeiro V, Riedi R, Baraniuk R, Navratil J, Cottrell L: pathChirp: efficient available bandwidth estimation for network paths. In Proceedings of Passive and Active Measurement Workshop. La Jol; 6–8 April 2003.

    Google Scholar 

  32. Paxson V, Almes G, Mahdavi J, Mathis M: Framework for IP performance metrics. IETF Technical Report, RFC 2330 1998.

    Google Scholar 

  33. Stuhlmüller K, Färber N, Link M, Girod B: Analysis of video transmission over lossy channels. IEEE J. Select. Areas Commun 2000, 18(6):1012-1032.

    Article  Google Scholar 

  34. Zhu X, Agrawal P, Singh JP, Alpcan T, Girod B: Distributed rate allocation policies for multihomed video streaming over heterogeneous access networks. IEEE Trans. Multimedia 2009, 11(4):752-764.

    Article  Google Scholar 

  35. Thomos N, Argyropoulos S, Boulgouris N, Strintzis M: Robust transmission of h.264/avc video using adaptive slice grouping and unequal error protection. In IEEE International Conference on Multimedia and Expo, Toronto, 9–12 July 2006. IEEE, Piscataway; 2006:593-596.

    Chapter  Google Scholar 

  36. Baccaglini E, Tillo T, Olmo G: Slice sorting for unequal loss protection of video streams. IEEE Signal Process. Lett 2008, 15: 581-584.

    Article  Google Scholar 

  37. Xiao J, Tillo T, Lin C, Zhao Y: Dynamic sub-GOP forward error correction code for real-time video applications. IEEE Trans. Multimedia 2012, 14(4):1298-1308.

    Article  Google Scholar 

  38. Kurant M: Exploiting the path propagation time differences in multipath transmission with FEC. IEEE J. Select. Areas Commun 2011, 29(5):1021-1031.

    Article  Google Scholar 

  39. Kompella S, Mao S, Hou YT, Sherali HD: On path selection and rate allocation for video in wireless mesh networks. IEEE/ACM Trans. Netw 2009, 17(1):212-224.

    Article  Google Scholar 

  40. Sun M, Reibman A: Compressed Video Over Networks. Marcel Dekker Inc., New York; 2000.

    Google Scholar 

  41. Jain M, Dovrolis C: Pathload: a measurement tool for end-to-end available bandwidth. In Proceedings of Passive and Active Measurement Workshop. Fort Collins; 25–27 Mar 2002.

    Google Scholar 

  42. Zhou A, Liu M, Song Y, et al.: A new method for end-to-end available bandwidth estimation. In Proceedings of IEEE GLOBECOM, New Orleans, 30 Nov–4 Dec 2008. IEEE, Piscataway; 2008:1-5.

    Google Scholar 

  43. Exata: (SCALABLE Network Technologies, Inc., Culver City, 2008). . Accessed 29 Nov 2013

  44. QualNet: (SCALABLE Network Technologies, Inc., Culver City, 2008). . Accessed 29 Nov 2013

  45. Song W, Cheng Y, Zhuang W: Improving voice and data services in cellular/WLAN integrated networks by admission control. IEEE Trans. Wireless Commun 2007, 6(11):4025-4037.

    Article  Google Scholar 

  46. JSVM software, 2006, . Accessed 20 Jun 2012

Download references


This research is supported by the National Grand Fundamental Research 973 Program of China under grant nos. 2011CB302506, 2012CB315802, and 2013CB329102; Research Program of Chongqing Municipal Education Commission (grant no. KJ130523); CQUPT Research Fund for Young Scholars (grant no. A2012-79); National Key Technology Research and Development Program of China ‘Research on the mobile community cultural service aggregation supporting technology’ (grant no. 2012BAH94F02); Novel Mobile Service Control Network Architecture and Key Technologies (2010ZX03004-001-01); National High-tech R &D Program of China (863 Program) under grant no. 2013AA102301; National Natural Science Foundation of China under grant nos. 61003067, 61171102, 61001118, and 61132001; Program for New Century Excellent Talents in University (grant no. NCET-11-0592); Project of New Generation Broadband Wireless Network under grant no. 2011ZX03002-002-01; and Beijing Nova Program under grant no. 2008B50. The authors would like to express their gratitude to the anonymous reviewers who provided comments to improve the paper quality.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jiyan Wu.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Wu, J., Shang, Y., Huang, J. et al. Joint source-channel coding and optimization for mobile video streaming in heterogeneous wireless networks. J Wireless Com Network 2013, 283 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Mobile video streaming
  • Heterogeneous wireless networks
  • Multi-homing
  • Joint source-channel coding
  • Flow rate allocation