Forward error correction (FEC) codes that can provide unequal error protection (UEP) have been used recently for video transmission over wireless channels. These video transmission schemes may also benefit from the use of FEC codes both at the application layer (AL) and the physical layer (PL). However, the interaction and optimal setup of UEP FEC codes at the AL and the PL have not been previously investigated. In this paper, we study the cross-layer design of FEC codes at both layers for H.264 video transmission over wireless channels. In our scheme, UEP Luby transform codes are employed at the AL and rate-compatible punctured convolutional codes at the PL. In the proposed scheme, video slices are first prioritized based on their contribution to video quality. Next, we investigate the four combinations of cross-layer FEC schemes at both layers and concurrently optimize their parameters to minimize the video distortion and maximize the peak signal-to-noise ratio. We evaluate the performance of these schemes on four test H.264 video streams and show the superiority of optimized cross-layer FEC design.

1 Introduction

Multimedia applications such as video streaming, which are delay sensitive and bandwidth intensive, are growing rapidly over wireless networks. However, existing wireless networks provide only limited bandwidth and time-varying quality of service (QoS) support for these applications. Due to limited wireless bandwidth, the video is compressed using sophisticated compression techniques such as H.264 AVC, which is the state-of-the-art video compression standard jointly developed by the ITU and ISO [1]. The compressed video is vulnerable to channel impairments as the corrupted packets induce different levels of quality degradation due to temporal and spatial dependencies in the compressed bitstream. The most important problem that affects video quality is error propagation where an error in a reference frame is propagated by the decoder to all future reconstructed frames, which are predicted from the corrupted reference frame. This problem has led to the design of error-resiliency features, such as flexible macroblock ordering (FMO) [2], data partitioning, and error concealment schemes in H.264 [1, 3, 4]. Recent research has demonstrated the promise of cross-layer protocols for supporting the QoS demands of multimedia applications over wireless networks [5–7]. For example, van der Schaar and Shankar [6] showed the benefits of the joint APP-MAC-PHY approach for transmitting video over wireless networks.

Forward error correction (FEC) schemes are used to protect the video data against channel errors in order to improve the successful data transmission probability and to eliminate the costly retransmissions. However, the maximum throughput does not guarantee the minimum video distortion at the receiver for the following reasons. First, unlike data packets, loss of H.264 compressed video slices induces different amounts of distortion in the received video. Therefore, the FEC code rates should be adaptive to the slice priority. Second, video data are delay sensitive; therefore, the retransmission of corrupted slices may not be feasible. Third, a video stream can tolerate loss of some slices because the lost slices can be error-concealed. This is true especially for the low-priority slices, which introduce low distortion in the received video and result in graceful quality degradation. In this paper, we consider H.264 AVC streams with fixed slice sizes, where each slice can be independently decoded. The video slices are classified into four priority classes based on the distortion contributed by their loss to the received video quality.

An FEC code that provides unequal error protection (UEP), i.e., a higher (lower) protection to high (low)-priority video slices, can achieve considerable quality improvement compared to the equal error protection (EEP) FEC codes [8, 9]. Note that the UEP FEC codes may be employed both at the application layer (AL) and physical layer (PL). Recently, some schemes [5, 10, 11] have considered the precise tuning of EEP FEC schemes at the AL and the PL. However, to the best of our knowledge, existing schemes have not investigated the cross-layer design of UEP FEC codes at the AL and the PL for prioritized video transmission. Employing FEC codes at both layers introduces two interesting trade-offs that we investigate in this paper. First, both FEC codes share a common channel bandwidth to add their redundancy and the optimal ratio of overhead added by each needs to be determined for a given channel signal-to-noise ratio (SNR) and bandwidth. Second, since UEP can be provided at both layers, we need to find the optimal UEP/EEP FEC setup to maximize the video peak SNR (PSNR). To tackle these trade-offs, we concurrently tune the parameters of two FEC codes at both layers.

We use UEP Luby transform (LT) codes [12, 13] at the AL and rate-compatible punctured convolutional (RCPC) codes [14] at the PL. LT codes [15] are modern and efficient FEC codes that are specifically suitable for packet-level coding at the AL. These codes are rateless[12, 13, 15, 16] in the sense that they can generate unlimited encoded information from a finite-length source information.

Next, we carry out a cross-layer optimization to find the optimal parameters of both FEC codes by considering the relative priorities of video packets. For a known channel SNR (i.e., \frac{{E}_{s}}{{N}_{0}}), we address the problem of assigning optimal FEC code rates at the AL and the PL to the individual priority slices within the channel bit-rate limitations. The information about the channel conditions can be obtained from the receiver in the form of channel side information [5–7, 17, 18].

The scheme provides higher transmission reliability to high-priority slices at the expense of the higher loss rates for low-priority slices and, whenever necessary, also discards some low-priority slices to meet the channel bit-rate limitations. We show that adapting the FEC code rates to the slice priority reduces the overall expected video distortion at the receiver. Our scheme does not assume retransmission of lost slices. The preliminary results of this paper appeared in [8].

This paper is organized as follows: Section 2 provides an overview of the related work on FEC coding for video streams. Section 3 provides a brief background on the LT and RCPC FEC codes. Section 4 describes the video slice priority assignment, design of LT and RCPC codes, and cross-layer FEC schemes. Section 5 presents the cross-layer optimization and performance of the proposed FEC schemes. The simulation results of the proposed cross-layer FEC schemes on sample H.264 videos are presented in Section 6, followed by conclusions in Section 7.

2 Related work

LT codes have recently become popular in video transmission schemes due to their good performance and low complexity [15]. Kushwaha et al. [19] used LT codes to encode group of pictures (GOP) of each layer of H.264 SVC video for transmission over cognitive radio wireless networks. Ahmad et al. [17] took advantage of the ratelessness of LT codes and proposed an adaptive FEC scheme for video transmission over the Internet by employing feedback from receivers in the form of acknowledgement. Cataldi et al. [18] proposed a novel LT code, called sliding-window Raptor codes, with a higher efficiency than regular LT codes. They used these codes to provide UEP for a two-layer H.264 SVC scalable video. LT codes were also used in [20–25] to design streaming schemes with lower complexity.

Stockhammer et al. [5] defined the protocol stack, including the FEC coding at the AL and the PL, for the multimedia broadcast multicast service (MBMS) download and streaming in universal mobile telecommunication system (UMTS). In [5], a Raptor code [16] is used at the AL and a turbo code at the PL. Gomez-Barquero and Bria [10] suggested employing the Raptor codes as the AL FEC in DVB-H systems for mobile terminals and demonstrated its advantages over conventional multiprotocol encapsulation (MPE) FEC. Conventional MPE FEC employs the Reed-Solomon codes to encode the video stream; hence, it lacks the flexibility of LT coding at the AL. Courtade and Wesel [11] considered a setup with LT coding at the AL and turbo coding at the PL, and showed that the available channel bandwidth should be optimally split between the AL and PL FEC codes to improve the system performance.

Luby et al. [26] also considered employing two layers of EEP FEC at the AL and the PL for MBMS download delivery in UMTS. They investigated the trade-off between the AL FEC and PL FEC codes, and studied the advantages of the AL FEC on the system performance. Stockhammer and Liebl [27] used the Raptor codes at the AL in 3GPP streaming applications. They investigated how the AL FEC coding may guarantee the ratio of satisfied users who are receiving the video stream. Afzal et al. [28] investigated the overall system performance when the AL FEC codes are used in video streaming in UMTS and packet radio services. Alexiou et al. [29] studied the power control of streaming over high-speed downlink packet access systems when the AL FEC is employed. Munaretto et al. [30] proposed an interesting optimization of the AL FEC coding, video source coding, and the PL rate selection to improve the PSNR of delivered video on cellular networks. The authors in [31] also considered employing the Raptor codes at the AL to improve the quality of service for video in MBMS in long-term evolution (LTE) networks. They investigated the benefits of the AL FEC to multicast multimedia contents and examined how much FEC redundancy should be used under different packet loss patterns.

In [8], we investigated UEP rateless coding at the AL and assumed an ideal PL coding. We found the optimal parameters of a UEP rateless code that maximizes the video quality at the receiver for known channel bandwidth. In this paper, we extend the results of [8] and consider the interaction of the AL coding with the PL coding in video transmission schemes.

3 Background

In this section, we briefly review LT and RCPC FEC codes that will be used at the AL and the PL, respectively, in our proposed cross-layer FEC scheme.

3.1 LT codes

Recently, a new class of FEC codes called rateless (Fountain) codes has been invented. LT codes [15] and Raptor codes [16] are examples of such codes. Unlike other FEC codes, such as LDPC codes [32], rateless codes can adapt to any erasure channel with unknown or varying characteristics as they do not impose any code rate constraint. Fountain codes are especially very desirable for packet-level coding at the application layer, where the underlying channel can be modeled as a packet erasure channel.

LT codes can generate a limitless number of output symbols from N_{
s
}input symbols based on a degree distribution \{{\Omega}_{1},{\Omega}_{2},\dots ,{\Omega}_{{N}_{s}}\}, where Ω_{
i
} is the probability that an output symbol has degree i, and \sum _{i=1}^{{N}_{s}}{\Omega}_{i}=1. This probability distribution can also be shown by its generator polynomial \Omega \left(x\right)=\sum _{i=1}^{{N}_{s}}{\Omega}_{i}{x}^{i}. In LT coding, first an output symbol degree d is randomly chosen from Ω(.). Next, d input symbols are chosen uniformly and randomly from N_{
s
} input symbols and are bit-wise XOR ed together to generate an output symbol. Ω(.) is usually fine-tuned such that the N_{
s
} input symbols can be decoded from any γ_{
r
}N_{
s
} output symbols, for γ_{
r
} slightly greater than 1. Here, γ_{
r
} is the received coding overhead. LT decoding is performed iteratively. At each iteration, an output symbol is found such that the value of all but one of its neighboring input symbols is known. The value of the unknown input symbol is computed by a simple XOR. This step is applied iteratively until no more such output symbols can be found.

Robust-Soliton degree distribution was designed by Luby for LT codes [15]. LT coding with Robust-Soliton distribution results in asymptotically capacity-achieving codes with the encoding complexity of O(N_{
s
} logN_{
s
}). To reduce the coding complexity to linear (at the cost of a slight performance loss), new degree distributions for LT codes have been introduced such as [16]

In this paper, we use (1) as the degree distribution of LT codes.

Interestingly, it has been shown that LT codes can easily provide UEP property with a slight change in the encoding process. In [12, 13], the authors proposed UEP LT codes by modifying the source symbol selection from uniform to non-uniform. In UEP LT codes, N_{
s
} source symbols are partitioned into r sets, s_{1},s_{2},…,s_{
r
} of sizes τ_{1}N_{
s
},τ_{2}N_{
s
},…,τ_{
r
}N_{
s
}, such that \sum _{j=1}^{r}{\tau}_{j}=1. Let p_{
j
} be the probability that a source symbol from set s_{
j
} is chosen to form an encoded symbol. Consequently, we define the protection level of priority i group as k_{
i
}=p_{
i
}N_{
s
}, where \sum _{j=1}^{r}{k}_{j}{\tau}_{j}=1. Further, let y_{l,j} be the probability that a source symbol in s_{
j
} is not recovered after l LT decoding iterations at the receiver. For j=1,…,r we have [12, 13]

where y_{0,j} = 1, β(x) = Ω^{′}(x)/Ω^{′}(1), and {\delta}_{j}\left(x\right)\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}{e}^{{N}_{s}{p}_{j}{\Omega}^{\prime}\left(1\right){\gamma}_{r}(x\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}1)}.

It can be shown that sequences {y_{l,j}}_{
l
},∀j converge to a fixed point y_{
j
}[12, 13], where y_{
j
} is the final decoding error rate of symbols in set j∈{1,2,…,r} for a UEP LT code with the parameters {Ω(x),γ_{
r
},τ_{1},τ_{2},…,τ_{
r
},p_{1},p_{2},…,p_{
r
}}. For EEP LT coding, we have k_{
j
}=1,j∈{1,2,…,r}; hence, ∀j∈{1,2,…,r},y_{
j
}=y. Note that (2) has been derived from tree-graph approximation of LT codes and provides y_{
j
}’s for asymptotic case (N_{
s
}→∞) [12, 13, 16].

3.2 RCPC codes

We choose RCPC codes [14] due to their flexibility in providing various code rates. RCPC codes use a low-rate convolutional mother code and employ various puncturing patterns to obtain various code rates. The RCPC decoder employs a Viterbi decoder. The bit error rate P_{
b
} of the Viterbi decoder is upper bounded by [14]

where d_{
f
} is the free distance of the convolutional code, P is the puncturing period, and c_{
d
} is the total number of error bits produced by the incorrect paths and is known as the distance spectrum[14]. Finally, P_{
d
} is the probability of selecting a wrong path in Viterbi decoding with Hamming distance d, which depends on the modulation and channel characteristics. For an RCPC code with rate R, using the additive white Gaussian noise (AWGN) channel, binary phase shift keying (BPSK) modulation, and the symbol to noise power ratio \frac{{E}_{S}}{{N}_{0}}=R\frac{{E}_{b}}{{N}_{0}}, the value of P_{
d
} (using soft Viterbi decoding) is given by [14]

where Q\left(\lambda \right)=\frac{1}{\sqrt{2\pi}}\underset{\lambda}{\overset{\infty}{\int}}{e}^{-\frac{{a}^{2}}{2}}\mathit{\text{da}}.

4 Cross-layer FEC coding for H.264 video bitstream

In this section, we discuss a priority assignment scheme for H.264 AVC video slices, design of LT and RCPC codes, and our proposed cross-layer FEC scheme. We consider a unicast video transmission from a source node (at the transmitter) to a destination node (at the receiver) in a single-hop wireless network and ignore the intermediate network layers, i.e., transport layer (TL), network layer (NL), and link layer (LL). This allows our algorithm to be employed with different existing network protocols stacks.

4.1 Priority assignment for H.264 video slices

In H.264 AVC, the video frames are grouped into GOPs, and each GOP is encoded as a unit. For the sake of simplicity, we use a GOP length of 30 frames which corresponds to a duration of 1 s. We encode each GOP independently by employing FEC codes. We have used a fixed slice size configuration where macroblocks of a frame are aggregated to form a fixed slice size. Let N_{
s
} be the average number of slices in 1 s of the video. More details of the video encoding parameters are given in Section 6.

H.264 slices can be prioritized based on their distortion contribution to the received video quality [9, 33–37]. In this paper, the total distortion of a slice loss is computed using the cumulative mean square error (CMSE), which takes into consideration the error propagation within the entire GOP [9, 34]. Let the original uncompressed video frame at time t be f(t), the decoded frame without the slice loss be \widehat{f}\left(t\right), and the decoded frame with the slice loss be \stackrel{~}{f}\left(t\right). Assuming that each frame consists of N×M pixels, the MSE introduced by the loss of a slice in the video frame is computed by

The loss of a slice in a reference frame can also introduce error propagation in the current and subsequent frames until the end of GOP. The CMSE contributed by the loss of the slice is thus computed as the sum of MSE over the current and all the subsequent frames in the GOP. Note that computation of slice CMSE requires decoding of the entire GOP for every slice loss, which introduces computational overhead. This overhead can be avoided by predicting the slice CMSE using a low-complexity scheme recently proposed by us in [9]. This slice CMSE prediction scheme uses certain parameters from the current encoded frame alone without using the future frames in the GOP.

We use the CMSE metric to determine the slice priority. All slices in a GOP are distributed into r=4 priority classes of equal size based on their CMSE value. The priority 1 slices induce the highest distortion whereas the priority 4 slices induce the least distortion to received video quality. Note that using more than four slice priorities would result in a more accurate and flexible UEP coding at the cost of higher complexity due to a larger number of design parameters. In fact, using N_{
s
} priority levels would achieve the best performance where each slice is separately protected based on its CMSE. On the other hand, using fewer than four priority levels would limit the flexibility of our scheme and hence decrease its performance.

Let CMSE_{
i
} denote the average CMSE of all slices in a priority class i. Therefore, we have CMSE_{1} > CMSE_{2} > CMSE_{3} > CMSE_{4}. Since CMSE_{
i
} may vary considerably for various videos depending on their content, we use the normalized CMSE_{
i
}, {\overline{\text{CMSE}}}_{i}=\frac{{\text{CMSE}}_{i}}{\sum _{j=1}^{4}{\text{CMSE}}_{j}} to represent the relative importance of a priority class. We show {\overline{\text{CMSE}}}_{i} for six H.264 test video sequences in Table 1. These video sequences have widely different spatial and temporal content.

Table 1 shows that the first five videos, which have very different characteristics (such as slow, moderate, and high motion), have almost similar{\overline{\text{CMSE}}}_{i}values. We also observed similar{\overline{\text{CMSE}}}_{i}values for other video sequences, such as Table Tennis and Mother Daughter. However, Akiyo, which is a static sequence, has different{\overline{\text{CMSE}}}_{i}values than other sequences. The{\overline{\text{CMSE}}}_{i}values changed only slightly when these videos were encoded at different bit rates (i.e., 512 kbps and 1 Mbps) and slice sizes (150 to 900 bytes). When these videos are encoded at 840 kbps with 150-byte slices, we get N_{
s
}≈700.

We choose the{\overline{\text{CMSE}}}_{i}values of Bus, which are similar to most other videos discussed above, to tune our proposed cross-layer scheme for all videos in Section 5. Since the{\overline{\text{CMSE}}}_{i}values of Akiyo are different, we also study the performance of the proposed cross-layer FEC scheme for Akiyo by using its own{\overline{\text{CMSE}}}_{i}values and compare it to the performance of the scheme designed using the{\overline{\text{CMSE}}}_{i}values of Bus in Section 6.

4.2 Design of LT codes at the AL

The video slices may be either directly passed to the PL or encoded using an EEP/UEP LT code before passing to the PL. Therefore, the AL frames contain either uncoded or LT-coded video slices. When no LT coding is performed at the AL, each video slice forms an AL frame and the N_{
s
}AL frames are given to the lower network layers. When the LT coding is performed at the AL, γ_{
t
}N_{
s
}AL frames, containing LT-coded output symbols, are generated from N_{
s
}video slices, where γ_{
t
}≥1 denotes the LT coding overhead at the transmitter. Note that the size of each LT-coded AL frame is still 150 bytes, i.e., the same as input video slice size, whereas the number of AL frames increases to γ_{
t
}N_{
s
}from N_{
s
}. We emphasize that the transmitted LT overhead γ_{
t
}should not be confused with the received LT coding overhead γ_{
r
}. Generally, γ_{
r
}≠γ_{
t
}since some AL frames may not be correctly delivered to the receiver due to channel-induced losses.

The parameters of the UEP LT code at the AL are k_{
i
},i∈{1,…,4} and γ_{
t
}, which need to be optimized while considering the FEC at the PL in the cross-layer setup. Since all r=4 priority levels have equal size, we have{\tau}_{1}={\tau}_{2}={\tau}_{3}={\tau}_{4}=\frac{1}{4} (see Section 3.1). For EEP/UEP LT coding, we use the standard degree distribution given by (1) [12, 13, 16].

When UEP rateless codes designed in [12, 13] are used at the AL, all γ_{
t
}N_{
s
} LT-coded symbols have equal importance. In other words, while more emphasis is given on the higher priority video slices, compared with the lower priority slices, in generating each encoded symbol, the UEP property is embedded in all the encoded symbols equally. Therefore, when UEP rateless codes designed in [12, 13] are used, only EEP FEC coding should be performed at the PL. On the other hand, when video slices are passed to the lower layers without the AL FEC coding, the UEP FEC coding can be performed at the PL based on the slice priority. However, the rateless codes discussed in [21, 25] are capable of encoded symbols with unequal importance.

4.3 Design of RCPC codes at the PL

At the PL, cyclic redundancy check (CRC) bits are added to each AL frame to detect any RCPC decoding errors. We use the industry-standard CRC-8 defined by the polynomial 1+x^{2}+x^{4}+x^{6}+x^{7}+x^{8}[38]. Next, each AL frame is encoded using a UEP/EEP RCPC code. As mentioned earlier, we employ an RCPC code designed in [14] with the mother code rate ofR=\frac{1}{3} and memory of M=6. Based on the AL frame priority level, the RCPC codes may be punctured to get appropriate higher rates. For four priority groups of AL frames, we have R_{1}≤R_{2}≤R_{3}≤R_{4}and{R}_{i}\in \left\{\frac{8}{8},\frac{8}{9},\frac{8}{10},\frac{8}{12},\frac{8}{14},\frac{8}{16},\frac{8}{18},\frac{8}{20},\frac{8}{22},\frac{8}{24}\right\}, where R_{
i
}represents the RCPC code rate of priority i AL frames. Therefore, the parameters that need to be tuned at the PL are R_{1}through R_{4}. For EEP RCPC codes, we have R_{1}=R_{2}=R_{3}=R_{4}. We refer to a frame encoded by the RCPC code as a PL frame.

For the sake of simplicity and without the loss of generality, we assume that each transmitted packet contains one PL frame. Note that the number of PL frames in a packet does not affect the optimal cross-layer setup of FEC codes in our scheme. We have used a conventional BPSK modulation and a simple AWGN channel. Our model can be easily extended to the more complex channel models by using an appropriate P_{
d
}in (4) from [14]. To obtain the packet error rates at the PL on the receiver side, we first employ (4) to obtain the bit error rate of the received bitstream. Next, we employ Monte Carlo method to obtain the packet error rate at the receiver. We perform numerical RCPC encoding and CRC calculations and simulate the transmission. Finally, we find the ratio of correctly received packets by taking average over 10^{3} packet transmissions in 10^{3}iterations.

4.4 System model at transmitter

Based on our discussions so far, we can use four combinations of cross-layer FEC coding schemes at the AL and the PL (summarized in Table 2). Note that the FEC coding is necessary at the PL but optional at the AL. We illustrate the layout of cross-layer FEC schemes in Figure 1 for S-I and S-II schemes and in Figure 2 for S-III and S-IV schemes. The cross-layer optimization of these FEC schemes is discussed in Section 5.

In S-I and S-II, FEC coding is applied only at the PL. In S-I, the equal protection (i.e., EEP RCPC coding) is provided to all frames regardless of their importance. In S-II, the video slices are protected at the PL with various protection levels based on their priority by using the UEP RCPC coding. We expect this scheme to have a considerably improved performance compared to S-I. Note that the priority of each AL frame is conveyed to the PL by using the cross-layer communication. This setup represents the schemes proposed in [36, 39–45].

In S-III and S-IV, FEC coding is applied at both the AL and the PL in a cross-layer fashion. In S-III scheme, we add the FEC coding at the AL by using regular EEP LT codes to the base S-I setup. As we will see later, S-III cannot outperform S-I for all channel conditions since LT codes require extra coding overhead. However, this scheme has the ratelessness property, meaning that it can tolerate loss of the AL frames and still recover the original video slices after LT decoding. This is in contrast to S-I and S-II where the corrupted frames are considered lost. This setup represents the cross-layer FEC schemes proposed in [5, 10, 11, 26–31, 46].

In the proposed S-IV scheme, we apply the UEP LT codes where different slices are protected according to their priority. This scheme benefits both from ratelessness and UEP property. We expect this scheme to achieve the best performance. When LT coding is applied at the AL, the rateless coded symbols are uniformly generated and all the encoded AL frames have equal importance. As a result, using UEP FEC coding at the PL would not be beneficial. This is why we have used EEP FEC coding at the PL in the cross-layer S-III and S-IV schemes.

4.5 Decoding at receiver

Let PER_{
i
}denote the packet error rate of AL frames of priority i at the receiver after RCPC decoding and before LT decoding at the AL. PER_{
i
}can be computed using (3).

In S-I and S-II schemes, each AL frame consists of an uncoded video slice (i.e., LT coding is not performed at the AL). Therefore, the video slice loss rate (VSLR) of slices in priority i is VSLR_{
i
}=PER_{
i
}. In S-III and S-IV schemes, on the other hand, the LT decoding should also be performed, and the decoding error rate of LT codes should be considered in VSLR_{
i
}. In S-III and S-IV schemes, the EEP RCPC code is used at the PL; hence, we have PER_{1}=PER_{2}=PER_{3}=PER_{4}=PER. In this case, we employ (2) with γ_{
r
}=γ_{
t
}N_{
s
}(1−PER), degree distribution (1), and a given set of k_{
i
},i∈{1,…,4} to find the final LT decoding symbol error rates y_{
i
},i∈{1,…,4} for each priority at the receiver (see Section 3.1). If the symbol decoding error rate of priority i is y_{
i
}, then VSLR_{
i
}=y_{
i
}.

5 Cross-layer optimization of the proposed FEC schemes

In our cross-layer FEC schemes, we consider the following issues. First, the AL and PL FEC codes share the same available channel bandwidth to add their coding redundancy. As the channel\frac{{E}_{s}}{{N}_{0}}increases, the RCPC code rate at the PL can be increased. Thus, more channel bandwidth becomes available for improving the LT coding at the AL. For low values of\frac{{E}_{s}}{{N}_{0}}, assigning a higher portion of the available redundancy to LT codes at the AL may not improve the delivered video quality since almost all PL frames would be corrupted during transmission. Therefore, a stronger RCPC code rate should be used at the PL. This consumes a larger portion of the channel bandwidth allowing only a weaker LT code at the AL. Second, UEP FEC may be used either at the AL or the PL. We study how using UEP relates to varying\frac{{E}_{s}}{{N}_{0}}and the bandwidth portions assigned to each FEC code. Third, the optimal FEC code rates for one scheme in Table 2 may be substantially different from another scheme.

To find the optimal parameters for both the FEC schemes and the portion of channel bandwidth they share, we discuss below the cross-layer optimization for the four schemes given in Table 2.

5.1 Formulation of optimization problem

The goal of cross-layer optimization in our scheme is to deliver a video with the highest possible PSNR for a given channel bandwidth C and SNR. Since computing the video PSNR requires decoding the video at the receiver, it is not feasible to use PSNR directly as the optimization metric due to its heavy computational complexity. The PSNR of a compressed video stream depends on several factors, including the video characteristics, bit rate, the percentage of lost slices, and their CMSE values [9, 34]. Therefore, we define a function ‘normalized F,’ denoted by\overline{F}, which represents the weighted distortion contributed by the slice loss rates and their corresponding normalized CMSE values, as

Here, we use a parameter α≥0 that needs to be tuned so that\overline{F}can correctly capture the behavior of PSNR. For a compressed video whose PSNR for error-free transmission is already known, minimizing F results in minimizing the decrease in its PSNR. Selecting the optimal α is discussed in the next section.

To minimize\overline{F}, we tune the parameters of the FEC codes at the AL and the PL. In the S-I scheme, the optimization function finds the optimal RCPC code rate R for a given channel data rate C as

where S+1 is the slice size S=150 bytes plus 1 byte of CRC.

In S-II, the optimization parameters are R_{1}through R_{4}, such that R_{1}≤R_{2}≤R_{3}≤R_{4}. For this scheme, the optimization function can be written as

The optimization parameters for S-III are γ_{
t
}and R. In S-III, we have k_{1}=k_{2}=k_{3}=k_{4}=1 since EEP LT coding is used at the AL. The channel data rate is shared among the two FEC codes and needs to be tuned by selecting an appropriate γ_{
t
}. The optimization function is

In S-IV, the UEP LT codes are used and optimization parameters are k_{1}through k_{3}, along with γ_{
t
}and R. Here, the value of k_{4}can be determined based on k_{1}through k_{3}since\sum _{j=1}^{r}{k}_{j}{\tau}_{j}=1(see Section 3.1). As a result, the optimization function is

The optimization of the LT code’s parameters involves employing (2) for various priority levels. Since (2) has a recursive form, it may not be represented by a linear function. Furthermore, the concatenation of two FEC codes presents a non-linear optimization problem, which cannot be solved using linear programming techniques. Therefore, we use the genetic algorithms (GA) to perform optimizations [47, 48]. Although GA are computationally complex, they can give solutions which are close to the global optimum [47–49]. There are numerous implementations of GA. We used the GA toolbox available in Matlab [50]. We have provided a brief review on GA in the Appendix.

5.2 Optimal value of α

In Table 1, the normalized CMSE values ({\overline{\text{CMSE}}}_{i}) of the video sequences, except Akiyo, were similar. Therefore, the optimal parameters computed for the Bus video would be almost optimal for the other four video sequences generated by the same encoding parameters. We therefore use the{\overline{\text{CMSE}}}_{i}of the Bus video with data rate of 840 kbps to perform our optimizations, followed by the Akiyo sequence. We implement our cross-layer FEC setup including LT coding at the AL and RCPC coding at the PL for S-I through S-IV (see Table 2) in Matlab environment.

We find the optimal value of α such that minimizing\overline{F}maximizes the PSNR of the decoded video. For this, we perform the optimization to minimize\overline{F}for various values of α and also compute the corresponding video PSNR. Note that the value of α has no effect on a cross-layer scheme with EEP FEC code since all VSLR_{
i
}’s are equal in this case. Therefore, we perform our optimization for S-II, which is the simplest UEP FEC scheme. Table 3 reports the PSNR of the Bus video for three values of α and\frac{{E}_{s}}{{N}_{0}}for C=1.4 Mbps when\overline{F}is minimized in S-II. The value of α that concurrently maximizes the PSNR of the video for all values of\frac{{E}_{s}}{{N}_{0}}is α=1. Although not shown in Table 3, the non-integer values of α and α<1 were also considered in optimization. α=1 also gave the best results for Akiyo.

5.3 Discussion of cross-layer optimization results

We report the cross-layer optimization results, including the FEC parameters (e.g., R_{
i
}, γ_{
t
}, and k_{
i
}), VSLR_{
i
}, normalized\overline{F}, and non-normalized F for the{\overline{\text{CMSE}}}_{i}values of the Bus video. Note that F is calculated by replacing the{\overline{\text{CMSE}}}_{i}by the actual average CMSE_{
i
} for the video sequence under consideration. The results of all four FEC schemes for three video sequences (Bus, Foreman, and Coastguard) are reported in Tables 4, 5, 6, and 7 for channel bit rate C=1.4 Mbps. The results for Akiyo are discussed in Section 6.

From Tables 4 and 5, we observe that the use of UEP RCPC coding at the PL in the S-II scheme achieves much better performance (i.e., lower F_{Bus}) than the use of EEP RCPC coding in the S-I scheme. Both schemes do not use FEC coding at the AL.

Since the RCPC code rate of\frac{8}{12}at the PL is not strong enough for\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}\le 2dB, the value of F_{Bus}in the S-I scheme is high (F_{Bus}>300 in Table 4) because many packet are corrupted due to high channel errors. For a successful decoding in LT, the number of error-free packets received should be above a threshold. As a result, the S-III scheme (which also uses RCPC with the same code rate as in S-I) achieves a lower performance (higher value of F_{Bus}) than S-I for\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}\le 2dB (see Tables 4 and 6). However, the S-III scheme achieves much better performance (F_{Bus}<10) than S-I for\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}\ge 2.5dB because fewer packets are now corrupted at the PL and the LT coding becomes effective.

From Tables 6 and 7, we observe that the proposed S-IV scheme achieves much lower values of F_{Bus}than S-III at all values of\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}. This demonstrates that using UEP LT codes at the AL along with EEP RCPC codes at the PL gives a far superior performance than using EEP codes at both layers.

From Table 7 for the S-IV scheme, we observe an interesting trade-off between the code rates assigned to FEC codes at the AL and the PL. For lower values of\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}, a larger portion of the bit budget is assigned to RCPC codes at the PL rather than LT codes at the AL because the LT coding cannot be effective when a large number of packets are corrupted due to channel errors. Furthermore, a stronger UEP (i.e., higher value of k_{
i
}to higher priority video slices) is provided at the AL. For higher values of\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}, the RCPC code rate is relatively high and more protection is provided to LT codes at the AL. Also, the UEP (i.e., value of k_{
i
}) at the AL is relatively less strong now.

Overall, the proposed S-IV scheme achieves the best performance at different channel SNRs, followed by the S-II scheme for\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}\le 2.5dB. S-III outperforms S-II for other higher channel SNRs. We observe similar results for Foreman and Coastguard videos. Therefore, we can generally conclude that it is optimal to provide UEP at the AL and EEP at the PL using a cross-layer design.

Note that the optimization is performed only once for a given set of{\overline{\text{CMSE}}}_{i}values, a GOP structure, and a set of channel SNRs, and need not be run separately for each GOP. The same set of optimized parameters can be used for any video stream with similar properties. Further, we should note that similar performance improvement is also observed for the 1.8-Mbps channel bit rate.

6 Performance evaluation of FEC schemes for test videos

In this section, we evaluate the performance of our optimized cross-layer FEC schemes for four CIF (352×288 pixels) video sequences: Bus, Foreman, Coastguard, and Akiyo. These sequences were encoded using H.264/AVC JM 14.2 reference software [51] at 840 kbps and 150 bytes slice size, for a GOP length of 30 frames with GOP structure IDRBPB…PB at 30 frames/s. The slices were formed using dispersed-mode FMO with two slice groups per frame. Two reference frames were used for predicting the P and B frames, with error concealment enabled using temporal concealment and spatial interpolation. We have used a channel transmission rate of C=1.4 to study the performance over AWGN channels.

We used the slice loss rates reported in Tables 4 through 7 to evaluate the average PSNR of three video sequences (Bus, Foreman, and Coastguard) in Figures 3, 4, and 5. Figures 3, 4, and 5 confirm that our proposed cross-layer S-IV scheme, with UEP FEC coding at the AL and EEP FEC coding at the PL, achieves considerable improvement in average video PSNR, especially at low values of\frac{{E}_{s}}{{N}_{0}}. It outperforms the S-II scheme, which uses UEP RCPC code at the PL, by about 2∼7 dB for\frac{{E}_{s}}{{N}_{0}}\le 3.5dB. Only S-III has a comparable performance at\frac{{E}_{s}}{{N}_{0}}\ge 2.5dB. However, at low values of\frac{{E}_{s}}{{N}_{0}}, the S-IV scheme considerably outperforms S-III.

Although our cross-layer FEC parameters were optimized for Bus sequences, the average PSNR performance is similar to that of the other two test video sequences, i.e., Foreman and Coastguard. As mentioned earlier, both sequences have different characteristics compared to the Bus sequence.

Since Akiyo has considerably different values of{\overline{\text{CMSE}}}_{i}, the proposed S-IV scheme designed by using Bus video’s{\overline{\text{CMSE}}}_{i}values would be suboptimal for Akiyo. In order to study the effect of these CMSE variations, we also designed the S-IV scheme by using the{\overline{\text{CMSE}}}_{i}values of Akiyo and compare its performance with its suboptimal version. The optimization results are reported in Table 8. In this table, we also included the suboptimal values of F_{sub}and PSNR_{sub}, which were obtained by using the optimized parameters of the Bus video from Table 7. The values of PSNR_{opt}and PSNR_{sub}are also shown in Figure 6.

In Table 8 (for optimal scheme) and Table 7 (for suboptimal scheme), the LT code overhead (i.e., γ_{
t
}) and RCPC code strength (R) are the same for both schemes, whereas the values of LT code protection level k_{
i
}for each priority class vary slightly (e.g., k_{1}is higher for the optimal scheme compared to the suboptimal scheme). Similarly, the values of VSLR_{
i
}for higher priority slices (which have the most impact on F and PSNR) are similar in both tables, except for channel SNRs of 2.25, 2.5, and 2.75 dB in the decreasing order of the difference in values. The maximum PSNR degradation of the suboptimal scheme compared to the optimal scheme is 1.7 dB at the channel SNR of 2.25, with only about 0.1 to 0.3 dB PSNR degradation at other channel SNRs. We can, therefore, conclude that the performance of the proposed cross-layer FEC scheme is not very sensitive to the precise values of normalized CMSE.

7 Conclusion

Previously, EEP and UEP FEC coding schemes have been used for video transmission over lossy channels. However, the joint optimization of cross-layer UEP FEC codes at the AL and the PL for robust video transmission has never been considered. In this paper, we used UEP LT coding at the AL and RCPC coding at the PL for robust H.264 video transmission over wireless channels. H.264 video slices were prioritized based on their contribution to video quality. We performed cross-layer optimization to concurrently tune the FEC code parameters at both layers, to minimize the video distortion, and to maximize the PSNR. We observed that our cross-layer FEC scheme outperformed other FEC schemes that use either UEP coding at the PL alone or EEP FEC schemes at the AL as well as the PL. Further, we showed that our optimization works well for different H.264-encoded video sequences, which have widely different characteristics.

Appendix

Introduction to genetic algorithms

J.

Holland in [47] showed how the evolutionary process can be applied to solve a wide variety of problems using a parallel technique that is now called the genetic algorithms [48]. Non-linear and complicated optimization problems which cannot be solved employing conventional optimization algorithms such as linear programming can be effectively solved using genetic algorithms. Let W and\overline{w}=\{{w}_{1},{w}_{2},\dots ,{w}_{k}\}denote the decision space and k decision variables, respectively. LetF\left(\overline{w}\right)denote the objective function that we need to optimize (minimize/maximize). In conventional genetic algorithms, each w_{
i
}is translated to a binary format. The steps to find the optimum answer are as follows:

1.

Generate a random initial population of size i each including k members\overline{{w}_{j}},j=\{1,2,\dots ,k\}.

2.

Translate the generated population from real numbers to a binary format considering desired precision.

3.

Concatenate the translated version of k decision variables together to generate i binary population members.

4.

Evaluate i fitnessesF(\overline{{w}_{j}},j\in \{1,2,\dots ,i\}) of the current population.

5.

Select two parents randomly, assigning higher probability of selection to the parents with a better fitness value.

6.

Perform crossover and mutation [47] on the parents to generate two offsprings. For crossover, cut two parents from a random location and exchange second parts to generate offsprings. For mutation, with a small probability, flip a random bit in the offsprings’ bit streams.

7.

Go to step 5 until i−2 offsprings are generated.

8.

Keep two parents with the best fitness values and replace the rest i−2 with the new offsprings.

9.

If maximum iterations are not reached, go to 4; otherwise, translate the member of population with the best fitness value from a binary to real format and report it as the final answer.

The above algorithm is an overall view of conventional genetic algorithms. However, many variations have been proposed since genetic algorithms were first introduced. For instance, the translation from real to binary and vice versa is no more performed, and the algorithm and the crossover and mutation are all performed in real numbers. More detailed explanation of genetic algorithms is out of the scope of this paper. We refer the interested readers for performance evaluations of genetic algorithm methods to [48, 52] a and the numerous available surveys.

References

Wiegand T, Sullivan GJ, Bjntegaard G, Luthra A: Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol 2003, 13(7):560-576.

Wiegand T, Sullivan G: ITU-T, JVT, T Rec: H.264/IEC 14496-10 AVC 2003 Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification. International Telecommunication Union, Geneva; 2002.

Stockhammer T, Shokrollahi A, Watson M, Luby M, Gasiba T: Application layer forward error correction for mobile multimedia broadcasting. In Handbook of Mobile Broadcasting: DVB-H, DMB, ISDB-T and MEDIAFLO. Taylor & Francis, Boca Raton; 2008:239-280.

van der Schaar M, Shankar S: Cross-layer wireless multimedia transmission: challenges, principles, and new paradigms. IEEE Wireless Commun 2005, 12(4):50-58. 10.1109/MWC.2005.1497858

Setton E, Yoo T, Zhu X, Goldsmith A, Girod B: Cross-layer design of ad hoc networks for real-time video streaming. IEEE Wireless Commun 2005, 12(4):59-65. 10.1109/MWC.2005.1497859

Talari A, Rahnavard N: Unequal error protection rateless coding for efficient MPEG video transmission. In IEEE Military Communications Conference. IEEE, Piscataway; 2009:1-7.

Courtade T, Wesel R: A cross-layer perspective on rateless coding for wireless channels. In IEEE International Conference on Communications, ICC. IEEE, Piscataway; 2009:1-6.

Ahmad S, Hamzaoui R, Al-Akaidi M: Adaptive unicast video streaming with rateless codes and feedback. IEEE Trans. Circuits Syst. Video Technol 2010, 20(2):275-285.

Cataldi P, Grangetto M, Tillo T, Magli E, Olmo G: Sliding-window raptor codes for efficient scalable wireless video broadcasting with unequal loss protection. IEEE Trans. Image Process 2010, 19(6):1491-1503.

Kushwaha H, Xing Y, Chandramouli R, Heffes H: Reliable multimedia transmission over cognitive radio networks using fountain codes. Proc. IEEE 2008, 96: 155-165.

Vukobratovic D, Stankovic V, Sejdinovic D, Stankovic L, Xiong Z: Expanding Window Fountain codes for scalable video multicast. In IEEE International Conference on Multimedia and Expo. IEEE, Piscataway; 2008:77-80.

Tan AS, Aksay A, Bilen C, Akar GB, Arikan E: Rate-distortion optimized layered stereoscopic video streaming with raptor codes. In Packet Video. IEEE, Piscataway; 2007:98-104.

Jenkac H, Stockhammer T: Asynchronous media streaming over wireless broadcast channels. In IEEE International Conference on Multimedia and Expo. IEEE, Piscataway; 2005:1318-1321.

Luby M, Watson M, Gasiba T, Stockhammer T: Mobile data broadcasting over MBMS tradeoffs in forward error correction. In Proceedings of the 5th International Conference on Mobile and Ubiquitous Multimedia. ACM, New York; 2006:10-10.

Stockhammer T, Liebl G: On practical crosslayer aspects in 3GPP video services. In Proceedings of the International Workshop on Workshop on Mobile Video. ACM, New York; 2007:7-12.

Munaretto D, Jurca D, Widmer J: Broadcast video streaming in cellular networks: an adaptation framework for channel, video and AL-FEC rates allocation. In 2010 The 5th Annual ICST Wireless Internet Conference (WICON). IEEE, Piscataway; 2010:1-9.

Thomos N, Argyropoulos S, Boulgouris N, Strintzis M: Robust transmission of H. 264/AVC streams using adaptive group slicing and unequal error protection. EURASIP J. Appl. Signal Process 2006, 2006: 120-120.

Argyropoulos S, Tan A, Thomos N, Arikan E, Strintzis M: Robust transmission of multi-view video streams using flexible macroblock ordering and systematic LT codes. In 3DTV Conference, 2007. IEEE, Piscataway; 2007:1-4.

Koopman P, Chakravarty T: Cyclic redundancy code (CRC) polynomial selection for embedded networks. In International Conference on Dependable Systems and Networks. IEEE, Piscataway; 2004:145-154.

Maani E, Katsaggelos A: Unequal error protection for robust streaming of scalable video over packet lossy networks. IEEE Trans. Circuits Syst. Video Technol 2010, 20(3):407-416.

Ha H, Yim C: Layer-weighted unequal error protection for scalable video coding extension of H.264/AVC. IEEE Trans. Consum. Electron 2008, 54(2):736-744.

Liu Y, Yu S: Adaptive unequal loss protection for scalable video streaming over IP networks. IEEE Trans. Consum. Electron 2005, 51(4):1277-1282. 10.1109/TCE.2005.1561856

Yingbo Shi S, Chengke Wu W, Jianchao Du D: A novel unequal loss protection approach for scalable video streaming over wireless networks. IEEE Trans. Consum. Electron 2007, 53(2):363-368.

Zhang XJ, Peng XH, Haywood R, Porter T: Robust video transmission over lossy network by exploiting H.264/AVC data partitioning. In 5th International Conference on Broadband Communications, Networks and Systems, BROADNETS. IEEE, Piscataway; 2008:307-314.

Gasiba T, Xu W, Stockhammer T: Enhanced system design for download and streaming services using Raptor codes. European Trans. Telecommun 2009, 20(2):159-173. 10.1002/ett.1275

Deb K, Pratap A, Agarwal S, Meyarivan T: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput 2002, 6(2):182-197. 10.1109/4235.996017

This material is based upon work supported by the National Science Foundation under grants ECCS-1056065 and CCF-0915994, and by the Air Force Research Laboratory under award FA8750-11-1-0048.

Disclaimer

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the US Air Force Research Laboratory.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK, 74078, USA

Ali Talari & Nazanin Rahnavard

Department of Electrical and Computer Engineering, San Diego State University, San Diego, CA, 92182, USA

Sunil Kumar & Seethal Paluri

Air Force Research Laboratory, Rome, NY, 13441, USA

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Talari, A., Kumar, S., Rahnavard, N. et al. Optimized cross-layer forward error correction coding for H.264 AVC video transmission over wireless channels.
J Wireless Com Network2013, 206 (2013). https://doi.org/10.1186/1687-1499-2013-206