Optimized cross-layer forward error correction coding for H.264 AVC video transmission over wireless channels

Forward error correction (FEC) codes that can provide unequal error protection (UEP) have been used recently for video transmission over wireless channels. These video transmission schemes may also benefit from the use of FEC codes both at the application layer (AL) and the physical layer (PL). However, the interaction and optimal setup of UEP FEC codes at the AL and the PL have not been previously investigated. In this paper, we study the cross-layer design of FEC codes at both layers for H.264 video transmission over wireless channels. In our scheme, UEP Luby transform codes are employed at the AL and rate-compatible punctured convolutional codes at the PL. In the proposed scheme, video slices are first prioritized based on their contribution to video quality. Next, we investigate the four combinations of cross-layer FEC schemes at both layers and concurrently optimize their parameters to minimize the video distortion and maximize the peak signal-to-noise ratio. We evaluate the performance of these schemes on four test H.264 video streams and show the superiority of optimized cross-layer FEC design.


Introduction
Multimedia applications such as video streaming, which are delay sensitive and bandwidth intensive, are growing rapidly over wireless networks. However, existing wireless networks provide only limited bandwidth and time-varying quality of service (QoS) support for these applications. Due to limited wireless bandwidth, the video is compressed using sophisticated compression techniques such as H.264 AVC, which is the state-of-the-art video compression standard jointly developed by the ITU and ISO [1]. The compressed video is vulnerable to channel impairments as the corrupted packets induce different levels of quality degradation due to temporal and spatial dependencies in the compressed bitstream. The most important problem that affects video quality is error propagation where an error in a reference frame is propagated by the decoder to all future reconstructed frames, which are predicted from the corrupted reference frame. This *Correspondence: ali.talari@okstate.edu 1 Department of Electrical and Computer Engineering, Oklahoma State University, Stillwater, OK 74078, USA Full list of author information is available at the end of the article problem has led to the design of error-resiliency features, such as flexible macroblock ordering (FMO) [2], data partitioning, and error concealment schemes in H.264 [1,3,4]. Recent research has demonstrated the promise of cross-layer protocols for supporting the QoS demands of multimedia applications over wireless networks [5][6][7]. For example, van der Schaar and Shankar [6] showed the benefits of the joint APP-MAC-PHY approach for transmitting video over wireless networks.
Forward error correction (FEC) schemes are used to protect the video data against channel errors in order to improve the successful data transmission probability and to eliminate the costly retransmissions. However, the maximum throughput does not guarantee the minimum video distortion at the receiver for the following reasons. First, unlike data packets, loss of H.264 compressed video slices induces different amounts of distortion in the received video. Therefore, the FEC code rates should be adaptive to the slice priority. Second, video data are delay sensitive; therefore, the retransmission of corrupted slices may not be feasible. Third, a video stream can tolerate http://jwcn.eurasipjournals.com/content/2013/1/206 loss of some slices because the lost slices can be errorconcealed. This is true especially for the low-priority slices, which introduce low distortion in the received video and result in graceful quality degradation. In this paper, we consider H.264 AVC streams with fixed slice sizes, where each slice can be independently decoded. The video slices are classified into four priority classes based on the distortion contributed by their loss to the received video quality.
An FEC code that provides unequal error protection (UEP), i.e., a higher (lower) protection to high (low)priority video slices, can achieve considerable quality improvement compared to the equal error protection (EEP) FEC codes [8,9]. Note that the UEP FEC codes may be employed both at the application layer (AL) and physical layer (PL). Recently, some schemes [5,10,11] have considered the precise tuning of EEP FEC schemes at the AL and the PL. However, to the best of our knowledge, existing schemes have not investigated the cross-layer design of UEP FEC codes at the AL and the PL for prioritized video transmission. Employing FEC codes at both layers introduces two interesting trade-offs that we investigate in this paper. First, both FEC codes share a common channel bandwidth to add their redundancy and the optimal ratio of overhead added by each needs to be determined for a given channel signal-to-noise ratio (SNR) and bandwidth. Second, since UEP can be provided at both layers, we need to find the optimal UEP/EEP FEC setup to maximize the video peak SNR (PSNR). To tackle these trade-offs, we concurrently tune the parameters of two FEC codes at both layers.
We use UEP Luby transform (LT) codes [12,13] at the AL and rate-compatible punctured convolutional (RCPC) codes [14] at the PL. LT codes [15] are modern and efficient FEC codes that are specifically suitable for packetlevel coding at the AL. These codes are rateless [12,13,15,16] in the sense that they can generate unlimited encoded information from a finite-length source information.
Next, we carry out a cross-layer optimization to find the optimal parameters of both FEC codes by considering the relative priorities of video packets. For a known channel SNR (i.e., E s N 0 ), we address the problem of assigning optimal FEC code rates at the AL and the PL to the individual priority slices within the channel bit-rate limitations. The information about the channel conditions can be obtained from the receiver in the form of channel side information [5][6][7]17,18].
The scheme provides higher transmission reliability to high-priority slices at the expense of the higher loss rates for low-priority slices and, whenever necessary, also discards some low-priority slices to meet the channel bit-rate limitations. We show that adapting the FEC code rates to the slice priority reduces the overall expected video distortion at the receiver. Our scheme does not assume retransmission of lost slices. The preliminary results of this paper appeared in [8].
This paper is organized as follows: Section 2 provides an overview of the related work on FEC coding for video streams. Section 3 provides a brief background on the LT and RCPC FEC codes. Section 4 describes the video slice priority assignment, design of LT and RCPC codes, and cross-layer FEC schemes. Section 5 presents the crosslayer optimization and performance of the proposed FEC schemes. The simulation results of the proposed crosslayer FEC schemes on sample H.264 videos are presented in Section 6, followed by conclusions in Section 7.

Related work
LT codes have recently become popular in video transmission schemes due to their good performance and low complexity [15]. Kushwaha et al. [19] used LT codes to encode group of pictures (GOP) of each layer of H.264 SVC video for transmission over cognitive radio wireless networks. Ahmad et al. [17] took advantage of the ratelessness of LT codes and proposed an adaptive FEC scheme for video transmission over the Internet by employing feedback from receivers in the form of acknowledgement. Cataldi et al. [18] proposed a novel LT code, called slidingwindow Raptor codes, with a higher efficiency than regular LT codes. They used these codes to provide UEP for a two-layer H.264 SVC scalable video. LT codes were also used in [20][21][22][23][24][25] to design streaming schemes with lower complexity.
Stockhammer et al. [5] defined the protocol stack, including the FEC coding at the AL and the PL, for the multimedia broadcast multicast service (MBMS) download and streaming in universal mobile telecommunication system (UMTS). In [5], a Raptor code [16] is used at the AL and a turbo code at the PL. Gomez-Barquero and Bria [10] suggested employing the Raptor codes as the AL FEC in DVB-H systems for mobile terminals and demonstrated its advantages over conventional multiprotocol encapsulation (MPE) FEC. Conventional MPE FEC employs the Reed-Solomon codes to encode the video stream; hence, it lacks the flexibility of LT coding at the AL. Courtade and Wesel [11] considered a setup with LT coding at the AL and turbo coding at the PL, and showed that the available channel bandwidth should be optimally split between the AL and PL FEC codes to improve the system performance.
Luby et al. [26] also considered employing two layers of EEP FEC at the AL and the PL for MBMS download delivery in UMTS. They investigated the trade-off between the AL FEC and PL FEC codes, and studied the advantages of the AL FEC on the system performance. Stockhammer and Liebl [27] used the Raptor codes at the AL in 3GPP streaming applications. They investigated how the AL http://jwcn.eurasipjournals.com/content/2013/1/206 FEC coding may guarantee the ratio of satisfied users who are receiving the video stream. Afzal et al. [28] investigated the overall system performance when the AL FEC codes are used in video streaming in UMTS and packet radio services. Alexiou et al. [29] studied the power control of streaming over high-speed downlink packet access systems when the AL FEC is employed. Munaretto et al. [30] proposed an interesting optimization of the AL FEC coding, video source coding, and the PL rate selection to improve the PSNR of delivered video on cellular networks. The authors in [31] also considered employing the Raptor codes at the AL to improve the quality of service for video in MBMS in long-term evolution (LTE) networks. They investigated the benefits of the AL FEC to multicast multimedia contents and examined how much FEC redundancy should be used under different packet loss patterns.
In [8], we investigated UEP rateless coding at the AL and assumed an ideal PL coding. We found the optimal parameters of a UEP rateless code that maximizes the video quality at the receiver for known channel bandwidth. In this paper, we extend the results of [8] and consider the interaction of the AL coding with the PL coding in video transmission schemes.

Background
In this section, we briefly review LT and RCPC FEC codes that will be used at the AL and the PL, respectively, in our proposed cross-layer FEC scheme.

LT codes
Recently, a new class of FEC codes called rateless (Fountain) codes has been invented. LT codes [15] and Raptor codes [16] are examples of such codes. Unlike other FEC codes, such as LDPC codes [32], rateless codes can adapt to any erasure channel with unknown or varying characteristics as they do not impose any code rate constraint. Fountain codes are especially very desirable for packetlevel coding at the application layer, where the underlying channel can be modeled as a packet erasure channel.
LT codes can generate a limitless number of output symbols from N s input symbols based on a degree distribution { 1 , 2 , . . . , N s }, where i is the probability that an output symbol has degree i, and N s i=1 i = 1. This probability distribution can also be shown by its generator polynomial (x) = N s i=1 i x i . In LT coding, first an output symbol degree d is randomly chosen from (.). Next, d input symbols are chosen uniformly and randomly from N s input symbols and are bit-wise XORed together to generate an output symbol. (.) is usually fine-tuned such that the N s input symbols can be decoded from any γ r N s output symbols, for γ r slightly greater than 1. Here, γ r is the received coding overhead. LT decoding is performed iteratively. At each iteration, an output symbol is found such that the value of all but one of its neighboring input symbols is known. The value of the unknown input symbol is computed by a simple XOR. This step is applied iteratively until no more such output symbols can be found.
Robust-Soliton degree distribution was designed by Luby for LT codes [15]. LT coding with Robust-Soliton distribution results in asymptotically capacity-achieving codes with the encoding complexity of O(N s log N s ). To reduce the coding complexity to linear (at the cost of a slight performance loss), new degree distributions for LT codes have been introduced such as [16] (x) = 0.00797x + 0.49357x 2 + 0.16622x 3 + 0.07265x 4 + 0.08256x 5 + 0.05606x 8 + 0.03723x 9 + 0.05559x 19 + 0.02502x 65 + 0.00314x 66 . (1) In this paper, we use (1) as the degree distribution of LT codes.
Interestingly, it has been shown that LT codes can easily provide UEP property with a slight change in the encoding process. In [12,13], the authors proposed UEP LT codes by modifying the source symbol selection from uniform to non-uniform. In UEP LT codes, N s source symbols are partitioned into r sets, s 1 , s 2 , . . . , s r of sizes τ 1 N s , τ 2 N s , . . . , τ r N s , such that r j=1 τ j = 1. Let p j be the probability that a source symbol from set s j is chosen to form an encoded symbol. Consequently, we define the protection level of priority i group as k i = p i N s , where r j=1 k j τ j = 1. Further, let y l,j be the probability that a source symbol in s j is not recovered after l LT decoding iterations at the receiver. For j = 1, . . . , r we have [12,13] where It can be shown that sequences {y l,j } l , ∀j converge to a fixed point y j [12,13], where y j is the final decoding error rate of symbols in set j ∈ {1, 2, . . . , r} for a UEP LT code with the parameters { (x), γ r , τ 1 , τ 2 , . . . , τ r , p 1 , p 2 , . . . , p r }. For EEP LT coding, we have k j = 1, j ∈ {1, 2, . . . , r}; hence, ∀j ∈ {1, 2, . . . , r}, y j = y. Note that (2) has been derived from tree-graph approximation of LT codes and provides y j 's for asymptotic case (N s → ∞) [12,13,16].

RCPC codes
We choose RCPC codes [14] due to their flexibility in providing various code rates. RCPC codes use a low-rate convolutional mother code and employ various punc-http://jwcn.eurasipjournals.com/content/2013/1/206 turing patterns to obtain various code rates. The RCPC decoder employs a Viterbi decoder. The bit error rate P b of the Viterbi decoder is upper bounded by [14] where d f is the free distance of the convolutional code, P is the puncturing period, and c d is the total number of error bits produced by the incorrect paths and is known as the distance spectrum [14]. Finally, P d is the probability of selecting a wrong path in Viterbi decoding with Hamming distance d, which depends on the modulation and channel characteristics. For an RCPC code with rate R, using the additive white Gaussian noise (AWGN) channel, binary phase shift keying (BPSK) modulation, and the symbol to noise power ratio E S N 0 = R E b N 0 , the value of P d (using soft Viterbi decoding) is given by [14] where

Cross-layer FEC coding for H.26video bitstream
In this section, we discuss a priority assignment scheme for H.264 AVC video slices, design of LT and RCPC codes, and our proposed cross-layer FEC scheme. We consider a unicast video transmission from a source node (at the transmitter) to a destination node (at the receiver) in a single-hop wireless network and ignore the intermediate network layers, i.e., transport layer (TL), network layer (NL), and link layer (LL). This allows our algorithm to be employed with different existing network protocols stacks.

Priority assignment for H.264 video slices
In H.264 AVC, the video frames are grouped into GOPs, and each GOP is encoded as a unit. For the sake of simplicity, we use a GOP length of 30 frames which corresponds to a duration of 1 s. We encode each GOP independently by employing FEC codes. We have used a fixed slice size configuration where macroblocks of a frame are aggregated to form a fixed slice size. Let N s be the average number of slices in 1 s of the video. More details of the video encoding parameters are given in Section 6. H.264 slices can be prioritized based on their distortion contribution to the received video quality [9,[33][34][35][36][37]. In this paper, the total distortion of a slice loss is computed using the cumulative mean square error (CMSE), which takes into consideration the error propagation within the entire GOP [9,34]. Let the original uncompressed video frame at time t be f (t), the decoded frame without the slice loss bef (t), and the decoded frame with the slice loss bef (t). Assuming that each frame consists of N × M pixels, the MSE introduced by the loss of a slice in the video frame is computed by The loss of a slice in a reference frame can also introduce error propagation in the current and subsequent frames until the end of GOP. The CMSE contributed by the loss of the slice is thus computed as the sum of MSE over the current and all the subsequent frames in the GOP. Note that computation of slice CMSE requires decoding of the entire GOP for every slice loss, which introduces computational overhead. This overhead can be avoided by predicting the slice CMSE using a low-complexity scheme recently proposed by us in [9]. This slice CMSE prediction scheme uses certain parameters from the current encoded frame alone without using the future frames in the GOP.
We use the CMSE metric to determine the slice priority. All slices in a GOP are distributed into r = 4 priority classes of equal size based on their CMSE value. The priority 1 slices induce the highest distortion whereas the priority 4 slices induce the least distortion to received video quality. Note that using more than four slice priorities would result in a more accurate and flexible UEP coding at the cost of higher complexity due to a larger number of design parameters. In fact, using N s priority levels would achieve the best performance where each slice is separately protected based on its CMSE. On the other hand, using fewer than four priority levels would limit the flexibility of our scheme and hence decrease its performance.
Let CMSE i denote the average CMSE of all slices in a priority class i. Therefore, we have CMSE 1 > CMSE 2 > CMSE 3 > CMSE 4 . Since CMSE i may vary considerably for various videos depending on their content, we use the normalized to represent the relative importance of a priority class. We show CMSE i for six H.264 test video sequences in Table 1. These video sequences have widely different spatial and temporal content. Table 1 shows that the first five videos, which have very different characteristics (such as slow, moderate, and high motion), have almost similar CMSE i values. We also observed similar CMSE i values for other video sequences, such as Table Tennis and Mother Daughter. However, Akiyo, which is a static sequence, has different CMSE i values than other sequences. The CMSE i values changed only slightly when these videos were encoded at different bit rates (i.e., 512 kbps and 1 Mbps) and slice sizes (150 to 900 bytes). When these videos are encoded at 840 kbps with 150-byte slices, we get N s ≈ 700. http://jwcn.eurasipjournals.com/content/2013/1/206 We choose the CMSE i values of Bus, which are similar to most other videos discussed above, to tune our proposed cross-layer scheme for all videos in Section 5. Since the CMSE i values of Akiyo are different, we also study the performance of the proposed cross-layer FEC scheme for Akiyo by using its own CMSE i values and compare it to the performance of the scheme designed using the CMSE i values of Bus in Section 6.

Design of LT codes at the AL
The video slices may be either directly passed to the PL or encoded using an EEP/UEP LT code before passing to the PL. Therefore, the AL frames contain either uncoded or LT-coded video slices. When no LT coding is performed at the AL, each video slice forms an AL frame and the N s AL frames are given to the lower network layers. When the LT coding is performed at the AL, γ t N s AL frames, containing LT-coded output symbols, are generated from N s video slices, where γ t ≥ 1 denotes the LT coding overhead at the transmitter. Note that the size of each LT-coded AL frame is still 150 bytes, i.e., the same as input video slice size, whereas the number of AL frames increases to γ t N s from N s . We emphasize that the transmitted LT overhead γ t should not be confused with the received LT coding overhead γ r . Generally, γ r = γ t since some AL frames may not be correctly delivered to the receiver due to channel-induced losses.
The parameters of the UEP LT code at the AL are k i , i ∈ {1, . . . , 4} and γ t , which need to be optimized while considering the FEC at the PL in the cross-layer setup. Since all r = 4 priority levels have equal size, we have τ 1 = τ 2 = τ 3 = τ 4 = 1 4 (see Section 3.1). For EEP/UEP LT coding, we use the standard degree distribution given by (1) [12,13,16].
When UEP rateless codes designed in [12,13] are used at the AL, all γ t N s LT-coded symbols have equal importance. In other words, while more emphasis is given on the higher priority video slices, compared with the lower priority slices, in generating each encoded symbol, the UEP property is embedded in all the encoded symbols equally. Therefore, when UEP rateless codes designed in [12,13] are used, only EEP FEC coding should be performed at the PL. On the other hand, when video slices are passed to the lower layers without the AL FEC coding, the UEP FEC coding can be performed at the PL based on the slice priority. However, the rateless codes discussed in [21,25] are capable of encoded symbols with unequal importance.

Design of RCPC codes at the PL
At the PL, cyclic redundancy check (CRC) bits are added to each AL frame to detect any RCPC decoding errors. We use the industry-standard CRC-8 defined by the polynomial 1 + x 2 + x 4 + x 6 + x 7 + x 8 [38]. Next, each AL frame is encoded using a UEP/EEP RCPC code. As mentioned earlier, we employ an RCPC code designed in [14] with the mother code rate of R = 1 3 and memory of M = 6. Based on the AL frame priority level, the RCPC codes may be punctured to get appropriate higher rates. For four priority groups of AL frames, we have R 1 ≤ R 2 ≤ R 3 ≤ R 4 and R i ∈ 8 8 , 8 9 , 8 10 , 8 12 , 8 14 , 8 16 , 8 18 , 8 20 , 8 22 , 8 24 , where R i represents the RCPC code rate of priority i AL frames. Therefore, the parameters that need to be tuned at the PL are R 1 through R 4 . For EEP RCPC codes, we have R 1 = R 2 = R 3 = R 4 . We refer to a frame encoded by the RCPC code as a PL frame.
For the sake of simplicity and without the loss of generality, we assume that each transmitted packet contains one PL frame. Note that the number of PL frames in a packet does not affect the optimal cross-layer setup of FEC codes in our scheme. We have used a conventional BPSK modulation and a simple AWGN channel. Our model can be easily extended to the more complex channel models by using an appropriate P d in (4) from [14]. To obtain the packet error rates at the PL on the receiver side, we first employ (4) to obtain the bit error rate of the received bitstream. Next, we employ Monte Carlo method to obtain the packet error rate at the receiver. We perform numerical RCPC encoding and CRC calculations and simulate the transmission. Finally, we find the ratio of correctly received packets by taking average over 10 3 packet transmissions in 10 3 iterations.

System model at transmitter
Based on our discussions so far, we can use four combinations of cross-layer FEC coding schemes at the AL and the PL (summarized in Table 2). Note that the FEC coding is necessary at the PL but optional at the AL. We illustrate the layout of cross-layer FEC schemes in Figure 1  Modulation and transmission Figure 1 The proposed S-I and S-II cross-layer FEC schemes. In these schemes, the video slices are prioritized at the AL and UEP/EEP FEC coding is performed only at the PL. In S-I, we have R 1 = R 2 = R 3 = R 4 . Here, TL, NL, and LL represent the transport, network, and link layers, respectively. Figure 2 for S-III and S-IV schemes. The cross-layer optimization of these FEC schemes is discussed in Section 5.

for S-I and S-II schemes and in
In S-I and S-II, FEC coding is applied only at the PL. In S-I, the equal protection (i.e., EEP RCPC coding) is provided to all frames regardless of their importance. In S-II, the video slices are protected at the PL with various protection levels based on their priority by using the UEP RCPC coding. We expect this scheme to have a considerably improved performance compared to S-I. Note that the priority of each AL frame is conveyed to the PL by using the cross-layer communication. This setup represents the schemes proposed in [36,[39][40][41][42][43][44][45].
In S-III and S-IV, FEC coding is applied at both the AL and the PL in a cross-layer fashion. In S-III scheme, we add the FEC coding at the AL by using regular EEP LT codes to the base S-I setup. As we will see later, S-III cannot outperform S-I for all channel conditions since LT codes require extra coding overhead. However, this scheme has the ratelessness property, meaning that it can tolerate loss of the AL frames and still recover the original video slices after LT decoding. This is in contrast to S-I and S-II where the corrupted frames are considered lost. This setup represents the cross-layer FEC schemes proposed in [5,10,11,[26][27][28][29][30][31]46].
In the proposed S-IV scheme, we apply the UEP LT codes where different slices are protected according to their priority. This scheme benefits both from ratelessness and UEP property. We expect this scheme to achieve the best performance. When LT coding is applied at the AL, the rateless coded symbols are uniformly generated and all the encoded AL frames have equal importance. As a result, using UEP FEC coding at the PL would not be beneficial. This is why we have used EEP FEC coding at the PL in the cross-layer S-III and S-IV schemes.

Decoding at receiver
Let PER i denote the packet error rate of AL frames of priority i at the receiver after RCPC decoding and before LT decoding at the AL. PER i can be computed using (3).  The proposed S-III and S-IV cross-layer FEC schemes. In these schemes, the video slices are prioritized at the AL and two layers of FEC coding at the AL and the PL are performed. We perform UEP/EEP LT coding at the AL and EEP RCPC coding at the PL. In S-III, we have k 1 = k 2 = k 3 = k 4 = 1 for EEP LT coding. http://jwcn.eurasipjournals.com/content/2013/1/206 In S-I and S-II schemes, each AL frame consists of an uncoded video slice (i.e., LT coding is not performed at the AL). Therefore, the video slice loss rate (VSLR) of slices in priority i is VSLR i = PER i . In S-III and S-IV schemes, on the other hand, the LT decoding should also be performed, and the decoding error rate of LT codes should be considered in VSLR i . In S-III and S-IV schemes, the EEP RCPC code is used at the PL; hence, we have PER 1 =PER 2 =PER 3 =PER 4 =PER. In this case, we employ (2) with γ r = γ t N s (1 − PER), degree distribution (1), and a given set of k i , i ∈ {1, . . . , 4} to find the final LT decoding symbol error rates y i , i ∈ {1, . . . , 4} for each priority at the receiver (see Section 3.1). If the symbol decoding error rate of priority i is y i , then VSLR i = y i .

Cross-layer optimization of the proposed FEC schemes
In our cross-layer FEC schemes, we consider the following issues. First, the AL and PL FEC codes share the same available channel bandwidth to add their coding redundancy. As the channel E s N 0 increases, the RCPC code rate at the PL can be increased. Thus, more channel bandwidth becomes available for improving the LT coding at the AL. For low values of E s N 0 , assigning a higher portion of the available redundancy to LT codes at the AL may not improve the delivered video quality since almost all PL frames would be corrupted during transmission. Therefore, a stronger RCPC code rate should be used at the PL. This consumes a larger portion of the channel bandwidth allowing only a weaker LT code at the AL. Second, UEP FEC may be used either at the AL or the PL. We study how using UEP relates to varying E s N 0 and the bandwidth portions assigned to each FEC code. Third, the optimal FEC code rates for one scheme in Table 2 may be substantially different from another scheme.
To find the optimal parameters for both the FEC schemes and the portion of channel bandwidth they share, we discuss below the cross-layer optimization for the four schemes given in Table 2.

Formulation of optimization problem
The goal of cross-layer optimization in our scheme is to deliver a video with the highest possible PSNR for a given channel bandwidth C and SNR. Since computing the video PSNR requires decoding the video at the receiver, it is not feasible to use PSNR directly as the optimization metric due to its heavy computational complexity. The PSNR of a compressed video stream depends on several factors, including the video characteristics, bit rate, the percentage of lost slices, and their CMSE values [9,34]. Therefore, we define a function 'normalized F, ' denoted by F, which represents the weighted distortion contributed by the slice loss rates and their corresponding normalized CMSE values, as Here, we use a parameter α ≥ 0 that needs to be tuned so that F can correctly capture the behavior of PSNR. For a compressed video whose PSNR for errorfree transmission is already known, minimizing F results in minimizing the decrease in its PSNR. Selecting the optimal α is discussed in the next section.
To minimize F, we tune the parameters of the FEC codes at the AL and the PL. In the S-I scheme, the optimization function finds the optimal RCPC code rate R for a given channel data rate C as where S + 1 is the slice size S = 150 bytes plus 1 byte of CRC.
In S-II, the optimization parameters are R 1 through R 4 , such that R 1 ≤ R 2 ≤ R 3 ≤ R 4 . For this scheme, the optimization function can be written as The optimization parameters for S-III are γ t and R. In S-III, we have k 1 = k 2 = k 3 = k 4 = 1 since EEP LT coding is used at the AL. The channel data rate is shared among the two FEC codes and needs to be tuned by selecting an appropriate γ t . The optimization function is In S-IV, the UEP LT codes are used and optimization parameters are k 1 through k 3 , along with γ t and R. Here, the value of k 4 can be determined based on k 1 through  k 3 since r j=1 k j τ j = 1 (see Section 3.1). As a result, the optimization function is The optimization of the LT code's parameters involves employing (2) for various priority levels. Since (2) has a recursive form, it may not be represented by a linear function. Furthermore, the concatenation of two FEC codes presents a non-linear optimization problem, which cannot be solved using linear programming techniques. Therefore, we use the genetic algorithms (GA) to perform optimizations [47,48]. Although GA are computationally complex, they can give solutions which are close to the global optimum [47][48][49]. There are numerous implementations of GA. We used the GA toolbox available in Matlab [50]. We have provided a brief review on GA in the Appendix.

Optimal value of α
In Table 1, the normalized CMSE values (CMSE i ) of the video sequences, except Akiyo, were similar. Therefore, the optimal parameters computed for the Bus video would be almost optimal for the other four video sequences generated by the same encoding parameters. We therefore use the CMSE i of the Bus video with data rate of 840 kbps to perform our optimizations, followed by the Akiyo sequence. We implement our cross-layer FEC setup including LT coding at the AL and RCPC coding at the PL for S-I through S-IV (see Table 2) in Matlab environment.
We find the optimal value of α such that minimizing F maximizes the PSNR of the decoded video. For this, we perform the optimization to minimize F for various values of α and also compute the corresponding video PSNR. Note that the value of α has no effect on a cross-layer scheme with EEP FEC code since all VSLR i 's are equal in this case. Therefore, we perform our optimization for S-II, which is the simplest UEP FEC scheme. Table 3 reports the PSNR of the Bus video for three values of α and E S N 0 for C = 1.4 Mbps when F is minimized in S-II. The value of α that concurrently maximizes the PSNR of the video for all values of E S N 0 is α = 1. Although not shown in Table 3, the non-integer values of α and α < 1 were also considered in optimization. α = 1 also gave the best results for Akiyo.

Discussion of cross-layer optimization results
We report the cross-layer optimization results, including the FEC parameters (e.g., R i , γ t , and k i ),  Tables 4 and 5, we observe that the use of UEP RCPC coding at the PL in the S-II scheme achieves much better performance (i.e., lower F Bus ) than the use of EEP RCPC coding in the S-I scheme. Both schemes do not use FEC coding at the AL.
Since the RCPC code rate of 8 12 at the PL is not strong enough for Es No ≤ 2 dB, the value of F Bus in the S-I scheme is high (F Bus > 300 in Table 4) because many packet are corrupted due to high channel errors. For a successful decoding in LT, the number of error-free packets received should be above a threshold. As a result, the S-III scheme (which also uses RCPC with the same code rate as in S-I) achieves a lower performance (higher value of F Bus ) than S-I for Es No ≤ 2 dB (see Tables 4 and 6). However, the S-III scheme achieves much better performance (F Bus < 10) than S-I for Es No ≥ 2.5 dB because fewer packets are now corrupted at the PL and the LT coding becomes effective.
From Tables 6 and 7, we observe that the proposed S-IV scheme achieves much lower values of F Bus than S-III at all values of Es No . This demonstrates that using UEP LT codes at the AL along with EEP RCPC codes at the PL gives a far superior performance than using EEP codes at both layers. From Table 7 for the S-IV scheme, we observe an interesting trade-off between the code rates assigned to FEC codes at the AL and the PL. For lower values of Es No , a larger portion of the bit budget is assigned to RCPC codes at the PL rather than LT codes at the AL because the LT coding cannot be effective when a large number of packets are corrupted due to channel errors. Furthermore, a stronger UEP (i.e., higher value of k i to higher priority video slices) is provided at the AL. For higher values of Es No , the RCPC code rate is relatively high and more protection is provided to LT codes at the AL. Also, the UEP (i.e., value of k i ) at the AL is relatively less strong now. Overall, the proposed S-IV scheme achieves the best performance at different channel SNRs, followed by the S-II scheme for Es No ≤ 2.5 dB. S-III outperforms S-II for other higher channel SNRs. We observe similar results for Foreman and Coastguard videos. Therefore, we can generally conclude that it is optimal to provide UEP at the AL and EEP at the PL using a cross-layer design.
Note that the optimization is performed only once for a given set of CMSE i values, a GOP structure, and a set of channel SNRs, and need not be run separately for each GOP. The same set of optimized parameters can be used for any video stream with similar properties. Further, we should note that similar performance improvement is also observed for the 1.8-Mbps channel bit rate.

Performance evaluation of FEC schemes for test videos
In this section, we evaluate the performance of our optimized cross-layer FEC schemes for four CIF (352 × 288 pixels) video sequences: Bus, Foreman, Coastguard, and Akiyo. These sequences were encoded using H.264/AVC JM 14.2 reference software [51] at 840 kbps and 150 bytes slice size, for a GOP length of 30 frames with GOP structure IDR B P B . . . P B at 30 frames/s. The slices were formed using dispersed-mode FMO with two slice groups per frame. Two reference frames were used for predicting the P and B frames, with error concealment enabled using temporal concealment and spatial interpolation. We have used a channel transmission rate of C = 1.4 to study the performance over AWGN channels. We used the slice loss rates reported in Tables 4  through 7 to evaluate the average PSNR of three video sequences (Bus, Foreman, and Coastguard) in Figures 3, 4, and 5. Figures 3, 4, and 5 confirm that our proposed crosslayer S-IV scheme, with UEP FEC coding at the AL and   Although our cross-layer FEC parameters were optimized for Bus sequences, the average PSNR performance is similar to that of the other two test video sequences, i.e., Foreman and Coastguard. As mentioned earlier, both sequences have different characteristics compared to the Bus sequence.
Since Akiyo has considerably different values of CMSE i , the proposed S-IV scheme designed by using Bus video's CMSE i values would be suboptimal for Akiyo. In order to study the effect of these CMSE variations, we also designed the S-IV scheme by using the CMSE i values of Akiyo and compare its performance with its suboptimal version. The optimization results are reported in Table 8. In this table, we also included the suboptimal values of  F sub and PSNR sub , which were obtained by using the optimized parameters of the Bus video from Table 7. The values of PSNR opt and PSNR sub are also shown in Figure 6. In Table 8 (for optimal scheme) and Table 7 (for suboptimal scheme), the LT code overhead (i.e., γ t ) and RCPC code strength (R) are the same for both schemes, whereas the values of LT code protection level k i for each priority class vary slightly (e.g., k 1 is higher for the optimal scheme compared to the suboptimal scheme). Similarly, the values of VSLR i for higher priority slices (which have the most impact on F and PSNR) are similar in both tables, except for channel SNRs of 2.25, 2.5, and 2.75 dB in the decreasing order of the difference in values. The maximum PSNR degradation of the suboptimal scheme compared to the optimal scheme is 1.7 dB at the channel SNR of 2.25, with only about 0.1 to 0.3 dB PSNR degradation at other channel SNRs. We can, therefore, conclude that the per- Figure 6 Average PSNR performance of the optimal and suboptimal cross-layer FEC schemes for Akiyo video sequence.
formance of the proposed cross-layer FEC scheme is not very sensitive to the precise values of normalized CMSE.

Conclusion
Previously, EEP and UEP FEC coding schemes have been used for video transmission over lossy channels. However, the joint optimization of cross-layer UEP FEC codes at the AL and the PL for robust video transmission has never been considered. In this paper, we used UEP LT coding at the AL and RCPC coding at the PL for robust H.264 video transmission over wireless channels. H.264 video slices were prioritized based on their contribution to video quality. We performed cross-layer optimization to concurrently tune the FEC code parameters at both layers, to minimize the video distortion, and to maximize the PSNR. We observed that our cross-layer FEC scheme outperformed other FEC schemes that use either UEP coding at the PL alone or EEP FEC schemes at the AL as well as the PL. Further, we showed that our optimization works well for different H.264-encoded video sequences, which have widely different characteristics.