# Optimized cross-layer forward error correction coding for H.264 AVC video transmission over wireless channels

- Ali Talari
^{1}Email author, - Sunil Kumar
^{2}, - Nazanin Rahnavard
^{1}, - Seethal Paluri
^{2}and - John D Matyjas
^{3}

**2013**:206

https://doi.org/10.1186/1687-1499-2013-206

© Talari et al.; licensee Springer. 2013

**Received: **16 November 2012

**Accepted: **9 July 2013

**Published: **13 August 2013

## Abstract

Forward error correction (FEC) codes that can provide *unequal error protection* (UEP) have been used recently for video transmission over wireless channels. These video transmission schemes may also benefit from the use of FEC codes both at the *application layer* (AL) and the *physical layer* (PL). However, the interaction and optimal setup of UEP FEC codes at the AL and the PL have not been previously investigated. In this paper, we study the cross-layer design of FEC codes at both layers for H.264 video transmission over wireless channels. In our scheme, UEP *Luby transform codes* are employed at the AL and *rate-compatible punctured convolutional* codes at the PL. In the proposed scheme, video slices are first prioritized based on their contribution to video quality. Next, we investigate the four combinations of cross-layer FEC schemes at both layers and concurrently optimize their parameters to minimize the video distortion and maximize the peak signal-to-noise ratio. We evaluate the performance of these schemes on four test H.264 video streams and show the superiority of optimized cross-layer FEC design.

## Keywords

## 1 Introduction

Multimedia applications such as video streaming, which are delay sensitive and bandwidth intensive, are growing rapidly over wireless networks. However, existing wireless networks provide only limited bandwidth and time-varying *quality of service* (QoS) support for these applications. Due to limited wireless bandwidth, the video is compressed using sophisticated compression techniques such as H.264 AVC, which is the state-of-the-art video compression standard jointly developed by the ITU and ISO [1]. The compressed video is vulnerable to channel impairments as the corrupted packets induce different levels of quality degradation due to temporal and spatial dependencies in the compressed bitstream. The most important problem that affects video quality is error propagation where an error in a reference frame is propagated by the decoder to all future reconstructed frames, which are predicted from the corrupted reference frame. This problem has led to the design of error-resiliency features, such as *flexible macroblock ordering* (FMO) [2], data partitioning, and error concealment schemes in H.264 [1, 3, 4]. Recent research has demonstrated the promise of cross-layer protocols for supporting the QoS demands of multimedia applications over wireless networks [5–7]. For example, van der Schaar and Shankar [6] showed the benefits of the joint APP-MAC-PHY approach for transmitting video over wireless networks.

*Forward error correction* (FEC) schemes are used to protect the video data against channel errors in order to improve the successful data transmission probability and to eliminate the costly retransmissions. However, the maximum throughput does not guarantee the minimum video distortion at the receiver for the following reasons. *First*, unlike data packets, loss of H.264 compressed video slices induces different amounts of distortion in the received video. Therefore, the FEC code rates should be adaptive to the slice priority. *Second*, video data are delay sensitive; therefore, the retransmission of corrupted slices may not be feasible. *Third*, a video stream can tolerate loss of some slices because the lost slices can be error-concealed. This is true especially for the low-priority slices, which introduce low distortion in the received video and result in graceful quality degradation. In this paper, we consider H.264 AVC streams with fixed slice sizes, where each slice can be independently decoded. The video slices are classified into *four* priority classes based on the distortion contributed by their loss to the received video quality.

An FEC code that provides *unequal error protection* (UEP), i.e., a higher (lower) protection to high (low)-priority video slices, can achieve considerable quality improvement compared to the *equal error protection* (EEP) FEC codes [8, 9]. Note that the UEP FEC codes may be employed both at the *application layer* (AL) and *physical layer* (PL). Recently, some schemes [5, 10, 11] have considered the precise tuning of EEP FEC schemes at the AL and the PL. However, to the best of our knowledge, existing schemes have not investigated the cross-layer design of *UEP* FEC codes at the AL and the PL for prioritized video transmission. Employing FEC codes at both layers introduces two interesting trade-offs that we investigate in this paper. *First*, both FEC codes share a common channel bandwidth to add their redundancy and the optimal ratio of overhead added by each needs to be determined for a given channel signal-to-noise ratio (SNR) and bandwidth. *Second*, since UEP can be provided at both layers, we need to find the optimal UEP/EEP FEC setup to maximize the video peak SNR (PSNR). To tackle these trade-offs, we *concurrently* tune the parameters of two FEC codes at both layers.

We use *UEP Luby transform (LT)* codes [12, 13] at the AL and *rate-compatible punctured convolutional* (RCPC) codes [14] at the PL. LT codes [15] are modern and efficient FEC codes that are specifically suitable for packet-level coding at the AL. These codes are *rateless*[12, 13, 15, 16] in the sense that they can generate unlimited encoded information from a finite-length source information.

Next, we carry out a cross-layer optimization to find the optimal parameters of both FEC codes by considering the relative priorities of video packets. For a known channel SNR (i.e., $\frac{{E}_{s}}{{N}_{0}}$), we address the problem of assigning optimal FEC code rates at the AL and the PL to the individual priority slices within the channel bit-rate limitations. The information about the channel conditions can be obtained from the receiver in the form of channel side information [5–7, 17, 18].

The scheme provides higher transmission reliability to high-priority slices at the expense of the higher loss rates for low-priority slices and, whenever necessary, also discards some low-priority slices to meet the channel bit-rate limitations. We show that adapting the FEC code rates to the slice priority reduces the overall expected video distortion at the receiver. Our scheme does not assume retransmission of lost slices. The preliminary results of this paper appeared in [8].

This paper is organized as follows: Section 2 provides an overview of the related work on FEC coding for video streams. Section 3 provides a brief background on the LT and RCPC FEC codes. Section 4 describes the video slice priority assignment, design of LT and RCPC codes, and cross-layer FEC schemes. Section 5 presents the cross-layer optimization and performance of the proposed FEC schemes. The simulation results of the proposed cross-layer FEC schemes on sample H.264 videos are presented in Section 6, followed by conclusions in Section 7.

## 2 Related work

LT codes have recently become popular in video transmission schemes due to their good performance and low complexity [15]. Kushwaha et al. [19] used LT codes to encode *group of pictures* (GOP) of each layer of H.264 SVC video for transmission over cognitive radio wireless networks. Ahmad et al. [17] took advantage of the ratelessness of LT codes and proposed an adaptive FEC scheme for video transmission over the Internet by employing feedback from receivers in the form of acknowledgement. Cataldi et al. [18] proposed a novel LT code, called sliding-window Raptor codes, with a higher efficiency than regular LT codes. They used these codes to provide UEP for a two-layer H.264 SVC scalable video. LT codes were also used in [20–25] to design streaming schemes with lower complexity.

Stockhammer et al. [5] defined the protocol stack, including the FEC coding at the AL and the PL, for the *multimedia broadcast multicast service* (MBMS) download and streaming in universal mobile telecommunication system (UMTS). In [5], a Raptor code [16] is used at the AL and a turbo code at the PL. Gomez-Barquero and Bria [10] suggested employing the Raptor codes as the AL FEC in DVB-H systems for mobile terminals and demonstrated its advantages over conventional *multiprotocol encapsulation* (MPE) FEC. Conventional MPE FEC employs the Reed-Solomon codes to encode the video stream; hence, it lacks the flexibility of LT coding at the AL. Courtade and Wesel [11] considered a setup with LT coding at the AL and turbo coding at the PL, and showed that the available channel bandwidth should be optimally split between the AL and PL FEC codes to improve the system performance.

Luby et al. [26] also considered employing two layers of EEP FEC at the AL and the PL for MBMS download delivery in UMTS. They investigated the trade-off between the AL FEC and PL FEC codes, and studied the advantages of the AL FEC on the system performance. Stockhammer and Liebl [27] used the Raptor codes at the AL in 3GPP streaming applications. They investigated how the AL FEC coding may guarantee the ratio of satisfied users who are receiving the video stream. Afzal et al. [28] investigated the overall system performance when the AL FEC codes are used in video streaming in UMTS and packet radio services. Alexiou et al. [29] studied the power control of streaming over *high-speed downlink packet access* systems when the AL FEC is employed. Munaretto et al. [30] proposed an interesting optimization of the AL FEC coding, video source coding, and the PL rate selection to improve the PSNR of delivered video on cellular networks. The authors in [31] also considered employing the Raptor codes at the AL to improve the quality of service for video in MBMS in *long-term evolution (LTE)* networks. They investigated the benefits of the AL FEC to multicast multimedia contents and examined how much FEC redundancy should be used under different packet loss patterns.

In [8], we investigated UEP rateless coding at the AL and assumed an ideal PL coding. We found the optimal parameters of a UEP rateless code that maximizes the video quality at the receiver for known channel bandwidth. In this paper, we extend the results of [8] and consider the interaction of the AL coding with the PL coding in video transmission schemes.

## 3 Background

In this section, we briefly review LT and RCPC FEC codes that will be used at the AL and the PL, respectively, in our proposed cross-layer FEC scheme.

### 3.1 LT codes

Recently, a new class of FEC codes called rateless (Fountain) codes has been invented. LT codes [15] and Raptor codes [16] are examples of such codes. Unlike other FEC codes, such as LDPC codes [32], rateless codes can adapt to any erasure channel with unknown or varying characteristics as they do not impose any code rate constraint. Fountain codes are especially very desirable for packet-level coding at the application layer, where the underlying channel can be modeled as a packet erasure channel.

LT codes can generate a limitless number of *output symbols* from *N*_{
s
}*input symbols* based on a degree distribution $\{{\Omega}_{1},{\Omega}_{2},\dots ,{\Omega}_{{N}_{s}}\}$, where *Ω*_{
i
} is the probability that an output symbol has degree *i*, and $\sum _{i=1}^{{N}_{s}}{\Omega}_{i}=1$. This probability distribution can also be shown by its generator polynomial $\Omega \left(x\right)=\sum _{i=1}^{{N}_{s}}{\Omega}_{i}{x}^{i}$. In LT coding, first an output symbol degree *d* is randomly chosen from *Ω*(.). Next, *d* input symbols are chosen *uniformly* and *randomly* from *N*_{
s
} input symbols and are bit-wise *XOR* ed together to generate an output symbol. *Ω*(.) is usually fine-tuned such that the *N*_{
s
} input symbols can be decoded from any *γ*_{
r
}*N*_{
s
} output symbols, for *γ*_{
r
} slightly greater than 1. Here, *γ*_{
r
} is the *received coding overhead*. LT decoding is performed *iteratively*. At each iteration, an output symbol is found such that the value of all but one of its neighboring input symbols is known. The value of the unknown input symbol is computed by a simple XOR. This step is applied iteratively until no more such output symbols can be found.

*O*(

*N*

_{ s }log

*N*

_{ s }). To reduce the coding complexity to linear (at the cost of a slight performance loss), new degree distributions for LT codes have been introduced such as [16]

In this paper, we use (1) as the degree distribution of LT codes.

*non-uniform*. In UEP LT codes,

*N*

_{ s }source symbols are partitioned into

*r*sets,

*s*

_{1},

*s*

_{2},…,

*s*

_{ r }of sizes

*τ*

_{1}

*N*

_{ s },

*τ*

_{2}

*N*

_{ s },…,

*τ*

_{ r }

*N*

_{ s }, such that $\sum _{j=1}^{r}{\tau}_{j}=1$. Let

*p*

_{ j }be the probability that a source symbol from set

*s*

_{ j }is chosen to form an encoded symbol. Consequently, we define the

*protection level*of priority

*i*group as

*k*

_{ i }=

*p*

_{ i }

*N*

_{ s }, where $\sum _{j=1}^{r}{k}_{j}{\tau}_{j}=1$. Further, let

*y*

_{l,j}be the probability that a source symbol in

*s*

_{ j }is not recovered after

*l*LT decoding iterations at the receiver. For

*j*=1,…,

*r*we have [12, 13]

where *y*_{0,j} = 1, *β*(*x*) = *Ω*^{′}(*x*)/*Ω*^{′}(1), and ${\delta}_{j}\left(x\right)\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}{e}^{{N}_{s}{p}_{j}{\Omega}^{\prime}\left(1\right){\gamma}_{r}(x\phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}1)}$.

It can be shown that sequences {*y*_{l,j}}_{
l
},∀*j* converge to a fixed point *y*_{
j
}[12, 13], where *y*_{
j
} is the final decoding error rate of symbols in set *j*∈{1,2,…,*r*} for a UEP LT code with the parameters {*Ω*(*x*),*γ*_{
r
},*τ*_{1},*τ*_{2},…,*τ*_{
r
},*p*_{1},*p*_{2},…,*p*_{
r
}}. For EEP LT coding, we have *k*_{
j
}=1,*j*∈{1,2,…,*r*}; hence, ∀*j*∈{1,2,…,*r*},*y*_{
j
}=*y*. Note that (2) has been derived from tree-graph approximation of LT codes and provides *y*_{
j
}’s for asymptotic case (*N*_{
s
}→*∞*) [12, 13, 16].

### 3.2 RCPC codes

*low-rate*convolutional mother code and employ various puncturing patterns to obtain various code rates. The RCPC decoder employs a

*Viterbi*decoder. The bit error rate

*P*

_{ b }of the Viterbi decoder is upper bounded by [14]

*d*

_{ f }is the free distance of the convolutional code,

*P*is the puncturing period, and

*c*

_{ d }is the total number of error bits produced by the incorrect paths and is known as the

*distance spectrum*[14]. Finally,

*P*

_{ d }is the probability of selecting a wrong path in Viterbi decoding with Hamming distance

*d*, which depends on the modulation and channel characteristics. For an RCPC code with rate

*R*, using the additive white Gaussian noise (AWGN) channel, binary phase shift keying (BPSK) modulation, and the symbol to noise power ratio $\frac{{E}_{S}}{{N}_{0}}=R\frac{{E}_{b}}{{N}_{0}}$, the value of

*P*

_{ d }(using soft Viterbi decoding) is given by [14]

where $Q\left(\lambda \right)=\frac{1}{\sqrt{2\pi}}\underset{\lambda}{\overset{\infty}{\int}}{e}^{-\frac{{a}^{2}}{2}}\mathit{\text{da}}$.

## 4 Cross-layer FEC coding for H.264 video bitstream

In this section, we discuss a priority assignment scheme for H.264 AVC video slices, design of LT and RCPC codes, and our proposed cross-layer FEC scheme. We consider a unicast video transmission from a source node (at the transmitter) to a destination node (at the receiver) in a single-hop wireless network and ignore the intermediate network layers, i.e., transport layer (TL), network layer (NL), and link layer (LL). This allows our algorithm to be employed with different existing network protocols stacks.

### 4.1 Priority assignment for H.264 video slices

In H.264 AVC, the video frames are grouped into GOPs, and each GOP is encoded as a unit. For the sake of simplicity, we use a GOP length of 30 frames which corresponds to a duration of 1 s. We encode each GOP independently by employing FEC codes. We have used a fixed slice size configuration where macroblocks of a frame are aggregated to form a fixed slice size. Let *N*_{
s
} be the *average* number of slices in 1 s of the video. More details of the video encoding parameters are given in Section 6.

*distortion*contribution to the received video quality [9, 33–37]. In this paper, the total distortion of a slice loss is computed using the

*cumulative mean square error*(CMSE), which takes into consideration the error propagation within the entire GOP [9, 34]. Let the original uncompressed video frame at time

*t*be

*f*(

*t*), the decoded frame without the slice loss be $\widehat{f}\left(t\right)$, and the decoded frame with the slice loss be $\stackrel{~}{f}\left(t\right)$. Assuming that each frame consists of

*N*×

*M*pixels, the MSE introduced by the loss of a slice in the video frame is computed by

The loss of a slice in a reference frame can also introduce error propagation in the current and subsequent frames until the end of GOP. The CMSE contributed by the loss of the slice is thus computed as the sum of MSE over the current and all the subsequent frames in the GOP. Note that computation of slice CMSE requires decoding of the entire GOP for every slice loss, which introduces computational overhead. This overhead can be avoided by predicting the slice CMSE using a low-complexity scheme recently proposed by us in [9]. This slice CMSE prediction scheme uses certain parameters from the current encoded frame alone without using the future frames in the GOP.

We use the CMSE metric to determine the slice priority. All slices in a GOP are distributed into *r*=4 priority classes of *equal size* based on their CMSE value. The priority 1 slices induce the highest distortion whereas the priority 4 slices induce the least distortion to received video quality. Note that using more than four slice priorities would result in a more accurate and flexible UEP coding at the cost of higher complexity due to a larger number of design parameters. In fact, using *N*_{
s
} priority levels would achieve the best performance where each slice is separately protected based on its CMSE. On the other hand, using fewer than four priority levels would limit the flexibility of our scheme and hence decrease its performance.

_{ i }denote the

*average*CMSE of all slices in a priority class

*i*. Therefore, we have CMSE

_{1}> CMSE

_{2}> CMSE

_{3}> CMSE

_{4}. Since CMSE

_{ i }may vary considerably for various videos depending on their content, we use the

*normalized*CMSE

_{ i }, ${\overline{\text{CMSE}}}_{i}=\frac{{\text{CMSE}}_{i}}{\sum _{j=1}^{4}{\text{CMSE}}_{j}}$ to represent the relative importance of a priority class. We show ${\overline{\text{CMSE}}}_{i}$ for six H.264 test video sequences in Table 1. These video sequences have widely different spatial and temporal content.

**Normalized CMSE,**
${\overline{\text{CMSE}}}_{i}$
**, for slices in different priorities of sample videos**

Sequence | ${\overline{\text{CMSE}}}_{1}$ | ${\overline{\text{CMSE}}}_{2}$ | ${\overline{\text{CMSE}}}_{3}$ | ${\overline{\text{CMSE}}}_{4}$ |
---|---|---|---|---|

Coastguard | 0.61 | 0.22 | 0.12 | 0.05 |

Foreman | 0.63 | 0.21 | 0.11 | 0.05 |

Bus | 0.64 | 0.21 | 0.10 | 0.04 |

Football | 0.65 | 0.21 | 0.10 | 0.04 |

Silent | 0.68 | 0.2 | 0.09 | 0.03 |

Akiyo | 0.85 | 0.12 | 0.03 | 0.01 |

Table 1 shows that the first five videos, which have very different characteristics (such as slow, moderate, and high motion), have almost similar${\overline{\text{CMSE}}}_{i}$values. We also observed similar${\overline{\text{CMSE}}}_{i}$values for other video sequences, such as Table Tennis and Mother Daughter. However, Akiyo, which is a static sequence, has different${\overline{\text{CMSE}}}_{i}$values than other sequences. The${\overline{\text{CMSE}}}_{i}$values changed only slightly when these videos were encoded at different bit rates (i.e., 512 kbps and 1 Mbps) and slice sizes (150 to 900 bytes). When these videos are encoded at 840 kbps with 150-byte slices, we get *N*_{
s
}≈700.

We choose the${\overline{\text{CMSE}}}_{i}$values of Bus, which are similar to most other videos discussed above, to tune our proposed cross-layer scheme for all videos in Section 5. Since the${\overline{\text{CMSE}}}_{i}$values of Akiyo are different, we also study the performance of the proposed cross-layer FEC scheme for Akiyo by using its own${\overline{\text{CMSE}}}_{i}$values and compare it to the performance of the scheme designed using the${\overline{\text{CMSE}}}_{i}$values of Bus in Section 6.

### 4.2 Design of LT codes at the AL

The video slices may be either directly passed to the PL or encoded using an EEP/UEP LT code before passing to the PL. Therefore, the AL frames contain either uncoded or LT-coded video slices. When no LT coding is performed at the AL, each video slice forms an AL frame and the *N*_{
s
}AL frames are given to the lower network layers. When the LT coding is performed at the AL, *γ*_{
t
}*N*_{
s
}AL frames, containing LT-coded output symbols, are generated from *N*_{
s
}video slices, where *γ*_{
t
}≥1 denotes the LT coding overhead at the *transmitter*. Note that the size of each LT-coded AL frame is still 150 bytes, i.e., the same as input video slice size, whereas the number of AL frames increases to *γ*_{
t
}*N*_{
s
}from *N*_{
s
}. We emphasize that the transmitted LT overhead *γ*_{
t
}should not be confused with the received LT coding overhead *γ*_{
r
}. Generally, *γ*_{
r
}≠*γ*_{
t
}since some AL frames may not be correctly delivered to the receiver due to channel-induced losses.

The parameters of the UEP LT code at the AL are *k*_{
i
},*i*∈{1,…,4} and *γ*_{
t
}, which need to be optimized while considering the FEC at the PL in the cross-layer setup. Since all *r*=4 priority levels have equal size, we have${\tau}_{1}={\tau}_{2}={\tau}_{3}={\tau}_{4}=\frac{1}{4}$ (see Section 3.1). For EEP/UEP LT coding, we use the standard degree distribution given by (1) [12, 13, 16].

When UEP rateless codes designed in [12, 13] are used at the AL, all *γ*_{
t
}*N*_{
s
} LT-coded symbols have equal importance. In other words, while more emphasis is given on the higher priority video slices, compared with the lower priority slices, in generating each encoded symbol, the UEP property is embedded in all the encoded symbols equally. Therefore, when UEP rateless codes designed in [12, 13] are used, only EEP FEC coding should be performed at the PL. On the other hand, when video slices are passed to the lower layers without the AL FEC coding, the UEP FEC coding can be performed at the PL based on the slice priority. However, the rateless codes discussed in [21, 25] are capable of encoded symbols with unequal importance.

### 4.3 Design of RCPC codes at the PL

At the PL, *cyclic redundancy check* (CRC) bits are added to each AL frame to detect any RCPC decoding errors. We use the industry-standard CRC-8 defined by the polynomial 1+*x*^{2}+*x*^{4}+*x*^{6}+*x*^{7}+*x*^{8}[38]. Next, each AL frame is encoded using a UEP/EEP RCPC code. As mentioned earlier, we employ an RCPC code designed in [14] with the mother code rate of$R=\frac{1}{3}$ and memory of *M*=6. Based on the AL frame priority level, the RCPC codes may be punctured to get appropriate higher rates. For four priority groups of AL frames, we have *R*_{1}≤*R*_{2}≤*R*_{3}≤*R*_{4}and${R}_{i}\in \left\{\frac{8}{8},\frac{8}{9},\frac{8}{10},\frac{8}{12},\frac{8}{14},\frac{8}{16},\frac{8}{18},\frac{8}{20},\frac{8}{22},\frac{8}{24}\right\}$, where *R*_{
i
}represents the RCPC code rate of priority *i* AL frames. Therefore, the parameters that need to be tuned at the PL are *R*_{1}through *R*_{4}. For EEP RCPC codes, we have *R*_{1}=*R*_{2}=*R*_{3}=*R*_{4}. We refer to a frame encoded by the RCPC code as a PL frame.

For the sake of simplicity and without the loss of generality, we assume that each transmitted packet contains one PL frame. Note that the number of PL frames in a packet does not affect the optimal cross-layer setup of FEC codes in our scheme. We have used a conventional BPSK modulation and a simple AWGN channel. Our model can be easily extended to the more complex channel models by using an appropriate *P*_{
d
}in (4) from [14]. To obtain the packet error rates at the PL on the receiver side, we first employ (4) to obtain the bit error rate of the received bitstream. Next, we employ Monte Carlo method to obtain the packet error rate at the receiver. We perform numerical RCPC encoding and CRC calculations and simulate the transmission. Finally, we find the ratio of correctly received packets by taking average over 10^{3} packet transmissions in 10^{3}iterations.

### 4.4 System model at transmitter

**Various combinations of cross-layer FEC coding schemes**

Model | S-I | S-II | S-III | S-IV |
---|---|---|---|---|

AL FEC | No FEC | No FEC | EEP | UEP |

PL FEC | EEP | UEP | EEP | EEP |

In S-I and S-II, FEC coding is applied only at the PL. In S-I, the equal protection (i.e., EEP RCPC coding) is provided to all frames regardless of their importance. In S-II, the video slices are protected at the PL with various protection levels based on their priority by using the UEP RCPC coding. We expect this scheme to have a considerably improved performance compared to S-I. Note that the priority of each AL frame is conveyed to the PL by using the cross-layer communication. This setup represents the schemes proposed in [36, 39–45].

In S-III and S-IV, FEC coding is applied at both the AL and the PL in a cross-layer fashion. In S-III scheme, we add the FEC coding at the AL by using regular EEP LT codes to the base S-I setup. As we will see later, S-III cannot outperform S-I for all channel conditions since LT codes require extra coding overhead. However, this scheme has the ratelessness property, meaning that it can tolerate loss of the AL frames and still recover the original video slices after LT decoding. This is in contrast to S-I and S-II where the corrupted frames are considered lost. This setup represents the cross-layer FEC schemes proposed in [5, 10, 11, 26–31, 46].

In the proposed S-IV scheme, we apply the UEP LT codes where different slices are protected according to their priority. This scheme benefits both from ratelessness and UEP property. We expect this scheme to achieve the best performance. When LT coding is applied at the AL, the rateless coded symbols are uniformly generated and all the encoded AL frames have equal importance. As a result, using UEP FEC coding at the PL would not be beneficial. This is why we have used EEP FEC coding at the PL in the cross-layer S-III and S-IV schemes.

### 4.5 Decoding at receiver

Let PER_{
i
}denote the packet error rate of AL frames of priority *i* at the receiver after RCPC decoding and before LT decoding at the AL. PER_{
i
}can be computed using (3).

In S-I and S-II schemes, each AL frame consists of an uncoded video slice (i.e., LT coding is not performed at the AL). Therefore, the *video slice loss rate* (VSLR) of slices in priority *i* is VSLR_{
i
}=PER_{
i
}. In S-III and S-IV schemes, on the other hand, the LT decoding should also be performed, and the decoding error rate of LT codes should be considered in VSLR_{
i
}. In S-III and S-IV schemes, the EEP RCPC code is used at the PL; hence, we have PER_{1}=PER_{2}=PER_{3}=PER_{4}=PER. In this case, we employ (2) with *γ*_{
r
}=*γ*_{
t
}*N*_{
s
}(1−PER), degree distribution (1), and a given set of *k*_{
i
},*i*∈{1,…,4} to find the final LT decoding symbol error rates *y*_{
i
},*i*∈{1,…,4} for each priority at the receiver (see Section 3.1). If the symbol decoding error rate of priority *i* is *y*_{
i
}, then VSLR_{
i
}=*y*_{
i
}.

## 5 Cross-layer optimization of the proposed FEC schemes

In our cross-layer FEC schemes, we consider the following issues. *First*, the AL and PL FEC codes share the same available channel bandwidth to add their coding redundancy. As the channel$\frac{{E}_{s}}{{N}_{0}}$increases, the RCPC code rate at the PL can be increased. Thus, more channel bandwidth becomes available for improving the LT coding at the AL. For low values of$\frac{{E}_{s}}{{N}_{0}}$, assigning a higher portion of the available redundancy to LT codes at the AL may not improve the delivered video quality since almost all PL frames would be corrupted during transmission. Therefore, a stronger RCPC code rate should be used at the PL. This consumes a larger portion of the channel bandwidth allowing only a weaker LT code at the AL. *Second*, UEP FEC may be used either at the AL or the PL. We study how using UEP relates to varying$\frac{{E}_{s}}{{N}_{0}}$and the bandwidth portions assigned to each FEC code. *Third*, the optimal FEC code rates for one scheme in Table 2 may be substantially different from another scheme.

To find the optimal parameters for both the FEC schemes and the portion of channel bandwidth they share, we discuss below the cross-layer optimization for the four schemes given in Table 2.

### 5.1 Formulation of optimization problem

*C*and SNR. Since computing the video PSNR requires decoding the video at the receiver, it is not feasible to use PSNR directly as the optimization metric due to its heavy computational complexity. The PSNR of a compressed video stream depends on several factors, including the video characteristics, bit rate, the percentage of lost slices, and their CMSE values [9, 34]. Therefore, we define a function ‘normalized

*F*,’ denoted by$\overline{F}$, which represents the weighted distortion contributed by the slice loss rates and their corresponding normalized CMSE values, as

Here, we use a parameter *α*≥0 that needs to be *tuned* so that$\overline{F}$can correctly capture the behavior of PSNR. For a compressed video whose PSNR for error-free transmission is already known, minimizing *F* results in minimizing the decrease in its PSNR. Selecting the optimal *α* is discussed in the next section.

*R*for a given channel data rate

*C*as

where *S*+1 is the slice size *S*=150 bytes plus 1 byte of CRC.

*R*

_{1}through

*R*

_{4}, such that

*R*

_{1}≤

*R*

_{2}≤

*R*

_{3}≤

*R*

_{4}. For this scheme, the optimization function can be written as

*γ*

_{ t }and

*R*. In S-III, we have

*k*

_{1}=

*k*

_{2}=

*k*

_{3}=

*k*

_{4}=1 since EEP LT coding is used at the AL. The channel data rate is shared among the two FEC codes and needs to be tuned by selecting an appropriate

*γ*

_{ t }. The optimization function is

*k*

_{1}through

*k*

_{3}, along with

*γ*

_{ t }and

*R*. Here, the value of

*k*

_{4}can be determined based on

*k*

_{1}through

*k*

_{3}since$\sum _{j=1}^{r}{k}_{j}{\tau}_{j}=1$(see Section 3.1). As a result, the optimization function is

The optimization of the LT code’s parameters involves employing (2) for various priority levels. Since (2) has a recursive form, it may not be represented by a linear function. Furthermore, the concatenation of two FEC codes presents a non-linear optimization problem, which cannot be solved using *linear programming* techniques. Therefore, we use the *genetic algorithms* (GA) to perform optimizations [47, 48]. Although GA are computationally complex, they can give solutions which are close to the global optimum [47–49]. There are numerous implementations of GA. We used the GA toolbox available in Matlab [50]. We have provided a brief review on GA in the Appendix.

### 5.2 Optimal value of *α*

In Table 1, the normalized CMSE values (${\overline{\text{CMSE}}}_{i}$) of the video sequences, except Akiyo, were similar. Therefore, the optimal parameters computed for the Bus video would be *almost* optimal for the other four video sequences generated by the same encoding parameters. We therefore use the${\overline{\text{CMSE}}}_{i}$of the Bus video with data rate of 840 kbps to perform our optimizations, followed by the Akiyo sequence. We implement our cross-layer FEC setup including LT coding at the AL and RCPC coding at the PL for S-I through S-IV (see Table 2) in Matlab environment.

*α*such that minimizing$\overline{F}$maximizes the PSNR of the decoded video. For this, we perform the optimization to minimize$\overline{F}$for various values of

*α*and also compute the corresponding video PSNR. Note that the value of

*α*has no effect on a cross-layer scheme with EEP FEC code since all VSLR

_{ i }’s are equal in this case. Therefore, we perform our optimization for S-II, which is the simplest UEP FEC scheme. Table 3 reports the PSNR of the Bus video for three values of

*α*and$\frac{{E}_{s}}{{N}_{0}}$for

*C*=1.4 Mbps when$\overline{F}$is minimized in S-II. The value of

*α*that concurrently maximizes the PSNR of the video for all values of$\frac{{E}_{s}}{{N}_{0}}$is

*α*=1. Although not shown in Table 3, the non-integer values of

*α*and

*α*<1 were also considered in optimization.

*α*=1 also gave the best results for Akiyo.

**PSNR of Bus video sequence for various values of**
α
**and**
$\frac{{E}_{s}}{{N}_{0}}$
**with optimized**
F
**for S-II**

$\frac{{E}_{s}}{{N}_{0}}$ | 1 dB | 2 dB | 3 dB | 4 dB | ||||
---|---|---|---|---|---|---|---|---|

| 1,2 | 3 | 1,2 | 3 | 1 | 2,3 | 1 | 2,3 |

PSNR | 18.2 | 16.85 | 22.3 | 19.8 | 25.8 | 20.6 | 29.69 | 29 |

### 5.3 Discussion of cross-layer optimization results

*R*

_{ i },

*γ*

_{ t }, and

*k*

_{ i }),

*VSLR*

_{ i }, normalized$\overline{F}$, and non-normalized

*F*for the${\overline{\text{CMSE}}}_{i}$values of the Bus video. Note that

*F*is calculated by replacing the${\overline{\text{CMSE}}}_{i}$by the actual average CMSE

_{ i }for the video sequence under consideration. The results of all four FEC schemes for three video sequences (Bus, Foreman, and Coastguard) are reported in Tables 4, 5, 6, and 7 for channel bit rate

*C*=1.4 Mbps. The results for Akiyo are discussed in Section 6.

**Optimal cross-layer parameters for S-I scheme with**
C
**= 1.**
**4Mbps**

E | 1 dB | 1.25 dB | 1.5 dB | 1.75 dB | 2 dB | 2.25 dB | 2.5 dB | 2.75 dB | 3 dB | 4 dB | 5 dB |
---|---|---|---|---|---|---|---|---|---|---|---|

$\overline{F}$ | 0.998 | 0.988 | 0.949 | 0.852 | 0.694 | 0.503 | 0.328 | 0.197 | 0.11 | 0.008 | 0 |

| 443.4 | 438.9 | 421.6 | 378.5 | 307.9 | 223.5 | 145.7 | 87.5 | 48.9 | 3.1 | 0 |

| 214.7 | 212.5 | 204.1 | 183.3 | 149.1 | 108.2 | 70.6 | 42.4 | 23.7 | 1.5 | 0 |

| 179.8 | 178.0 | 171.0 | 153.5 | 124.9 | 90.6 | 59.1 | 35.5 | 19.8 | 1.3 | 0 |

| $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ |

VSLR | 0.998 | 0.988 | 0.949 | 0.852 | 0.693 | 0.503 | 0.328 | 0.197 | 0.11 | 0.007 | 0 |

**Optimal cross-layer parameters for S-II scheme with**
C
**= 1.**
**4Mbps**

E | 1 dB | 1.25 dB | 1.5 dB | 1.75 dB | 2 dB | 2.25 dB | 2.5 dB | 2.75 dB | 3 dB | 4 dB | 5 dB |
---|---|---|---|---|---|---|---|---|---|---|---|

$\overline{F}$ | 0.172 | 0.163 | 0.158 | 0.111 | 0.077 | 0.059 | 0.05 | 0.046 | 0.041 | 0.003 | 0 |

| 76.1 | 72.2 | 70.1 | 49.3 | 34.0 | 25.9 | 22.1 | 20.4 | 17.9 | 1.1 | 0 |

| 30.2 | 28.4 | 27.4 | 21.8 | 14.3 | 10.3 | 8.4 | 7.6 | 7.7 | 0.5 | 0 |

| 30.7 | 29.1 | 28.2 | 20.5 | 14.3 | 11.1 | 9.5 | 8.8 | 7.4 | 0.5 | 0 |

| $\frac{8}{18}$ | $\frac{8}{18}$ | $\frac{8}{18}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ |

| $\frac{8}{16}$ | $\frac{8}{16}$ | $\frac{8}{16}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ |

| $\frac{8}{9}$ | $\frac{8}{9}$ | $\frac{8}{9}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{14}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ |

| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ |

VSLR | 0.007 | 0.003 | 0.001 | 0.072 | 0.036 | 0.017 | 0.008 | 0.004 | 0.001 | 0 | 0 |

VSLR | 0.063 | 0.033 | 0.0162 | 0.072 | 0.036 | 0.017 | 0.008 | 0.004 | 0.11 | 0.007 | 0 |

VSLR | 1 | 1 | 1 | 0.072 | 0.036 | 0.017 | 0.008 | 0.004 | 0.11 | 0.007 | 0 |

VSLR | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0.11 | 0.007 | 0 |

**Optimal cross-layer parameters for S-III scheme with**
C
**= 1.**
**4Mbps**

E | 1.75 dB | 2 dB | 2.25 dB | 2.5 dB | 2.75 dB | 3 dB | 4 dB | 5 dB |
---|---|---|---|---|---|---|---|---|

$\overline{F}$ | 1 | 0.972 | 0.268 | 0.022 | 0.021 | 0.017 | 0.007 | 0.006 |

| 444.3 | 431.9 | 119.2 | 9.8 | 9.3 | 5.3 | 2.1 | 0.8 |

| 215.1 | 209.1 | 57.7 | 4.7 | 4.5 | 2.6 | 1.0 | 0.4 |

| 180.2 | 175.2 | 48.3 | 4.0 | 3.8 | 2.1 | 0.8 | 0.3 |

| $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{10}$ | $\frac{8}{10}$ | $\frac{8}{9}$ |

| 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.25 | 1.25 | 1.4 |

VSLR | 1 | 0.972 | 0.268 | 0.022 | 0.021 | 0.012 | 0.005 | 0.002 |

**Optimal cross-layer parameters for S-IV scheme with**
C
**= 1.**
**4Mbps**

E | 1 dB | 1.25 dB | 1.5 dB | 1.75 dB | 2 dB | 2.25 dB | 2.5 dB | 2.75 dB | 3 dB | 4 dB | 5 dB |
---|---|---|---|---|---|---|---|---|---|---|---|

$\overline{F}$ | 0.157 | 0.058 | 0.047 | 0.045 | 0.044 | 0.026 | 0.017 | 0.016 | 0.013 | 0.005 | 0.004 |

| 69.7 | 25.6 | 20.9 | 19.9 | 19.6 | 11.4 | 7.6 | 7.2 | 5.8 | 2.1 | 2.0 |

| 27.3 | 10.1 | 7.8 | 7.3 | 7.2 | 5.1 | 3.4 | 3.2 | 2.6 | 0.9 | 0.9 |

| 28.0 | 10.9 | 9.0 | 8.6 | 8.5 | 4.7 | 3.1 | 2.9 | 2.4 | 0.9 | 0.8 |

| $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{10}$ | $\frac{8}{10}$ | $\frac{8}{10}$ |

| 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.2 | 1.2 | 1.2 |

| 2 | 1.4 | 1.4 | 1.4 | 1.4 | 1.2 | 1.2 | 1.2 | 1.2 | 1.2 | 1.2 |

| 2 | 1.3 | 1.3 | 1.3 | 1.3 | 1.1 | 1 | 1 | 1 | 1 | 1 |

| 0 | 1.3 | 1.3 | 1.3 | 1.3 | 0.9 | 0.9 | 0.9 | 0.9 | 1 | 1 |

| 0 | 0 | 0 | 0 | 0 | 0.8 | 0.9 | 0.9 | 0.9 | 0.8 | 0.8 |

VSLR | 0.004 | 0.014 | 0.004 | 0.002 | 0.002 | 0.015 | 0.008 | 0.008 | 0.006 | 0.002 | 0.002 |

VSLR | 0.004 | 0.021 | 0.007 | 0.004 | 0.003 | 0.024 | 0.025 | 0.024 | 0.019 | 0.007 | 0.007 |

VSLR | 1 | 0.021 | 0.007 | 0.004 | 0.003 | 0.064 | 0.043 | 0.041 | 0.034 | 0.007 | 0.007 |

VSLR | 1 | 1 | 1 | 1 | 1 | 0.107 | 0.043 | 0.041 | 0.034 | 0.028 | 0.026 |

From Tables 4 and 5, we observe that the use of UEP RCPC coding at the PL in the S-II scheme achieves much better performance (i.e., lower *F*_{Bus}) than the use of EEP RCPC coding in the S-I scheme. Both schemes do not use FEC coding at the AL.

Since the RCPC code rate of$\frac{8}{12}$at the PL is not strong enough for$\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}\le 2$dB, the value of *F*_{Bus}in the S-I scheme is high (*F*_{Bus}>300 in Table 4) because many packet are corrupted due to high channel errors. For a successful decoding in LT, the number of error-free packets received should be above a threshold. As a result, the S-III scheme (which also uses RCPC with the same code rate as in S-I) achieves a lower performance (higher value of *F*_{Bus}) than S-I for$\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}\le 2$dB (see Tables 4 and 6). However, the S-III scheme achieves much better performance (*F*_{Bus}<10) than S-I for$\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}\ge 2.5$dB because fewer packets are now corrupted at the PL and the LT coding becomes effective.

From Tables 6 and 7, we observe that the proposed S-IV scheme achieves much lower values of *F*_{Bus}than S-III at all values of$\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}$. This demonstrates that using UEP LT codes at the AL along with EEP RCPC codes at the PL gives a far superior performance than using EEP codes at both layers.

From Table 7 for the S-IV scheme, we observe an interesting trade-off between the code rates assigned to FEC codes at the AL and the PL. For lower values of$\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}$, a larger portion of the bit budget is assigned to RCPC codes at the PL rather than LT codes at the AL because the LT coding cannot be effective when a large number of packets are corrupted due to channel errors. Furthermore, a stronger UEP (i.e., higher value of *k*_{
i
}to higher priority video slices) is provided at the AL. For higher values of$\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}$, the RCPC code rate is relatively high and more protection is provided to LT codes at the AL. Also, the UEP (i.e., value of *k*_{
i
}) at the AL is relatively less strong now.

Overall, the proposed S-IV scheme achieves the best performance at different channel SNRs, followed by the S-II scheme for$\frac{\mathit{\text{Es}}}{\mathit{\text{No}}}\le 2.5$dB. S-III outperforms S-II for other higher channel SNRs. We observe similar results for Foreman and Coastguard videos. Therefore, we can generally conclude that it is optimal to provide UEP at the AL and EEP at the PL using a cross-layer design.

Note that the optimization is performed only once for a given set of${\overline{\text{CMSE}}}_{i}$values, a GOP structure, and a set of channel SNRs, and need not be run separately for each GOP. The same set of optimized parameters can be used for any video stream with similar properties. Further, we should note that similar performance improvement is also observed for the 1.8-Mbps channel bit rate.

## 6 Performance evaluation of FEC schemes for test videos

In this section, we evaluate the performance of our optimized cross-layer FEC schemes for four CIF (352×288 pixels) video sequences: Bus, Foreman, Coastguard, and Akiyo. These sequences were encoded using H.264/AVC JM 14.2 reference software [51] at 840 kbps and 150 bytes slice size, for a GOP length of 30 frames with GOP structure *IDR* *B* *P* *B*…*P* *B* at 30 frames/s. The slices were formed using dispersed-mode FMO with two slice groups per frame. Two reference frames were used for predicting the *P* and *B* frames, with error concealment enabled using temporal concealment and spatial interpolation. We have used a channel transmission rate of *C*=1.4 to study the performance over AWGN channels.

Although our cross-layer FEC parameters were optimized for Bus sequences, the average PSNR performance is similar to that of the other two test video sequences, i.e., Foreman and Coastguard. As mentioned earlier, both sequences have different characteristics compared to the Bus sequence.

*F*

_{sub}and PSNR

_{sub}, which were obtained by using the optimized parameters of the Bus video from Table 7. The values of PSNR

_{opt}and PSNR

_{sub}are also shown in Figure 6.

**Optimal cross-layer parameters for S-IV at**
C
**= 1.4 Mbps for Akiyo video sequence**

E | 1 dB | 1.25 dB | 1.5 dB | 1.75 dB | 2 dB | 2.25 dB | 2.5 dB | 2.75 dB | 3 dB | 4 dB | 5 dB |
---|---|---|---|---|---|---|---|---|---|---|---|

| 1.111 | 0.600 | 0.287 | 0.243 | 0.229 | 0.223 | 0.221 | 0.219 | 0.215 | 0.066 | 0.062 |

| 1.141 | 0.600 | 0.317 | 0.259 | 0.239 | 0.494 | 0.325 | 0.306 | 0.240 | 0.079 | 0.074 |

PSNR | 29.78 | 38.36 | 40.39 | 40.6 | 41.0 | 41.12 | 41.15 | 41.15 | 41.23 | 45.62 | 45.96 |

PSNR | 29.62 | 38.20 | 40.2 | 40.3 | 40.8 | 39.42 | 41.04 | 41.05 | 41.15 | 45.49 | 45.85 |

| $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{12}$ | $\frac{8}{10}$ | $\frac{8}{10}$ | $\frac{8}{10}$ |

| 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.2 | 1.2 | 1.2 |

| 2.3 | 1.4 | 1.8 | 1.8 | 1.8 | 1.8 | 1.8 | 1.8 | 1.8 | 1.3 | 1.3 |

| 1.7 | 1.3 | 1.2 | 1.2 | 1.2 | 1.2 | 1.2 | 1.2 | 1.2 | 1 | 1 |

| 0 | 1.3 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0.9 | 0.9 |

| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.8 | 0.8 |

VSLR | 0.001 | 0.014 | 0.001 | 0 | 0 | 0 | 0 | 0 | 0 | 0.001 | 0.001 |

VSLR | 0.012 | 0.021 | 0.014 | 0.008 | 0.006 | 0.005 | 0.005 | 0.004 | 0.004 | 0.008 | 0.007 |

VSLR | 1 | 0.021 | 0.039 | 0.024 | 0.018 | 0.016 | 0.015 | 0.015 | 0.013 | 0.015 | 0.014 |

VSLR | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0.028 | 0.027 |

In Table 8 (for optimal scheme) and Table 7 (for suboptimal scheme), the LT code overhead (i.e., *γ*_{
t
}) and RCPC code strength (*R*) are the same for both schemes, whereas the values of LT code protection level *k*_{
i
}for each priority class vary slightly (e.g., *k*_{1}is higher for the optimal scheme compared to the suboptimal scheme). Similarly, the values of VSLR_{
i
}for higher priority slices (which have the most impact on *F* and PSNR) are similar in both tables, except for channel SNRs of 2.25, 2.5, and 2.75 dB in the decreasing order of the difference in values. The maximum PSNR degradation of the suboptimal scheme compared to the optimal scheme is 1.7 dB at the channel SNR of 2.25, with only about 0.1 to 0.3 dB PSNR degradation at other channel SNRs. We can, therefore, conclude that the performance of the proposed cross-layer FEC scheme is not very sensitive to the precise values of normalized CMSE.

## 7 Conclusion

Previously, EEP and UEP FEC coding schemes have been used for video transmission over lossy channels. However, the joint optimization of cross-layer UEP FEC codes at the AL and the PL for robust video transmission has never been considered. In this paper, we used UEP LT coding at the AL and RCPC coding at the PL for robust H.264 video transmission over wireless channels. H.264 video slices were prioritized based on their contribution to video quality. We performed cross-layer optimization to concurrently tune the FEC code parameters at both layers, to minimize the video distortion, and to maximize the PSNR. We observed that our cross-layer FEC scheme outperformed other FEC schemes that use either UEP coding at the PL alone or EEP FEC schemes at the AL as well as the PL. Further, we showed that our optimization works well for different H.264-encoded video sequences, which have widely different characteristics.

## Appendix

### Introduction to genetic algorithms

- J.
Holland in [47] showed how the evolutionary process can be applied to solve a wide variety of problems using a parallel technique that is now called the genetic algorithms [48]. Non-linear and complicated optimization problems which cannot be solved employing conventional optimization algorithms such as

*linear programming*can be effectively solved using genetic algorithms. Let*W*and$\overline{w}=\{{w}_{1},{w}_{2},\dots ,{w}_{k}\}$denote the decision space and*k*decision variables, respectively. Let$F\left(\overline{w}\right)$denote the objective function that we need to optimize (minimize/maximize). In conventional genetic algorithms, each*w*_{ i }is translated to a binary format. The steps to find the optimum answer are as follows: - 1.
Generate a random initial population of size

*i*each including*k*members$\overline{{w}_{j}},j=\{1,2,\dots ,k\}$. - 2.
Translate the generated population from real numbers to a binary format considering desired precision.

- 3.
Concatenate the translated version of

*k*decision variables together to generate*i*binary population members. - 4.
Evaluate

*i*fitnesses$F(\overline{{w}_{j}},j\in \{1,2,\dots ,i\}$) of the current population. - 5.
Select two parents randomly, assigning higher probability of selection to the parents with a better fitness value.

- 6.
Perform

*crossover*and*mutation*[47] on the parents to generate two offsprings. For crossover, cut two parents from a random location and exchange second parts to generate offsprings. For mutation, with a small probability, flip a random bit in the offsprings’ bit streams. - 7.
Go to step 5 until

*i*−2 offsprings are generated. - 8.
Keep two parents with the best fitness values and replace the rest

*i*−2 with the new offsprings. - 9.
If maximum iterations are not reached, go to 4; otherwise, translate the member of population with the best fitness value from a binary to real format and report it as the final answer.

The above algorithm is an overall view of conventional genetic algorithms. However, many variations have been proposed since genetic algorithms were first introduced. For instance, the translation from real to binary and vice versa is no more performed, and the algorithm and the crossover and mutation are all performed in real numbers. More detailed explanation of genetic algorithms is out of the scope of this paper. We refer the interested readers for performance evaluations of genetic algorithm methods to [48, 52] a and the numerous available surveys.

## Declarations

### Acknowledgements

This material is based upon work supported by the National Science Foundation under grants ECCS-1056065 and CCF-0915994, and by the Air Force Research Laboratory under award FA8750-11-1-0048.

**Disclaimer**

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the US Air Force Research Laboratory.

## Authors’ Affiliations

## References

- Wiegand T, Sullivan GJ, Bjntegaard G, Luthra A: Overview of the H.264/AVC video coding standard.
*IEEE Trans. Circuits Syst. Video Technol*2003, 13(7):560-576.View ArticleGoogle Scholar - Wiegand T, Sullivan G:
*ITU-T, JVT, T Rec: H.264/IEC 14496-10 AVC 2003 Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification*. International Telecommunication Union, Geneva; 2002.Google Scholar - Kumar S, Xu L, Mandal MK, Panchanathan S: Error resiliency schemes in H.264/AVC standard,.
*Elsevier J. Vis. Commun. Image Representation Spec. Issue Emerg. H.264/AVC Video Coding Stand*2006, 17(2):183-185.Google Scholar - Stockhammer T, Hannuksela M, Wiegand T: H.264/AVC in wireless environments.
*IEEE Trans. Circuits Syst. Video Technol*2003, 13(7):657-673. 10.1109/TCSVT.2003.815167View ArticleGoogle Scholar - Stockhammer T, Shokrollahi A, Watson M, Luby M, Gasiba T: Application layer forward error correction for mobile multimedia broadcasting. In
*Handbook of Mobile Broadcasting: DVB-H, DMB, ISDB-T and MEDIAFLO*. Taylor & Francis, Boca Raton; 2008:239-280.View ArticleGoogle Scholar - van der Schaar M, Shankar S: Cross-layer wireless multimedia transmission: challenges, principles, and new paradigms.
*IEEE Wireless Commun*2005, 12(4):50-58. 10.1109/MWC.2005.1497858View ArticleGoogle Scholar - Setton E, Yoo T, Zhu X, Goldsmith A, Girod B: Cross-layer design of ad hoc networks for real-time video streaming.
*IEEE Wireless Commun*2005, 12(4):59-65. 10.1109/MWC.2005.1497859View ArticleGoogle Scholar - Talari A, Rahnavard N: Unequal error protection rateless coding for efficient MPEG video transmission. In
*IEEE Military Communications Conference*. IEEE, Piscataway; 2009:1-7.Google Scholar - Paluri S, Kambhatla K, Kumar S, Bailey B, Cosman P, Matyjas JD: Predicting slice loss distortion in H.264/AVC video for low complexity data prioritization. In
*IEEE Int. Conf. Image Processing Proceedings (ICIP 2012),*. Orlando; 30 Sept–3 Oct 2012.Google Scholar - Gomez-Barquero D, Bria A: Application layer FEC for improved mobile reception of DVB-H streaming services. In
*IEEE 64th Vehicular Technology Conference, VTC-2006 Fall*. IEEE, Piscataway; 2006:1-5.Google Scholar - Courtade T, Wesel R: A cross-layer perspective on rateless coding for wireless channels. In
*IEEE International Conference on Communications, ICC*. IEEE, Piscataway; 2009:1-6.Google Scholar - Rahnavard N, Vellambi B, Fekri F: Rateless codes with unequal error protection property.
*IEEE Trans. Inf. Theory*2007, 53(4):1521-1532.MathSciNetView ArticleGoogle Scholar - Rahnavard N, Fekri F: Generalization of rateless codes for unequal error protection and recovery time: asymptotic analysis. In
*IEEE Int. Symp. Inf. Theory*. IEEE, Piscataway; 2006:523-527.Google Scholar - Hagenauer J: Rate-compatible punctured convolutional codes (RCPC codes) and their applications.
*IEEE Trans. Commun*1988, 36(4):389-400. 10.1109/26.2763View ArticleGoogle Scholar - Luby M: LT codes. In
*The 43rd Annual IEEE Symposium on Foundations of Computer Science*. IEEE, Piscataway; 2002:271-280.Google Scholar - Shokrollahi A: Raptor codes.
*IEEE Trans. Inf. Theory*2006, 52(6):2551-2567.MathSciNetView ArticleGoogle Scholar - Ahmad S, Hamzaoui R, Al-Akaidi M: Adaptive unicast video streaming with rateless codes and feedback.
*IEEE Trans. Circuits Syst. Video Technol*2010, 20(2):275-285.View ArticleGoogle Scholar - Cataldi P, Grangetto M, Tillo T, Magli E, Olmo G: Sliding-window raptor codes for efficient scalable wireless video broadcasting with unequal loss protection.
*IEEE Trans. Image Process*2010, 19(6):1491-1503.MathSciNetView ArticleGoogle Scholar - Kushwaha H, Xing Y, Chandramouli R, Heffes H: Reliable multimedia transmission over cognitive radio networks using fountain codes.
*Proc. IEEE*2008, 96: 155-165.View ArticleGoogle Scholar - Hellge C, Schierl T, Wiegand T: Receiver driven layered multicast with layer-aware forward error correction. In
*15th IEEE International Conference on Image Processing, ICIP*. IEEE, Piscataway; 2008:2304-2307.Google Scholar - Vukobratovic D, Stankovic V, Sejdinovic D, Stankovic L, Xiong Z: Expanding Window Fountain codes for scalable video multicast. In
*IEEE International Conference on Multimedia and Expo*. IEEE, Piscataway; 2008:77-80.Google Scholar - Tan AS, Aksay A, Bilen C, Akar GB, Arikan E: Rate-distortion optimized layered stereoscopic video streaming with raptor codes. In
*Packet Video*. IEEE, Piscataway; 2007:98-104.Google Scholar - Jenkac H, Stockhammer T: Asynchronous media streaming over wireless broadcast channels. In
*IEEE International Conference on Multimedia and Expo*. IEEE, Piscataway; 2005:1318-1321.Google Scholar - Ahmad S, Hamzaoui R, Al-Akaidi M: Robust live unicast video streaming with rateless codes. In
*Packet Video*. IEEE, Piscataway; 2007:78-84.Google Scholar - Vukobratovic D, Stankovic V, Sejdinovic D, Stankovic L, Xiong Z: Scalable video multicast using expanding window fountain codes.
*IEEE Trans. Multimedia*2009, 11(6):1094-1104.View ArticleGoogle Scholar - Luby M, Watson M, Gasiba T, Stockhammer T: Mobile data broadcasting over MBMS tradeoffs in forward error correction. In
*Proceedings of the 5th International Conference on Mobile and Ubiquitous Multimedia*. ACM, New York; 2006:10-10.View ArticleGoogle Scholar - Stockhammer T, Liebl G: On practical crosslayer aspects in 3GPP video services. In
*Proceedings of the International Workshop on Workshop on Mobile Video*. ACM, New York; 2007:7-12.View ArticleGoogle Scholar - Afzal J, Stockhammer T, Gasiba T, Xu W: Video streaming over MBMS: a system design approach.
*J. Multimedia*2006, 1(5):25-35.View ArticleGoogle Scholar - Alexiou A, Bouras C, Papazois A: A study of forward error correction for mobile multicast.
*Int. J. Commun. Syst*2011, 24(5):607-627. 10.1002/dac.1178View ArticleGoogle Scholar - Munaretto D, Jurca D, Widmer J: Broadcast video streaming in cellular networks: an adaptation framework for channel, video and AL-FEC rates allocation. In
*2010 The 5th Annual ICST Wireless Internet Conference (WICON)*. IEEE, Piscataway; 2010:1-9.Google Scholar - Bouras C, Kokkinos V, Papazois A: Application layer forward error correction for multicast streaming over LTE networks.
*Int. J. Commun. Syst*2012. http://onlinelibrary.wiley.com/doi/10.1002/dac.2321/abstractGoogle Scholar - Gallager R: Low-density parity-check codes.
*Inf. Theory IRE Trans*1962, 8: 21-28. 10.1109/TIT.1962.1057683MathSciNetView ArticleGoogle Scholar - Thomos N, Argyropoulos S, Boulgouris N, Strintzis M: Robust transmission of H. 264/AVC streams using adaptive group slicing and unequal error protection.
*EURASIP J. Appl. Signal Process*2006, 2006: 120-120.View ArticleGoogle Scholar - Kambhatla K, Kumar S, Cosman P: Wireless H.264 video quality enhancement through optimal prioritized packet fragmentation.
*IEEE Trans. Multimedia*2012, 14(5):1480-1495.View ArticleGoogle Scholar - Kumar S, Janarthanan A, Shakeel MM, Maroo S, Matyjas JD, Medley M: Robust H.264/AVC video coding with priority classification, adaptive NALU size and fragmentation. In
*IEEE MILCOM Proceedings, 2009,*. Boston; 18.Google Scholar - Baccaglini E, Tillo T, Olmo G: Slice sorting for unequal loss protection of video streams.
*IEEE Signal Process. Lett*2008, 15: 581-584.View ArticleGoogle Scholar - Argyropoulos S, Tan A, Thomos N, Arikan E, Strintzis M: Robust transmission of multi-view video streams using flexible macroblock ordering and systematic LT codes. In
*3DTV Conference, 2007*. IEEE, Piscataway; 2007:1-4.View ArticleGoogle Scholar - Koopman P, Chakravarty T: Cyclic redundancy code (CRC) polynomial selection for embedded networks. In
*International Conference on Dependable Systems and Networks*. IEEE, Piscataway; 2004:145-154.Google Scholar - Bouabdallah A, Lacan J: Dependency-aware unequal erasure protection codes.
*J. Zhejiang Univ. Sci. A*2006, 7: 27-33. 10.1631/jzus.2006.AS0027View ArticleGoogle Scholar - Maani E, Katsaggelos A: Unequal error protection for robust streaming of scalable video over packet lossy networks.
*IEEE Trans. Circuits Syst. Video Technol*2010, 20(3):407-416.View ArticleGoogle Scholar - Xiang W, Zhu C, Siew CK, Xu Y, Liu M: Forward error correction-based 2-D layered multiple description coding for error-resilient H.264 SVC video transmission.
*IEEE Trans. Circuits Syst. Video Technol*2009, 19(12):1730-1738.View ArticleGoogle Scholar - Ha H, Yim C: Layer-weighted unequal error protection for scalable video coding extension of H.264/AVC.
*IEEE Trans. Consum. Electron*2008, 54(2):736-744.View ArticleGoogle Scholar - Liu Y, Yu S: Adaptive unequal loss protection for scalable video streaming over IP networks.
*IEEE Trans. Consum. Electron*2005, 51(4):1277-1282. 10.1109/TCE.2005.1561856View ArticleGoogle Scholar - Yingbo Shi S, Chengke Wu W, Jianchao Du D: A novel unequal loss protection approach for scalable video streaming over wireless networks.
*IEEE Trans. Consum. Electron*2007, 53(2):363-368.View ArticleGoogle Scholar - Zhang XJ, Peng XH, Haywood R, Porter T: Robust video transmission over lossy network by exploiting H.264/AVC data partitioning. In
*5th International Conference on Broadband Communications, Networks and Systems, BROADNETS*. IEEE, Piscataway; 2008:307-314.Google Scholar - Gasiba T, Xu W, Stockhammer T: Enhanced system design for download and streaming services using Raptor codes.
*European Trans. Telecommun*2009, 20(2):159-173. 10.1002/ett.1275View ArticleGoogle Scholar - Holland J:
*Adaptation in Natural and Artificial Systems*. MIT Press, Cambridge; 1992.Google Scholar - Koza JR: Survey of genetic algorithms and genetic programming. In
*WESCON/’95. Conference Record*. IEEE, Piscataway; 1995:589-589.Google Scholar - Coley D:
*An Introduction to Genetic Algorithms for Scientists and Engineers*. World Scientific, Singapore; 1999.View ArticleGoogle Scholar - MathWorks:
*Global Optimization Toolbox: User’s Guide (R2011b)*. MathWorks, Natick; 2011.Google Scholar - JVT: H.264/AVC Reference Software JM14.2. ISO/IEC Std. . Accessed 12 Feb 2012 http://iphome.hhi.de/suehring/tml/download/
- Deb K, Pratap A, Agarwal S, Meyarivan T: A fast and elitist multiobjective genetic algorithm: NSGA-II.
*IEEE Trans. Evol. Comput*2002, 6(2):182-197. 10.1109/4235.996017View ArticleGoogle Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.