- Research Article
- Open Access
Graceful Degradation in 3GPP MBMS Mobile TV Services Using H.264/AVC Temporal Scalability
EURASIP Journal on Wireless Communications and Networkingvolume 2009, Article number: 207210 (2009)
These days, there is an increasing interest in Mobile TV broadcast services shown by customers as well as service providers. One general problem of Mobile TV broadcast services is to maximize the coverage of users receiving an acceptable service quality, which is mainly influenced by the user's position and mobility within the cell. In this paper, graceful degradation is considered as an approach for improved service availability and coverage. We present a layered transmission approach for 3GPP's Release 6 Multimedia and Broadcast Service (MBMS) based on temporal scalability using H.264/AVC Baseline Profile. A differentiation in robustness between temporal quality layers is achieved by unequal error protection approach based on either application layer Forward Error Correction (FEC) or unequal transmit power for the layers or even a combination of both. We discuss the corresponding MBMS service as well as network settings and define measures allowing for evaluating the amount of users reached with a certain mobile terminal play-out quality while considering the network cell capacity usage. Using simulated 3GPP Rel. 6 network conditions, we show that if the service and network settings are chosen carefully, a noticeable extension of the coverage of the MBMS service can be achieved.
These days, there is an increasing interest in Mobile TV broadcast services shown by customers as well as by service and network providers. Such services may be based on cellular networks as specified in Release 6 of the 3rd Generation Partnership Project (3GPP) Multimedia Broadcast and Multicast Service (MBMS) [1–4]. One general problem of Mobile TV broadcast services over cellular networks is the coverage, which may change depending on the user's location within a cell; that is, users at cell boundaries typically get a less strong signal than users closer to the cell center get. This affects the packet loss rate and, respectively, the network throughput, which has a noticeable influence on the received video quality. In order to become widely accepted, Mobile TV services need to offer service reliability throughout the transmission cell. Mobile TV broadcast services may be also based on other radio access bearer techniques as, for example, single frequency networks of DVB-H  or DVB-SH , which may show similar problems with respect to the coverage. In this paper, we focus on cellular access bearers of 3GPP Rel.6 MBMS only.
To increase the coverage in terms of served users or to reduce costs in terms of required transmission power/network resources, graceful degradation is considered as an approach for extended coverage. Such an approach for improving the availability and coverage of the service at fixed transmission cost or for reducing the transmission cost at slightly reduced quality for some of the users is to transmit a part of the service providing a base user experience with higher protection or higher transmission power. The rest of the service's data rate fraction representing one or more quality enhancements is transmitted at lower costs with less protection or with less transmission power. Since such a transmission approach is only feasible with scalable media types, we rely in this work on the H.264/AVC Temporal Scalability as supported in the H.264/AVC Baseline Profile , which is already supported by 3GPP's Rel.6 MBMS.
One important key factor for giving success to Mobile TV services is the continuity of the service availability. The differentiation in robustness of base and enhancement layer of the scalable media and the resulting different received qualities depending on the user's reception conditions leads to graceful degradation, which can in fact lead to higher acceptance of the service itself due to higher service availability, for some users possibly at reduced quality.
Prior research work on graceful degradation for wireless broadcasting includes papers focusing on transmission techniques like hierarchical modulation [8, 9], scalable video coding [10–13], or combinations of both [14, 15]. However, an application of graceful degradation to cellular mobile broadcasting systems together with a performance analysis in terms of experienced service quality of different network and service settings is still missing.
In this paper we present a layered transmission approach for 3GPP's Rel.6 MBMS based on H.264/AVC Baseline Profile Temporal Scalability and different Raptor application layer FEC protection [2, 16] and different radio power spreading as well as combinations of both. We evaluate the usage and the integration of video scalability with the H.264/AVC video coding standard in 3GPP Multimedia Broadcast/Multicast Services (MBMS) Rel.6 as a standardized mobile broadcasting system. We present the different transmission schemes for audio and H.264/AVC video realizing graceful degradation in MBMS.
For evaluating the benefit of graceful degradation in 3GPP Rel.6, we define assessment techniques for measuring the quality coverage describing the amount of users reached with a certain mobile terminal play-out quality. We connect this quality coverage to a measure for used network cell capacity giving the used network resources for 3GPP Rel.6 network cells. Finally, we evaluate graceful degradation under simulated 3GPP Rel.6 network conditions. We show that if the service and network settings are chosen carefully, a noticeable extension of the coverage of the MBMS service can be achieved, especially for receivers with poor reception conditions.
In the next section, we give an overview of MBMS in 3GPP Rel. 6. In Section 3, we present H.264/AVC Baseline Profile Temporal Scalability as used in this work. The proposed transmission schemes for audio and H.264/AVC video are introduced in Section 4. Section 5 covers the media play-out quality assessment, including the measure for used cell capacity. Section 6 presents the simulation environment and Section 7 provides a detailed overview on the simulation results. We finalize the paper in Section 8 with a summary and a conclusion.
3GPP has defined an MBMS feature for UMTS and GSM-EDGE systems [1, 17]. The key motivation for integrating multicast and broadcast extensions into mobile communication systems is to enable efficient one-to-many data distribution services. As shown in Figure 1 the basic idea is to use multicasting in the service layer and core network in order to save on server and transmission network capacity. On the radio interface the multicast service can be provided with a broadcast transmission. MBMS services can be multiplexed flexibly with unicast services in the same cell and on the same carrier frequency.
Within the UMTS radio access network MBMS was introduced in Release 6 of the 3GPP specifications. MBMS integrates so-called point-to-multipoint (PTM) bearers for broadcast in cells with a high number of group members requesting the same content with point-to-point (PTP) bearers for unicast. Thus a service delivered over MBMS typically uses PTM transmission within geographical areas of high density of group members and PTP transmission in cells with low number of group members. The PTP transmission uses one of the existing unicast radio bearers, preferably HSDPA (High Speed Downlink Packet Access), where the transmission parameters can be adapted to the individual user equipment (UE) reception conditions. In contrast, for the PTM mode no control channel for feedback of reception quality can be used by the UEs; therefore the transmission needs to be statically configured. Due to the lack of a feedback control channel the content server cannot be informed about possible reception problems at particular receivers in difficult radio propagation environments. The present work therefore investigates opportunities for the PTM mode to provide a graceful degradation of media quality in such conditions.
If the broadcast bearer service is used, UEs do not need to inform the network when they start receiving the service. Nevertheless, the radio network controller (RNC) nodes can initiate a counting procedure where a statistically representative number of receiving UEs need to respond. Based on this information the RNCs can decide for each cell if the transmission mode should be kept or changed from PTM to PTP or vice versa. Transmission in a cell may also be stopped entirely if no UE is present. The cell continues to announce the considered MBMS service, and if a new UE desires to receive the service, it will contact the RNC to start transmission again. All these procedures are completely invisible to the user as they serve only to maximize the transmission efficiency.
The MBMS bearer service addresses MBMS transmission procedures below the IP layer. It provides a new PTM transmission bearer, which may use common radio resources (i.e., broadcast) in cells of high receiver density. The MBMS bearer service is supported by both UMTS Terrestrial Radio Access Network (UTRAN) and GSM/EDGE Radio Access Network (GERAN) [1, 18]. The MBMS Broadcast Mode offers a PTM distribution system. The PTM transmission bearer uses the MBMS Traffic Channel (MTCH). This logical channel is mapped to a transport channel of type Forward Access Channel (FACH) which uses Turbo coding. A FACH is mapped to a physical channel of type Secondary Common Control Physical Channel (S-CCPCH), which uses QPSK (or, in Release 7, also 16 QAM) modulation at a constant transmission power. Multiple MTCHs can be configured in a cell, either time multiplexed on one FACH or transmitted on separate FACHs. Multiple FACHs may be multiplexed on one S-CCPCH or transmitted on separate S-CCPCHs. A brief summary of the channels is given in Table 1.
The investigations in this work are based on 3GPP Release 6 functionality. However, meanwhile 3GPP has further evolved MBMS in Release 7 . A new transmission mode has been added that increases the system capacity by avoiding intercell interference. If the same multimedia service shall be provided over geographical areas comprising several cells or even an entire nation, advantages can be taken from the inherent single frequency operation of the Wideband CDMA (WCDMA) and Long Term Evolution (LTE); that is, all cells can use the same carrier frequency and the terminal is therefore inherently able to receive the signals from several cells simultaneously. If different cells transmit different information, then the signals interfere in the terminal receiver. Whereas this cannot be avoided for user-individual services, it is possible to avoid intercell interference for broadcast services by sending the same signal from all cells in the service area at the same time. By the elimination of intercell interference, a significant increase in spectral efficiency and thereby throughput on the used radio resources is achieved. This efficient exploitation of the single frequency network nature for MBMS is called MBSFN in 3GPP. MBSFN transmission in FDD mode requires that a carrier is dedicated to MBSFN transmission; that is, this carrier cannot be used for unicast. For this reason and in scenarios where broadcasting is intended over a few cells only where the MBSFN gains are small, the Release 6 mode of MBMS transmission remains an important complement to MBSFN transmissions.
On the application layer, the Raptor FEC code  may be used to increase bearer reliability for MBMS transmissions. The Raptor code belongs to the class of Rateless codes and can thus generate an arbitrary number of FEC redundant symbols out of one source block. The FEC protection level is depicted by the so-called code rate . In this paper the code rate is defined as the fraction of the number of Raptor source symbols and the total number of Raptor symbols , where denotes the number of used repair symbols.
3. H.264/AVC Baseline Profile Temporal Scalability
The MBMS Rel. 6  specification supports the use of H.264/AVC Baseline Profile Level 1b , which, amongst others, does not allow the use of bipredictive pictures. In order to provide temporal scalability, in this work we use a prediction structure with referenced (I-pictures, P-pictures) and nonreferenced (p-pictures) pictures, where the p-pictures are not used for temporal prediction by any other picture and so the disposal of such a picture does not affect any other picture. Thus using a coding structure with p-pictures allows for temporal scalability by dropping the p-pictures without causing any prediction errors. We examined three different coding structures. Each sequence starts with an I-picture. IpP (Figure 2) with one, IppP (Figure 3) with two, and IpppP (Figure 4) having three consecutive nonreferenced pictures between two referenced pictures. The I-picture period depends on the required Random Access Point (RAP) frequency.
Both the number of p-pictures and the characteristic of the sequences influence the scalability features and compression efficiency of the video stream in terms of bit rate and frame rate. Keeping the frame rate constant, the IpP and the IppP coding structure allows for only one additional temporal level having the I- and P-pictures in one layer (base layer) and the p-pictures in a second layer (temporal enhancement layer). Using the IpppP coding structure there are two temporal levels, that is, the base layer incorporates the I- and the P-pictures, the first enhancement layer the middle p-picture, and the second enhancement layer the remaining p-pictures. Table 2 depicts the base layer and the possible temporal enhancement (enh.) layer(s) where the the grayed out pictures are not included in the layer.
To get an impression of the bit rate distribution across the layers, we examined the layer bit rates of five sequences (Foreman, Football, Mobile, Paris, and Tempete) of different characteristics, using the aforementioned coding structures. All sequences have qCIF resolution at 12.5 fps and are encoded with 96 kbps, In order to provide a tune in possibility after a video outage, there is an I-picture every 2 seconds. The results are depicted in Figures 5, 6, and 7, where the -axis shows the frame rate in terms of frames per second (fps) and the y-axis shows the bit rate reduction in percentage of the full stream bit rate.
Increasing the number of p-pictures also increases the scalability ratio in terms of bit rate and frame rate. Depending on the sequence the IpP coding allows for up to 33% bit rate reduction, which means that 33% of the total bit rate lies in the temporal enhancement layer. The frame rate can be reduced by half. Using IppP allows for bit rate reduction of slightly over 50%, whereas the frame rate can be reduced to one-third of the full frame rate. The IpppP coding structure can offer two temporal enhancement layers. Dropping the first temporal layer results in similar scalability features as observed with IpP coding. Additional dropping the second temporal layer allows a bit rate reduction up to 60% with the football sequence and a frame rate reduction down to one-fourth of the full sequence frame rate.
As aforementioned, the number of p-pictures also influences the coding efficiency. In the following examination, we analyzed the already presented sequences. We used three different bit rates with different resolutions (qCIF and CIF) and frame rates (12.5 fps and 25 fps) and a different number of p-pictures. The results are depicted in Figures 8, 9, 10, 11, and 12, where the -axis shows the number of p-pictures and the -axis the video quality in terms of PSNR.
The effect on the coding efficiency depends on the characteristic of the sequence. At sequences with relative low motion, for example, Mobile orTempete, the coding efficiency in terms of PSNR increases with a higher number of nonreferenced p-pictures whereas at sequences with relative high motion, for example, Football, the coding efficiency decreases with the number of p-pictures. The Foreman or Paris sequences show a similar coding efficiency at different numbers of p-pictures.
Due to the better scalability feature in terms of bit rate, for the conducted simulations we use an IpppP coding structure as depicted in Figure 4. Though there are two potential temporal layers as shown in Table 2, we focused on a setting, where we combine all three p-pictures to the temporal enhancement layer, that is, to get an almost even bit rate distribution between the layers. A summary of the used layer setting can be found in Table 3.
4. Layered Transmission
The transmission schemes we have investigated can be subdivided into single layer (SL) transmission schemes and multi layer (ML) transmission schemes. When SL transmission is used, a media stream is uniformly protected against transmission errors. Using higher radio transmission power (SingleLayer) or lower Raptor code rate (SingleLayerFEC) allows for increasing robustness of SL transmission in MBMS. As of the uniform protection, in case of errors, all parts of the media stream are affected independent of their importance.
As already shown in the preceding section, some parts of a media stream can be considered more important than other parts; for example, referenced pictures (I- and P-) can be considered more important than nonreferenced pictures (p-pictures). Hence, it is reasonable to subdivide the media stream into two or more layers accordingly and increase the protection of the more important layers, for example, the layer containing I- and P-pictures, as compared to the less important layers. Such schemes are denoted as ML transmission schemes. To increase the reliability of the more important layer (base layer) while keeping the transmission cost constant, such an increase in reliability requires a decrease in reliability of the less important layer(s). In order to support streaming with different quality layers per service stream over 3GPP MBMS, the following methods could be used to differentiate the protection for the different layers.
(1)Transmission power per layer: the less important the layer the lower the transmission power (Unequal Transmission Power: UTP).
(2)Turbo coding scheme per layer: the less important the layer the higher the code rate.
(3)Raptor code rate per layer: the less important the layer the higher the code rate (Unequal Error Protection: UEP).
For method 1 it is necessary to map the different layers to different S-CCPCHs. Method 2 can be implemented by mapping the layers to different FACHs, which are separately channel coded. As method 1 and 2 are expected to behave quite similarly, in the simulations we focused on method 1.
Figure 13 illustrates the effect of transmit power setting on the transport block error probability (BLEP) for a radio bearer with 256 kbps data rate. The transport block (TB) is the smallest unit of the Cyclic Redundancy Check (CRC) protected information on the radio interface. Section 6.1 describes how TBs are mapped to video frames in the simulation environment. By choosing different power levels, the BLEP distribution that the users experience on the radio bearer can be adjusted.
For method 3, the application layer FEC (Raptor FEC) code rate can be adjusted to increase the protection of a transmission layer. For SL transmission, the SingleLayerFEC method has the same code rate for the whole stream. Using ML transmission, the UEP method has a lower code rate (higher protection) for the more important layer. For keeping the transmission rate constant, the code rate of the less important layer must be increased at the same time. Since method 3 works on application layer, the SingleLayerFEC or the UEP can be used in combination with method 1 and method 2 to further increase the robustness of a distinct layer.
UEP can be achieved within the FEC framework of 3GPP Rel. 6 MBMS by adapting FEC protection under consideration of the media layers within a single media stream. Thus method 3 could principally be supported in 3GPP Rel. 6 MBMS without further modifications. On the other hand, 3GPP Rel. 6 MBMS does not provide the signaling mechanisms that would be required to allow subdivision of media streams into two or more layers and mapping of those layers to different S-CCPCHs or FACHs. Thus methods 1 and 2 could not be supported in 3GPP Rel. 6 MBMS systems without further system modifications. We still consider these methods in our simulations to compare their potential benefits relative to method 3.
In our experiments we compared the SL transmission schemes (SingleLayer/SingleLayerFEC) with ML transmission using method 1 (UTP) and 3 (UEP) solely and in combination (UTP_FEC, UTP_UEP).
5. Performance Criteria
5.1. Quality Assessment
The Peak-Signal-to Noise Ratio (PSNR) measure, which is derived from mean square distortion between original video sequence and the reconstructed video sequence, is the most popular objective quality measure used in the area of video coding. Although we use PSNR for evaluation of the encoding results in Section 3, PSNR cannot show properly the effect of prediction errors (lost I- and P-pictures), which might be highly visible but create small distortion in terms of mean square error, or even of reduced frame rate by the loss of nonreferenced p-pictures. And it is obviously not suitable to represent outages in the audio play-out. Much work has been done in the area of objective video quality assessment (VQA) to find an appropriate measure. A survey of this research area is given by Engelke and Zepernick . This work distinguishes between full-reference method reduced-reference method and nonreference method, where "reference" indicates the original, unimpaired video sequence. The full referenced methods use the original sequence as reference to predict the quality degradation of the distorted medium. The reduced-reference methods send additional information of the original reference along with the video sequence which can be used for the quality assessment. The nonreference methods or "blind" methods rate the video quality without any information of the original reference. In , Lotfallah et al. present a reduced-reference method, where they try to estimate the experienced video quality using information of the coding structures, mainly of the prediction range, and the position of the lost frame in the Group Of Picture (GOP). Similar approach is presented in , where Da Silva et al. present a "Pseudo subjective Quality Assessment" approach . In this work, they calculate an objective measure based on the loss rate of different frame types, respectively I-, P-, and B- pictures. However, both approaches do not take the audio loss rate into account and are not sufficient to rate the behavior of the temporal scalability. In , a nonreference method for audio/video streaming is presented, which estimates the audio-visual quality based on information about the used audio and video codecs, encoding bit rates, packet loss rates, and duration of potential rebuffering events. Its targeted application is quality monitoring for multimedia services in 3G networks; thus both video and audio are considered. However the approach is not tuned to consider the effects of temporal scalability or loss of different frame types. For our quality assessment, we use a measurement that is related to [21, 22, 24], picking up the ideas of weighting the loss rate of different frame types (I-, P-, and p-pictures and audio-frames), the prediction length of the video coding structure, and the PSNR measure.
We use an objective quality measure, which reflects in a comprehensible way the behavior of the introduced temporal scalability with a very simple and intuitive measure. We define four quality categories: maximum (max), medium (med), minimum (min), and unacceptable. The thresholds of the quality categories were selected in such a manner, that each of category reflects a certain user experience. A media stream sorted in the max quality has hardly any errors. Received media streams of the med quality can have a reduced frame rate, but hardly any prediction errors or audio outages. That is, in this category we assume that the loss of nonreferenced p-pictures is tolerable for the user because it only results in a reduced frame rate but does not affect any prediction errors. The min quality tolerates the loss of all p-pictures and some prediction losses and audio outages. The rest of the simulated media streams, which do not fit in one of the aforementioned categories, are sorted in the unacceptable category.
Each of these quality categories is composed by three measurable values: lost video, lost NRF(nonreferenced frames,p-pictures), and lost audio in terms of percentage of the media. The lost video value reflects the ratio of the video sequence affected by prediction errors due to lost reference frames (I- or P-pictures). A picture is rated as affected, if the PSNR value of the received picture is lower than 0.5 dB compared to the error free PSNR value. The lost NRF value defines the fraction of lost p-pictures, which is related to the experienced frame rate. The lost audio value shows the fraction of the media stream suffering from audio outages due to lost audio frames. The thresholds defined in this work are depicted in Table 4; that is, "0" stands for no loss and "1" denotes that all associated frames are lost. The depicted conditions must be fulfilled after a simulation to rate this test run with the related quality. Note that if a receiver's quality experience is included in the max quality category, it is also included in the med and min quality category.
To rate the used transmission setting, we compare the percentage of all users in the cell experiencing the quality categories, which reflects the coverage in the cell of a certain quality category. Certainly, such a quality measure allows no substituting subjective tests, but it gives an understandable and comprehensible rating of the experienced media quality in a cell. That is, it can be seen very easy how many users are experiencing a video with no errors, with a few errors, with some errors, or experiencing a video quality which is not acceptable.
5.2. Used Channel Capacity (Ucc)
Each transmission scheme has a certain transmission cost that is affected by the transmission power and the total content bit rate considering all layers of a content channel, whereas a content channel defines the transmission of one media stream with the selected transmission scheme. To rate a transmission scheme, these transmission costs must be taken into account.
We define the Used channel capacity (Ucc) as a metric of the necessary transmission cost. A Ucc value of 1.0 means that the full transmission capacity of a cell is necessary for the transmission of one content channel with the selected transmission scheme. For instance, with an Ucc value of 0.3, three content channels with the same transmission cost can be broadcasted with the given cell capacity. We have considered radio bearers with 128 kbps and 256 kbps. However, the layer rates do not match the radio bearer rates. We assume that the difference is accounted for by the normal UMTS rate matching. Thereby, the unused bearer capacity can be translated into an equivalent saving in transmit power. This is taken into account in the Ucc. An example calculation of the Ucc measure for a UTP_UEP scenario with two temporal layers is demonstrated in Table 5.
6. Simulation Environment
To simulate the transmission of the presented multilayer schemes (see Section 4) over a 3GPP MBMS system we implemented a test system, which simulates a standard compliant data transmission. The simulation environment covers the audio and video encoding, "RTP/UDP/IP" packetization, Raptor en-/decoding, and the mapping on the MBMS transport packets. We build up a simulation chain as shown in Figure 14. Here one can distinguish between three different sections of functionality: the "Encoding-section", the "Simulation-section," and the "Decoding-section".
6.1. Simulation System
The "Encoding-section" covers the audio and video encoding and the Real-Time Transport Protocol (RTP) packetization. This section uses the original audio and video files of the selected sequence as input. The input files are encoded using an H.264/AVC Baseline Profile encoder for video and an audio encoder for AAC HEv2  encoding. After media encoding, the "RTP Packetizer" generates the RTP stream. To fit the Maximum Transmission Unit (MTU) of the IP layer, which in our system is set to 832 Bytes, it fragments the Network Abstraction Layer Units (NALUs) as exemplary depicted in Figure 15 by the use of fragmented NALUs (FUs) specified in RFC 3984 . Finally, the "RTP Packetizer" extracts the RTP header information, for example, packet size, presentation timestamp, sequence number, and so forth from the RTP packets and stores this information in a text file. These "virtual" RTP packets are further used in the "Simulation-section" as input.
The "Simulation-section" simulates the transmission over an MBMS channel including IP/UDP packetization and application layer FEC en-/decoding. The "Streamparser" receives the virtual RTP packets from the text files delivered by the "Encoding-section" and the "IP/UDP-packet-generator" generates virtual IP/UDP packets. Further this section distinguishes between different layers (ProtClass) and attaches virtual Raptor packets to the different IP streams of the different layers. The virtual IP streams of the different protection classes are further sent to the "MBMS-channel-simulator." The channel is simulated by the use of the MBMS loss traces (loss patterns). The generation of these loss patterns is described in Section 6.2. Each Transmit Time Interval (TTI) has 16 TBs. After the transmission, the "FEC Analyzer" checks, if the amount of received Raptor packets is sufficient for error correction. Finally, the recovered virtual RTP packets are stored in a text file (Received video packets) and statistics of the lost packets and the type of lost packets are written to a second text file (ErrorLog).
The "Decoding-section" rebuilds an H.264/AVC data stream from the encoded H.264/AVC stream, using the information of the lost RTP packets. The "Video Rebuild Module" extracts the successful received H.264/AVC NALUs and writes them back to an H.264/AVC data file, which is further decoded by an H.264 decoder. In case of the loss of referenced frames, error concealment similar to the approach presented in  is used. To control the subjective quality of the received video, the "Decoding-section" allows for reconstructing the received video. The "Media Rebuild Module" combines the received YUV file with the original audio stream which is reassembled using the same error characteristics as recorded by the simulation. We used a kind of an audio concealment in such a way, that we assume in case of lost audio packets a volume drop to zero. The output is a YUV file and an AVI file, where the latter also comprises the audio. The YUV file can be used for calculating the PSNR value and the AVI file for subjective tests.
Figure 15 depicts the mapping of the encoded H.264/AVC NALUs on the MBMS transport blocks (TBs). In our simulation we use one slice per picture; therefore, the size of the NALU mainly depends on its picture type and the sequence characteristic. To match the IP MTU size, in our simulation we use fragmented NALUs . The Raptor generation is done according to the standard specification  and the header extensions are added to the source and to the repair packets. The IP and UDP packets are finally mapped on the MBMS transport blocks (TBs) of fixed size 82 Byte.
6.2. MBMS Channel Simulation
In our investigation we used traces of the physical layer transport block (TB) error probability from an MBMS multicell radio network simulation of a typical urban area. The most important parameters of this simulation are summarized in Table 6.
The BLEP traces capture the error distribution and time correlations due to multipath fading. It can be shown that the coherence time of the Rayleigh fast fading process (for so called classical Jakes Doppler spectrum) is
with the maximum Doppler frequency:
with c being the speed of light. For the simulation the parameters 3 km/h, 2.1 GHz, and 72.5 milliseconds are chosen. With this, is smaller than the TTI which is equal to 80 milliseconds. Therefore little correlation of the BLEP is expected between even adjacent TTIs. For the 40 ms TTI used with 256 kbps bearers, there will be some correlation. Due to interleaving and the properties of the Turbo Code, the error correlation of transport blocks in one TTI is very high. In the simulation the model is that either all TBs are correct or all TBs are in error in a TTI. There is one trace for each of the 500 users and each trace covers 40 s. For video sequences that last longer than the traces, the respective traces have been used multiple times, with a random starting point offset. For ML method 1, we used transport block loss traces recorded simultaneously for 2 radio bearers transmitted at different power fractions. The error probabilities of the trace are compared with the values of a pseudorandom generator determining whether the data in a TB is lost or not. For every user the simulation is repeated four times with a different starting point in the loss pattern. Figure 16 shows a recorded trace of such a simulation channel. The first number is the importance layer; so here we have only layer 0. The following character depicts the packet type, where "_V" stands for video, "_S" for Recovery SEI, "_R" for an RTCP packet, "_A" for audio, and "_FEC0" for repair packets of layer 0. The next value is the CTS ("composition time stamp") value, followed by the IP packet size. After the arrow "" follows the transmission trace of the MBMS TB packets. The value "" is the loss probability of the loss traces and the next value "" is the random check value. If the random check value is lower, all TBs of the following TTI are lost which is marked by the surrounding stars "*z*."
6.3. Video Sequences and Application Layer FEC
An MBMS bearer in Rel. 6 allows for the transmission of up to 250 kbps. Since the MBMS Rel. 6 specification supports in maximum H.264/AVC Baseline profile Level 1b, encoding of a sequence is limited to QCIF@15 fps and a maximum bit rate of 128 kbps . To fully exploit the given bearer rate, we assume that MBMS terminals can cope with slightly higher minimum requirements as, for example, Level 1.1. As MBMS Rel. 7 supports H.264/AVC Baseline profile Level 1.2, this assumption is even more justified.
We encoded three different sequences using a rate-controlled encoder with an IpppP coding structure and a random access point Interval (I-pictureRecoverypoint SEI) every two seconds. An overview of the sequence parameters is given in the following. The Reuter sequence is part of a news program and lasts 26 seconds. The sequence has QCIF resolution and a frame rate of 12.5 fps. We encoded with an average PSNR value of 38 dB and a resulting bit rate of 42 kbps. The Stronger sequence is a part of a music clip of 64 seconds duration. The sequence has QCIF resolution and a frame rate of 25 fps. It is encoded with an average PSNR value of 36 dB and a resulting bit rate of 148 kbps. The Wineyard sequence is a part of a documentation and lasts 64 seconds. It has QCIF resolution at 25 fps and is encoded with an average PSNR of 36 dB at a bit rate of 88 kbps. For each sequence an audio layer is encoded with an AAC-HE v2 codec  at constant bit rate of 32 kbps and 48 kHz sample frequency.
Instead of implementing the Raptor code as defined in , we model it by defining that a Raptor code block can be decoded, if the ratio of received symbols to all symbols in a code block is at least the number of source symbols plus 3% code specific symbol overhead. We chose the code block size of the Raptor code so that it covers 2 seconds of video data, whereas each source block incorporates one random access point. Longer intervals would have been preferable for better correction efficiency; however, the coding interval will also be the maximum interval it takes for a UE that has just tuned into the ongoing stream before it can continuously play out the stream. The tune-in time also gets relevant, when the user switches between channels. Therefore the tune-in time needs to be kept low. Following the guidelines in , the symbol size of the Raptor is set to 32 Byte and there are 16 symbols within one Raptor repair packet.
7. Simulation Results
In the conducted simulations we compare the SL transmission schemes with the ML transmission schemes (cf. Section 4). The SL schemes contain all data within one layer whereas in the ML schemes the more important layer 0 contains the referenced I- and P-pictures as well as the audio stream (AU) and layer 1 contains the nonreferenced p-pictures. The associated bit rates of the SL and ML stream including IP/UDP/RTP packetization overhead, the frame rate, and the video quality in terms of PSNR for all sequences are shown in Tables 8, 9, and 10. Due to similar importance compared to the video base layer, we defined that the audio layer is a part of the media base layer.
For keeping the total bit rate constant, the increase in reliability for the base layer must be compensated with a decreased reliability for the enhancement layer. That is, an even bit rate distribution would be advantageous for the ML transmission schemes. Depending on the characteristics of the sequences, the bit rates of the ML case are more (e.g., Stronger) or less evenly (e.g., Reuter) distributed over the layers.
The plots in Figures 17, 19, and 21 show the average quality coverage over Ucc of 500 users in the same network area for a simulation time from 104 seconds up to 256 seconds depending on the length of the sequence. Selected results of the Stronger, Wineyard, and Reuter sequence are shown in more detail in Figures 18, 20, and 22. Each transmission scheme is marked by different colors. The different lines of the same color denote different settings, which differ either in FEC code rate or in transmission power or even both. The lower one of the connected points denotes the coverage of users receiving the max quality, whereas the coverage of the med quality is depicted by the middle point. The coverage value of the min quality is shown by the highest connected point. For SL transmission the coverage of the med and max quality has almost the same value due to the selection of quality layers in Tables 8, 9, and 10 and the selection of the quality categories depicted in Table 4. For SL, if one source block is lost, there are I-, P- and, p-pictures as well as audio frames affected. Due to the higher sensibility of the defined quality categories to losses of I- and P-pictures and audio-frames, for SL transmission schemes, it is quite unlikely that a UE experiences the med quality category, where only losses of p-pictures are tolerated.
The cost for a service providing graceful degradation is a weaker reliability of the less important layer compared to SL transmission. Therefore all the ML transmission schemes loose coverage in the max quality compared to the SL transmission. That is, since the overall used radio resources (transmission power and time) are about the same in this comparison, the radio resources needed for the higher robustness of layer 0 are taken away from the radio resources of the layer 1 which is responsible for the decrease in max quality and thereby this layer becomes less reliable. The same behavior can be observed for unequal error protection using application layer FEC. Each transmission scheme is simulated with manually selected code rates and transmission power values. Each setting requires different Ucc values. That is, with increasing Ucc, the coverage of all transmission schemes increases due to higher transmission power or lower code rate.
Observing the overview plots in Figures 17, 19, and 21 for all sequences, the use of an additional FEC in SingleLayerFEC scheme outperforms the SingleLayer approach. That is, increasing the transmission power seems to be less efficient in preventing the occurrence of error bursts that affect reference frames than decreasing the code rate. The ML schemes show a wider spreading between the min and max quality coverage. This is due to the loss in max quality coverage due to the lower protection for the nonreferenced pictures and an increase in min quality coverage due to the higher protection for audio and the referenced pictures.
In Figures 18, 20, and 22, the simulated code rate and transmission power are shown in the legend; for example, "UTP_UEP 0.7/0.9 @ 4 dB/6 dB" means an UTP_UEP transmission scheme, where the code rate and transmission power for layer 0 are 0.7 and 4 dB and for layer 1 0.9 and 6 dB, respectively. The transmit power values are relative to the maximum base station total power of 20 W.
The results for the Stronger sequence are depicted in Figure 17 for a Ucc value from 0.2 to 0.7. The blue curves (SingleLayerFEC_0.8_max and SingleLayerFEC_0.8_min) depict the best SL transmission setting, which is SingleLayerFEC with code rate 0.8 under varying transmission power. These blue curves are taken as reference and each ML scheme and setting is compared against.
With the advantageous bit rate distribution of the stronger sequence (cf. Table 8), most ML transmission schemes allow for increasing the coverage at least at min quality and med quality category. Due to the less robust enhancement layer, the max quality category of the ML schemes shows a weaker coverage. Figure 18 depicts a cut-out of selected results in the area from 0.35 to 0.50 in terms of Ucc, which is marked by the red box in Figure 17.
With the assumption that the loss of nonreferenced p-pictures is tolerable (cf. Section 5), we can compare the med quality coverage of the ML scheme with the max quality coverage of the SL scheme. That is, the "UEP 0.7/0.9 @ 5 dB" scheme shows a gain of 5% coverage compared to the "SingleLayerFEC 0.8" SL scheme. Whereas all selected ML schemes show a lower coverage for the max quality category, a gain for the med and the min quality category coverage can be observed. The results show that for the Stronger sequence using certain settings of code rate and transmission power, an increase in coverage can be introduced by an ML transmission scheme at similar cost.
The bit rate distribution for the Wineyard sequence using an IpppP coding structure (cf. Table 8) is less favorable for using an ML transmission compared to the Stronger sequence. The base layers bit rate, which includes the audio stream, has about 72% of the total bit rate, which makes it quite expensive for the enhancement layer to increase the protection of the more important base layer. The simulations results are shown in Figure 19 from Ucc 0.23 to 0.50.
As can be observed in Figure 19, the ML schemes for the Wineyard sequence can increase the coverage for the min quality category but also show a high loss for the max quality. That is, compared to the SL schemes, using the ML schemes the loss for the coverage of the max category is higher than the gain in the min category. Figure 20 depicts the area in the red box, where selected transmission schemes can be compared in more detail.
As exemplary setting, the med quality category of the "UEP 0.6/0.8 @ 6 dB" scheme shows a 2% gain in comparison to the max quality coverage of the "SingleLayerFEC_0.6_max" curve. On the other hand we have a huge drop of about 20% in max quality coverage. That is, the unfavorable bit rate distribution for the Wineyard sequence at QCIF resolution makes it difficult for the ML schemes to increase coverage.
Due to the low bit rate of the encoded video stream and the relative high bit rate for the audio stream, the bit rate distribution across the layers of the Reuter sequence with IpppP coding structure (cf. Table 10) is even worse than for the Wineyard sequence. The ML schemes transport 77% of the total bit rate in the base layer. Figure 21 show the simulation results for the Reuter sequence from UCC 0.06 to 0.31 and Figure 22 gives a more detailed view on exemplary results from UCC 0.16 to 0.23.
The ML schemes do not show any gain in terms of coverage compared to the "SingleLayerFEC_0.7" curve. The results show that, with the Reuter sequence and its very low bit rates and unfavorable distribution across the layers, an ML scheme cannot show any gain compared to a standard SL scheme with application layer FEC protection.
The bit rate distribution across the layer mainly influences the performance of the ML transmission schemes. As can be observed for the Stronger sequence, with an even bit rate distribution and an advantageous scheme settings, the ML transmission schemes can show a benefit or an increase in terms of coverage at a similar Ucc value. The results for Wineyard and Reuter sequence only show a small increase or for the latter sequence even no increase in terms of coverage. For both sequences, only the video stream provides a quite even bit rate distribution, but the audio bit rate is relative high compared to the bit rate of the video at QCIF resolution. Note that at higher resolutions the video bit rate ratio is much higher whereas the audio bit rate remains constant. That is, at higher resolutions the total bit rate distribution for Reuter and Wineyard sequence can be more favorable for the temporal scalability feature introduced in this work.
8. Conclusion and Summary
In this work, we propose transmission schemes for 3GPP MBMS Rel. 6 allowing for a graceful degradation with H.264/AVC Baseline Profile Temporal Scalability using unequal error protection, radio transmission power spreading, or a combination of both. We implemented a test system simulating a 3GPP MBMS Rel. 6 compliant transmission including application layer FEC using the systematic Raptor code. We introduce quality categories for video and audio picking up the ideas of weighting the loss rate of different frame types (I-, P-, p-pictures and audio-frame), the prediction length of the video coding structure, and the PSNR measure. The simulation results show that in principle graceful degradation can be applied to 3GPP MBMS. Using graceful degradation a higher number of users are provided with an acceptable media quality. As expected the number of users receiving perfect media quality is reduced. The results are strongly dependent on the bit rate distribution between the layers. In case of an even bit rate distribution between base and enhancement layer the most significant gains are observed. In case of uneven bit rate distribution gains are lower or not present. Future work could involve the use of temporal scalability together with a hierarchical prediction coding structure.
3GPP TS 25.346 : Introduction of the multimedia broadcast multicast service (MBMS) in the radio access network (RAN), stage 2. V6.13.0, March 2008
3GPP TS 26.346 : Technical specification group services and system aspects; multimedia broadcast/multicast service (MBMS); Protocols and codecs. V6.13.0, March 2009
Hartung F, Horn U, Huschke J, Kampmann M, Lohmar T, Lundevall M: Delivery of broadcast services in 3G networks. IEEE Transactions on Broadcasting 2007, 53(1):188-199.
Afzal J, Stockhammer T, Gasiba T, Xu W: System design options for video broadcasting over wireless networks. Proceedings of the 3rd IEEE Consumer Communications and Networking Conference (CCNC '06), January 2006, Las Vegas, Nev, USA 2: 938-943.
Digital Video Broadcasting (DVB) : IP datacast over DVB-H: architecture. ETSI TR 102 469, 2006
Digital Video Broadcasting (DVB) : System specifications for satellite services to handheld devices (SH) below 3 GHz. ETSI TR 102 585, 2007
ITU-T , ISO/IEC JTC 1 : Advanced video coding for generic audiovisual services, ITU-T Recommendation H.264 and ISO/IEC 14496-10 (AVC). Version 8, July 2007
Barmada B, Ghandi MM, Jones EV, Ghanbari M: Prioritized transmission of data partitioned H.264 video with hierarchical QAM. IEEE Signal Processing Letters 2005, 12(8):577-580.
Chang YC, Lee SW, Komiya R: A low-complexity unequal error protection of H.264/AVC video using adaptive hierarchical QAM. IEEE Transactions on Consumer Electronics 2006, 52(4):1153-1158.
De Cock J, Notebaert S, Van de Walle R: Combined SNR and temporal scalability for H.264/AVC using requantization transcoding and hierarchical B pictures. Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '07), July 2007, Beijing, China 448-451.
Schierl T, Schwarz H, Marpe D, Wiegand T: Wireless broadcasting using the scalable extension of H.264/AVC. Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '05), July 2005, Amsterdam, The Netherlands 884-887.
Girod B, Horn U, Belzer B: Scalable video coding with multiscale motion compensation and unequal error protection. Proceedings of the International Symposium on Multimedia Communications and Video Coding, October 1995, New York, NY, USA
Albanese A, Bloemer J, Edmonds J, Luby M, Sudan M: Priority encoding transmission. IEEE Transactions on Information Theory 1996, 42(6):1737-1744. 10.1109/18.556670
FLO technology overview Whitepaper, Qualcomm, San Diego, Calif, USA, 2007
Correia AMC, Silva JCM, Souto NMB, Silva LAC, Boal AB, Soares AB: Multi-resolution broadcast/multicast systems for MBMS. IEEE Transactions on Broadcasting 2007, 53(1):224-234.
Luby M, Gasiba T, Stockhammer T, Watson M: Reliable multimedia download delivery in cellular broadcast networks. IEEE Transactions on Broadcasting 2007, 53(1):235-245.
3GPP TS 25.992 : Multimedia broadcast multicast service (MBMS);UTRAN/GERAN requirements. V6.0.0, October 2006
3GPP TS 25.331 : Technical specification group radio access network; radio resource control (RRC); protocol specification (release 6). V6.21.0, March 2009
3GPP: improvement of the multimedia broadcast multicast service (MBMS) in UTRAN 2006., (25.905 V7.0.0):
Engelke U, Zepernick H: Perceptual-based quality metrics for image and video services: a survey. Proceedings of the 3rd EuroNGI Conference on Next Generation Internet Networks (NGI '07), May 2007, Trondheim, Norway 190-197.
Lotfallah OA, Reisslein M, Panchanathan S: A framework for advanced video traces: evaluating visual quality for video transmission over lossy networks. EURASIP Journal on Applied Signal Processing 2006, 2006:-21.
Couto da Silva AP, Rodríguez-Bocca P, Rubino G: Optimal quality-of-experience design for a P2P multi-source video streaming. Proceedings of the IEEE International Conference on Communications (ICC '08), May 2008, Beijing, China 22-26.
Mohamed S, Rubino G: A study of real-time packet video quality using random neural networks. IEEE Transactions on Circuits and Systems for Video Technology 2002, 12(12):1071-1083. 10.1109/TCSVT.2002.806808
Video streaming quality measurement with VSQI Ericsson, Stockholm, Sweden; 2006.
ISO/IEC 14496-3:2001 : Coding of audio-visual objects—part 3: audio—final committee draft. July 2001
Wenger S, Hannuksela M, Stockhammer T, Westerlund M, Singer D: RTP payload format for H.264 video. RFC 3984, February 2005
ITU-Telecommunications Standardization Sector; Video Coding Experts Group (VCEG) : Non-normative error concealment algorithms. Document VCEG-N62; September 2001
Rappaport TS: Wireless Communications. Prentice-Hall, Englewood Cliffs, NJ, USA; 1996.
This work has been partly funded by the German Federal Ministry of Education and Research (BMBF) under Grant 01BU355. The authors are responsible for the content of this publication.