# Minimum decoding trellis length and truncation depth of wrap-around Viterbi algorithm for TBCC in mobile WiMAX

- Yu-Sun Liu
^{1}Email author and - Yao-Yu Tsai
^{1}

**2011**:111

https://doi.org/10.1186/1687-1499-2011-111

© Liu and Tsai; licensee Springer. 2011

**Received: **21 June 2011

**Accepted: **25 September 2011

**Published: **25 September 2011

## Abstract

The performance of the wrap-around Viterbi decoding algorithm with finite truncation depth and fixed decoding trellis length is investigated for tail-biting convolutional codes in the mobile WiMAX standard. Upper bounds on the error probabilities induced by finite truncation depth and the uncertainty of the initial state are derived for the AWGN channel. The truncation depth and the decoding trellis length that yield negligible performance loss are obtained for all transmission rates over the Rayleigh channel using computer simulations. The results show that the circular decoding algorithm with an appropriately chosen truncation depth and a decoding trellis just a fraction longer than the original received code words can achieve almost the same performance as the optimal maximum likelihood decoding algorithm in mobile WiMAX. A rule of thumb for the values of the truncation depth and the trellis tail length is also proposed.

## Keywords

## 1 Introduction

The IEEE 802.16 defines the wireless metropolitan area network (MAN) technology that is commonly referred to as WiMAX. The IEEE 802.16 includes two sets of standards, IEEE 802.16-2004 (802.16d) [1] for fixed WiMAX and IEEE 802.16-2005 (802.16e) [2] for mobile WiMAX. In mobile WiMAX, tail-biting convolutional codes (TBCCs) [3] are designated as the mandatory error-correcting codes. In the WiMAX transmitters, data bursts are divided into data blocks, and each data block is separately encoded by a TBCC encoder. The circular decoding algorithm [4–6], in which the wrap-around Viterbi algorithm traverses on the circular code trellis, has been shown to be a simple and effective decoding method for TBCCs. Its performance depends on both the truncation depth of the Viterbi algorithm [7] and the length of the circular decoding trellis [8]. The larger the truncation depth or the longer the decoding trellis, the better the performance, but also more computational overhead and longer delay.

The goal of this paper is to investigate how to choose truncation depth and decoding trellis length in mobile WiMAX. The rule of thumb for truncation depth has been studied in the literature [9, 10], but never for higher order modulations on the Rayleigh channel. Several circular decoding algorithms with adaptive decoding trellis length were proposed in [11–14]. These methods do not guarantee fixed number of computations. However, for DSP/ASIC implementation, fixed decoding trellis length with fixed number of computations and delay is preferable. In this paper, we examine the performance of the circular decoding algorithm with finite truncation depth and fixed trellis length for all transmission rates in mobile WiMAX. We first derive upper bounds on the error probabilities induced by finite truncation depth and finite trellis length. We show that the circular decoding algorithm with an appropriately chosen truncation depth and a fixed-length decoding trellis just a fraction longer than the original one can achieve almost the same performance as the maximum likelihood (ML) decoding algorithm in mobile WiMAX. Moreover, the truncation depths and trellis lengths that yield losses of 0.05 dB relative to ML decoding algorithm are obtained for all transmission rates on the Rayleigh channel. Finally, we also obtain a rule of thumb for the relative values of truncation depth and trellis tail length.

## 2 Circular decoding algorithm

In mobile WiMAX systems, data bursts are divided into data blocks. Each data block is separately encoded by the binary (171, 133) convolutional encoder with memory *m* = 6. Before encoding, the convolutional encoder memory is initialized with the last 6 bits of the data block being encoded. Thus, the initial state of the code trellis is the same as the end state. After encoding, the TBCC code word is then punctured to realize the designated code rate *r*, where *r* can be one of the three possible code rates 1/2, 2/3, or 3/4. Let *L* denote the length of a data block, and let (*d*_{0}, *d*_{1},⋯, *d*_{L-1}) denote the data block. It follows that the resulting TBCC code word with length *n* = *L/r* can be viewed as one period of the periodic convolutional code word generated by periodic data bits with period *L*. The circular decoding algorithm (similar to the one in [6]) with truncation depth *W* and trellis tail length *U* discussed in this paper is described as follows:

**Step 1:** For each received codeword metric sequence $\stackrel{\u0304}{v}=\left({v}_{0},{v}_{1},\dots ,{v}_{n-1}\right)$, lengthen the sequence by copying the first *U/r* entries of the sequence and appending them to the end of the sequence.

**Step 2:**Data bits are decoded by using the soft-decision Viterbi algorithm with truncation depth

*W*[9] and decoding trellis length

*L*+

*U*. It is convenient to explain the Viterbi decoding algorithm by means of a trellis diagram. Figure 1 illustrates an example of decoding trellis for a convolutional code with

*m*= 2. The Viterbi algorithm is initialized by assigning the same metric value to all possible initial states. At each decoding depth

*t*,

*t*≥

*W*-1, the information bit on the branch at depth

*t -W*+ 1 is decoded by selecting the best survivor path at state

*S*

_{t+1}and tracing back the path to find the information bit

*d*

_{t-W+1}. Thus, a total of

*L*+

*U*-

*W*+ 1 data bits are decoded by the Viterbi algorithm. It is to be noted that the last

*U - W*+ 1 decoded bits are obtained in the second round of traversing the circular decoding trellis.

**Step 3:** Replace the first *U - W* + 1 decoded data bits by the last *U - W* + 1 decoded bits to obtain the final data sequence of length *L*. Since the initial state of the TBCC encoder is unknown to the Viterbi decoder, the bit error rates (BERs) of the first few decoded data bits are much larger than those of the rest. Thus, the first few unreliable decoded bits are replaced by those decoded bits obtained in the second traverse of the circular decoding trellis.

## 3 Upper bounds on error probabilities

*S*

_{ i }and data bit

*d*

_{ i }in the example in Figure 1), respectively. The bit error probability of the

*k*th decoded data bit in Step 2,

*k*= 0, 1,...,

*L*+

*U - W*, is upper bounded by the sum of probabilities of the following four error events.

- 1.
The chosen path at decoding depth

*k*+*W*- 1 diverges from the correct path at state ${S}_{{t}_{1}}$, 0 ≤*t*_{1}≤*k*and merges into the correct path for the first time at state ${S}_{{t}_{2}}$,*k*<*t*_{2}≤*k*+*W*and the decoded data ${d}_{k}^{*}\ne 0$. - 2.
The chosen path at decoding depth

*k*+*W*- 1 diverges from the correct path at state ${S}_{{t}_{1}}$, 0 ≤*t*_{1}≤*k*, never merges with the correct path, and reaches state ${S}_{k+W}^{*}$, ${S}_{k+W}^{*}\ne \stackrel{\u0304}{0}$. - 3.
The chosen path has an initial state ${S}_{0}^{*}\ne \stackrel{\u0304}{0}$ and merges into the correct path for the first time at state ${S}_{{t}_{2}}$ with

*k*+*m < t*_{2}≤*k*+*W*. (This is because if a path merges into correct path at state ${S}_{{t}_{2}}$, the last*m*data bits must be correct.) - 4.
The chosen path has an initial state ${S}_{0}^{*}\ne \stackrel{\u0304}{0}$, never merges with the correct path and reaches state ${S}_{k+W}^{*}\ne \stackrel{\u0304}{0}$.

*P*

_{1}, is upper bounded by the bit error probability of ML decoding for zero-tail convolutional codes. Let

*d*

_{free}is the free distance of the convolutional code,

*a*

_{ ij }is the number of paths with Hamming weight

*i*that are generated by data sequences containing

*j*non-zero bits, and the exponents of

*D*and

*N*describe the Hamming weights of coded sequences and data sequences of the paths, respectively. From [15], we get

*E*

_{ s }is the energy of a QPSK signal,

*N*

_{0}/2 is the power spectral density of AWGN, and

*S*

_{ B }, ${S}_{B}\in \mathcal{B}$, end in a state

*S*

_{ E }, ${S}_{E}\in \mathcal{E}$, and never merge with the all-zero path in between, where

*b*

_{ i }is the number of such paths with Hamming weight

*i*, and the exponent of

*D*describes the Hamming weights of such paths. The probability of the second error event,

*P*

_{2}, is upper bounded by the sum of all error probabilities caused by each possible error paths in the second error event. Thus, by following an argument similar to the ones in [10, 15],

*P*

_{2}satisfies

*d*

_{2}is the minimum weight of error paths in the second error event, and $\mathcal{A}$ is the set of all 64 states. The third error event is caused by the uncertainty of the encoder's initial state. Similarly, The error probability

*P*

_{3}of the third error event satisfies

*d*

_{3}is the minimum weight of error paths in the third error event. Finally, the error probability of the last error event,

*P*

_{4}, satisfies

where *d*_{4} is the minimum weight of error paths in the fourth error event. This error probability is caused by both finite truncation depth and the uncertainty of the encoder's initial state. The bit error probability of the *k* th decoded bit is upper bounded by the sum of four upper bounds in (2), (7), (8), and (9).

*D*

_{0}is very small, and only the term with the smallest power of

*D*is significant. In the upper bounds of

*P*

_{1},

*P*

_{2},

*P*

_{3}, and

*P*

_{4}, the smallest powers of

*D*are

*d*

_{free},

*d*

_{2},

*d*

_{3}, and

*d*

_{4}, respectively. Let

*W**and

*k**be the least values of

*W*and

*k*such that

*d*

_{2}>

*d*

_{free}and

*d*

_{3}>

*d*

_{free}, respectively. It follows that, if truncation depth is

*W**and the first

*k**- 1 decoded bits in Step 2 are replaced in Step 3 (equivalently, trellis tail length

*U**=

*W**+

*k**- 2), the error probabilities

*P*

_{2}and

*P*

_{3}of each bit in the final data sequence will be small compared to the error probability of ML decoding for zero-tail convolutional codes. The values of

*W**,

*k**, and

*U**for the three code rates are obtained using a method similar to the one in [10] and are listed in Table 1. It is to be noted that for the rate-2/3 and rate-3/4 TBCCs,

*W**may be different for different bits in a puncturing period. Thus, the values of

*W**in Table 1 are the maximum values of

*W**over a puncturing period. In this table, ${d}_{4}^{*}$ that denotes the value of

*d*

_{4}for the case

*W*=

*W**and

*k*=

*k**is also listed. Observe that ${d}_{4}^{*}>{d}_{\mathrm{free}}$ for every code rate. Therefore, we conclude that the bit error rate of circular decoding algorithm with

*W*=

*W**and

*U*=

*U**will asymptotically approach that of the ML decoding algorithm for zero-tail convolutional codes for high signal-to-noise ratio. From (7) and (8), it follows that if the generator polynomials of convolutional codes are symmetric [10],

The values of *W**, *k**, and *U** for TBCCs

Code rate |
d
| W* | k* | U* | ${d}_{4}^{*}$ |
---|---|---|---|---|---|

1/2 | 10 | 28 | 20 | 46 | 14 |

2/3 | 6 | 34 | 26 | 58 | 7 |

3/4 | 5 | 48 | 42 | 88 | 7 |

From Table 1 we observe that even though the three codes in mobile WiMAX do not have symmetric generator polynomials, *W** -*m* -1 is still a good estimation for *k**.

*P*

_{3}and

*P*

_{4}for each data bit (after replacement in Step 3) are much smaller than those of

*P*

_{1}and

*P*

_{2}. Thus, the average bit error rate is upper bounded by

*P*

_{1}+

*P*

_{2}. In other words, the degradation of decoder performance is mainly caused by finite truncation depth. It is to be noted that even if the trellis tail length is only 60+W (equivalently, the first 61 decoded bits are replaced in Step 3), the value of

*P*

_{3}+

*P*

_{4}is many orders of magnitude smaller than the upper bounds of

*P*

_{1}+

*P*

_{2}. Figure 2 plots the upper bounds of

*P*

_{1}and

*P*

_{2}and their sum versus the truncation depth

*W*for

*E*

_{ b }

*/N*

_{0}= 4 dB. At this signal-to-noise ratio, the BER of optimal ML decoding algorithm (without memory truncation) is approximately 10

^{-5}. For comparison, simulation results of BER with tail length

*U*= 120 are also plotted in the figure. We observe that the upper bound of

*P*

_{2}decreases exponentially with the truncation depth

*W*, so that BER is dominated by the error probability

*P*

_{1}for truncation depth

*W*≥

*W'*= 35.

*P*

_{2}and

*P*

_{4}are much smaller than those of

*P*

_{1}and

*P*

_{3}for each decoded bit. Thus, the BER is upper bounded by

*P*

_{1}+

*P*

_{3}. It is to be noted that even if the truncation length is only 60, the value of

*P*

_{2}+

*P*

_{4}is many orders of magnitude smaller than the upper bounds of

*P*

_{1}+

*P*

_{3}. Figure 3 plots the upper bounds of

*P*

_{1}and

*P*

_{3}and their sum for each decoded bit with

*E*

_{ b }/

*N*

_{0}= 4 dB. For comparison, the simulated BER for each decoded bit with truncation depth

*W*= 100 is also plotted in the figure. We observe that the upper bound of

*P*

_{3}decreases exponentially when bit index

*k*increases. In other words, the performance degradation caused by the uncertainty of the initial state abates rapidly as the decoder traverses through the trellis. From the figure, we see that BER is dominated by the error probability

*P*

_{1}for

*k*≥

*k'*= 27. Thus, if the first

*k'*- 1 decoded bits are replaced in Step 3, all the resulting data bits will have almost the same bit error probability. It is noteworthy that

*W'*-

*m*- 1 is still a good estimation for

*k'*. Figures 2 and 3 are plotted for

*E*

_{ b }/

*N*

_{0}= 4 dB. As

*E*

_{ b }

*/N*

_{0}increases, the values of

*W'*and

*k'*decrease. Moreover, the values of

*W'*and

*k'*approach the values of

*W**and

*k**in Table 1 as

*E*

_{ b }

*/N*

_{0}approaches 5 dB and BER ≈ 5 × 10

^{-7}.

*P*

_{4}. Figure 4 plots the upper bounds of

*P*

_{4}and

*P*

_{1}+

*P*

_{2}+

*P*

_{3}for each decoded bit with

*E*

_{ b }/

*N*

_{0}= 4 dB. The upper bound of

*P*

_{4}depends both on the truncation depth

*W*and the bit index

*k*. From the figure, we observe that if the truncation depth is chosen as

*W'*= 35, the sum of upper bounds on all the other three error probabilities is much larger than the upper bound of

*P*

_{4}. As the truncation length

*W*increases, the contribution of

*P*

_{4}to BER becomes even more insignificant.

## 4 Simulation results

*L*equal to the maximum data block length in the mobile WiMAX standard. Figures 5, 6, and 7 plot the average BERs of the circular decoding algorithm versus truncation depth for QPSK rate-1/2, 64QAM rate-2/3, QPSK rate-3/4 with very long tail length (

*U*= 120), respectively. As a benchmark for comparison, the average BER of the optimal ML decoding algorithm (without memory truncation) is also plotted in the figures. These figures show that the circular decoding algorithm with a sufficiently large truncation depth can achieve almost the same error performance as optimal ML decoding. We also observe that all TBCCs require smaller truncation depth as

*E*

_{ b }

*/N*

_{0}increases, which agrees with the observation in the previous section. Table 2 lists the least value of truncation depth $\stackrel{\u0303}{W}$ that yields losses within 0.05 dB of optimal ML decoding for BER ≈ 10

^{-5}.

*E*

_{ b }

*/N*

_{0}values in Table 2 are the required bit signal-to-noise ratios for BER ≈ 10

^{-5}. From the table, we obtain a rule of thumb for the truncation depth $\stackrel{\u0303}{W}$. The rate-1/2 code requires a truncation depth of six to seven times the memory of the convolutional code, and the rate-2/3 and rate-3/4 codes require a truncation depth of ten to eleven times the memory. From the table, we also observe that high-order modulations require larger truncation depths than low-order ones.

The values of $\stackrel{\u0303}{W}$, $\stackrel{\u0303}{k}$, and $\stackrel{\u0303}{U}$ for all transmission rates.

Modulation | Code rate | E | $\stackrel{\u0303}{W}$ | $\stackrel{\u0303}{k}$ | $\stackrel{\u0303}{U}$ |
---|---|---|---|---|---|

QPSK | 1/2 | 8 | 35 | 26 | 59 |

QPSK | 3/4 | 12.5 | 60 | 45 | 103 |

16QAM | 1/2 | 11 | 39 | 27 | 64 |

16QAM | 3/4 | 16 | 65 | 42 | 105 |

64QAM | 1/2 | 14 | 43 | 36 | 77 |

64QAM | 2/3 | 17 | 64 | 52 | 114 |

64QAM | 3/4 | 20 | 67 | 57 | 122 |

*E*

_{ b }/

*N*

_{0}= 8 dB), 64QAM rate-2/3 (with

*E*

_{ b }/

*N*

_{0}= 17 dB), and QPSK rate-3/4 (with

*E*

_{ b }/

*N*

_{ 0 }= 12.5 dB) with large truncation depth (

*W*= 100), respectively. We observe that even though the BER tends to decrease in general as the Viterbi decoder traverses through the trellis, the BER is not a monotonically decreasing function of index

*k*. This is caused by a short block length

*L*and small interleaving depth in Figure 8. When pairs of coded bits for QPSK signals are deinterleaved to form a codeword trellis in the receivers, coded bits in some pairs end up being very close to each other on the trellis, while others are further apart. In Figures 9 and 10, this problem is further complicated by code puncturing in rate-2/3 and rate-3/4 convolutional codes and unequal protection of each coded bit in high-order modulation. Define $\stackrel{\u0303}{k}$ as the index of the first decoded bit that attains losses within 0.05 dB of optimal ML decoding. Table 2 lists the values of $\stackrel{\u0303}{k}$ for BER ≈ 10

^{-5}. It is to be noted that for rate-2/3 and rate-3/4 TBCCs, each data bit in a puncturing period has a different protection level and the values of $\stackrel{\u0303}{k}$ in the table are obtained by using the average BERs over a puncturing period. We observe that only a small fraction (less than 1/3) of the decoded bits in the first decoding round are unreliable, and thus should be replaced. We conclude that if truncation depth is $\stackrel{\u0303}{W}$ and the first $\stackrel{\u0303}{k}-1$ decoded bits in Step 2 are replaced in Step 3 (equivalently, trellis tail length $\stackrel{\u0303}{U}=\stackrel{\u0303}{W}+\stackrel{\u0303}{k}-2$), the losses caused by truncation and the uncertainty of the initial state will be both within 0.05 dB relative to ML decoding. We also observe $\stackrel{\u0303}{k}\le \stackrel{\u0303}{W}-m-1$ for all transmission rates. Thus, $\stackrel{\u0303}{W}-m-1$ can be used as a rule of thumb for the choice of $\stackrel{\u0303}{k}$. It is noted that if the tail length is chosen according to another criterion that the average BER over the whole TBCC codeword attains a loss less than 0.05 dB, the tail length will be substantially less than $\stackrel{\u0303}{U}$. Finally, Figure 11 plots the average BERs of the circular decoding algorithm with truncation depth $\stackrel{\u0303}{W}$ and trellis tail length $\stackrel{\u0303}{U}$ for all transmission rates in mobile WiMAX.

## 5 Conclusions

We have investigated the error probabilities of TBCCs caused by memory truncation and the uncertainty of the initial state. From the upper bounds on the error probabilities, we found that if the same criterion is used to choose the truncation depth *W* and the first reliable decoded bit *k*, then *k* = *W* - *m* - 1 for symmetric convolutional codes. The truncation depth, the index of the first reliable bit, and the trellis tail length with 0.05 dB losses on the Rayleigh channel were obtained by simulation for each transmission rate in the mobile WiMAX standard. From the results, we obtain a rule of thumb for the truncation depth *W* and trellis tail length *U* . The rate-1/2 code requires a truncation depth of six to seven times the memory *m*, and the rate-2/3 and rate 3/4 codes require a truncation depth of ten to eleven times *m*. Moreover, *W* - *m* - 1 is an appropriate rule of thumb for the first reliable decoded bit *k*. Thus, the rule of thumb for trellis tail length is *U* = 2*W* - *m* - 3. The results show that the circular decoding algorithm with an appropriately chosen truncation depth and a circular trellis just a fraction longer than the original trellis can achieve almost the same performance as the optimal ML decoding algorithm in mobile WiMAX. Moreover, it is observed that high-rate TBCCs require larger truncation depths and longer trellis length than low-rate ones, and high-order modulations require larger truncation depths and longer trellis length than low-order ones.

## Declarations

### Acknowledgements

This work was supported by the National Science Council, R.O.C., under the contract NSC 97-2218-E-027-010.

## Authors’ Affiliations

## References

- IEEE: IEEE Standard for Local and Metropolitan Area Networks.
*Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems, IEEE Std 802.16-2004. New York*2004.Google Scholar - IEEE: IEEE Standard for Local and Metropolitan Area Networks.
*Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems, IEEE 802.16e-2005 and IEEE Std 802.16-2004/Cor 1-2005 (Amendment and Corrigendum to IEEE Std 802.16-2004). New York*2006.Google Scholar - Ma HH, Wolf JK: On tail biting convolutional codes.
*IEEE Trans Commun*1986, 34: 104-111. 10.1109/TCOM.1986.1096498View ArticleGoogle Scholar - Yehushua M, Watson J, Parr M: System and method for decoding tailbiting code especially applicable digital cellular base stations and mobile units.
*U.S Patent 5,369,671*1994.Google Scholar - Chennakeshu S, Toy RL: Generalized Viterbi algorithm with tail-biting.
*U.S Patent 5,349,589*1994.Google Scholar - Wang YE, Ramésh R:
*Proceedings of the Seventh IEEE International Symposium on Personal, Indoor and Mobile Radio Communications. To bite or not to bite--A study of tail bits versus tail-biting*. 1996, 2: 317-321.Google Scholar - Lin S, Costello DJ Jr:
*Error Control Coding: Fundamentals and Applications*. Prentice-Hall, Englewood Cliffs, NJ, USA; 1983.Google Scholar - Sung W: Minimum decoding trellis lengths for tail-biting convolutional codes.
*Electron Lett*2000, 36: 643-645. 10.1049/el:20000517View ArticleGoogle Scholar - Hemmati F, Costello DJ Jr: Truncation error probability in Viterbi decoding.
*IEEE Trans Commun*1977, 25: 530-532. 10.1109/TCOM.1977.1093861View ArticleGoogle Scholar - Onyszchuk IM: Truncation length for Viterbi decoding.
*IEEE Trans Commun*1991, 39: 1023-1026. 10.1109/26.87203View ArticleGoogle Scholar - Cox RV, Sundberg CW: An efficient adaptive circular Viterbi algorithm for decoding generalized tailbiting convolutional codes.
*IEEE Trans Veh Technol*1994, 43: 57-68. 10.1109/25.282266View ArticleGoogle Scholar - Zigangirov KS, Chepyzhov VV:
*Proceedings of the 4th Joint Swedish-Soviet International Workshop Information Theory. Study of decoding tailbiting convolutional codes*. 1989, 52-56.Google Scholar - Wang Q, Bhargava VK: An efficient maximum likelihood decoding algorithm for generalized tail biting convolutional codes including quasicyclic codes.
*IEEE Trans Commun*1989, 37: 875-879. 10.1109/26.31187MathSciNetView ArticleGoogle Scholar - Shao RY, Lin S, Fossorier MPC: Two decoding algorithms for tailbiting codes.
*IEEE Trans Commun*2003, 51: 1658-1665. 10.1109/TCOMM.2003.818084View ArticleGoogle Scholar - Viterbi AJ: Convolutional codes and their performance in Communication systems.
*IEEE Trans Commun*1971, 19: 751-772. 10.1109/TCOM.1971.1090700MathSciNetView ArticleGoogle Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.