Reduced complexity Log-MAP algorithm with Jensen inequality based non-recursive max∗ operator for turbo TCM decoding

In this paper, a reduced complexity Log-MAP algorithm based on a non-recursive approximation of the max∗ operator is presented and studied for turbo trellis-coded modulation (TTCM) systems. In the algorithm, denoted as AvN Log-MAP, the max∗ operation is generalized and performed on n≥2 arguments. The approximation is derived from the Jensen inequality. The non-recursive form of the max∗ calculations allows to achieve significant reduction in the decoding effort in comparison to the conventional Log-MAP algorithm. Bit-error rate performance simulation results for serial and parallel TTCM schemes in the additive white Gaussian noise and uncorrelated Rayleigh fading channels show that the AvN Log-MAP algorithm performs close to the Log-MAP. Performance and complexity comparisons of the AvN Log-MAP algorithm against the Log-MAP and several relevant reduced complexity turbo decoding algorithms proposed in the literature reveal, that it offers favorable low computational effort for the price of small performance degradation.


Introduction
The invention of turbo codes and their iterative (turbo) decoding principle [1,2] has opened new perspectives on digital transmission and receiver design. Owing to the excellent performance, turbo codes have been extensively studied and have found applications in various wireless communication systems. Moreover, several enhanced turbo-like schemes have been proposed in recent years, with the aim of improving the overall bandwidth efficiency. On the other hand, the idea of iterative processing has found widespread applications not only in error control coding but also in other areas of digital communications, such as detection, interference suppression, equalization and synchronization. Nowadays, iterative processing has become prevalent in state-of-the-art receiver design.
It is known that in turbo decoding, the optimal algorithm for the soft-input/soft-output (SISO) component *Correspondence: tyczka@et.put.poznan.pl Chair of Wireless Communications, Poznan University of Technology, ul. Polanka 3, 60-965 Poznan, Poland decoders is the maximum a posteriori (MAP) probability algorithm [3]. In practice, in order to reduce numerical computation problems, the MAP algorithm is implemented in the logarithmic domain that results in the socalled Log-MAP algorithm [4,5]. The core operation of the Log-MAP is the calculation of the logarithm of the sum of exponential terms, denoted as max * operator, using the so-called Jacobian logarithm [6]. In recent years, several algorithmic approaches have been proposed aiming for a simplification of the max * operator and thus reducing the implementation complexity of the SISO decoders without a substantial loss of decoding performance (e.g., [7][8][9][10][11][12][13]). In all these algorithms, a conventional max * operator, i.e., defined for n = 2 input values, is modified, and for n > 2, these approximations are applied recursively n−1 times. Generalized non-recursive approximation methods for max * operator with n > 2 arguments have been recently presented in [14] and [15].
In this paper, we revisit a reduced-complexity Log-MAP algorithm based on a non-recursive approximation of the max * operator with n ≥ 2 input values, which has been recently proposed in [16]. The novel approximation http://jwcn.eurasipjournals.com/content/2013/1/238 given in [16], is derived from the Jensen inequality [17] and allows to achieve significant reduction in the decoding effort in comparison to the conventional Log-MAP algorithm. The purpose of this paper is to expand the initial work reported in [16] and to present comprehensive performance and complexity results and analysis of the algorithm, hereafter called the AvN Log-MAP algorithm (AvN is the acronym of ' Average with the parameter N'), for turbo trellis-coded modulation (TTCM) schemes [18,19]. Performance evaluation results of the AvN Log-MAP algorithm are presented for both parallel and serial TTCM schemes in the additive white Gaussian noise (AWGN) and uncorrelated (i.e., fully interleaved) Rayleigh fading channels. For TTCM, the trellises are non-binary and hence, in the Log-MAP operation all variables are calculated by means of the max * operator with n > 2 arguments. Bit-error rate (BER) results of the AvN Log-MAP algorithm are compared with the results of the Log-MAP and some relevant reduced complexity decoding algorithms, namely Linear Log-MAP [8], Average Log-MAP [9], Shift Log-MAP [10] and Max-Log-MAP [4]. Complexity comparison of the AvN Log-MAP algorithm against those algorithms for the investigated TTCM schemes is also given. The paper is organized as follows. In section 2, for the sake of completeness, we review the derivation of the non-recursive max * approximation for the AvN Log-MAP algorithm, given in [16]. Section 3 presents the performance evaluation results and their discussion. Complexity comparison of the algorithms is covered in section 4. Finally, section 5 contains concluding remarks.

Non-recursive max * approximation based on the Jensen inequality
In the Log-MAP algorithm, the calculation of the soft output as well as the forward and backward metrics and the branch metrics of trellis transitions requires computation of the max * operator defined as An exact solution to this problem, used in the Log-MAP algorithm, is the application of the Jacobian logarithm where f c (.) is a correction function, usually implemented with an eight-element look-up table (LUT) [5]. To obtain the max * operator for more than two arguments, i.e., n > 2, the Jacobian logarithm (2) is applied recursively n − 1 times. For example, assuming n = 3, it yields In the approach taken in [16], in order to reduce the computational effort, the max * operator with n ≥ 2 arguments is approximated directly, that is, without recursive computations required for the Jacobian logarithm, as it is shown in (3). The motivation for such a treatment comes from the observation that the recursion performed in the Log-MAP to calculate the Jacobian logarithm for n > 2 arguments has a major influence on the complexity of this algorithm.
In [16], the novel approximation is derived from two inequalities. The first originates from the definition of the max * operator given in (1). Identifying that Since the second term of the right-hand side (RHS) in (4) can be treated as a correcting term and is always greater than zero, we obtain Note that the equality in (5) corresponds to the Max-Log-MAP algorithm. For the second inequality, consider the Jensen inequality By substituting α i = e x i and taking the logarithm of the two sides of (6), we have Identifying from (1) that ln x n ) we may rewrite (7) as and further ln e x i (9) http://jwcn.eurasipjournals.com/content/2013/1/238 Since ln(n) in (9) is a positive constant value that does not depend on x i , it can be omitted and hence, we obtain the final inequality It can be easily noticed that the RHS in (10) is the average value of input values of the max * operator.
As it can be seen in (5) and (10), the exact value of the max * operator in both inequalities is greater than the RHS values. In fact, we may think of the RHSs of (5) and (10) as lower bounds to the max * operator. Merging inequalities (5) and (10), we may write Inspection of (11) reveals, however, that the decoding algorithm based on this approximation would achieve performance equal to that of the Max-Log-MAP algorithm. This conclusion results from the following relation The idea for overcoming this drawback, proposed in [16], is to replace the part of the approximation (11) that computes the average value of input arguments x i by the part that will compute the sum of all x i and then divide it by a certain parameter N, suitably selected so as to optimize the performance of the algorithm. Thus, the new approximation that is expected to have a better performance than the Max-Log-MAP algorithm can be formulated as follows [16]: where N is the parameter of the approximation. For a given transmission scheme, an optimal value of N minimizing BER performance at the assumed signal-to-noise ratio (SNR) level can be found by means of computer simulations. The Log-MAP algorithm with the approximation of the max * operator given in (13) is referred to as the AvN Log-MAP algorithm.

Performance evaluation results
In this section, we evaluate performance of the AvN Log-MAP decoding algorithm by means of computer simulations for parallel and serial concatenated TTCM schemes. In simulations, the AWGN channel and the uncorrelated, i.e., fully interleaved, Rayleigh fading channel with perfect channel state information (CSI) were considered. Furthermore, the block sizes of K = 684 and K = 5, 000 symbols with the S-random interleavers with the spreading factors S = 7 and S = 13, respectively, were assumed. For comparison purposes, the BER performance curves for the conventional Log-MAP (with a LUT storing eight values), Linear Log-MAP, Average Log-MAP, Shift Log-MAP, and Max-Log-MAP algorithms were also evaluated. At the receiver, eight decoding iterations for all algorithms were performed.

Parallel TTCM scheme
The block diagram of the encoder structure for a parallel concatenated TTCM scheme is shown in Figure 1.
The AvN Log-MAP algorithm was examined in the TTCM schemes with two rate-3/4 8-state Ungerboeck's trellis-coded modulation (TCM) encoders and 16-QAM modulation [18]. The results of the search for an optimal value of the parameter N of the approximation (13) for parallel turbo TCM schemes with K = 684 and K = 5, 000 are depicted in Figures 2 and 3, respectively. Figures 2 and 3 show the BER performance curves versus parameter N for the AvN Log-MAP algorithm in AWGN channel at several values of SNR. As it can be seen in Figures 2 and 3, for the values of SNR that correspond to the waterfall region of performance curves, the value of N has an influence on the error performance. It can be concluded from these figures that for both block sizes, the optimal value of the parameter N, which minimizes BER is 18.5. It should also be noted that the search for the optimal value of N for serial turbo TCM schemes investigated in section 3.2, gives a very similar result. Thus, in all simulations the AvN Log-MAP algorithm uses N = 18.5 in the approximation (13). One may comment at this point that the same optimal value of N has been obtained for TTCM systems, which differ in a structure, overall coding rate, and block size, yet having component codes with an equal number of states (i.e., eight states). This may suggest that the trellis structure of component codes is the key factor that has an influence on the value of the parameter N. Nevertheless, it remains an open problem for further studies.
In terms of hardware or digital signal processor (DSP) implementation of decoding algorithms, it is desirable to  avoid multiplications, which are rather complex operations. Therefore, the number of multiplications required by the algorithm can be regarded as one of the crucial factors in determining its complexity. As it will be shown in section 4, the AvN Log-MAP algorithm requires some multiplications. These multiplications are due to the division by parameter N in (13). It can be noticed, however, that if we select N = 2 m , m = 1, 2, . . ., then the multiplications can be replaced by bit shifts which simplifies the implementation. In order to obtain such a simplified algorithm for parallel TTCM, based on the results from Figures 2 and 3, we may select N = 16 although this value of parameter N is not optimal but close to the optimal N = 18.5. Hence, one may expect only a small performance degradation at the gain of further reduction of the algorithm complexity when N = 16 is selected. In the remainder of this paper, the AvN Log-MAP algorithm with N = 16 will be denoted as the AvN Log-MAP hardware efficient (HE).
BER performance evaluation results in the AWGN channel for the small (K = 684 symbols) and large (K = 5, 000 symbols) interleaver sizes are given in

Serial TTCM scheme
Encoder structure for a serial concatenated TTCM scheme is depicted in Figure 9 [19]. A rate-2/3 (systematic feedback) convolutional encoder with parity polynomials (in octal notation) h 0 = 13, h 1 = 15, and h 2 = 17 was applied as an outer encoder. For the inner encoder, we used the same encoder as in the parallel TTCM scheme. Hence, the overall code rate of the serial TTCM scheme is R c = 1/2.

Complexity comparison of the algorithms
From the implementation point of view, the key aspect of the AvN Log-MAP algorithm is its complexity compared against the Log-MAP and the reduced complexity algorithms previously proposed in the literature. Complexity comparison of the algorithms has been performed for both TTCM schemes from section 3 in software (i.e., computer based) and DSP implementations. DSP evaluation was accomplished on an Ultra Low Power TMS320VC5510 processor that uses a fixed-point number representation [20].
Tables 1 and 2 depict the required number of operations (i.e., additions, comparisons, bit shifts, conversion to integer and assignment) per decoding step in software implementation of the decoding algorithms for the parallel and serial concatenated TTCM scheme, respectively. As shown in Table 1, the proposed AvN Log-MAP algorithm in parallel TTCM scheme is 41.1% simpler than the Log-MAP algorithm. This significant reduction in the number of operations comes at an expense of small BER performance degradation, as it was shown by simulation results. When compared with the Max-Log-MAP algorithm, it is found that the AvN Log-MAP algorithm requires 34.0% more operations.
According to Table 2, the reduction in complexity of the AvN Log-MAP algorithm against the Log-MAP is also significant in serial TTCM scheme and amounts to 31.2%. Comparison to the simple Max-Log-MAP algorithm reveals that the AvN Log-MAP requires 43.3% more operations, but according to the BER results presented in the previous section, it exchanges for a performance improvement of the proposed algorithm by 0.3 to 0.5 dB.
It can also be observed in Tables 1 and 2 that the AvN Log-MAP HE algorithm performs the same overall number of operations as the AvN Log-MAP but  instead of multiplications it realizes bit shifts. This feature of the AvN Log-MAP HE is favorable in hardware implementation of the algorithm and can be considered as an additional complexity reduction. Tables 3 and 4 summarize the comparison of the AvN Log-MAP against all competing algorithms in terms of overall number of operations in the parallel and serial scenarios, respectively. It can be easily seen that the AvN Log-MAP algorithm offers much higher reduction in the number of operations with regards to the Log-MAP than the Linear Log-MAP, Average Log-MAP, and Shift Log-MAP algorithms. Inspection of Tables 3 and 4 shows that this reduction is more than twice as much as that offered by the least complex algorithm among the competitorsthe Average Log-MAP: 41.1% vs. 15.7% in a parallel scheme and 31.2% vs. 14.8% in a serial scheme. Taking into account only a small performance degradation of the AvN Log-MAP algorithm, this substantial reduction in computational effort makes the proposed algorithm attractive for the practical implementation.
DSP implementation-based complexity comparison of the algorithms is presented in Tables 5 and 6. The numbers of processor's cycles needed per single decoding step for the considered algorithms in the parallel and serial concatenated TTCM schemes are shown. As it can be seen in Table 5, the AvN Log-MAP algorithm in parallel TTCM scheme is 48.9% simpler than the Log-MAP algorithm and requires only 8.0% more processor's cycles than the simple Max-Log-MAP. In serial concatenation TTCM scheme (Table 6), these comparisons give the numbers of 41.6% and 27.9%, respectively. Comparison of the AvN Log-MAP algorithm to the remaining algorithms (Table 5) shows that its reduction in the number of processor's cycles against Log-MAP is more than 30% higher than that achieved by the Linear Log-MAP. From the analysis given in this section, we conclude that the AvN Log-MAP algorithm offers significant savings in decoding effort with respect to the Log-MAP and relatively small increase of complexity as compared to the Max-Log-MAP algorithm. It is also substantially less complex than all reduced complexity algorithms given as a reference in this paper. It should be emphasized that all complexity comparisons presented in Tables 1, 2, 3, 4, 5, and 6 are per single decoding step in a constituent decoder and hence, these results do not depend on the number of iterations or the block size.

Conclusions
A low complexity AvN Log-MAP algorithm has been presented and investigated for turbo TCM schemes. It is  based on the approximation of the max * operator with n ≥ 2 values, derived from the Jensen inequality. As is apparent from the simulation results obtained for AWGN and uncorrelated Rayleigh fading channels, the performance of the proposed algorithm is slightly inferior to that of the Log-MAP but its computational requirements are substantially lower. In particular, we have found that the AvN Log-MAP algorithm performs about 0.1 to 0.4 dB worse than the Log-MAP at BER of 10 −4 , depending on TTCM scheme and radio channel, while it is also as high as 31.2% to 48.9% much simpler. Performance and complexity comparisons with some relevant reduced complexity decoding algorithms proposed in the literature, i.e., Linear Log-MAP, Average Log-MAP, and Shift Log-MAP, have shown that the AvN Log-MAP algorithm reveals significant decrease in decoding complexity and its computational requirements are much lower than those of these algorithms. The penalty paid for this favorable low complexity of the AvN Log-MAP algorithm is a small performance degradation with respect to those algorithms. Moreover, proper selection of parameter N in the approximation will offer further simplification of the algorithm by means of replacing multiplication operations with simple bit shifts. Finally, it should be emphasized that the AvN Log-MAP algorithm is also suitable for implementation in SISO modules for iterative scenarios other than turbo TCM.