Sparsening filter design for iterative softinput softoutput detectors
 Raquel G Machado^{1}Email author,
 Andrew G Klein^{1}Email author and
 Richard K Martin^{2}
https://doi.org/10.1186/16871499201272
© Machado et al; licensee Springer. 2012
Received: 15 May 2011
Accepted: 29 February 2012
Published: 29 February 2012
Abstract
A large body of research exists around the idea of channel shortening, where a prefilter is designed to reduce the effective channel impulse response to some smaller number of contiguous taps. This idea was originally conceived to reduce the complexity of Viterbibased maximumlikelihood equalizers. Here, we consider a generalization of channel shortening which we term "channel sparsening". In this case, a prefilter is designed to reduce the effective channel to a small number of nonzero taps which do not need to be contiguous. When used in combination with beliefpropagationbased maximum a posteriori (MAP) detectors, an analogous complexity reduction can be realized. We address the design aspects of sparsening filters, including several approaches to minimize the bit error rate of MAP detectors. We devote attention to the interaction of the sparsening filter and detector, and demonstrate the performance gains through simulation.
Keywords
belief propagation turbo equalization channel sparsening channel shortening1 Introduction
Intersymbol inteference (ISI) caused by frequency selective channels is one of the chief impairments faced by modern, high datarate communication receivers. The issue of compensating for ISI has been studied at length over the past five decades, and a wide range of strategies are available for use by communication system designers. For optimal performance, a maximum a posteriori (MAP) or maximumlikelihood (ML) sequence estimator may be implemented using the BahlCocke JelinekRaviv (BCJR) or Viterbi algorithm, respectively. These optimal approaches, however, are exponentially complex in the number of channel coefficients, and consequently suboptimal ISI compensation techniques are used in most applications.
Sparse impulse responses are characterized as having only a small fraction of nonzero coefficients. This behavior can arise, for example, in underwater acoustic communication channels or in terrestrial communication channels over hilly terrain. Compensation of sparse ISI channels is considerably challenging since these channels can often have very long delay spreads, and optimal approaches like BCJR and Viterbi are therefore infeasible. Recently, a MAP detector employing belief propagation (BP) was proposed [1] for ISI compensation in sparse channels. The proposed scheme is attractive because it permits nearoptimal performance with complexity that depends only on the number of nonzero coefficients. The complexity of this algorithm is exponential in the number of nonzero channel coefficients, however, so it may still be prohibitively complex for the majority of applications. A hybrid version of this detector was proposed [2] which uses a linear prefilter in the receiver just before the BPbased MAP detector. By designing the prefilter so that the combined response of the sparse channel and prefilter has a reduced, limited number of nonzero coefficients, the use of the BPbased detector becomes feasible in a wider range of applications.
The utility of the hybrid structure in [2] has been demonstrated through an extensive simulation study that showed significant errorrate improvement in sparse channels when compared with competing approaches that employ decisionfeedback equalizers. While the simulation results are encouraging, relatively little attention is paid to the interaction of the prefilter and the BPbased detector. For example, the prefilter is arbitrarily designed so that the taps of the combined response of the sparse channel and prefilter coincide with the dominant taps in the original channel, yet no motivation is provided for this choice.
In this work, we focus on the design of sparsening prefilters for use with softinput softoutput MAP detectors of the form considered in [1, 2]. While [1, 2] primarily focused on the case where the original channel is sparse, we note that even nonsparse channels can be sparsened with a simple linear, finite impulse response (FIR) filter. Consequently, our work can be applied in general situations, even where the original channel is not sparse. We address the issue of sparsening filter design with the goal of minimizing the detector bit error rate (BER). We consider the interaction of the sparsening filter and BP detector, and develop a practicallyimplementable sparsening filter design method.
We note that channel sparsening filters can be seen as a generalization of socalled channel shortening filters proposed in [3–7]. Given an FIR channel h of length L_{ h }, the channel shortening problem roughly amounts to designing a filter w so that the energy in the combined response h⋆w is concentrated in μ < L_{ h }contiguous taps. Channel sparsening is nearly the same, though the μ taps which contain the majority of the energy are not constrained to be contiguous. Furthermore, while much of the recent interest in channel shortening has been for application to multicarrier systems, the original idea of channel shortening [3] was proposed for a reducedcomplexity hybrid prefilter/ML detector which bears some resemblance to the one considered here. More recent works such as [8] have considered channel shortening in conjunction with iterative MAP detectors. Again, however, these works impose a constraint that the taps in the combined channel/filter response must be contiguous. One very recent work [9] has considered use of matching pursuit to find a sparse, noncontiguous target impulse response (TIR), and it is shown to yield a lower mean squared error (MSE) compared to the conventional contiguous approach.
In Section 2 we present the basic system model, and in Section 3 we describe the basic operation of the BP detector. Sections 4 through 6 focus on the design aspects of the channel sparsening filter (CSF), while Section 7 presents simulation results which compare various sparsening filter design methods. Finally, Section 8 concludes the article.
2 System model
where the CSF output noise, which is colored, is given by v[k] = n[k]⋆w[k]. Finally, the output of the CSF is passed to the softinput softoutput BP detector which outputs likelihood values that can be used to make decisions as to what was transmitted.
In this work, we focus our attention on the optimal design of the sparsening filter. As such, we make the simplifying assumption that the channel h is known perfectly to the receiver. It is rather straightforward, however, to extend our proposed design method to adaptive implementations which can be employed in situations where the channel is unknown and/or slowly timevarying.
3 Belief propagation detector
Before discussing the sparsening filter design in detail, we first provide some details about the BP detector. The BP algorithm used in the detector is in the class of message passing algorithms, and is sometimes called the sumproduct algorithm[1]. By representing the ISI channel as a factor graph, the BP algorithm can be used to implement MAP detection, thereby finding the sequence x which maximizes the joint a posteriori probability density function P(x  y). The BP algorithm proceeds iteratively, and computes log likelihood ratios of the transmitted bits which become more reliable with each iteration. After a sufficient number of iterations, the log likelihood ratios can be used to make bit decisions.
To compute the likelihood ratios, the BP detector needs to know the effective channel impulse response. Given a finitelength filter h, it is not possible in general to find a finitelength filter w such that the combined response is exactly equal to some prescribed FIR response c since the problem is overdetermined.^{a} In other words, in designing the CSF filter w, we must accept that it is not possible to perfectly sparsen the channel so that the effective channel c consists of only μ nonzero taps. Consequently, the remaining L_{ c } μ taps of c will not be exactly equal to zero in general unless the CSF is chosen to have infinite length. Nevertheless, to keep computational complexity at the level prescribed by the choice of μ, we only use the largest μ taps of c in the computation of the likelihood ratios used inside the BP detector. As such the residual ISI contribution from the smallest L_{ c } μ taps of c in (1) will be treated as noise by the BP detector. A sufficiently large choice of CSF length L_{ w }, however, can ensure arbitrarily small additional ISI.
Since the BP detector is typically implemented in the log domain, the majority of its complexity is due to the many summation operations which must be performed [1]. If the BP algorithm proceeds over N total iterations, the total complexity requires on the order of N(μ + 1)M^{μ+1}summations, where M is the size of the source alphabet and μ is the number of significant effective channel taps used in the detection. As such, the complexity of the BP is exponential in μ, and so the system designer can specify the total complexity by appropriate choice of μ.
We note that the BP detector performance only truly coincides with the MAP detector when two conditions are met: (1) there are no cycles in the factor graph corresponding to the channel, and (2) the additive noise is white and Gaussian. In general, the first of these conditions is never satisfied. In practice cycles have been shown to be of little concern since they are a low probability event (in the case of potentially detrimental length 4 cycles) [10], or the cycles themselves do not pose a noticeable performance penalty [1]. The second condition on the noise, however, is more serious for this hybrid structure. Since the AWGN gets colored by the CSF, the noise at the input of the BP detector is no longer white.^{b} We will address this issue in the sequel.
We emphasize that the CSF does not change the operation of the BP detector. As the CSF changes the effective channel taps, however, and passes the μ largest effective channel taps to the BP detector, the CSF obviously affects the behavior and performance of the combined filter/detector structure. Since the BP detector itself is unaltered from [1], it can accommodate a system employing channel codes such as LDPC encoding considered in [1], or can readily be extended to the MIMO case with, for example, spacetime coding as in [2, 8]. Since our focus is on the design of the CSF, we consider an uncoded system.
4 Channel sparsening
In the design of the CSF w, the goal is for the number of significant (nonzero) taps of c to be μ or less, regardless of where they lie in c or whether they are contiguous or not. We note that μ ∈ {1, 2,..., L_{ h }} is a parameter chosen by the system designer. If μ = 1, then the detector coincides with traditional linear equalization since the goal of the CSF design is to make the effective channel be a single spike. At the other extreme, the choice μ = L_{ h }corresponds to "pure" BP detection as in [1] since the CSF need not do any sparsening and can be a simple unity gain filter. Larger choices of μ will result in an exponentially more complicated BP detector, but will also result in better BER performance.
where $\mathcal{S}$ is the set of the locations of the desired largest μ taps in c. The numerator of (2) is the signal power scaled by the power of the μ significant taps in c, and the denominator contains the total received signal power. Ideally, the energy in the significant taps, ${\sum}_{k\in \mathcal{S}}{\left{c}_{k}\right}^{2}$, will make up almost all of the energy in the channel, ${\sum}_{k}{\left{c}_{k}\right}^{2}$, since we want all other taps to be as close to zero as possible. Ignoring noise for a moment, 0 ≤ J_{ S }≤ 1, and the only way to force J_{ S }→ 1 it to make all but μ taps in c go to zero. Adding in the noise term to the denominator ensures that the residual selfinterference is weighted comparably to the noise, so that the excess taps are not made small at the expense of noise amplification.
The SSSNR in (2) is analogous to Melsa's Shortening SNR (SSNR) [4], with a few distinctions: the set of desired taps $\mathcal{S}$ is not contiguous, the denominator includes the noise power, and the denominator includes both the desired and undesired taps (rather than just the latter). The last distinction is for numerical reasons, and it can be shown that keeping or omitting the desired taps in the denominator leads to the same solution in this type of problem [[11], III.B].
The SSSNR expression in (3) is a generalized Rayleigh quotient. The value of w that maximizes this quantity, i.e., the SSSNRoptimal CSF for a given set $\mathcal{S}$, is computed by finding the generalized eigenvector of the matrix pair $\left({\mathbf{B}}_{\mathcal{S}},\mathbf{C}\right)$ corresponding to the largest generalized eigenvalue. An algorithm for this general problem is given in [[12], Section 8.7.2].
The method used to compute the tap values in [2], which is based on [8], is mathematically similar to our approach, with two key differences. Most importantly, the set $\mathcal{S}$ is fixed in [2]. Second, [2] uses the concept of a TIR. The optimal CSF is written as a function of the TIR, and then the TIR is optimized. (This is implicit within [2, Equation (25)].) The choice of CSF is "optimal" in the sense that it minimizes the MSE between the outputs of the CSF and TIR, and the TIR is optimal in the sense that it maximizes the signaltonoise ratio (SNR) at the CSF output. Similar to the channel shortening literature where the minimum MSE and the maximum SSNR channel shorteners are equivalent [[13], Section 5], the approach in [2] is mathematically equivalent to our approach with the exception of the fixed sparse coefficient locations. However, the minimum MSE approach is more convoluted to implement, as two filters must be designed rather than one.
The CSF that maximizes (3) is only optimal for a given choice of $\mathcal{S}$. As such, the design of w involves two issues: picking the best locations for the μ nonzero taps in c (i.e., choosing the set $\mathcal{S}$), and picking the values of the filter coefficients so that (3) is maximized. The first issue is related to the problem of choosing the optimal delay in linear minimum mean squared error (LMMSE) equalization, which is known to be nontrivial since there is no known expression for the optimal delay [14]. In the classical equalization problem, it is feasible to conduct a bruteforce search over the L_{ c }possible delays. Here, however, the problem is considerably more challenging since there are $\left(\begin{array}{c}{L}_{c}\hfill \\ \mu \hfill \end{array}\right)=\frac{{L}_{c}!}{\left({L}_{c}\mu \right)!\mu !}$ possible choices of $\mathcal{S}$. In this article, we consider three methods of choosing the set $\mathcal{S}$.

Use the indices of the μ largest magnitude taps of h, as in [2]. This will be referred to as Roy's tap selection method.

Try all of the possible combinations. This will be referred to as the combinatorial tap selection method, and it is optimal (though expensive).

Try the heuristic approach outlined below, which will be referred to as the greedy tap selection method.
The greedy method is as follows. First, set $\stackrel{\u0304}{\mu}=1$, and find the location ${\mathcal{S}}_{1}$ of a single tap that maximizes the SSSNR. This involves computing w for all L_{ c }possible tap choices. Next, set $\stackrel{\u0304}{\mu}=2$ and $\mathcal{S}=\left\{{\mathcal{S}}_{1},{\mathcal{S}}_{2}\right\}$. Keep ${\mathcal{S}}_{1}$ from the prior step, and find the location ${\mathcal{S}}_{2}$ such that the best twotap channel is produced. This involves computing w for each of L_{ c } 1 values. Continue adding one tap at a time until μ locations have been chosen.
Roy's method requires designing a single CSF, although the tap locations are likely far from optimal (as will be demonstrated in Section 7). The combinatorial method requires designing $\frac{{L}_{c}!}{\left({L}_{c}\mu \right)!\mu !}$ filters. Finally, the greedy method requires designing $\frac{1}{2}\mu \left(2{L}_{c}\mu +1\right)$ filters. It is far cheaper than the combinatorial method, although its performance approaches that of the combinatorial method, as will be demonstrated in Section 7. For example, with L_{ c }= 20 and μ = 2, the greedy method is 4.9 times cheaper; and with L_{ c }= 25 and μ = 5, the greedy method is 460 times cheaper than the combinatorial method.
5 Noise coloration
where 2πf = ω. Thus, J_{ A }penalizes nonflatness of the spectrum of w, since J_{ A }drops to its minimum value of 1 as the spectrum W(ω) approaches any constant value ∀f.
The weight β can be chosen to try to force the minimum of J to be in the proximity of the BER cost surface, J_{ E }. In the next section, we look at the surfaces J_{ S }, J, and J_{ E }in order to visualize the effect of β.
The value of β can be set several ways. The simplest is to try various values of β and get a sense of which values lead to good results for the class of parameter values of interest. For example, for the parameters in our simulations, β ∈ [0.1,0.5] seems to yield good results. Alternatively, β can be included in the optimization problem. One could search the objective function of (4) for a new value of w (but without changing β), then occasionally adjust β (but not w) to improve the BER, and repeat. If β is updated on a much slower time scale than w, then the computationallyintensive BER does not have to be evaluated very often during the search.
For a given value of β, (4) can be minimized over w by any method of unconstrained nonlinear optimization. We chose to use the simplex method of [15], since it was already available in Matlab, via the "fminsearch" function.
6 Cost surfaces
7 Simulations
Having demonstrated that SSSNR is a good proxy for BER, we now compare the BER of the various CSF design approaches. We consider a longer channel than in the lowdimensional example of the previous sections, and we compare SSSNR and computational time among the different design metrics and tap selection methods. Second, we evaluate the BER for two channels employing different sparsening filters in conjunction with BP.
In the first example, we consider the channel h_{1} = [0.0722,0,0,0.7217,0.6495,0,0,0.2165,0,0.0722]. We design the CSF to sparsen the channel to μ = 2 taps, we let L_{ w }= 25, we transmit uncoded BPSK symbols, and we use ten iterations in the BP detector.
Computational complexity, taps selected, and SSSNR achieved at 8dB SNR
Tapselection method  Number of filters designed  Taps selected  SSSNR 

Combinatorial  561  {17,18}  6.9344 
Greedy  67  {18,19}  6.9336 
Roy  1  {4,5}  5.7961 
In conducting simulations, we noticed that occasionally the hybrid CSF/BP structure yielded BER performance which was inferior to that of a simple linear equalizer with a memoryless slicer. Upon further investigation, it became clear that the performance degradation in such cases was due to noise coloring by the CSF, as addressed in Section 5. We now consider such an example, and show that the use of the modified cost function given in (4) results in flatter sparsening filters, and improves the BER performance. We now consider the channel h_{2} = [0.21,  0.36,0.72,0.5,0.21], and we again design the CSF to sparsen the channel to μ = 2 taps with a CSF and equalizer of length L_{ w }= 25. As before, we transmit uncoded BPSK symbols, and use ten iterations in the BP detector. We also add the squared autocorrelation penalizing term to the combinatorial and greedy SSSNR CSF design metrics. To choose the β value, we performed a grid search with 11 values between 0.0 and 1.0 at a 14 dB SNR and the best value obtained was β = 0.1.
Conversely, by adding the penalizing term to the combinatorial SSSNR approach, the resultant sparsening filter becomes flatter, producing an effective channel similar to the original one. This, in fact, reduces the noise enhancement. Moreover, the flatter frequency response shown by the CSF will also reduce noise coloring thereby improving BP detector BER performance. Both reasons explain the error performance improvement and motivate the incorporation of the squared penalizing term to the CSF design. The optimal choice of β however, remains an open issue and is likely to be channeldependent.
In addition, to provide further evidence of the proposed method's efficacy, we also considered the ITU Vehicular A channel [16] that has six paths arriving at [0, 310, 710, 1090, 1730, 2510] ns and a powerdelay profile of [0,  1,  9,  10,  15,  20] dB. In our simulations we used a squareroot raised cosine pulse and a symbol duration of T = 80 ns, which generally resulted in a sparse equivalent discrete channel with average length of 21 taps. Also, we transmit uncoded BPSK symbols, use ten iterations in the BP detector, let μ = 2 nonzero taps, and use a CSF (and for comparison, an equalizer) of length L_{ w }= 32. Again, to calculate the β value, we used a grid search at 20 dB SNR and the valued obtained was β = 0.2.
Finally, we again emphasize that this hybrid detector is quite flexible since its complexity can be adjusted by the system designer. While the complexity scales exponentially with μ, implementations are quite feasible on modern hardware in a wide range of applications [17]. While linear and decision feedback equalizer complexity scales only linearly with the channel length L_{ h }, and are therefore attractive for applications where hardware simplicity is at a premium, the performance advantage offered by the hybrid BP detector (reported here and in [2]) may well be worth the additional complexity. Finally, when compared with traditional Viterbi and BCJR detectors which scale exponentially with L_{ h }, the hybrid BP detector appears to have a considerable advantage in terms of complexity [1].
8 Conclusions
In this work we have considered the design of sparsening filters as a way to reduce the complexity of iterative softinput softoutput MAP detectors. By designing the sparsening filter so that the combined response of the (possibly nonsparse) channel and filter has a sparse impulse response, i.e., a response with only a handful of significant taps, the use of a BPbased MAP detector becomes feasible for detecting the bits. We proposed a filter design metric called the SSSNR, and showed that maximizing this quantity serves as a good proxy for minimizing BER. We developed a greedy algorithm for tap selection, and showed that this approach yields nearoptimal performance with a significant reduction in complexity when compared to the optimal, combinatorial tap selection approach. In addition, we treated the issue of noise coloration introduced by the sparsening filter, and showed that the addition of a noise penalty term to the SSSNR results in solutions with a flatter frequency response, thereby limiting the amount of noise coloration. Numerical simulations compared our scheme with an existing approach due to Roy, and showed that significant performance gains can be had by intelligently choosing the tap locations.
Future work in this area will consider fractionallyspaced or adaptive versions of this approach, as well as the effects of sparsening filter length on performance. The authors would like to thank Yanjie Peng at WPI for providing the initial version of the simulation code to implement the BP detector.
Endnotes
^{a}Note that a SIMO system employing either multiple receive antennas or fractional sampling can perfectly sparsen the channel under certain conditions on subchannel roots [18]. ^{b}While it is possible that the noise observed at the receiver frontend is not white to begin with, we make the standard AWGN assumption throughout.
Declarations
Acknowledgements
Martin is funded in part by the Air Force Research Labs, Sensors Directorate. The views expressed in this article are those of the authors, and do not reflect the official policy or position of the United States Air Force, Department of Defense, or the U.S. Government. This document has been approved for public release; distribution unlimited.
Authors’ Affiliations
References
 Colavolpe G, Germi G: On the application of factor graphs and the sumproduct algorithm to ISI channels. IEEE Trans Commun 2005, 53(5):818825. 10.1109/TCOMM.2005.847129View ArticleGoogle Scholar
 Roy S, Duman TM, McDonald VK: Error rate improvement in underwater MIMO communications using sparse partial response equalization. IEEE J Ocean Eng 2009, 34(2):181201.View ArticleGoogle Scholar
 Falconer DD, Magee FR: Adaptive channel memory truncation for maximum likelihood sequence estimation. Bell Sys Tech J 1973, 52: 15411562.View ArticleGoogle Scholar
 Melsa PJW, Younce RC, Rohrs CE: Impulse response shortening for discrete multitone transceivers. IEEE Trans Commun 1996, 44: 16621672. 10.1109/26.545896View ArticleGoogle Scholar
 Kallinger M, Mertins A: Room impulse response shortening by channel shortening concepts. In Proc IEEE Asilomar Conf on Signals, Systems and Comp. Pacific Grove, CA; 2005:898902.Google Scholar
 Arslan G, Evans BL, Kiaei S: Equalization for discrete multitone receivers to maximize bit rate. IEEE Trans Signal Process 2001, 49(12):31233135. 10.1109/78.969519View ArticleGoogle Scholar
 Vanbleu K, Ysebaert G, Cuypers G, Moonen M, Van Acker K: Bitrate maximizing timedomain equalizer design for DMTbased systems. IEEE Trans Commun 2004, 52(6):871876. 10.1109/TCOMM.2004.829564View ArticleGoogle Scholar
 Bauch G, AlDhahir N: Reducedcomplexity spacetime turboequalization for frequencyselective MIMO channels. IEEE Trans Wirel Commun 2002, 1(4):819828. 10.1109/TWC.2002.805094View ArticleGoogle Scholar
 Gomaa A, AlDhahir N: A new design framework for sparse FIR MIMO equalizers. IEEE Trans Commun 2011, 59(8):21322140.View ArticleGoogle Scholar
 Kaynak M, Duman T, Kurtas E: Belief propagation over frequency selective fading channels. In Proc IEEE Vehicular Technology Conf (VTC'04). Volume 2. Los Angeles CA; 2004:13671371.Google Scholar
 Martin RK, Walsh JM, Johnson CR Jr: Low complexity MIMO blind, adaptive channel shortening. IEEE Trans Signal Process 2005, 53(4):13241334.MathSciNetView ArticleGoogle Scholar
 Golub GH, Van Loan CF: Matrix Computations. The Johns Hopkins University Press, Baltimore; 1996.Google Scholar
 Martin RK, Vanbleu K, Ding M, Ysebaert G, Milosevic M, Evans BL, Moonen M, Johnson CR Jr: Unification and evaluation of equalization structures and design algorithms for discrete multitone modulation systems. IEEE Trans Signal Process 2005, 53(10):38803894.MathSciNetView ArticleGoogle Scholar
 Szczecinski L: Lowcomplexity search for optimal delay in linear FIR MMSE equalization. IEEE Signal Process Lett 2005, 12(8):549552.View ArticleGoogle Scholar
 Lagarias JC, Reeds JA, Wright MH, Wright PE: Convergence properties of the neldermead simplex method in low dimensions. SIAM J Optim 1998, 9: 112147. 10.1137/S1052623496303470MathSciNetView ArticleGoogle Scholar
 Guidelines for the evaluation of radio transmission technologies for IMT2000, Recommendation ITUR M.1225 1997.Google Scholar
 Peng Y, Zhang K, Klein AG, Huang X: Design and implementation of a belief propagation detector for sparse channels. In Proc IEEE Intl Conf on Applicationspecific Systems, Architectures and Processors (ASAP'11). Santa Monica, CA; 2011:259262.Google Scholar
 Tong L, Xu G, Kailath T: Blind identification and equalization based on secondorder statistics: a time domain approach. IEEE Trans Inf Theory 1994, 40(2):455466. 10.1109/18.312168View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.