Minimum Probability of Error-Based Equalization Algorithms for Fading Channels

Novel channel equalizer algorithms are introduced for wireless communication systems to combat channel distortions resulting from multipath propagation. The novel algorithms are based on newly derived bounds on the probability of error (PE) and guarantee better performance than the traditional zero forcing (ZF) or minimum mean square error (MMSE) algorithms. The new equalization methods require channel state information which is obtained by a fast adaptive channel identiﬁcation algorithm. As a result, the combined convergence time needed for channel identiﬁcation and PE minimization still remains smaller than the convergence time of traditional adaptive algorithms, yielding real-time equalization. The performance of the new algorithms is tested by extensive simulations on standard mobile channels.


INTRODUCTION
Since radio spectrum became scarce and expensive, one of the major concerns of wireless communication is to maximize spectral efficiency (SE).This implies that broadband services are implemented over narrowband radio channels which makes them susceptible to selective fading due to multipath propagation which may yield severe performance degradation [1].As a result, efficient channel equalization techniques prove to be instrumental to combat intersymbol interference (ISI) in order to avoid large scale drops in system performance.
The effect of interferences is especially crucial in mobile communication systems which have two evolutionary paths: (i) 3G systems are launched based on WCDMA [2]; and (ii) the current 2G systems (GSM and IS-136) are updated to provide broadband services [1].The latter strategy introduces a novel common physical layer "enhanced data rates for GSM evolution" (EDGE) for both TDMA schemes.EDGE improves spectral efficiency by applying 8PSK modulation format instead of binary Gaussian minimum-shift keying (GMSK).For the sake of seamless GSM-EDGE transfer, most of the system parameters remain unchanged (e.g., symbol time and symbol duration).However, in the case of 8PSK modulation the maximum likelihood sequence estima-tor (involving the Viterbi algorithm) can no longer be implemented on the current DSP technology due to its complexity [2].As a result, fast channel equalizer algorithms have to be developed which are simple enough to run on the currently available hardware architectures even in the case of multilevel PSK modulation schemes.
This paper aims at developing small complexity channel equalizer algorithms by directly minimizing the PE instead of minimizing the mean square error (MSE) or the peak distortion (PD) [3].Unfortunately, the direct minimization of PE with respect to the equalizer coefficients is of exponential complexity.Thus, we develop new bounds on which basis the equalizer coefficients can be optimized by fast algorithms.For the sake of simplicity, the novel algorithms presented in the paper are treated assuming a two-state modulation scheme (however the analysis can be easily extended to many-state schemes by introducing complex variables).
The first attempt to derive an equalizer based on the minimum PE strategy can be found in the work of Shimbo and Celebiler [4].The optimal equalizer coefficients were only sought by exhaustive search, thus real-time adaptivity was not guaranteed.In recent years, some new results have been developed for minimum PE equalization.In [5] a low-complexity adaptive algorithm is proposed for 2 or 4state modulation systems but the convergence is rather slow, while in [6,7] near minimum PE equalization is carried out by radial basis function neural networks which considerably increases the equalizer complexity.Minimum BER equalization for decision feedback equalizers can be found in [8,9].On the other hand, very complex equalizer schemes have been proposed for DS-CDMA systems in [10][11][12].
Recently other equalization strategies have also been developed, such as iterative (turbo) equalization algorithms which jointly optimize the equalization and the detection yielding similar performance than the maximum likelihood sequence estimation.For some recent results see [13][14][15][16].Other solutions are based on the negentropy minimization principle [17] or the nearest neighbor classifier [18] but they yield complex algorithms.
The results are given in the following structure.
(i) In Section 2, the communication model will be outlined.(ii) In Section 3, PE is expressed as a function of the equalizer coefficients and a gradient-based algorithm is introduced for minimization.Then new bounds are derived on PE to develop new equalizer algorithms with low complexity.(iii) In Section 4, the performance and convergence properties of the new equalizer algorithms are analyzed numerically.

THE MODEL
To describe single-user communication over fading channel, we use the so-called equivalent discrete time white noise filter model (for further details see [3]).
The corresponding quantities are defined as follows: (i) y k ∈ {−1, 1} denotes the transmitted information bit at time instant k being a sequence of identically distributed independent Bernoulli random variables with P(y k = 1) = P(y k = −1) = 0.5; (ii) the discrete impulse response of the channel is denoted by h k , k = 0, . . ., M, where M denotes the span of ISI; (iii) the noise is denoted by ν k and is assumed to be a stationary zero mean white Gaussian random sequence with constant spectral density N 0 ; (iv) the received sequence is denoted by x k , which is linearly distorted and noisy version of the transmitted sequence given as and with the assumption of BPSK modulation and coherent demodulation, x k is real; (i) the equalizer is a linear FIR filter, the output of which is denoted by where w i , i = 0, . . ., J, denote the free parameters of the equalizer which are subject to further optimization; (ii) the decision is carried out by threshold detection in a symbol-by-symbol fashion: (iii) the overall channel impulse response function is determined by the cascade of the channel and the equalizer where L = M + J denotes the support of the overall impulse response.
Traditional equalization algorithms aimed at minimizing the PD defined as or MSE defined as The corresponding adaptive equalizer algorithm that minimizes the PD is called zero-forcing: (7) and that which minimizes the MSE (often referred to as LMS) is where γ is a sufficiently small step-size which governs the convergence.Both approaches involved linear stochastic approximation schemes but they fell short of efficient estimation as the goal functions did not have direct relationship with PE.

NOVEL CHANNEL EQUALIZATION METHODS
In this section, we express PE as a function of the equalizer coefficients w and we also demonstrate that equalization with respect to direct PE minimization is of exponential complexity.In order to circumvent this difficulty, we develop new bounds on PE and the equalization coefficients can be optimized by minimizing these bounds in real time.

Weight optimization subject to minimizing the PE
Since our approach to equalization is based on minimizing the probability of error, first we express PE as a function of the equalizer coefficients as given in [4]: where Φ(•) denotes the standard normal cumulative distribution function (cdf) defined as j , and Y = {y = (y 0 , y 1 , . . ., y L ) | y 0 = −1; y i ∈ {−1, 1}, i = 1, . . ., L}. Substituting (4) into (9), we obtain To find the optimal weights of the equalizer which minimize this error probability, we have to solve the following equation: where the ith component of the gradient is ( The weights can be minimized by gradient descent, which yields the following equalization algorithm: Here w(k) is the value of the weight vector at the kth iteration and γ is a sufficiently small step-size.
In the forthcoming discussion, this procedure is termed as true gradient search (TGS).Unfortunately, performing TGS is computationally prohibitive because of the summation over an exponentially growing number of vectors in expression (12).This summation must be calculated in each step of algorithm (13).Thus, TGS can only be applied in practice if the support of the overall impulse response defined in (4) is very limited.Otherwise, near-optimal algorithms must be sought which lend themselves to real-time implementations.To ease this complexity, new bounds are derived on PE.

New bounds on PE
In this section, we derive new upperbounds on PE which can be used for channel equalization.For the sake of performance analysis, lower bounds are also derived which can help to evaluate the accuracy of the bounds.
To develop these upperbounds, first we introduce the concept of "diagonal dominancy" (or "eye-openness") which is often used in the literature related to digital communication theory [3,4].
It should be noted that if a sequence a k is "eye-opened," then the associated Toeplitz matrix A, defined by Definition 2. The peak distortion (PD) of a linear filter with impulse response function a k is defined as In the forthcoming discussion, the PD is related to the overall impulse response function q k of the communication system given in (4), which is calculated as follows: The appearance of w in (15) in the notation for PD is due to the dependence of PD on the weights of the equalizer.Note that if q k is eye-opened, then PD(w) < |q 0 |.

Theorem 1.
The following bounds on P E (w) can be derived, where the inequalities provided with a star hold under the assumption of the "eye-openness" of the overall channel response function q k (when The proof of this theorem can be found in Appendix A. It should be noted that the upperbound in (16b) is tighter than the upperbound in (16a) due to the relation which implies One must not forget, however, that upperbound (16b) is obtained under a stronger condition (eye-openness) than upperbound (16a), thus the latter one can be used in more general circumstances.
Unfortunately, due to the nondifferentiability of PD(w) it is still difficult to minimize the newly obtained bounds with respect to the weight vector.By using the Cauchy-Schwartz inequality to upperbound PD(w), differentiable bounds can be derived on P E (w).

Theorem 2. The following additional bounds on P E (w) can be derived, where the inequality (19b) (provided with a star) holds under the assumption of "eye-openness" of the overall channel response function:
Note that bounds (16a) and (16b) are tighter than (19b) and (19a) (due to the application of the Cauchy-Schwartz inequality in the latter ones).On the other hand, the advantage of (19b) and (19a) is that both G(w) and Q(w) are differentiable functions with respect to the weights, which can give rise to gradient-based equalization algorithms.
The proof of this theorem can be found in Appendix B.

Channel equalization by minimizing the bounds on PE
where γ is a sufficiently small step-size and This procedure is obtained from minimizing G(w) by gradient search where G(w) is defined in Theorem 2. Since Φ(•) is monotone, it is enough to minimize G(w).
Bound-based equalization algorithm related to bound (19b) (BBEAe) where γ is a sufficiently small step-size and One must note that the advantage of minimizing bounds (19a) and (19b) is that in the gradients of G(w) and Q(w) there is no summation over an exponentially growing set.In this way a much faster equalization can be obtained by applying algorithm (20) or ( 22) than (13).

Obtaining channel-state information
In order to run the proposed algorithms, channel-state information is needed (the channel impulse response function h k appears in expressions (13) and (20)).There are plenty of real-time adaptive channel identification algorithms [1] which provide fast and simple channel-state information.In this paper, we identify the channel with an adaptive FIR filter, the coefficients of which are updated as follows: where symbols y k come from a sufficiently large training sequence: where y k denotes the transmitted sequence (known at the receiver side prior to start of the real communication), while x k is the observed input at the receiver.Algorithm (24) minimizes the MSE between the unknown channel impulse response function h i , i = 1, . . ., M, and the FIR filter coefficients g i , i = 0, 1, 2, . . ., M.Here x k denotes the received sequence at the output of the channel.In stable state, algorithm (24) provides weights for which g i = h i in mean square if the degree of the FIR filter is larger than the channel impulse response (overmodeling).
It is noteworthy that the adaptive channel identifier (24) converges rather fast to the true channel-state because of the narrow eigenvalue-spectrum of the underlying matrices (for further details see [3]).Hence, the combination of identification and equalization can provide real-time solutions for low PE reception of digital information.

NUMERICAL RESULTS
In this section, a detailed performance analysis is given to evaluate the PE achieved by the different equalization methods and comparing their convergence speed and algorithmic complexities.

Channel characteristics and channel-state information
The simulations were made in the case of four different channel models representing multipath propagation in different practical scenarios.The corresponding channel characteristics are given by their impulse response as follows: ) = [1; 0.6; −0.45] T , and h (4) = [1.2;1.1; −0.2] T .One must note that h (3) and h (4) are non-minimumphase channels.In this case, PE can be decreased by introducing a delay D into the equalization in the following way: (i) instead of (3) the decision is carried out by y k−D = sgn{ y k } = sgn{ J i=0 w i x k−i }, (ii) and bound (19a) (see Theorem 2) must be modified by substituting In order to achieve the best performance, one should choose D = 0 in the case of minimumphase channels, or D = J in the case of non-minimumphase channels.
As far as the channel-state information is concerned, we investigated two scenarios: (i) at first the exact channel-state information (the impulse response of the channel) was assumed to be known at the receiver side.Therefore the equalizer algorithms were run by using the corresponding h vector; (ii) secondly, no channel-state information was assumed to be available at the receiver side, thus channel equalization was preceded by an adaptive channel identifier algorithm given in (24).

The PE versus SNR
In this section, we numerically investigate PE with respect to SNR.The performance was analyzed by having 2 up to 8 equalizer coefficients.In the case of the TGS, BBEAd, and BBEAe algorithms, the weight vector of the equalizer was normalized by setting w T w = 1, since PE is invariant to the normalization.The step-size of the gradient descent algorithms was not changed during the optimization and it falls into the interval of 10 For the sake of comparison, the exact PE was calculated by formula (9) using the exact channel-state information.The PE-SNR curves are plotted for the different channels by Figures 1, 2, 3, and 4, respectively.In the case of  non-minimumphase channels (Figures 3 and 4) the new methods far outperform the classical ones, while in the case of minimumphase channels the benefit is not so large.The best results were obtained by the TGS method which yields a 1-6 dB gain in SNR related to the traditional solutions.Furthermore, Figure 3 depicts such an example when traditional algorithms cannot provide better performance even though with increasing SNR, while the new methods are ca- pable to further decrease PE.The BBEAd algorithm gets very close to the performance of TGS, but it runs much faster (due to the newly derived bounds on PE).It is noteworthy that TGS needs exponential complexity in each step to calculate the gradient of PE by the exact summation according to formula (10).Therefore, TGS can only be applied in practice if the support of the overall impulse response (channel impulse response convolved with the equalizer impulse response, defined in ( 4)) is very limited.This, in turn, puts severe limitations on the number of equalizer coefficients when using TGS.This argument also prompts the use of  E /P E min versus the number of equalizer coefficients for channel h (3) .
the new algorithms where the complexity is not exponential with respect to the support of the overall impulse response.The performance of BBEAe is very close to BBEAd for minimumphase channels, which is explained by the fact that among the two terms in bound (19b) one term dominates the other, in the case of small noise.As a result, bound (19b) converges to bound (19a).When the noise is large, then all algorithms will yield similar performance, which are demonstrated by Figures 1-4.Furthermore BBEAe does not converge in the case of non-minimumphase ones.For the sake of comparison we also plotted the PE-SNR curve of the AMBER algorithm (for details see [5]), which exhibits almost the same performance, as TGS.On the other hand, the traditional equalizer methods (ZF and LMS) yield significantly worse performance than the new bounds.It is noteworthy, however, that in the case of minimumphase channels the performance of the LMS method can come close to the minimum PE solution.
In Figures 5 and 6, the P LMS E /P E min ratio is depicted with respect to the number of equalizer coefficients in the case of different SNR values, for two different channels.One can see that, on the one hand (in the case of h (3) ), the difference between the performance increases in favor of the minimum PE solution as the number of equalizer coefficients grow.On the other hand, in the case of h (4) , the performance of LMS and the minimum PE solution converges to each other as the number of equalizer coefficients grow.Hence, the gain obtained by increasing the number of equalizer coefficients depends on the type of channel to be equalized.

Convergence time and numerical complexity
In this section, the convergence properties of the obtained algorithms are analyzed in comparison with their numerical complexity.The ratio P LMS E /P E min versus the number of equalizer coefficients for channel h (4) .The numerical complexity is measured by the number of additions and multiplications required for a single update of the equalizer coefficients.The complexity of different algorithms are depicted by Figure 7 in the case of 5 channel coefficients.Note that TGS needs exponential number of summations, while the other methods are much simpler providing real-time equalization.
The convergence properties of the different algorithms in the case of SNR = 30 dB are compared in Figure 8, where the convergence time is averaged over channel characteristics h (1) , h (2) , h (3) , and shown in the case of two and six equalizer Figure 9: PE versus the number of iterations in the case of channel h (3) and 30 dB SNR (10 runs are averaged); symbol "I" in the legend refers to the case of applying a plugged-in channel identifier.
coefficients.The convergence time is measured by the number of iterations from the initial state to the one, where the relative changes of PE will not exceed 5%.All the equalizer algorithms were started from the same initial state of (w = [1, 0, . . ., 0] T ) and the step-sizes of the algorithms were optimized empirically.In the case of Figures 9 and 10, we set γ = 0.1 for TGS and TGSI, γ = 0.002 for BBEAd, γ = 0.01 for LMS, and γ = 0.2 for AMBER, respectively.In the case of AMBER, we set the learning threshold to τ = 0.5 providing increased convergence speed (for further details see [5]).
Fast convergence can still be maintained in the case of unknown channel-state, when an adaptive channel identification precedes the equalization algorithms.We started the identification and the equalizer algorithms at the same time instant and all updates of the equalizer were calculated ap- Figure 10: PE versus the number of iterations in the case of channel h (4) and 27 dB SNR (10 runs are averaged); symbol "I" in the legend refers to the case of applying a plugged-in channel identifier.
plying the actual estimation of the channel.We used the simple traditional channel identification algorithm given in (24).The number of the channel coefficients was assumed to be known.Convergence curves for channels h (3) and h (4) for a given SNR with adaptive channel identification can be seen in Figures 9 and 10.One must note that the convergence time of the proposed algorithms are about 10 times smaller than the traditional algorithms or the AMBER algorithm (which also minimizes PE but its convergence is apparently much slower).This justifies the use of this algorithms in real-time, high-data speed applications.

CONCLUSIONS
In this paper, novel channel equalizer algorithms have been developed based on newly derived bounds on PE.Due to the simplicity of the bounds, fast equalization algorithms can be obtained, the performance of which are close to optimum.Since these bounds need channel state information, the equalizer is preceded by an adaptive channel identifier.The combined convergence of channel identification and the new bound-based equalization is still much faster than other algorithms (e.g., AMBER, ZF, or LMS).The new methods yielded better performance than the traditional ZF and LMS equalizer algorithms.The operational complexity of the new bound-based algorithms is also low, requiring very simple calculations similarly to AMBER, ZF, or LMS.These benefits make the new algorithms suitable for real-time applications.In Table 1, the properties of the new and traditional algorithms are compared.

A. PROOF OF THEOREM 1
First we prove the right-hand side inequality of (16a), which can be deduced from (10).Since Now the overall expression (10) can be upperbounded in the following way: which proves the right-hand side of (16a).The lowerbound of (16a) can be proven by recalling the bit error probability given in the form of ( 9), which can be rewritten as If q k is eye-opened, then, recalling (14) and the ensuing discussion, we obtain q l y l < −q 0 + PD(w) < 0, −q 0 − PD(w) < −q 0 − L l=1 q l y l < 0. (A.5) Since the inequality is fulfilled for all a > > 0 for the Gaussian cdf, we can apply this result, taking a = −q 0 /σ and = L l=1 q l y l /σ, to obtain the following inequality: Thus the bit error probability can be lowerbounded as proving the left-hand side of (16a).

EURASIP Journal on Wireless Communications and
The proof of the upperbound in (16b) is based on the inequality which can be easily verified from the properties of the Gaussian cdf.Casting a = q 0 , = L l=1 q l y l , and b = PD(w), and making use of the eye-openness again, we obtain from representation (14) that For the lowerbound in (16b), we reshuffle the sum of the bit error probability in expression (A.4) and apply the definition of σ 2 and PD(w), yielding Due to expression (A.7), this can be lowerbounded by which completes the proof of the lowerbound in (16b).
To prove the lower bound (16c), we first observe that since q 0 = w 0 h 0 , Applying these formulas to the lowerbound of (16a), we have which concludes the derivation of the lowerbound (16c).

B. PROOF OF THEOREM 2
The proof of inequalities (19a) and (19b) follows from the application of the Cauchy-Schwarz inequality to PD(w) in the bounds given by the right-hand side of (16a) and (16b), respectively.Namely, PD(w) = L l=1 q l = L l=1 q l sgn q l = q, sgn{q} The quantity L L l=1 ( M j=0 w j h l− j ) 2 can now be substituted for PD(w) in the upperbounds of (16a) and (16b), which results in expressions (19a) and (19b), respectively.

C. COMPARISON TO CLASSICAL EQUALIZATION ALGORITHMS
For the sake of performance analysis, one may want to compare the newly new bounds on PE with the classical ones related to the PD and MSE [3], given as follows: where w 0 = 1/h 0 .
It is noteworthy that this bound depends on w through PD(w) and this dependence is monotone.This fact can give ground to reducing the minimization of bound (C.1a) to the minimization of PD(w).It can be proven [3] that the optimal weight vector w ( f ) which minimizes the function PD(w) under the condition that w 0 = 1/h 0 is the solution of the following set of linear equations: The solution of this problem has been treated in numerous papers (see, e.g., [3]) and is given by the following set of linear equations: One must note that these bounds yield the optimal equalizer coefficients as a solution of a set of linear equations, whereas the gradients of the newly derived bounds are nonlinear.As a result, there is a tradeoff between the performance (sharpness of the bound) and the complexity of the weight optimization.Namely, low complexity weight optimization (which reduces to solving a set of linear equations in the case of minimizing the PD or MSE) can result in poor performance.When, however, more sophisticated bounds, which yield better performance, are minimized, then a relatively time-consuming gradient descent must be performed.Finally, when one wants to minimize the error probability itself, then the complexity of the weight optimization algorithm becomes enormous, which prevents its practical implementation.

Figure 1 :
Figure1: PE versus SNR performance of the different methods in the case of channel h(1) and 3 equalizer coefficients.

Figure 2 :
Figure2: PE versus SNR performance of the different methods in the case of channel h(2) and 6 equalizer coefficients.

Figure 3 :Figure 4 :
Figure 3: PE versus SNR performance of the different methods in the case of channel h (3) , 3 equalizer coefficients and D = 2.

Figure 5 :
Figure5: The ratio P LMS E /P E min versus the number of equalizer coefficients for channel h(3) .

Figure 6 :
Figure6: The ratio P LMS E /P E min versus the number of equalizer coefficients for channel h(4) .

Figure 7 :
Figure 7: Number of operations required for a single update of the equalizer coefficients in the case of 5 channel coefficients.

6 Figure 8 :
Figure 8: Convergence time of the different algorithms in the case of 2 and 6 equalizer coefficients.
h2 J = 3 SNR = 27 the summation sign in(10) can be upperbounded by X] denotes the expected value of X. Inequality (C.1a) can be easily derived from the upperbound in expression (16a), taking into account the monotonicity of Φ(•) and

Table 1 :
Comparison of the new and traditional algorithms.