Open Access

Bidirectional algorithms for interference suppression in multiuser systems

EURASIP Journal on Wireless Communications and Networking20152015:228

Received: 2 March 2015

Accepted: 31 August 2015

Published: 19 October 2015


This paper presents adaptive bidirectional minimum mean-square error parameter estimation algorithms for fast-fading channels. The time correlation between successive channel gains is exploited to improve the estimation and tracking capabilities of adaptive algorithms and provide robustness against time-varying channels. Bidirectional normalized least mean-square and conjugate gradient algorithms are devised along with adaptive mixing parameters that adjust to the time-varying channel correlation properties. An analysis of the proposed algorithms is provided along with a discussion of their performance advantages. Simulations for an application to interference suppression in multiuser DS-CDMA systems show the advantages of the proposed algorithms.


Multiuser systems Interference suppression Adaptive algorithms Mobile channels

1 Introduction

Low-complexity reception and interference suppression are essential in multiuser mobile systems if battery power is to be conserved, data-rates improved and quality of service enhanced. Conventional adaptive schemes fulfill many of these requirements and have been a significant focus of the research literature [18]. However, in time-varying fading channels commonly associated with mobile systems, these adaptive techniques encounter tracking and convergence problems. Optimum closed-form solutions can address these problems but their computational complexity is high and CSI is required. Low-complexity adaptive channel estimation can provide CSI but in highly dynamic channels tracking problems exist due to their finite adaptation rate [9]. An alternative statistical approach is to obtain the correlation structures required for optimal minimum mean-square error (MMSE) or least-squares (LS) filtering [10, 11]. Although this relieves the tracking demands placed on the filtering process, in a Rayleigh fading channel, a zero correlator is the result due to the expectation of a Rayleigh fading coefficient, and therefore the cross-correlation vector, equating to zero, i.e., E[h 1[ n]]=0 and \(E\left [b_{1}^{\ast }[\!n]\mathbf {r}[\!n]\!\right ]=0\). In slowly fading channels, this problem may be overcome by using a time averaged approach where the averaging period is equal to or less than the coherence time of the channel. However, in fast fading channels, an averaging period equal to the coherence time of the channel is insufficient to overcome the effects of additive noise and characterize the multiuser interference (MUI) [1].

Furthermore, the use of optimized convergence parameters such as step sizes and forgetting factors into conventional adaptive algorithms extend their fading range and lead to improved convergence and tracking performance [8, 1218]. However, the stability of adaptive step-sizes and forgetting factors can be a concern unless they are constrained to lie within a predefined region [19]. Other alternative schemes include those based on processing the received data in subblocks [2022] and subspace algorithms [2328]. In addition, the fundamental problem of obtaining the unfaded symbols whilst suppressing MUI remains. Consequently, the application of such algorithms is restricted to low and moderate fading rates. The limitations of conventional estimation approaches led to the development of methods that attempt to track the faded symbol, such as the channel-compensated MMSE solution [29, 30]. This removes the burden of fading coefficient estimation from the receive filter. However, a secondary process is required to perform explicit estimation of the fading coefficients in order to perform symbol estimation [31].

Approaches that avoid tracking and estimation of the fading coefficients were proposed in [3133]. Although a channel might be highly time variant, two adjacent fading coefficient will be similar and have a significant level of correlation as studied in [3133]. These properties can then be exploited to obtain a sequence of faded symbols where the primary purpose of the filter is to suppress multiuser interference and track the ratio between successive fading coefficients; thus, not burdening it with estimation of the fading coefficients themselves. However, this scheme has a number of limitations stemming from the use of only one correlation time instant and a single class of adaptive algorithms.

In this work, a bidirectional MMSE-based interference suppression scheme for highly dynamic fading channels is presented. The non-zero correlation between multiple time instants is exploited to improve the robustness, tracking, and convergence performance of existing MMSE schemes. Unlike existing adaptive solutions [8, 3133], which do not fully exploit the fading correlation between multiple successive time instants, the proposed bidirectional approach exploits the correlation and adaptively weighs the output of the receive filter in order to optimize the estimation performance. Normalized least-mean square (NLMS) and conjugate gradient (CG)-type algorithms are presented that overcome a number of problems associated with applying the recursive least-squares (RLS) algorithm to bidirectional problems. Novel mixing strategies that weigh the contribution of the considered time instants and improve the convergence and steady-state performance, increasing the robustness against the channel discontinuities, are also presented. An analysis of the proposed schemes is developed and establishes the mechanisms and factors behind their behavior and expected performance. The proposed schemes are applied to conventional multiuser DS-CDMA [2] and cooperative DS-CDMA systems [3, 4] to assess their MUI suppression and tracking capabilities. The application of the proposed scheme and algorithms to multiple-antenna and multicarrier systems are also possible. Simulations show that the algorithms improve upon existing schemes with minimal increase in complexity.

The main contributions of this work can be summarized as:
  • Bidirectional MMSE-based interference suppression scheme for highly dynamic fading channels.

  • Bidirectional adaptive parameter estimation algorithms based on NLMS and CG techniques.

  • An analysis of the convergence and the computational complexity of the proposed algorithms.

  • A study of the proposed and existing algorithms in DS-CDMA and cooperative DS-CDMA multiuser systems.

This paper is organized as follows. Section 2 briefly details the signal models of a conventional DS-CDMA system and a cooperative DS-CDMA system. Section 3 presents the proposed scheme and its corresponding optimization problems and the motivation behind their development. Switching and mixing strategies that optimize performance are proposed and assessed in Section 4, followed by the derivation of the proposed algorithms in Section 5. An analysis of the proposed algorithms is given in Section 6, whereas performance evaluation results are presented in Section 7. Conclusions are drawn in Section 8.

2 Signal models

In this section, we describe the signal models of a DS-CDMA system operating in the uplink and a cooperative DS-CDMA system in the uplink equipped with relays and the amplify-and-forward (AF) cooperation protocol. These systems are employed for testing the proposed algorithms even though that extensions to multiple-antenna and multi-carrier can also be considered with appropriate modifications of the algorithms.

2.1 DS-CDMA signal model

We consider the uplink of a synchronous DS-CDMA system with K users, N chips per symbol, and L p (L p <N) propagation paths for each link. We assume that the delay is a multiple of the chip rate, the channel is constant during each symbol interval and the spreading codes are repeated from symbol to symbol. The received signal after filtering by a chip-pulse matched filter and sampled at chip rate yields the M-dimensional received vector given by
$${} {\fontsize{9.2pt}{9.6pt}\selectfont{\begin{aligned} \mathbf{r}[\!i] = A_{1}b_{1}[\!i]\mathbf{H}_{1}[\!i]\mathbf{c}_{1}[\!i] +\underbrace{\sum^{K}_{k=2}A_{k}b_{k}[\!i]\mathbf{H}_{k}[\!i]\mathbf{c}_{k}[\!i]}_{\text{MUI}} +\boldsymbol{\eta}[\!i] + \mathbf{n}[\!i], \end{aligned}}} $$
where M=N+L−1, and c k [ i] and A k are the spreading sequence and signal amplitude of the kth user, respectively. The M×N channel matrix with L paths is given by H k [ i] for the kth user, the M×1 vector η[ i] corresponds to the intersymbol interference and n[ i] is the noise vector. Conventional schemes use BPSK modulation and the differential and bidirectional schemes employ differential BPSK where the sequence of data symbols to be transmitted by the kth user are given by b k [ i]=a k [ i]b k [i−1] where a k [ i] is the unmodulated baseband data. Assuming that linear receive processing is adopted, the output of the receive filter is given by
$$ x[\!i] = {\mathbf w}^{H}[\!i]{\boldsymbol{r}}[\!i], $$

where w[ i] is an M-dimensional vector that corresponds to the receive filter.

2.2 Cooperative DS-CDMA signal model

We also consider the uplink of a cooperative DS-CDMA system with K users, N r relays, N chips per symbol, and L p (L p <N) propagation paths for each link. The system shown in Fig. 1 is equipped with an AF protocol at each relay. The received signals at the nth relay and the destination nodes are filtered by a chip-pulse matched filter, sampled at a chip rate to obtain sufficient statistics and organized into M×1 vectors as described by
$$ \mathbf{r}_{\text{sr}_{n}}[\!i] = \sum^{K}_{k=1} a_{\mathrm{s}_{k}}[\!i]b_{k}[\!i]h_{\text{sr}_{n}}[\!i]\mathbf{c}_{k}[\!i] + \mathbf{n}_{\mathrm{r}_{n}}[\!i], $$
Fig. 1

Cooperative DS-CDMA system model

$$ \mathbf{r}_{\text{rd}}[\!i] = \sum^{N_{\mathrm{r}}}_{n=1}a_{\mathrm{r}_{n}}[\!i]h_{\mathrm{r}_{n}\mathrm{d}}[\!i] \mathbf{r}_{\text{sr}_{n}}[\!i] + \mathbf{n}_{\mathrm{d}}[\!i] $$
$$ \begin{aligned} \mathbf{r}_{\text{rd}}[\!i] &=\sum^{N_{\mathrm{r}}}_{n=1}\sum^{K}_{k=1}a_{\mathrm{s}_{k}}[\!i] a_{\mathrm{r}_{n}}[\!i] h_{\text{sr}_{n}}[\!i] h_{\mathrm{r}_{n}\mathrm{d}}[\!i] \mathbf{c}_{k}[\!i] b_{k}[\!i]\\ &\quad + \sum^{N_{\mathrm{r}}}_{n=1} a_{\mathrm{r}_{n}}[\!i] h_{\mathrm{r}_{n}\mathrm{d}}[\!i]\mathbf{n}_{\mathrm{r}_{n}}[\!i] + \mathbf{n}_{\mathrm{d}}[\!i]. \end{aligned} $$

where \(h_{\text {sr}_{n}}[\!i]\phantom {\dot {i}\!}\) and \(h_{\mathrm {r}_{n}\mathrm {d}}[\!i]\phantom {\dot {i}\!}\) are the channel fading channel coefficients between the source and the nth relay, and the nth relay and the destination, respectively, and \(\mathbf {n}_{\mathrm {r}_{n}}[\!i]\phantom {\dot {i}\!}\) and n d[ i] are additive white Gaussian noise vectors at the relays and the destination, respectively.

The received data is processed by a linear receive filter, which produces the output given by
$$ x[\!i] = {\mathbf w}^{H}[\!i]{\boldsymbol{r}}_{\text{rd}}[\!i], $$

where w[ i] is an M-dimensional vector that corresponds to the receive filter for the cooperative system.

3 Proposed bidirectional scheme

Adaptive parameter estimation has two primary objectives: estimation and tracking of the desired parameters. When applied to multiuser wireless systems, these translate into recovery of the desired symbol, tracking of channel variations and suppression of MUI. However, in fast fading channels, these objectives place unrealistic demands on conventional filtering and estimation schemes. Differential techniques reduce these demands by relieving adaptive receivers from the task of tracking fading coefficients [31]. This is achieved by posing an optimization problem where the ratio between two successive received samples is the quantity to be tracked. Such an approach is enabled by the presumption that, although the fading is fast, there is correlation between the adjacent channel samples as described by
$$ f_{1}[\!i]=E\left[h_{1}[\!i]h_{1}^{\ast}[i+1]\!\right]\geq 0, $$

where h 1[ i] is the channel coefficient of the desired user. The interference suppression of the resulting receive filter is improved in fast fading environments compared to conventional adaptive receivers but only the ratio of adjacent fading samples is obtained. Consequently, differential MMSE schemes are suited to differential modulation where the ratio between adjacent symbols is the data carrying mechanism.

However, limiting the optimization to two adjacent samples exposes these processes to the negative effects of uncorrelated samples
$$ E\left[h_{1}[\!i]h_{1}^{\ast}[\!i+1]\!\right]\approx 0, $$
but also does not exploit the correlation that may be present between two or more adjacent samples, i.e.,
$$ \begin{aligned} f_{2}[\!i]&=E\left[h_{1}[\!i]h_{1}^{\ast}[\!i-1]\!\right]>0\\ f_{3}[\!i]&=E\left[h_{1}[i+1]h_{1}^{\ast}[\!i-1]\!\right]>0.\\ \end{aligned} $$
In order to address these weaknesses, we propose a bidirectional MSE cost function based on multiple adjacent samples so that the number of channel scenarios under which the differential MMSE performs beneficial adaptation is substantially increased. Termed the bidirectional MMSE, due to the use of multiple time instants, the motivation behind this proposition is illustrated by the plots of fading/channel coefficients in Fig. 2, where J1 represents the 2 sample differential MMSE. There is a low level of correlation present between samples i and i−1, thus any adaptation of the receive filter will bring little benefit. However, the proposed scheme for 3 time instants operates over J1, J2, and J3; therefore, it can exploit the correlation between i+1 and i−1 and past data. Figure 2 gives an example of a channel where there is a significant level of correlation between samples.
Fig. 2

Fading channels: an uncorrelated fading channel on the top figure and a correlated fading channel on the bottomfigure

The proposed bidirectional scheme can be expressed in the form of an optimization problem as described by
$${} {\fontsize{9.4pt}{9.6pt}\selectfont{\begin{aligned} \mathbf{w}_{o}[\!i] = \underset{\mathbf{w}[i]}{\mathrm{arg \; min}}\, & E\left[\left|b[\!i]\mathbf{w}^{H}[\!i] \mathbf{r}[\!i-1]-b[i-1]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i]\right|^{2}\right.\\ &\vdots\\ &+\left|b[\!i]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-(D-1)]\right.\\& \left.- b[i-(D-1)]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i]\right|^{2}\\ &\\ &+\left|b[i-1]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-2]\right. \\ &\left. -b[i-2]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-1]\right|^{2}\\ &\vdots \\ &+\left|b[i-1]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-(D-1)] \right. \\& \left.-b[i-(D-1)]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-1]\right|^{2}\\ &\\ &+\left.\left|b[i-(D-2)]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-(D-1)]\right.\right.\\&\left.\left.-b[i-(D-1)]\mathbf{w}^{H}[\!i] \mathbf{r}[\!i-(d-2)]\right|^{2}\right], \end{aligned}}} $$
where D denotes the number of considered time instants. Introducing summations into (10) yields a more concise form
$$ \begin{aligned} \mathbf{w}_{o}[\!i] =&\, \underset{\mathbf{w}[i]}{\mathrm{arg \; min}}\ E\left[\sum^{D-2}_{d=0}\sum^{D-1}_{l=d+1} \left|b[\!i-d]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-l]\right.\right.\\& \qquad\qquad\quad\left.\left. -b[\!i-l] \mathbf{w}^{H}[\!i]\mathbf{r}[\!i-d]\right|^{2}{\vphantom{\sum^{D-2}_{d=0}}}\right], \end{aligned} $$
where an output power constraint is required to avoid the trivial all-zero receive filter solution
$$ E\left[\left|\mathbf{w}^{H}[\!i]\mathbf{r}[\!i]\right|^{2}\right]=1. $$

Although the existing differential scheme operates over 2 correlated samples, the proposed scheme is able to exploit the additional correlation present between multiple adjacent samples. Moreover, it is also possible to obtain further gain by weighting the correlation between multiple adjacent samples. However, the benefit of using multiple time instant is dependent on the fading rate of the channel and the related correlation of the channel coefficients. We have investigated the use of multiple time instants and it turns out that a scheme which exploits 3 adjacent samples captures most of the performance benefits. In particular, we have tested the proposed bidirectional scheme and algorithms with various values of adjacent time instants (between 4 and 8) and verified that exploiting extra time instants above 3 does not yield significant gains. In fact, the number of time instants is a parameter to be chosen by the designer.

The optimization problem of the proposed scheme for 3 time instants is given by
$$ {\fontsize{7.9}{6}\begin{aligned} \mathbf{w}_{o}[\!i] = \underset{\mathbf{w}[i]}{\mathrm{arg \; min}}\ & E\left[\left|b[\!i]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-1]-b[\!i-1]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i] \right|^{2}\right. &(\mathrm{J}_{1})\\ &+\left|b[\!i]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-2]-b[\!i-2]\mathbf{w}^{H}[\!i] \mathbf{r}[\!i] \right|^{2}&(\mathrm{J}_{2})\\ &+\left.\left|b[\!i\,-\,1\!]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i\,-\,2]\!- b[\!i-2]\mathbf{w}^{H}\mathbf{r}[\!i\,-\,1] \right|^{2}\right]&\!\!(\mathrm{J}_{3})\\ \end{aligned},} $$
where J1−J3 equate to those of Fig. 2, and the time instants of interest have been altered to avoid the use of future samples. In addition to (13), an output power constraint is again required to avoid an all-zero trivial solution as given by
$$ E\left[\left|\mathbf{w}^{H}[\!i]\mathbf{r}[\!i]\right|^{2}\right]=1. $$

In what follows, we describe switching and weighting strategies to optimize the proposed scheme and obtain further performance gain.

4 Switching and weighting strategies

The advantages of a bidirectional scheme operating over 3 time or more time instants have been verified in our studies. However, the performance of the scheme may be degraded when received vectors based on uncorrelated fading coefficients are employed in the update of the receive filter. This is particularly evident from the example with an uncorrelated channel illustrated in Fig. 2, where the contribution to the cost function represented by J3 is unlikely to aid the accurate adaptation of w[ i]. To avoid this, we introduce a set of switching or mixing parameters that determine the weighting of the D constituent elements of the bidirectional cost function. The proposed generalized bidirectional cost function with weighting factors is described by
$$ {\fontsize{9}{6} \begin{aligned} \mathbf{w}_{o}[\!i] &= \underset{\mathbf{w}[i]}{\mathrm{arg \; min}}\ E\left[\sum^{D-2}_{d=0}\sum^{D-1}_{l=d+1}\rho_{n}[\!i]\left|b[\!i-d]\mathbf{w}^{H}[\!i] \mathbf{r}[\!i-l]\right.\right.\\&\qquad\qquad\qquad\left.\left.- b[i-l]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-d]\right|^{2}{\vphantom{\sum^{D-2}_{d=0}}}\right], \end{aligned}} $$
where n=d(D−3)+l+1. However, we again focus on the case with D=3 in the remainder of this work. With these modifications, the proposed bidirectional MSE cost function with 3-time instant and weighting factors is given by
$$ {\fontsize{7.4}{6}\begin{aligned} \mathbf{w}_{o}[\!i] \,=\, \underset{\mathbf{w}[i]}{\mathrm{arg \; min}}\ &E\left[\!\rho_{1}[\!i]\left|b[\!i]\mathbf{w}^{H}[\!i] \mathbf{r}[\!i-1]- b[\!i-1]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i] \right|^{2}\right.&(\mathrm{J}_{1})\\ &+\rho_{2}[\!i]\left|b[\!i]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-2]-b[\!i-2] \mathbf{w}^{H}[\!i] \mathbf{r}[\!i] \right|^{2}&(\mathrm{J}_{2})\\ &+\left.\rho_{3}[\!i]\!\left|b[\!i\,-\,1]\!\mathbf{w}^{H}[\!i] \mathbf{r}[\!i\,-\,2] \!-b[\!i\,-\,2]\mathbf{w}^{H}[\!i]\!\mathbf{r}[\!i\,-\,1]\! \right|^{2}\!\right]&(\mathrm{J}_{3})\\ \end{aligned},} $$

where 0≤ρ n ≤3 for n=1,2,3 are the weighting factors.

The determination of the receive vector samples that correspond to the scenarios depicted in Fig. 2 is essential if correct optimization of the ρ is to be achieved. The use of CSI to achieve this would be an effective but impractical solution due to the difficulty in obtaining CSI; consequently, other methods must be sought. In this section, we propose the use of two alternative metrics: the signal power differential after interference suppression between the considered time instants, and the error between the considered time instants.

Firstly, we consider a switching scheme where ρ 1−3=[ 0,1] are determined at each time instant based on the following post-filtering power differential metrics:
$$ \begin{array}{l} P_{1}[\!i]=\left\vert\mathbf{w}[\!i]^{H}\mathbf{r}[\!i]\right\vert^{2}-\left\vert\mathbf{w}[\!i]^{H}\mathbf{r}[\!i-1]\right\vert^{2}\\ P_{2}[\!i]=\left\vert\mathbf{w}[\!i]^{H}\mathbf{r}[\!i]\right\vert^{2}-\left\vert\mathbf{w}[\!i]^{H}\mathbf{r}[\!i-2]\right\vert^{2}\\ P_{3}[\!i]=\left\vert\mathbf{w}[\!i]^{H}\mathbf{r}[\!i-1]\right\vert^{2}-\left\vert\mathbf{w}[\!i]^{H}\mathbf{r}[\!i-2]\right\vert^{2}.\\ \end{array} $$
If the power difference for each of J1−3 exceeds a predefined threshold, the corresponding ρ is set to zero; therefore, removing the corresponding element of the cost function from the adaptation process at that time instant. For highly dynamic channels, one requires an adaptive threshold which is able to track the changes in the system and determine appropriate time instants based on successive samples. Consequently, for each ρ n a threshold, T n [ i], related to a time-averaged, windowed, root-mean-square of the relevant differential power is used. The value of ρ n is then determined in the following manner:
$$ \rho_{n}[\!i] = \left\{ \begin{array}{ll} 0& \text{if}~ P_{n}[\!i] \geq T_{n}[\!i]\\ 1& \text{otherwise}\\ \end{array}, \right. $$
$$ T_{n}[\!i]= \nu\left[\lambda_{P}P_{n_{\text{RMS}}}[\!i] + (1-\lambda_{P})P_{n_{\text{RMS}}}[\!i]\right], $$
$$ P_{n_{\text{RMS}}}[\!i]=\sqrt{\frac{1}{m-1}\sum^{i}_{l=i-m}P_{n}[\!l]^{2}}, $$

and ν is a positive user defined constant greater than unity that scales the threshold. The threshold ν is set with the help of computer experiments in a similar way as the step size of the NLMS algorithm is tuned. The aim is to scale the threshold such that it will be used to inform the algorithm about the relevant differential power which should be used.

Although the current sample corresponding to J n may bring little benefit in terms of adaptation, this does not indicate that all previous cost function elements corresponding to J n should be discarded. An alternative approach is to use a set of convex mixing parameters that are not restricted to 1 or 0. This allows each element of the cost function to be more precisely weighted based on its previous and current values. However, the setting of these mixing parameters is once again problematic if they are fixed. Accordingly, an adaptive implementation that can take account of the time-varying channels and previous values which continue to have an impact on the adaptation of the filter is sought. The errors extracted from the cost function (16) are chosen as the metric for developing algorithms. These provide an input to the weighting factor calculation process that is directly related to the cost function of (16). The time-varying mixing factors are given by
$$ \rho_{n}[\!i] = \lambda_{e}\rho_{n}[\!i-1] + (1-\lambda_{e})\frac{e_{T}[\!i]-|e_{n}[\!i]|}{e_{T}[\!i]} $$
$$ e_{T}[\!i] = |e_{1}[\!i]|+|e_{2}[\!i]|+|e_{3}[\!i]|, $$
and the individual error terms are calculated as
$${} {\fontsize{8.8pt}{9.6pt}\selectfont{\begin{aligned} e_{1}[\!i] &= b[\!i]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1]-b[\!i-1]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i]\\ e_{2}[\!i] &= b[\!i]\mathbf{w}^{H}[\!i-1]\mathbf{r}[i-2]-b[\!i-2]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i]\\ e_{3}[\!i] &= b[\!i-1]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-2]-b[\!i-2]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1].\\ \end{aligned}}} $$

The forgetting factor, 0≤λ e ≤1, is user defined and, along with normalization by the total error, e T [ i], and \({\sum ^{3}_{n=1}}\rho _{n}[\!0]=1\), ensures \({\sum ^{3}_{n=1}}\rho _{n}[\!i]=1\) and a convex combination at each time instant.

5 Adaptive algorithms

In order to devise low-complexity adaptive algorithms based on the proposed bidirectional schemes, we consider the minimization of the cost function given by
$${} {\fontsize{8.4pt}{9.6pt}\selectfont{\begin{aligned} C(\mathbf{w}[\!i]\!) =\ &E\left[\left|b[\!i]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1]-b[\!i-1]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i] \right|^{2}\right.\\ & +\left|b[\!i]\mathbf{w}^{H}[\!i-2]\mathbf{r}[\!i-2]-b[\!i-2]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i] \right|^{2}\\ & +\left.\left|b[\!i-1]\mathbf{w}^{H}[\!i-2]\mathbf{r}[\!i-2]-b[\!i-2]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-1] \right|^{2}\!\right]\\ &\\ \mathrm{subject\,to}&\ E\left[\left|\mathbf{w}[\!i]^{H}\mathbf{r}[\!i]\right|^{2}\right]=1. \end{aligned}}} $$

This cost function then forms the basis of the adaptive algorithms derived in this section. However, to reduce the complexity of the derivations, enforcement of the non-zero constraint is not included and instead enforced in a stochastic manner at each time instant after the adaptation step is complete [31].

5.1 Normalized least-mean square algorithm

We begin with the low-complexity NLMS implementation that employs an instantaneous gradient in a steepest descent framework. Firstly, the instantaneous gradient of (24) is taken with respect to w [ i], yielding
$$ {\begin{aligned} \nabla_{\mathbf{w}^{*}[i]}C(\mathbf{w}[\!i]\!)=&-b[\!i-1]\mathbf{r}[\!i] \left(b[\!i]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1]\right.\\&\left.\qquad\qquad\qquad\quad-b[\!i-1]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i]\right)^{H}\\ &-b[i-2]\mathbf{r}[\!i]\left(b[\!i]\mathbf{w}^{H}[\!i-2]\mathbf{r}[\!i-2]\right.\\&\left.\qquad\qquad\qquad\quad-b[\!i-2]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i]\right)^{H}\\ &-b[\!i-2]\mathbf{r}[\!i-1]\left(b[\!i-1]\mathbf{w}^{H}[\!i-2]\right.\\&\left.\quad\qquad\qquad\qquad\quad\mathbf{r}[\!i-2]-b[\!i-2]\right.\\&\left.\quad\qquad\qquad\qquad\quad\mathbf{w}^{H}[\!i]\mathbf{r}[\!i-1] \right)^{H} \end{aligned}} $$
At this point, in order to improve the convergence performance of the NLMS algorithm, the bracketed error terms of (25) are modified by replacing the receive filters with the most recently calculated one, w[ i−1]. The resulting gradient expression is given by
$$ \begin{array}{ll} \nabla_{\mathbf{w}^{*}[i]}C(\mathbf{w}[\!i]\!)=&-b[\!i-1]\mathbf{r}[\!i] \underbrace{\left(b[\!i]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1]-b[i-1]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i]\right)^{H}}_{e_{1}[\!i]}\\ &-b[\!i-2]\mathbf{r}[\!i]\underbrace{\left(b[\!i]\mathbf{w}^{H}[\!i-1] \mathbf{r}[\!i-2]-b[\!i-2]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i]\right)^{H}}_{e_{2}[\!i]}\\ &-b[\!i-2]\mathbf{r}[\!i-1]\underbrace{\left(b[\!i-1]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-2]- b[\!i-2]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1]\right)^{H}}_{e_{3}[\!i]} \end{array}. $$
Placing the above gradient expression in the steepest descent update recursion, we obtain
$$ \begin{aligned} \mathbf{w}[\!i] =&\, \mathbf{w}[\!i-1] + \frac{\mu}{M[\!i]|\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1]|}\\&\left[b[\!i-1] \mathbf{r}[\!i]e_{1}^{\ast}[\!i] + b[\!i-2]\mathbf{r}[\!i]e_{2}^{\ast}[\!i] + b[\!i-2]\right.\\&\left.\mathbf{r}[i-1]e_{3}^{\ast}[\!i]\!\right], \end{aligned} $$
where μ is the step-size and the normalization factor, M[ i], is given by
$$ M[\!i] = \lambda_{M}M[i-1] + (1-\lambda_{M})\mathbf{r}^{H}[\!i]\mathbf{r}[\!i], $$

where λ M is an exponential forgetting factor [31]. The enforcement of the constraint is performed by the denominator of (27) which ensures that the receive filter w[ i] does not tend towards a zero correlator as the adaptation progresses.

The incorporation of the variable switching and mixing factors of Section 4 has the potential to improve the performance of the above algorithm by optimizing the weighting of the error terms of (26). Integration of the factors given by (18) and (21) yields
$$\begin{array}{*{20}l} \mathbf{w}[\!i] =&\, \mathbf{w}[\!i-1] + \frac{\mu}{M[\!i]|\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1]|}\\& \left[\rho_{1}[\!i]b[\!i-1]\mathbf{r}[\!i]e_{1}[\!i] + \rho_{2}[\!i]b[\!i-2]\mathbf{r}[\!i]e_{2}[\!i] \right.\\&\left.+ \rho_{3}[\!i]b[\!i-2]\mathbf{r}[\!i-1]e_{3}[\!i]\!\right] \end{array} $$

as the receive filter update equation.

5.2 Least squares algorithm

To achieve faster convergence and increased robustness to fading, we now pursue a LS-based solution. Firstly, the bidirectional cost function of (24) is cast as an LS problem by replacing the expected value with a weighted summation, as described by
$$ \fontsize{8.2}{6}\begin{aligned} C_{\text{LS}}(\mathbf{w}[\!i]\!) \,=\, {\sum\limits^{i}_{l=1}}\lambda^{i-l}&\left[\!\left|b[\!i]\mathbf{w}^{H}[\!i-1]\mathbf{r}[\!i-1]-b[\!i-1]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i] \right|^{2}\right.\\ & +\left|b[\!i]\mathbf{w}^{H}[\!i-2]\mathbf{r}[\!i-2]-b[\!i-2]\mathbf{w}^{H}[\!i]\mathbf{r}[\!i] \right|^{2}\\ & +\left|b[\!i\,-\,1]\mathbf{w}^{H}[\!i-2]\mathbf{r}[\!i\,-\,2]-b[\!i\,-\,2]\mathbf{w}^{H}[\!i\,-\,1]\right.\\&\left.\left.\mathbf{r}[\!i-1] \right|^{2}\right] \end{aligned}, $$
where λ is an exponential forgetting factor. Proceeding as with the conventional LS derivation, and modifying the equivalent error terms in a similar manner to as in (26), we arrive at the following expressions for the component autocorrelation matrices:
$$ \begin{aligned} &\bar{\mathbf{R}}_{1}[\!i] = \lambda\bar{\mathbf{R}}_{1}[\!i-1] + b[\!i-1]\mathbf{r}[\!i]\mathbf{r}^{H}[\!i]b^{\ast}[\!i-1]\\ &\bar{\mathbf{R}}_{2}[\!i] = \lambda\bar{\mathbf{R}}_{2}[\!i-1] + b[\!i-2]\mathbf{r}[\!i]\mathbf{r}^{H}[\!i]b^{\ast}[\!i-2]\\ &\bar{\mathbf{R}}_{3}[\!i] = \lambda\bar{\mathbf{R}}_{3}[\!i-1] + b[\!i-2]\mathbf{r}[\!i-1]\mathbf{r}^{H}[\!i-1]b^{\ast}[\!i-2]\\ \end{aligned} $$
and the component cross-correlation vectors:
$$ {\fontsize{8.6}{6} \begin{aligned} &\bar{\mathbf{t}}_{1}[\!i] = \lambda\bar{\mathbf{t}}_{3}[\!i-1] + b[\!i-1]\mathbf{r}[\!i]\mathbf{r}^{H}[\!i-1]\mathbf{w}[\!i-1]b^{\ast}[\!i]\\ &\bar{\mathbf{t}}_{2}[\!i] = \lambda\bar{\mathbf{t}}_{2}[\!i-1] + b[\!i-2]\mathbf{r}[\!i]\mathbf{r}^{H}[\!i-2]\mathbf{w}[\!i-1]b^{\ast}[\!i]\\ &\bar{\mathbf{t}}_{3}[\!i] = \lambda\bar{\mathbf{t}}_{3}[\!i-1] + b[\!i-2]\mathbf{r}[\!i-1]\mathbf{r}^{H}[\!i-2]\mathbf{w}[\!i-1]b^{\ast}[\!i-1]\\ \end{aligned}}. $$
The overall correlation structure is then formed from the summation of the preceding expressions, yielding
$$ \bar{\mathbf{R}}[\!i] = \bar{\mathbf{R}}_{1}[\!i] + \bar{\mathbf{R}}_{2}[\!i] + \bar{\mathbf{R}}_{3}[\!i] $$
$$ \bar{\mathbf{t}}[\!i] = \bar{\mathbf{t}}_{1}[\!i] + \bar{\mathbf{t}}_{2}[\!i] + \bar{\mathbf{t}}_{3}[\!i]. $$
$$ \mathbf{w}[\!i] = \bar{\mathbf{R}}^{-1}[\!i]\bar{\mathbf{t}}[\!i]. $$
Similarly to the NLMS algorithm, performance improvements can be expected if the variable switching and mixing factors, (18) and (21), are incorporated into the correlation expressions. The resulting expressions are
$$ \mathbf{R}[\!i] = \rho_{1}[\!i]\mathbf{R}_{1}[\!i] + \rho_{2}[\!i]\mathbf{R}_{2}[\!i] + \rho_{3}[\!i]\mathbf{R}_{3}[\!i] $$
$$ \mathbf{t}[\!i] = \rho_{1}[\!i]\mathbf{t}_{1}[\!i] + \rho_{2}[\!i]\mathbf{t}_{2}[\!i] + \rho_{3}[\!i]\mathbf{t}_{3}[\!i] $$
Introducing the above expression into the RLS framework would lead to a low-complexity algorithm with improved convergence and robustness compared to the NLMS of Section 5.1. This requires the integration of (31) with the matrix inversion lemma [9, 34]. However, the derivation requires an expression with a rank-1 update of the form
$$ \mathbf{R}[\!i] = \mathbf{R} [\!i-1] + \lambda \mathbf{r}[\!i]\mathbf{r}^{H}[\!i] $$

for the autocorrelation matrix; a form which (31) is unable to fit into without assumptions that cause a significant performance degradation. Consequently, an alternative low-complexity algorithm to implement the LS solution given by (31)–(35) is required.

5.3 Conjugate gradient algorithm

Due to the particular form of the bidirectional LS formulation and the conventional RLS recursion, an alternative low-complexity method is now derived. The CG technique has been chosen to avoid matrix inversions and due to its excellent convergence properties [3537]. We begin the derivation of the proposed CG-type algorithm with the autocorrelation (31) and cross-correlation (32) structures of Subsection 5.2. Inserting them into the standard CG quadratic form yields
$$ J(\mathbf{w}) =\mathbf{w}^{H}[\!i]\mathbf{R}[\!i]\mathbf{w}[\!i] -\mathbf{t}^{H}[\!i]\mathbf{w}[\!i]. $$
From [36], the unique minimiser of (39) is also the minimiser of
$$ \mathbf{R}[\!i]\mathbf{w}[\!i]=\mathbf{t}[\!i]. $$
This shows the suitability of the CG algorithm to the bidirectional problem. At each time instant, a number of iterations of the following method are required to reach an accurate solution, where the iterations are indexed with the variable j. Other single iteration CG methods are available but these depend upon degeneracy—a term that describes the situation where the successive CG vectors are not orthogonal [38]. Consequently, we employ the conventional method to ensure satisfactory convergence. At the ith time instant, the gradient and direction vectors are initialized as
$$ \mathbf{g}_{0}[\!i]=\nabla_{\mathbf{w}[\!i]}C_{\text{LS}}(\mathbf{w}[\!i]\!)=\mathbf{R}[\!i]\mathbf{w}_{0}[\!i]-\mathbf{t}[\!i] $$
$$ \mathbf{d}_{0}[\!i] = -\mathbf{g}_{0}[\!i], $$
respectively, where the gradient expression is equivalent to those used in the derivation of the previous algorithms. The vectors d j [ i] and d j+1[ i] are R[ i] orthogonal with respect to R[ i] such that d j [ i]R[ i]d l [ i]=0 for jl. At each iteration, the receive filter is updated as
$$ \mathbf{w}_{j+1}[\!i] = \mathbf{w}_{j}[\!i] + \alpha_{j}[\!i]\mathbf{d}_{j}[\!i] $$
where α j [ i] is the minimizer of J(w j+1[ i]) such that
$$ \alpha_{j} = \frac{-\mathbf{d}_{j}^{H}\mathbf{g}_{j}[\!i]}{\mathbf{d}_{j}^{H}[\!i]\mathbf{R}[\!i]\mathbf{d}_{j}[\!i]}. $$
The gradient vector is then updated according to
$$ \mathbf{g}_{j+1}[\!i]=\mathbf{R}[\!i]\mathbf{w}_{j}[\!i]-\mathbf{t}[\!i] $$
and a new conjugate gradient direction vector is obtained as given by
$$ \mathbf{d}_{j+1}[\!i] = -\mathbf{g}_{j+1}[\!i] + \beta_{j}[\!i]\mathbf{d}_{j}[\!i] $$
$$ \beta_{j}[\!i] = \frac{\mathbf{g}_{j+1}^{H}[\!i]\mathbf{R}[\!i]\mathbf{d}_{j}[\!i]}{\mathbf{d}_{j}^{H}[\!i]\mathbf{R}[\!i]\mathbf{d}_{j}[\!i]} $$

ensures the R[ i] orthogonality between d j [ i] and d l [ i] where jl. The iterations (43)–(47) are then repeated until j=j max.

The variable switching and mixing factors can be incorporated into the algorithm to improve performance. This is achieved by operating the CG algorithm over the modified correlation structures given by (36) and (37).

6 Analysis

In this section, we analyze the proposed bidirectional algorithms to gain insight of the expected performance but also to obtain further knowledge into the operation of the proposed and existing algorithms. The unconventional form of the proposed cost functions precludes the application of standard MSE analysis. Consequently, we concentrate on the signal-to-interference-plus-noise ratio (SINR) of the proposed algorithms in order to analyze their interference suppression and tracking performance. We firstly study the NLMS algorithm and the features of its weight error correlation matrix in order to arrive at an analytical SINR expression. Following this, we explore the analogy between the form of the bidirectional expression and convex combinations of adaptive receive filters [39, 40].

6.1 SINR analysis

Let us define the instantaneous SINR expression given by
$$ \text{SINR}_{\text{inst}} \triangleq \frac{\mathbf{w}^{H}[\!i]\mathbf{R}_{\mathrm{S}} \mathbf{w}[\!i]}{\mathbf{w}^{H}[\!i]\mathbf{R}_{\mathrm{I}}\mathbf{w}[\!i]}, $$

where R S and R I are the signal and interference and noise correlation matrices, into a form amenable to analysis.

The receive filter error weight vector is described by
$$ {\boldsymbol \varepsilon}[\!i] = \mathbf{w}[\!i]-\mathbf{w}_{o}[\!i], $$

where w o is the instantaneous standard MMSE receiver.

Let us now describe the expression of the numerator of (48) with the desired signal component:
$$ \begin{aligned} \mathrm{S}_{\mathrm{c}} = \boldsymbol{\varepsilon}^{H}[\!i]\mathbf{R}_{\mathrm{S}}\boldsymbol{\varepsilon}[\!i] &+ \boldsymbol{\varepsilon}^{H}[\!i]\mathbf{R}_{\mathrm{S}}\mathbf{w}_{o}[\!i]+ \overbrace{\mathbf{w}_{o}^{H}[\!i]\mathbf{R}_{\mathrm{S}}\mathbf{w}_{o}[\!i]}^{P_{\mathrm{S,opt}}[i]}\\&+ \mathbf{w}_{o}^{H}[\!i]\mathbf{R}_{\mathrm{S}}\boldsymbol{\varepsilon}[\!i], \end{aligned} $$
and of the interference plus noise component described by
$$ \begin{aligned} \mathrm{S}_{i+n}= \boldsymbol{\varepsilon}^{H}[\!i]\mathbf{R}_{\mathrm{I}} \boldsymbol{\varepsilon}[\!i]&+ \boldsymbol{\varepsilon}^{H}[\!i] \mathbf{R}_{\mathrm{I}}\mathbf{w}_{o}[\!i] + \underbrace{ \mathbf{w}_{o}^{H}[\!i]\mathbf{R}_{\mathrm{I}} \mathbf{w}_{o}[\!i]}_{P_{\mathrm{I,opt}}[\!i]}\\&+ {\mathbf w}_{o}^{H}[\!i] {\mathbf R}_{\mathrm{I}}{\boldsymbol \varepsilon}[\!i]. \end{aligned} $$
Taking the expectation and the trace of S c and S i+n , defining K[ i]=E[ε[ i]ε H [ i]] and G[ i]=E[ w o [ i]ε H [ i] ], we can define the following SINR expression:
$$ \begin{aligned} \text{SINR}& \triangleq \frac{\text{tr}\{E[S_{c}]\}}{\text{tr}\{E[S_{i+n}]\}}\\ &=\frac{\text{tr}[\mathbf{K}[\!i]\mathbf{R}_{\mathrm{S}}+ \mathbf{G}[\!i]\mathbf{R}_{\mathrm{S}}+ P_{\mathrm{S,opt}}[\!i]+ \mathbf{G}^{H}[\!i]\mathbf{R}_{\mathrm{S}}]} {{\text{tr}}[\mathbf{K}[\!i]\mathbf{R}_{\mathrm{I}}+ \mathbf{G}[\!i]\mathbf{R}_{\mathrm{I}}+ P_{\mathrm{I,opt}}[\!i]+ \mathbf{G}^{H}[\!i]\mathbf{R}_{\mathrm{I}}]}. \end{aligned} $$

From (52), it is clear that we need to pursue expressions for K[ i] and G[ i] in order to reach an analytical interpretation of the bidirectional NLMS scheme.

Substituting the filter error weight vector into the filter update expression of (27) yields a recursive expression for the receive filter error weight vector described by
$${} \begin{aligned} \boldsymbol{\varepsilon}[\!i]=&\boldsymbol{\varepsilon}[\!i-1]\\ &+\left[\mathbf{I}+\mu\mathbf{r}[\!i]b[\!i-1]\mathbf{r}^{H}[\!i-1]b^{\ast}[\!i] -\mu\mathbf{r}[\!i]b[\!i-1]\right.\\&\left.\quad\; \mathbf{r}^{H}[\!n]b^{\ast}[\!i-1]\right.\\ &+\ \mu\mathbf{r}[\!i]b[\!i-2]\mathbf{r}^{H}[\!i-2]b^{\ast}[\!i]- \mu\mathbf{r}[\!i]b[\!i-2]\\&\quad\; \mathbf{r}^{H}[\!n]b^{\ast}[\!i-2]\\ &+\left.\mu\mathbf{r}[\!i-1]b[\!i-2]\mathbf{r}^{H}[\!i-2]b^{\ast}[\!i-1]\right. \\&-\left.\mu\mathbf{r}[\!i-1]b[\!i-2]\mathbf{r}^{H}[\!i-1]b^{\ast}[\!i-2]\right]\boldsymbol{\varepsilon}[\!i-1]\\ &+\ \mu\mathbf{r}[\!i]b[\!i-1]e^{\ast}_{o,1}[\!i]\\ &+\ \mu\mathbf{r}[\!i]b[\!i-2]e^{\ast}_{o,2}[\!i]\\ &+\ \mu\mathbf{r}[\!i-1]b[\!i-2]e^{\ast}_{o,3}[\!i]\\ \end{aligned}, $$
where the terms e o,1−3 denote the error terms of (26) when the optimum filter w o is used. Utilizing the direct averaging approach developed by Kushner [41], often invoked in this type of stochastic analysis, the solution to the stochastic difference equation of (53) can be approximated by the solution to a second equation [9, 42], such that
$${} \begin{aligned} &E\left[\mathbf{I} + \mu\mathbf{r}[\!i]b[\!i-1]\mathbf{r}^{H}[\!i-1]b^{\ast}[\!i] -\mu\mathbf{r}[\!i]b[\!i-1]\mathbf{r}^{H}[\!n]\right.\\&\quad b^{\ast}[\!i-1]+\mu\mathbf{r}[\!i]b[\!i-2]\mathbf{r}^{H}[\!i-2]b^{\ast}[\!i] -\mu\mathbf{r}[\!i]b[\!i-2]\\&\quad\mathbf{r}^{H}[\!n]b^{\ast}[\!i-2] +\mu\mathbf{r}[\!i-1]b[\!i-2]\mathbf{r}^{H}[\!i-2]b^{\ast}[\!i-1]\\&\quad\left. -\mu\mathbf{r}[\!i-1]b[\!i-2]\mathbf{r}^{H}[\!i-1]b^{\ast}[\!i-2]\right]\\& = \mathbf{I}+\mu\mathbf{F}_{1}-\mu\mathbf{R}_{1}+\mu\mathbf{F}_{2}-\mu\mathbf{R}_{2} +\mu\mathbf{F}_{3}-\mu\mathbf{R}_{3} \end{aligned}, $$
where F and R are correlation matrices. Specifically, R 1−3 are autocorrelation matrices given by
$$ \begin{array}{l} \mathbf{R}_{1} = E\left[\mu\mathbf{r}[\!i]b^{\ast}[\!i-1]\mathbf{r}^{H}[\!i]b^{\ast}[\!i-1]\!\right]\\ \mathbf{R}_{2} = E\left[\mu\mathbf{r}[\!i]b^{\ast}[\!i-2]\mathbf{r}^{H}[\!i]b^{\ast}[\!i-2]\!\right]\\ \mathbf{R}_{3} = E\left[\mu\mathbf{r}[\!i-1]b^{\ast}[\!i-2]\mathbf{r}^{H}[\!i-1]b^{\ast}[\!i-1]\right]\\ \end{array} $$
and F 1−3 cross-time instant correlation matrices, given by
$$ \begin{array}{l} \mathbf{F}_{1} = E\left[\mu\mathbf{r}[\!i]b^{\ast}[\!i-1]\mathbf{r}^{H}[\!i-1]b^{\ast}[\!i]\!\right]\\ \mathbf{F}_{2} = E\left[\mu\mathbf{r}[\!i]b^{\ast}[\!i-2]\mathbf{r}^{H}[\!i-2]b^{\ast}[\!i]\!\right]\\ \mathbf{F}_{3} = E\left[\mu\mathbf{r}[\!i-1]b^{\ast}[\!i-2]\mathbf{r}^{H}[\!i-2]b^{\ast}[\!i-1]\right].\\ \end{array} $$
Using (54) and the independence assumption of E[e o,1−3[ i]ε[ i]]=0, E[r H [ i]r[i−1]]=0 and E[b k [ i]b k [ i−1]]=0, we arrive at the expression for K[ i]
$$ \begin{array}{ll} \mathbf{K}[\!i]=&\left[\mathbf{I}+\mu\mathbf{F}_{1}-\mu\mathbf{R}_{1} +\mu\mathbf{F}_{2}-\mu\mathbf{R}_{2} +\mu\mathbf{F}_{3} -\mu\mathbf{R}_{3} \right]\\&\quad \mathbf{K}[i-1] \left[\mathbf{I}+\mu\mathbf{F}_{1}-\mu\mathbf{R}_{1}+\mu\mathbf{F}_{2}-\mu\mathbf{R}_{2}\right.\\&\left. \quad +\mu\mathbf{F}_{3} -\mu\mathbf{R}_{3} \right] \\ &+ \mu^{2}\mathbf{R}_{1}J_{\mathrm{min,1}}[\!i] \\ &+ \mu^{2}\mathbf{R}_{2}J_{\mathrm{min,2}}[\!i] \\ &+ \mu^{2}\mathbf{R}_{1}J_{\mathrm{min,3}}[\!i] \\ \end{array} $$
where J min,j [ i]=|e o,j |2. Following a similar method, an expression for G[ i] can also be reached
$$ \mathbf{G}[\!i] = \mathbf{G}[\!i-1]\left[\mu\mathbf{F}_{1} - \mu\mathbf{R}_{1} + \mu\mathbf{F}_{2} - \mu\mathbf{R}_{2} + \mu\mathbf{F}_{3} - \mu\mathbf{R}_{3}\right]. $$
At this point, we study the derived expression to gain an insight into the operation of the bidirectional algorithm and the origins of its advantages over the conventional differential scheme. Equivalent expressions for the conventional stochastic gradient scheme are given by
$$ \begin{array}{ll} \mathbf{K}[\!i]=&\left[\mathbf{I}+\mu\mathbf{F}_{1}-\mu\mathbf{R}_{1} \right]\mathbf{K}[\!i-1] \left[\mathbf{I}+\mu\mathbf{F}_{1}-\mu\mathbf{R}_{1}\right] \\ &+ \mu^{2}\mathbf{R}_{1}J_{\mathrm{min,1}}[\!i] \\ \mathbf{G}[\!i] = &\mathbf{G}[\!i-1]\left[\mu\mathbf{F}_{1} - \mu\mathbf{R}_{1}\right]. \end{array} $$
The bidirectional scheme has a number of additional correlation terms compared to the conventional scheme. Evaluating the cross-time instant matrices yields
$$ \begin{array}{l} \mathbf{F}_{1} = |a_{1}|^{2}\mathbf{c}_{1}\mathbf{c}_{1}^{H}\underbrace{E\left[h[\!i]h^{\ast}[\!i-1]\!\right]}_{f_{1}[\!i]}\\ \mathbf{F}_{2} = |a_{1}|^{2}\mathbf{c}_{1}\mathbf{c}_{1}^{H}\underbrace{E\left[h[\!i]h^{\ast}[\!i-2]\!\right]}_{f_{2}[\!i]}\\ \mathbf{F}_{3} = |a_{1}|^{2}\mathbf{c}_{1}\mathbf{c}_{1}^{H}\underbrace{E\left[h[\!i-1]h^{\ast}[\!i-2]\!\right]}_{f_{3}[\!i]}\\ \end{array}. $$

From the expression above, it is clear that the underlying factor that governs the SINR performance of the algorithms is the correlation between the considered time instants, f 1−3, data-ruse and the use of f 2. Accordingly, it is the additional correlation factors that the proposed bidirectional algorithms possess that enhances its performance compared to the conventional scheme, confirming the initial motivation behind the proposition of the bidirectional approach. Lastly, the f 1−3 expressions of (60) are the factors that influence the optimum number of considered time instants.

6.2 Combinations of adaptive receive filters

To further our understanding of the bidirectional algorithms, we follow a heuristic and complementary approach that leads to an analogy with a combination of adaptive filters [39]. The bidirectional LS solution given by (35) is made up of 6 constituent correlation structures that result in a filter output of
$$ {\begin{aligned} y[\!i] =&\, \left[\left(\rho_{1}\mathbf{R}_{1}[\!i] + \rho_{3}\mathbf{R}_{2}[\!i] + \rho_{3}\mathbf{R}_{3}[\!i]\right)^{-1} \right.\\ &\times \; \left. \left(\rho_{1}\mathbf{t}_{1}[\!i] + \rho_{2}\mathbf{t}_{2}[\!i] + \rho_{3}\mathbf{t}_{3}[\!i]\!\right){\vphantom{\mathbf{R}_{3}[\!i]^{-1}}}\!\right]^{H} \mathbf{r}[\!i]. \end{aligned}} $$
Decomposing the expression above leads us to an expression where the signal y[ i] is formed from the output of 3 individual adaptive receive filters
$$ \begin{array}{ll} y[\!i] =&\left[\left(\mathbf{R}_{1}[\!i]+\frac{\rho_{2}}{\rho_{1}}\mathbf{R}_{2}[\!i]+\frac{\rho_{3}}{\rho_{1}}\mathbf{R}_{3}[\!i]\right)^{-1}\mathbf{t}_{1}[\!i]\right]^{H}\mathbf{r}[\!i]\\ &\\ &+\left[\left(\mathbf{R}_{1}[\!i]+\frac{\rho_{1}}{\rho_{2}}\mathbf{R}_{2}[\!i]+\frac{\rho_{3}}{\rho_{2}}\mathbf{R}_{3}[\!i]\right)^{-1}\mathbf{t}_{2}[\!i]\right]^{H}\mathbf{r}[\!i]\\ &\\ &+\left[\left(\mathbf{R}_{1}[\!i]+\frac{\rho_{1}}{\rho_{3}}\mathbf{R}_{2}[\!i]+\frac{\rho_{2}}{\rho_{3}}\mathbf{R}_{3}[\!i]\right)^{-1}\mathbf{t}_{3}[\!i]\right]^{H}\mathbf{r}[\!i]\\ \end{array}. $$

This is equivalent to a convex combination of adaptive receive filters with varying λ [39, 40], where each of the 3 filters focuses on the correlation between the 2 of the 3 considered time instants. However, the presence of the autocorrelation matrices in the inverses of the expression also indicates that the remaining time instants also influence the structure of each filter. Although the mixing factors are not separable, we can interpret them as a form of weighting that is present in conventional combinations of adaptive filters. This explains in part the additional control and performance they provide.

7 Simulations

In this section, the proposed bidirectional adaptive algorithms are applied to conventional multiuser and cooperative DS-CDMA systems using the signal models described in Section 2. The application of the proposed algorithms to multiple-antenna and multi-carrier systems is straightforward and requires a change in the signal models. The individual Rayleigh fading channel coefficients, h[ i], are generated using Clarke’s model [43] where 20 scatterers are assumed. In all simulations, the number of packets is denoted by N p and the fading rate is given by the dimensionless normalized fading parameter, T s f d , where T s is the symbol period and f d is the Doppler frequency shift. The convergence parameters of the algorithms have been optimized resulting in step-sizes forgetting factors of 0.1 and 0.99, respectively, λ e=0.95, λ M =0.99, and the number of CG iterations, j max=5.

As detailed in Section 6, the proposed algorithms do not minimize the same MSE as a conventional MMSE receiver; therefore, the MSE is not an adequate performance metric. As a result, BER- and SINR-based metrics are chosen for the purpose of comparison between existing algorithms and the optimum MMSE solution. Due to the rapidly fading channel, the instantaneous SNR, SNRi, is highly variable and so the SINR alone is also not a satisfactory metric. To overcome this, it is normalized by the instantaneous SNR to give \(\frac {SINR}{SNR_{\mathrm {i}}}\). This value is negative in all simulations and directly reflects the MUI interference suppression and tracking capabilities of the proposed algorithms [31, 32].

7.1 Conventional DS-CDMA

Here we apply the adaptive algorithms of Section 5 to interference suppression in the uplink of a multiuser DS-CDMA system described in Section 2. Each simulation is averaged over N p packets and detailed parameters are specified in each plot.

7.1.1 Analytical results

We first assess the analytical expressions derived in Section 6.1 and their agreement with simulated results. Central to the performance of the differential and bidirectional schemes are the correlation factors f 1−3 and the related assumption of h 1[ i]≈h 1[ i−1]. Examining the effect of the fading rate on the value of f 1−3 shows that f 1f 2f 3 at fading rates of up to T s f d =0.01. Consequently, after a large number of received symbols with high total receive power
$$ \begin{aligned} 3\left[\mathbf{I}+\mu\mathbf{F}_{1} - \mu\mathbf{R}_{1}\right]\approx&\left[\mathbf{I}+\mu\mathbf{F}_{1} - \mu\mathbf{R}_{1} + \mu\mathbf{F}_{2} - \mu\mathbf{R}_{2}\right.\\ &\left.+ \mu\mathbf{F}_{3} - \mu\mathbf{R}_{3}\right], \end{aligned} $$
due to the decreasing significance of the identity matrix. This indicates that the expected value of the SINR, of the bidirectional scheme, once f 1f 2f 3 have stabilized, should be similar to the differential scheme. A second implication is that the bidirectional scheme should converge towards the MMSE level due to the equivalence between the differential scheme and the MMSE solution [31]. Figure 3 illustrates the analytical performance using the expressions given in Section 6.1.
Fig. 3

SINR performance comparison of simulated and analytical proposed NLMS algorithms over a single path channel

The correlation matrices are calculated via ensemble averages prior to the start of the algorithm and G[0]=K[ 0]=I. In Fig. 3, one can see the convergence of the simulated schemes to the analytical and MMSE plots, validating the presented analysis. Due to the highly dynamic nature of the channel, using the expected values of the correlation matrix alone cannot capture the true transient performance of the algorithms. However, the convergence period of the analytical plots within the first 200 iterations can be considered to be within the coherence time and therefore give an indication of the transient performance relative to other analytical plots. Using this justification and the aforementioned analysis, advantages should be present in the transient phase due to the additional correlation information supplied by F 2 and F 3. This conclusion is supported by Fig. 3 and the similar forms of the analytical and simulated schemes relative to each other and their subsequent convergence.

7.1.2 SINR performance

The SINR/SNR performance of the proposed algorithms is given by Figs. 4 and 5. The performance of the CG implementation of the differential algorithm is marginally below that of the RLS during convergence but the bidirectional scheme provides noticeable improvements. The differential and bidirectional algorithms converge close to the MMSE optimum as expected from the previous analysis. The bidirectional NLMS algorithm provides more significant improvements over the differential scheme, both in the final stages of convergence and steady-state. These differences can be accounted for by the reduced receive signal power; the matrices equivalent to F2 and F3 improving the consistency of the steady-state performance by reducing the impact of weakly correlated samples; and the NLMS’s suitability to data reuse as in the affine projection (AP) algorithm. As expected, the conventional adaptive schemes are unable to converge or track the solution due to the more demanding task of tracking both the fading coefficients and suppressing MUI.
Fig. 4

SINR performance comparison of proposed CG algorithms over a single path channel where all schemes have been trained with 150 symbols and then switched to decision-directed mode

Fig. 5

SINR performance comparison of proposed NLMS algorithms over a single path channel where all schemes have been trained with 150 symbols and then switched to decision-directed mode

The BER performance of the differential and bidirectional schemes is illustrated in Fig. 6, where the system parameters are equal to those of Figs. 4 and 5. The RLS and CG algorithms converge to near the MMSE level with the bidirectional scheme providing a performance improvement over other considered algorithms. The NLMS schemes exhibit slower BER convergence compared to their SINR performance but reach a level where a decision-directed operation can take place in a severely fading channel. Due to the superior performance of the CG- and RLS-based algorithms, we predominantly focus on their performance for the remainder of this section.
Fig. 6

BER performance comparison of proposed schemes during training over a single path channel

Figure 7 illustrates the performance of the proposed CG and RLS algorithms as the fading rate is increased. The conventional schemes with RLS and CG algorithms are unable to cope with fading rates in excess of T s f d =0.005 and begin to diverge at the completion of the training sequence. The proposed bidirectional scheme outperforms the differential schemes but the performance begins to decline once fading rates above f d T s =0.01 are reached. Once again, the increase in performance of the bidirectional scheme can be accounted for by the increased correlation information supplied by the matrices F 2 and F 3 and data reuse. The introduction of the mixing factors into the bidirectional algorithm improves performance further, especially at higher fading rates. A first reason for this is the improvement in consistency as previously mentioned. However, a second more significant reason can be established by referring back to the observations on the correlation factors f 1−3. Although fading rates of 0.01 may be fast, the assumption h[ i−2]≈h[ i−1]≈h[ i] is still valid. Consequently, f 1f 2f 3 and equal weighting is adequate. However, as the fading rate increases beyond T s f d =0.01, this assumption breaks down and the correlation information requires unequal weighting for optimum performance, a task fulfilled by the adaptive mixing factor.
Fig. 7

SINR performance versus fading rate of the proposed CG schemes over a single path channel after 200 training symbols

A more detailed plot illustrating the performance advantages of the CG switching and mixing parameters presented in is given by Fig. 8. The switching approach provides little improvement over the standard bidirectional scheme due to its discrete and non-adaptive operation. As previously covered, a low instantaneous value of f 1−3, as indicated by a large power differential, does not indicate that all information gathered on f 1−3 is redundant. The mixing parameter implementations address this shortcoming by adaptively setting the parameters via the error weight expression (21) that accurately reflects the averaged correlation factors. At a fading rate of T s f d =0.02 the assumption of f 1f 2f 3 begins to diminish in accuracy and therefore unequal weighting is required for performance in excess of the standard bidirectional scheme, as previously mentioned and shown in Fig. 8.
Fig. 8

SINR performance over a single path channel of the proposed schemes with switching and mixing factors

The MUI suppression of the proposed and existing schemes is given by Fig. 9. The bidirectional scheme has significantly improved multiuser performance compared to the differential algorithms at low system loads but diminishes as the number of users increases. This behavior supports the analytical conclusions of Section 6.1 by virtue of the convergence of the differential and bidirectional schemes and the increasing accuracy of (63) as system loading, and therefore received power, increases.
Fig. 9

BER performance against system loading after 500 symbols of the proposed schemes over a single-path channel. Schemes are trained with 150 symbols and then switch to decision directed operation

7.2 Cooperative DS-CDMA

To further demonstrate the performance of the proposed schemes in cooperative relaying systems [5], we apply them to an AF cooperative DS-CDMA system detailed in Section 2.

Figure 10 shows that the bidirectional scheme obtains performance benefits over the differential schemes during convergence but, as expected, the performance gap closes as steady-sate is reached. The inclusion of variable mixing parameters improves performance but to a lesser extent than non-cooperative networks due to the more challenging scenario of compounding highly time-variant channels.
Fig. 10

SINR performance of the proposed CG schemes during training in a single-path cooperative DS-CDMA system

The improvement BER brought about by the bidirectional schemes is evident from Fig. 11. However, the more challenging environment of a cooperative system with compounded rapid fading has impacted on the BER performance of the schemes, as evidenced by the increased performance gap between the proposed schemes and MMSE reception.
Fig. 11

BER performance of the proposed CG schemes in a single-path cooperative DS-CDMA system

8 Conclusions

In this paper, we have presented a bidirectional MMSE framework that exploits the correlation characteristics of rapidly varying fading channels to overcome the problems associated with conventional adaptive interference suppression techniques in such channels.

An analysis of the proposed schemes has been performed and the reasons behind the performance improvements shown to be the additional correlation information, data reuse, and optimized correlation factor weighting. The conditions under which the differential and bidirectional schemes are equivalent have also been established and the steady-state implications of this detailed. Finally, the proposed algorithms have been assessed in standard and cooperative multiuser DS-CDMA systems and shown to outperform both differential and conventional schemes.



Part of this manuscript was presented at the International Symposium on Wireless Communications Systems (ISWCS) in 2013. The work of R. C. de Lamare is partly funded by CNPq and FAPERJ.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

Department of Electronics


  1. S Verdu, Multiuser Detection (Cambridge University Press, NY, 1998).MATHGoogle Scholar
  2. U Madhow, M Honig, MMSE interference suppression for direct-sequence spread spectrum CDMA. IEEE Trans. Commun. 42(12), 3178–3188 (1994).View ArticleGoogle Scholar
  3. W Huang, Y Wong, C Kuo, in IEEE Global Communications Conference. Decode-and-forward cooperative relay with multi-user detection in uplink CDMA networks (Phoenix, USA, 2007).Google Scholar
  4. RC de Lamare, Joint iterative power allocation and linear interference suppression algorithms for cooperative ds-cdma networks. IET Commun. 6(13), 1930–1942 (2012).View ArticleGoogle Scholar
  5. P Clarke, RC de Lamare, Transmit diversity and relay selection algorithms for multirelay cooperative mimo systems. IEEE Trans. Veh. Technol. 61(3), 1084–1098 (2012).View ArticleGoogle Scholar
  6. T Wang, RC de Lamare, PD Mitchell, Low-complexity set-membership channel estimation for cooperative wireless sensor networks. IEEE Trans. Veh. Technol. 60(6), 2594–2607 (2011).View ArticleGoogle Scholar
  7. RC de Lamare, R Sampaio-Neto, Minimum mean squared error iterative successive parallel arbitrated decision feedback detectors for DS-CDMA, systems. IEEE Trans. Commun. 5(56), 778–789 (2008).View ArticleGoogle Scholar
  8. R de Lamare, P Diniz, Set-membership adaptive algorithms based on time-varying error bounds for cdma interference suppression. IEEE Trans. Veh. Technol. 58(2), 644–654 (2009).View ArticleGoogle Scholar
  9. S Haykin, Adaptive Filter Theory, 4th edn. (Prentice Hall, NJ, 2002).Google Scholar
  10. D Sadler, A Manikas, MMSE multiuser detection for array multicarrier DS-CDMA in fading channels. IEEE Trans. Signal Process. 53(7), 2348–2358 (2005).MathSciNetView ArticleGoogle Scholar
  11. R Schoder, W Gerstacker, A Lampe, Noncoherent MMSE interference suppression for DS-CDMA. IEEE Trans. Commun. 50(4), 577–587 (2002).View ArticleGoogle Scholar
  12. S Haykin, A Sayed, J Zeidler, P Yee, P Wei, Adaptive tracking of linear time variant systems by extended RLS algorithms. IEEE Trans. Signal Process. 45(5), 1118–1128 (1997).View ArticleGoogle Scholar
  13. J Wang, Fast tracking RLS algorithm using novel variable forgetting factor with unity zone. IEEE Electron. Lett. 27(23), 2550–2551 (1991).Google Scholar
  14. R Kwong, E Johnston, A variable step size LMS algorithm. IEEE Trans. Signal Process. 40(7), 1633–1642 (1992).MATHView ArticleGoogle Scholar
  15. B Toplis, S Pasupathy, Tracking improvements in fast RLS algorithm using a variable forgetting factor. IEEE Trans. Acoust., Speech, Signal Process. 36(2), 206–227 (1988).MATHView ArticleGoogle Scholar
  16. S-H Leung, C So, Gradient-based variable forgetting factor RLS algorithm in time-varying environments. IEEE Trans. Signal Process. 53(8), 3141–3150 (2005).MathSciNetView ArticleGoogle Scholar
  17. S Hyun-Chool, A Sayed, S Woo-Jin, Variable step-size NLMS and affine projection algorithms. IEEE Signal Process. Lett. 11(2), 132–135 (2004).View ArticleGoogle Scholar
  18. Y Zhang, N Li, J Chambers, Y Hao, New gradient-based variable step size LMS algorithms. EURASIP J. Adv. Signal Process, 1–9 (2008).Google Scholar
  19. S Gelfand, Y Wei, J Krogmeier, The stability of variable step-size LMS algorithms. IEEE Trans. Signal Process. 47(12), 3277–3288 (1999).MATHView ArticleGoogle Scholar
  20. P Baracca, S Tomasin, L Vangelista, N Benvenuto, A Morello, Per sub-block equalization of very long ofdm blocks in mobile communications. IEEE Trans. Commun. 59(2), 363–368 (2011).View ArticleGoogle Scholar
  21. S Yerramalli, M Stojanovic, U Mitra, Partial fft demodulation: A detection method for highly doppler distorted ofdm systems. IEEE Trans. Signal Process. 60(11), 5906–5918 (2012).MathSciNetView ArticleGoogle Scholar
  22. L Li, A Burr, R de Lamare, in IEEE 77th Vehicular Technology Conference (VTC Spring) 2013. Joint iterative receiver design and multi-segmental channel estimation for ofdm systems over rapidly time-varying channels (Dresden, Germany, 2013), pp. 1–5.Google Scholar
  23. ML Honig, JS Goldstein, Adaptive reduced-rank interference suppression based on the multistage wiener filter. IEEE Trans. Commun. 6(50), 986–994 (2002).View ArticleGoogle Scholar
  24. Y Sun, V Tripathi, ML Honig, Adaptive, iterative, reducedrank (turbo) equalization. IEEE Trans. Wirel. Commun. 4(6), 2789–2800 (2005).View ArticleGoogle Scholar
  25. RC de Lamare, R Sampaio-Neto, Reduced–rank adaptive filtering based on joint iterative optimization of adaptive filters. IEEE Signal Process. Lett. 14(12), 980–983 (2007).View ArticleGoogle Scholar
  26. RC de Lamare, R Sampaio-Neto, Reduced-rank space–time adaptive interference suppression with joint iterative least squares algorithms for spread-spectrum systems. IEEE Trans. Veh. Technol. 59(3), 1217–1228 (2010).View ArticleGoogle Scholar
  27. RC de Lamare, R Sampaio-Neto, Adaptive reduced-rank equalization algorithms based on alternating optimization design techniques for MIMO systems. IEEE Trans. Veh. Technol. 60(6), 2482–2494 (2011).View ArticleGoogle Scholar
  28. RC de Lamare, R Sampaio-Neto, Adaptive reduced-rank processing based on joint and iterative interpolation, decimation, and filtering. IEEE Trans. Signal Process. 57(7), 2503–2514 (2009).MathSciNetView ArticleGoogle Scholar
  29. M Honig, M Shensa, S Miller, L Milstein, in IEEE Vehicular Technology Conf. Performance of adaptive linear interference suppression for DS-CDMA in the presence of flat rayleigh fading (Phoenix, 1997).Google Scholar
  30. HV Poor, X Wang, in Annual Allerton Conf. Commun., Control and Computing. Adaptive multiuser detection in fading channels (Monticello, 1996).Google Scholar
  31. U Madhow, K Bruvold, LJ Zhu, Differential MMSE: A framework for robust adaptive interference suppression for DS-CDMA over fading channels. IEEE Trans. Commun. 53(8), 1377–1390 (2005).View ArticleGoogle Scholar
  32. LJ Zhu, U Madhow, in IEEE Global Telecoms. Conf. Adaptive interference suppression for direct sequence CDMA over severely time-varying channels (Pheonix, 1997).Google Scholar
  33. M Honig, S Miller, M Shensa, L Milstein, Performance of adaptive linear interference suppression in the presence of dynamic fading. IEEE Trans. Commun. 49(4), 635–645 (2001).MATHView ArticleGoogle Scholar
  34. PSR Diniz, Adaptive Filtering: Algorithms and Practical Implementation, 3rd edn. (Springer, 2008).Google Scholar
  35. C Meyer, Matrix Analysis and Applied Linear Algebra (SIAM, 2001).Google Scholar
  36. DG Luenburger, Y Ye, Linear and Nonlinear Programming, 3rd edn. (Springer, 2008).Google Scholar
  37. J Lee, H Yu, Y Sung, Beam tracking for interference alignment in time-varying mimo interference channels: A conjugate-gradient-based approach. IEEE Trans. Veh. Technol. 63(2), 958–964 (2014).View ArticleGoogle Scholar
  38. PS Chang, AN Wilson, Analysis of conjugate gradient algorithms for adaptive filtering. IEEE Trans. Signal Process. 48(2), 409–418 (2000).MATHView ArticleGoogle Scholar
  39. NJ Bershad, JCM Bermudez, J-Y Yourneret, An affine combination of two LMS adaptive filters - transient mean square analysis. IEEE Trans. Signal Process. 56(8), 1853–1864 (2008).MathSciNetView ArticleGoogle Scholar
  40. J Arenas-Garcia, A Figueiras-Vidal, A Sayed, Mean-square performance of a convex combination of two adaptive filters. IEEE Trans. Signal Process. 45(3), 1078–1090 (2006).View ArticleGoogle Scholar
  41. H Kushner, Approximation and Weak Convergence Methods for Random Processes with Applications to Stochastic System Theory (MIT Press, 1984).Google Scholar
  42. A Sayed, Adaptive Filters (John Wiley & Sons, 2008).Google Scholar
  43. W Jakes, Microwave Mobile Communications (Wiley-IEE Press, 1994).Google Scholar


© Clarke and de Lamare. 2015