Feedforward Data-Aided Phase Noise Estimation from a DCT Basis Expansion

This contribution deals with phase noise estimation from pilot symbols. The phase noise process is approximated by an expansion of discrete cosine transform (DCT) basis functions containing only a few terms.We propose a feedforward algorithm that estimates the DCT coe ﬃ cients without requiring detailed knowledge about the phase noise statistics. We demonstrate that the resulting (linearized) mean-square phase estimation error consists of two contributions: a contribution from the additive noise, that equals the Cramer-Rao lower bound, and a noise independent contribution, that results from the phase noise modeling error. We investigate the e ﬀ ect of the symbol sequence length, the pilot symbol positions, the number of pilot symbols, and the number of estimated DCT coe ﬃ cients on the estimation accuracy and on the corresponding bit error rate (BER). We propose a pilot symbol conﬁguration allowing to estimate any number of DCT coe ﬃ cients not exceeding the number of pilot symbols, providing a considerable performance improvement as compared to other pilot symbol conﬁgurations. For large block sizes, the DCT-based estimation algorithm substantially outperforms algorithms that estimate only the time-average or the linear trend of the carrier phase.


Introduction
Phase noise refers to random perturbations in the carrier phase, caused by imperfections in both transmitter and receiver oscillators.Compensation of this phase noise is critical since these disturbances can considerably degrade the system performance.The phase noise process typically has a low-pass spectrum [1].A description of the characteristics of oscillator phase noise is given in [2].Discrete-time processes that have a bandwidth which is considerably less than the sampling frequency can often be modeled as an expansion of suitable basis functions, that contains only a few terms.Such a basis expansion has been successfully applied in the context of channel estimation and equalization in wireless communications, where the coefficients of the channel impulse response are low-pass processes with a bandwidth that is limited by the Doppler frequency [3][4][5].Several methods trying to tackle the phase noise problem exist.
(i) Designing oscillators operating at low-phase noise reduces the need of accurate phase noise compensation algorithms.This, however, leads to expensive oscillators which are difficult to integrate on chip [6][7][8].
(ii) Phase noise can be tracked by means of a feedback algorithm that operates according to the principle of the phase-locked loop (PLL).As feedback algorithms give rise to rather long acquisition transients, they are not well suited to burst transmission systems [9,10].
(iii) The observation interval is divided into subintervals and a feedforward algorithm is used to estimate within each subinterval the local time-average (or the linear trend) of the phase [9][10][11].This corresponds to approximating the phase noise by a function that is constant (or linear) within each subinterval.Such algorithms avoid the long acquisition transients encountered with feedback algorithms.However, in order that the piecewise constant (or linear) approximation of the phase noise be accurate, the subintervals should be short, in which case a high sensitivity to additive noise occurs.
(iv) Recently, iterative joint estimation and decoding/detection algorithms have been proposed that make use of the a priori statistics of the phase noise process.A factor graph approach for the estimation of the Markov-type phase noise has been presented in [12,13], while in [14,15] sequential Monte Carlo methods combined with Kalman filtering are used to perform detection in the presence of phase noise.These algorithms are computationally rather complex, prevent the use of off-the-shelf decoders, and assume detailed knowledge about the phase noise statistics at the receiver.Less complex iterative phase noise estimation algorithms based on Wiener filtering have been presented in [16], but still require knowledge about the phase noise autocorrelation function at the receiver.
In this contribution, we apply the basis expansion model to the problem of phase noise estimation from pilot symbols only, using the orthogonal basis functions from the discrete cosine transform (DCT).In contrast to the case of channel estimation, the phase noise does not enter the observation model in a linear way.Section 2 presents the system description which includes the observation model and a general phase noise model.Also, the phase noise estimation algorithm, based on the estimation of only a few DCT coefficients, is derived.Section 3 contains the performance analysis of the proposed algorithm in terms of the meansquare error (MSE) of the phase estimate.The behavior of the linearized model in the frequency domain is examined in Section 4. Analysis results are confirmed by computer simulations in Section 5, which consider both the meansquare phase estimation error and the associated bit error rate (BER) degradation.Section 6 gives a complexity analysis of our algorithm.Conclusions are drawn in Section 7.

System Description
We consider the transmission of a block of K data symbols over an AWGN channel that is affected by phase noise.The resulting received signal is represented as where the index k refers to the kth symbol interval of length T, {a(k)} is a sequence of data symbols with symbol energy E[|a(k)| 2 ] = E s , the additive noise {w(k)} is a sequence of i.i.d.zero-mean circularly symmetric complexvalued Gaussian random variables with E[|w(k)| 2 ] = N 0 , and θ(k) is a time-varying phase noise process with K× K correlation matrix R θ .The symbol sequence {a(k)} contains K P known pilot symbols at positions k i , i = 0, . . ., K P − 1, with constant magnitude |a(k i )| 2 = E s .From the observation of the received signal at the pilot symbol positions k i , an estimate θ(k) of the time-varying phase θ(k) is to be produced.This phase estimate will be used to rotate the received signal before data detection, that is, the detection of the data symbols is based on {z(k)} = {r(k)exp(− j θ(k))}.The detector is designed under the assumption of perfect carrier synchronization, that is, θ(k) = θ(k).For uncoded transmission, the detection algorithm reduces to symbol-bysymbol detection: with A denoting the symbol constellation.The phase θ(k) can be represented as a weighed sum of K basis functions over the interval [0, K − 1]: As θ(k) is essentially a low-pass process, it can be well approximated by the weighed sum of a limited number N( K) of suitable basis functions: In this contribution, we make use of the orthonormal discrete cosine transform (DCT) basis functions, that are defined as , n > 0. ( Hence, from (3), x n is the nth DCT coefficient of θ(k).
As ψ n (k) has its energy concentrated near the frequencies n/2KT and −n/2KT, the DCT basis functions are well suited to represent a low-pass process by means of a small number of basis functions.
In the following, we produce from the observation {r(k i )} at the pilot symbol positions k i , with i = 0, . . ., K P −1, an estimate x n of the coefficients x n , with n = 0, . . ., N − 1, using the phase model (4) with equality.The final estimate θ(k) is obtained by computing the inverse DCT of { x n }: However, as (4) is not an exact model of the true phase θ(k), the phase estimate is affected not only by the additive noise contained in the observation, but also by a phase noise modeling error.Considering the observations (1) at instants k i , and assuming that (4) holds with equality, we obtain where for i = 0, . . ., K P − 1; (r P ) i = r(k i ), (w P ) i = w(k i ), (a P ) i = a(k i ), and D(x) is a K P × K P diagonal matrix with D(x) i = e j(ΨPx) i (8) and (Ψ P ) i,n = ψ n (k i ), (x) n = x n , n = 0, . . ., N − 1 with N ≤ K P .The K P × 1 vectors r P , a P , and w P can be viewed as resulting from subsampling {r(k)}, {a(k)}, and {w(k)} at the instants k i that correspond to the pilot symbol positions.Similarly, the nth column of the K P × N matrix Ψ P is obtained by subsampling the n th DCT basis function ψ n (k).Maximum likelihood estimation of x from r P results in As x enters the observation r P in a nonlinear way, the ML estimate is not easily obtained.Therefore, we resort to a suboptimum ad hoc estimation of x, which is based on the argument (angle) of the complex-valued observations.However, as the function arg(z) reduces the argument of z to an interval [−π, π], taking arg(r(k i )) might give rise to phase wrapping, especially when the time-average of θ(k) is close to −π or π.In order to reduce the probability of phase wrapping, we first rotate the observation r over an angle θ avg that is close to the time-average of θ(k), then we estimate the DCT coefficients of the fluctuation θ(k) − θ avg and finally we compute the phase estimate θ(k).We select and construct r with (r We obtain an estimate x of the DCT coefficients of the fluctuation θ(k) − θ avg through a least-squares fit x = arg min x |r − Ψ P x| 2 , yielding In order that (Ψ P T Ψ P ) −1 exists, we need N ≤ K P .Finally, the phase estimate is given by where Note from (13) that the estimation algorithm does not need specific knowledge about the phase noise process.As r (k i ) from ( 11) can be viewed as a noisy version of θ(k i ) − θ avg , the phase estimate θ from (13), or, equivalently, the phase estimate θ(k) from ( 6), can be interpreted as an interpolated version of the subsampled noisy phase trajectory.The estimation of the phase trajectory involves the inversion of the N ×N matrix Ψ P T Ψ P , which depends on the pilot symbol positions {k i , i = 0, . . ., K P − 1}.Now, we point out that the pilot symbol positions can be selected such that Ψ P T Ψ P is diagonal, or, equivalently, that the N columns of the K P × N matrix Ψ P are orthogonal.Such selection of {k i } avoids the need for matrix inversion in (12).Denoting by φ n (i) the orthonormal DCT basis functions of length K P , it is easily verified that selecting {k i } such that gives rise to so that with I N denoting the N × N identity matrix.Equations ( 12) and ( 13) then reduce to In order that all k i from ( 14) be integer, K must be an odd multiple of K P , that is, K = (2d + 1)K P , yielding k i = (2d+1)i+d.The resulting pilot symbol configuration is suited for estimating any number of DCT coefficients not exceeding K P .When K is not an odd multiple of K P , rounding the righthand side of ( 14) to the nearest integer gives rise to pilot symbol positions that still yield an essentially diagonal matrix Ψ P T Ψ P in which case the simplified equations ( 17) and ( 18) can still be used.

Performance Analysis
As the observation vector r P is a nonlinear function of the carrier phase, an exact analytical performance analysis is not feasible.Instead, we will resort to a linearization of the argument function in (11) in order to obtain tractable results.
Linearization of the argument function yields for i = 0, . . ., K P − 1, where {n P (i)} is a sequence of i.i.d.zero-mean Gaussian random variables with variance N 0 /2E s .Note that (19) incorporates the true phase θ(k i ) instead of the approximate model ( 4), so that our performance analysis will take the modeling error into account.In order that the linearization in (19) be valid, we need |θ E s ; hence, the phase noise fluctuations should not cause phase wrapping and E s /N 0 should be sufficiently large.Substituting ( 19) into ( 13) yields where (n P ) i = n P (i), (θ P ) i = θ(k i ), and the K P × K matrix S is such that its ith row has a 1 at the k i th column and zeroes elsewhere (i = 0, . . ., K P − 1).The estimation error resulting from (20) is given by where I K denotes the K × K identity matrix.If the model ( 4) was exact, we would have θ = Ψ K x and θ P = Ψ P x, yielding in which case the estimation error would be caused only by the additive noise.
As a performance measure of the estimation algorithm, we consider the mean-square error (MSE), defined as Substituting ( 21) into (23) yields where The first term in (24) denotes the contribution from the additive noise, whereas the second term in (24) constitutes an MSE floor, caused by the phase noise modeling error.The phase noise statistics affect the MSE floor through the autocorrelation matrix R θ .The MSE floor decreases with increasing N (because the modeling error is reduced when more DCT coefficients are taken into account), whereas the additive noise contribution to the MSE increases with N (because N parameters need to be estimated).Hence, there is an optimum value of N that minimizes the MSE.
From the nonlinear observation model (7), which assumes that (4) holds with equality, we compute the Cramer-Rao lower bound on the MSE (23) resulting from any unbiased estimate x of the DCT coefficients of θ(k): In (26), J denotes the Fisher information matrix related to the estimation of x from (7), which is found to be Combining ( 26) with (27) yields the following performance bound: Comparison of ( 24) and (28) indicates that our ad hoc algorithm (13) yields the minimum possible (over all unbiased estimates) noise contribution to the MSE (assuming that the linearization of the observation model is valid).
When the pilot symbol positions {k i } are selected according to (14), the Cramer-Rao bound (28) reduces to which indicates that the sensitivity to additive noise increases with the number (N) of estimated DCT coefficients.

Frequency-Domain Analysis
After linearization, (20) relates the phase estimate θ to the actual phase θ and the additive noise n P .In the absence of additive noise, the estimator can be viewed as a linear system that transforms θ into θ by means of the transfer matrix MS.
In order to analyze this system in the frequency domain, we consider an input θ n with (θ n ) k = exp( j2πkn/K), that is, θ n contains only the frequency n/K.We investigate the meansquare error MSE n between the input θ n and the output θ = MSθ n ; MSE n is given by (25), with R θ replaced by θ n θ H n , where the superscript H indicates conjugate transpose.
As θ n is periodic in n with period K, the same periodicity holds for MSE n .Assuming the pilot symbol positions are according to (14) with K = 105 and K p = 15, Figure 1 shows MSE n as a function of n/K, with n/K in the interval [−1/2, 1/2] and N = 7.The behavior of MSE n is explained by noting that subsampling θ n at the instants k i (with spacing K P ) gives rise to aliasing.Frequencies n/K and (n + K P )/K yield the same subsampled phase trajectory.In the following discussion, the intervals I KP and I N are defined as [−(K P − 1)/(2K), (K P − 1)/(2K)] and [−(N − 1)/(2K), (N − 1)/(2K)], respectively; note that I N ⊂ I KP .
(i) As the first N basis functions of the DCT transform cover the frequency interval I N , we get θ n ≈ θ n and MSE n ≈ 0 when n/K is in I N .
(ii) When n/K is in the interval I KP , but outside I N , we get θ n ≈ 0 and MSE n ≈ 1.
(iii) Suppose n = mK P + n , with m / = 0, |m| < K/(2K P ) and n in I KP , because of aliasing, θ n is interpreted as θ n exp( jφ m ) with φ m = 2πm(K − K P )/(2K).When n is in the interval I N , we get θ n ≈ θ n exp( jφ m ).The resulting estimation error is the sum of two complex exponentials with frequencies n/K and n /K, yielding MSE n ≈ 2. When n is not in I N , we get θ n ≈ 0 and MSE n ≈ 1.
It follows from Figure 1 that the estimator can be viewed as a low-pass system with bandwidth B = (N − 1)/(2K).Basically, the frequency components n/K of θ with |n/K | < B are tracked by the estimator, whereas the components with |n/K | > B contribute to the MSE.

Simulation Results
In this section, we assess the performance of the proposed technique in terms of the MSE of the phase estimate and the resulting BER degradation by means of computer simulations.In our simulations, we will consider two types of phase noise, that is, Wiener phase noise and first-order phase noise.The (discrete time) first-order phase noise process θ(k) can be viewed as the output of a one-pole filter driven by white Gaussian noise: where {Δ(k)} is a sequence of i.i.d.zero-mean Gaussian random variables with variance σ 2 Δ .The corresponding phase noise power spectrum and phase noise variance are given by The approximations in (31) and (32) hold for f T 1/2 and α 1.It follows from (31) that α/2πT is the 3 dB frequency of the power spectrum.The first-order phase noise models the phase instabilities of an oscillator signal that results from a phase-locked loop (PLL) circuit.The (discretetime) Wiener phase noise θ(k) is described by the following system equation: where the initial phase noise value θ(0) is uniformly distributed in [−π, π] and Δ(k) has the same meaning as in (30).Hence, θ(k) can be viewed as the output of an integrator with a white noise input.From (33), it follows that the variance of the Wiener phase noise increases linearly with the time index k, which indicates that the process is nonstationary.
Comparing (33) and (30), it follows that the Wiener phase noise can be interpreted as a limiting case of first-order phase noise, in the limit for α → 0. Hence, one can formally define the Wiener phase noise spectrum as the limit of the first-order spectrum (31); for α → 0, where the approximation in (34) holds for | f T| 1/2.Note that the Wiener phase noise spectrum becomes unbounded at f = 0, which is a consequence of the variance increasing linearly with time.In contrast, the complex envelope exp( jθ) of the oscillator signal can be shown to be a stationary process (with [1, the Lorentzian power spectrum]).The Wiener phase noise model is often used to describe the phase noise process of a free-running oscillator, although also more elaborate models exist, involving a phase noise spectrum that consists of a sum of terms of the form A m f −m , m = 0, . . ., 4 [10,[17][18][19].In order to reduce the strong low-frequency components of the phase noise resulting from a free-running oscillator, the oscillator is often incorporated in a PLL circuit;  a first-order PLL gives rise to the first-order phase noise process (30) [17].
Figure 2 shows the first-order phase noise power spectrum, normalized by its value S 0 at f = 0, as a function of the normalized frequency f / f 3 dB , with f 3 dB = α/(2πT); also displayed is the Wiener phase noise power spectrum (normalized by the same S 0 ).As for both types of phase noise, the same value of σ 2 Δ has been used, both spectra have the same high-frequency content.
In the following simulations, Wiener phase noise is assumed, unless noted otherwise.First, we assume transmission of a block of length K = 105 symbols, consisting of K D = 90 uncoded QPSK data symbols and K P = 15 constant-energy pilot symbols that are inserted into the sequence according to (14).
(i) Figure 3 shows the MSE of the phase estimate in the absence of phase noise as a function of E s /N 0 when N = 1, 4 and 10 DCT coefficients are estimated; in addition, these simulation results are compared to the corresponding CRB (29).We observe that the CRB is achieved for sufficiently high values of E s /N 0 .For small E s /N 0 , the MSE exceeds the CRB, which is in agreement with the fact that the linearized observation model from (19) is no longer accurate in the low-SNR region.Furthermore, it is confirmed that the contribution from the additive noise to the MSE is proportional to the number of estimated coefficients N.
(ii) Figure 4 shows the MSE as function of E s /N 0 for N = 1, 4 and 10, but this time in the presence of Wiener phase noise with σ 2 Δ = 0.0027 rad 2 (which corresponds to "strong" phase noise, with σ Δ = 3 • ).We observe an MSE floor in the high-E s /N 0 region, which can be reduced by increasing the number N of estimated coefficients.Figure 4 also confirms that for low E s /N 0 , the MSE increases when N increases.This high-E s /N 0 and low-E s /N 0 behaviors indicate that for given K, K P , and E s /N 0 , the MSE can be minimized by proper selection of N.
(iii) Figure 5 shows the bit error rate (BER) as a function of E b /N 0 (E b is the energy per transmitted bit, E s = 2(1−η)E b for QPSK) for N = 1, 4, and 10.The reference BER curve corresponds to a system with perfect synchronization and no pilot symbols (η = 0).We observe that for low E b /N 0 , it is sufficient to estimate only the time-average of the phase (i.e., N = 1).Estimating a higher number of DCT coefficients can lead to a worse BER performance for low E b /N 0 because the MSE of the phase estimate due to additive noise increases with N. At high E b /N 0 , a BER floor occurs which decreases with increasing N, so in this region it becomes beneficial to estimate more than just one DCT coefficient.Hence, the optimal number of estimated coefficients N opt will depend on the operating E b /N 0 .
(iv) Figure 6 compares the BER degradations at BER ref = 10 −4 resulting from Wiener phase noise and firstorder phase noise; the value of σ 2 Δ is the same for both phase noise processes, such that the Wiener phase noise spectrum and first-order phase noise spectrum are the same for large f .(The BER degradation caused by some impairment is characterized by the increase (in dB) of E b /N 0 (as compared to the case of no impairment) needed to maintain the BER at a specified reference level.)As the 3 dB frequency α/(2πT) of the first-order phase noise is less than BT, the frequency contents of the Wiener phase noise and the first-order phase noise outside the estimator bandwidth are essentially the same, and the corresponding BER curves are nearly coincident; this is in agreement with the analysis from Section 4, where we showed that the low-frequency components of the phase noise practically do not contribute to the phase error.It is also confirmed that there is an optimum value of N that minimizes the BER degradation; this optimum N increases with σ Δ .
Next, we study the influence of the pilot symbol positions in the symbol sequence, assuming Wiener phase noise with σ Δ = 3 • .The following scenarios are considered (see Figure 7), with K P = 15.
(ii) All pilot symbols are located in the middle of the sequence (SCEN2).
(iii) K P /2 pilot symbols are inserted at the beginning of the sequence, the remaining K P /2 pilot symbols are placed at the end (SCEN3).
(v) We divide the total number of 15 pilot symbols into 3 clusters of 5 consecutive pilot symbols each.The 3 clusters are centered at the positions ( 14) that correspond to K P = 3 (SCEN5).
(vi) We divide the total number of 15 pilot symbols into 5 clusters of 3 consecutive pilot symbols each.The 5 clusters are centered at the positions ( 14) that correspond to K P = 5 (SCEN6).
Figure 8 shows the BER for each scenario with N = 4.We observe that SCEN2 and SCEN3 lead to essentially the same BER performance, that turns out to be very poor.The BER resulting from SCEN5 is slightly better, but still poor.Much better BER performance is obtained for SCEN1, SCEN4, and SCEN6, with SCEN1 yielding the best performance.The poor performance resulting from SCEN2, SCEN3, and SCEN5 comes from the poor conditioning of the 15 × 4 matrix Ψ P , yielding very large values when computing the inverse of Ψ P T Ψ P .As the DCT basis functions ψ 0 (k), . . ., ψ 3 (k) change only slowly with k, SCEN2 yields a matrix Ψ P with nearly identical rows, so it behaves like a matrix of rank 1.Similarly, the matrices Ψ P that correspond to SCEN3 and SCEN5 behave like matrices of ranks 2 and 3, respectively.Hence, when the pilot symbols are placed in a number of clusters that are less than the number (N) of DCT coefficients to be estimated, poor performance results.For SCEN1, SCEN4, and SCEN6, the number of pilot symbol clusters exceeds N; the corresponding matrices Ψ P are fullrank (rank = 4), and good performance results.Note that SCEN1 and SCEN4 can cope with values of N up to K P , whereas SCEN6 cannot handle values of N in excess of 5.
In the following, we investigate the influence of the number of pilot symbols on the MSE and the BER.The constant-energy pilot symbols are inserted into the data sequence according to (14).For (14) to hold, the block length K should be an odd multiple of the number of pilot symbols K P .We assume a total block length K = 105 and simulate the BER and MSE for K P = 7, 15, and 35. Figure 9 shows the BER degradation at BER= 10 −4 with respect to the reference system, for a fixed ratio η = K P /K = 20% and various values of the block length K.The BER degradation −10 log(1 − η) due to the insertion of pilot symbols (which amounts to 0.97 dB for η = 0.2) is included.The following observation can be made.
(i) For given block size K, there is an optimum number N opt of DCT coefficients to be estimated that minimizes the BER degradation.This is consistent with the observation that the MSE of the phase estimate can be minimized by a suitable choice of N.
(ii) For very small K, N opt = 1.The optimum value N opt increases with increasing K because more DCT coefficients      are needed to model the phase fluctuations when K gets larger.Keeping N = 1 yields very large degradations when K increases.
(iii) The BER degradation that corresponds to N = N opt exhibits a (broad) minimum as a function of K.As long as the fluctuation of θ(k) about its time-average is small, so that linearization of the argument function in (11) applies, the degradation decreases with increasing K because the number K P of noisy observations of the phase noise increases when the ratio K P /K is fixed.However, for too large K, the fluctuation of the Wiener phase noise is so large that  linearization is no longer valid (for Wiener phase noise, we need Kσ 2 Δ 1 for the linearization to be accurate) and the resulting degradation increases with increasing K.
For the considered scenario, the minimum degradation occurs at (K opt , N opt ) ≈ (400, 20) and amounts to about 2.1 dB.When the actual block size K exceeds K opt , the degradation can be limited by dividing the block in subblocks of at most K opt symbols, and estimating the phase trajectory for each subblock separately.
(i) The proposed DCT-based algorithm with pilot symbol placement according to SCEN1 (14) and selection of the optimum N.
(ii) Estimation of only the time-average of the phase noise, with the pilot symbols arranged according to SCEN3.
(iii) The method from Luise et al. [11], with the pilot symbols arranged according to SCEN3.The phase noise over the total symbol block is approximated as a linear interpolation between the average phase values over the first and the second pilot symbol clusters .
We observe that estimating only the time-average or the linear trend of the phase noise yields poor BER performance, except for small K.For K = 10, the DCT-based algorithm also estimates the time-average only (because N = 1 is optimum for K = 10); we observe that SCEN3 (with pilot symbols at positions 0 and 9) performs slightly better than the DCT-based algorithm (with pilot symbols at positions 2 and 7) for K = 10.However, when the block length is increased, the DCT algorithm that estimates multiple DCT coefficients outperforms both SCEN3 and Luise et al. and leads to a BER degradation that decreases with increasing K until an optimal value for K is reached.

Complexity Analysis
In order to assess the complexity of the proposed algorithm, we determine the number of complex multiplications required per symbol interval.The calculation of the second term in (18) requires the highest number of computations.This term can be evaluated in the following ways.
(1) In a first approach, (K/K P )Ψ K Ψ P T r is calculated via two matrix multiplications: first Ψ P T (dimension N × K P ) and r (dimension K P × 1) are multiplied and then (K/K P )Ψ K (dimension K × N)and Ψ P T r (dimension N × 1) are multiplied.The resulting complexity is of the order O(NK P + KN) ≈ O(KN), with the approximation holding for K K P .Hence, the complexity per symbol interval amounts to O(N).(2) In a second approach, (K/K P )Ψ K Ψ P T r is calculated via a single-matrix multiplication: (K/K P )Ψ K Ψ P T (dimension K ×K P ) and r (dimension K P ×1) are multiplied.Taking into account that (K/K P )Ψ K Ψ P T can be computed offline, the resulting complexity per symbol is O(K P ).As N ≤ K P , the first approach is to be preferred over the second approach.
(3) The third approach exploits the fact that Ψ K and Ψ P are submatrices of K × K and K P × K P DCT transform matrices, respectively.Hence, the two matrix multiplications from the first approach can be replaced by an inverse DCT transform (size K P ) followed by a DCT transform (size K).As K K P , the complexity of the size-K DCT dominates.The DCT of a vector {s(0), s(1), . . ., s(K − 1)} of length K can be obtained by calculating the discrete Fourier transform (DFT) of its even expansion {s(K − 1), . . ., s(1), s(0), s(0), s(1), . . ., s(K − 1)} (note that the even expansion has length 2K).As the FFT algorithm used for calculating the DFT of length M has a computational complexity O(M log 2 (M)), the complexity of the size-K DCT is O(2K log 2 (2K)), yielding a complexity per symbol interval of O(log 2 (4K 2 )).
The complexity per symbol interval of the phase noise estimation method used by Benvenuti et al. [11] is about O (1). Figure 11 shows the order of complexity as a function of the block length K, for the proposed algorithm (approaches 1 and 3) and for Luise et al. algorithm; the result related to the first approach in the proposed algorithm corresponds to taking for each K the value of N that is optimum for σ Δ = 3 • .Luise et al. algorithm has

Figure 2 :
Figure 2: Power spectrum for Wiener phase noise and first-order phase noise.

6 EURASIPFigure 3 :
Figure 3: MSE in the absence of phase noise compared to the corresponding CRB.K = 105, K P = 15.

Figure 11 :
Figure 11: Complexity comparison for the proposed algorithm (approaches 1 and 3) and for Luise et al. algorithm.