We propose to apply a subspace-based algorithm to jointly estimate the SI and intended channel coefficients along with the nonlinear coefficients. Subspace methods rely on the orthogonality property between the signal and noise subspaces. These two subspaces are obtained from eigendecomposition of the covariance matrix of the received signal *y*. Denoting by *R*
_{
u
}, the covariance of *u*, the covariance matrix *R*
_{
y
} of the received vector *y* is given by

$$ \boldsymbol{R}_{y} = \boldsymbol{H} \boldsymbol{R}_{u} \boldsymbol{H}^{H} +\sigma^{2} \boldsymbol{I}_{MN_{r}}, $$

(17)

as long as the signal samples are uncorrelated from the noise samples^{2}.

The signal subspace is spanned by the columns of the matrix *H*. Noting that the columns of *H* are, by construction, linearly independent as soon as there exists an *l*∈ [ 0, *L*] such that *H*(*l*) is full rank^{3}, the matrix *H* is a full-rank matrix. Therefore, the dimension of the signal subspace is 2*NN*
_{
t
}. It follows that, to obtain a nondegenerate noise subspace, its dimension *N*
_{
r
}
*M*−2*N*
_{
t
}
*N* should be larger than zero, and thus, the number of receiving antennas should be larger than the number of transmitting antennas to make the subspace method work, and in [5], we developed the linear subspace algorithm for this setting. In the following, we will develop the subspace-based algorithm for general numbers of transmit and receive antennas. When *N*
_{
t
}=*N*
_{
r
}, the matrix *R*
_{
y
} cannot be directly used to find the noise subspace. As an alternative different approach, we consider the augmented received vector as

$$\begin{array}{@{}rcl@{}} \widetilde{\boldsymbol{y}} = \left(\begin{array}{l} \boldsymbol{y} \\ \boldsymbol{y}^{*} \end{array}\right) = \left(\begin{array}{ll} \boldsymbol{H} & \mathbf{0} \\ \mathbf{0} & \boldsymbol{H}^{*} \end{array}\right) \left(\begin{array}{l} \boldsymbol{u} \\ \boldsymbol{u}^{*} \end{array}\right) + \left(\begin{array}{l} \boldsymbol{w} \\ \boldsymbol{w}^{*} \end{array}\right). \end{array} $$

(18)

The use of the augmented received vector is usually referred as widely linear processing. In this case, the augmented covariance matrix \(\boldsymbol {R}_{\widetilde y}\) of \(\widetilde {\boldsymbol {y}}\) has the following structure:

$$\begin{array}{@{}rcl@{}} \boldsymbol{R}_{\widetilde y} = \widetilde{\boldsymbol{H}} \boldsymbol{R}_{\widetilde u} \widetilde{\boldsymbol{H}}^{H} + \sigma^{2} \boldsymbol{I}_{2MN_{r}}, \end{array} $$

(19)

where \(\boldsymbol {R}_{\widetilde u}\) denotes the covariance matrix of the augmented transmit signal \(\widetilde {\boldsymbol {u}} = \left (\begin {array}{l} \boldsymbol {u} \\ \boldsymbol {u}^{*} \end {array}\right)\) and

$$ \widetilde{\boldsymbol{H}} = \left(\begin{array}{ll} \boldsymbol{H} & \mathbf{0} \\ \mathbf{0} & \boldsymbol{H}^{*} \end{array}\right). $$

(20)

It is worth mentioning that the proper noise has a vanishing pseudo-covariance [34]. The main purpose of using the extended received signal is to increase the dimension of the received signal and thus avoid the degenerate noise subspace. Hence, the subspace identification procedure can be derived only if the signal part covariance matrix, given by \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\), of the covariance matrix \(\boldsymbol {R}_{\widetilde y}\) is singular. It results that \(d_{s} = \text {rank}(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}) < 2MN_{r}\). In this case, the signal is confined in a *d*
_{
s
}-dimensional subspace and the remaining noise subspace is with dimension 2*MN*
_{
r
}−*d*
_{
s
}. Singularity of \(\boldsymbol {R}_{\widetilde u}\) is a necessary condition to obtain a nondegenerate noise subspace. Actually, noting that \(\widetilde {\boldsymbol {H}}\) is full rank, nonsingular \(\boldsymbol {R}_{\widetilde u}\) results in \(\text {rank}(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}) = 2MN_{r}\), and thus, the matrix \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\) spans all the observation space. On the other hand, since the matrix \(\widetilde {\boldsymbol {H}}\) is a tall matrix, singularity of \(\boldsymbol {R}_{\widetilde u}\) is not a sufficient condition to guarantee the singularity of \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\).

The matrix \(\boldsymbol {R}_{\widetilde u}\) can be expressed in a block form in terms of the covariance matrix of *u*, *R*
_{
u
}=*E*(*u*
*u*
^{H}), the pseudo-covariance matrix *C*
_{
u
}=*E*(*u*
*u*
^{T}) and their complex conjugates as

$$ \boldsymbol{R}_{\widetilde u} = \left(\begin{array}{ll} \boldsymbol{R}_{u} & \boldsymbol{C}_{u} \\ \boldsymbol{C}_{u}^{*} & \boldsymbol{R}_{u}^{*} \end{array}\right). $$

(21)

In the following, we distinguish two cases of real and complex modulated symbols.

For real modulated symbols, it can be shown that \(\boldsymbol {R}_{\widetilde u} = \alpha ^{2} \boldsymbol {M} \otimes \boldsymbol {I}_{2N_{t}}\) with the 2*N*×2*N* matrix *M* having the following form:

From (22), we note that each column of *M* appears exactly two times (the first column of *M* is the same as the (*N*+1)^{th} column, and the *i*
^{th} column of *M* is the same as the (2*N*−*i*+2)^{th} column, for *i*=2,…, *N*). Therefore, the matrix *M* has exactly *N*-independent columns and thus its rank is *N*. It follows that the rank of \(\boldsymbol {R}_{\widetilde u}\) is 2*NN*
_{
t
}. In Appendix 1, we show that \(\boldsymbol {R}_{\widetilde u}\) has zero eigenvalue with multiplicity 2*NN*
_{
t
} and 2*α*
^{2} also with multiplicity 2*NN*
_{
t
}. Then, the matrix \(\boldsymbol {R}_{\widetilde u}\) is decomposed as *U*
*D*
*U*
^{H} where *D* is the 4*NN*
_{
t
}×4*NN*
_{
t
} diagonal matrix with zeroes in the first 2*NN*
_{
t
} diagonal elements and 2*α*
^{2} in the last 2*NN*
_{
t
} diagonal elements and *U* is an orthogonal matrix whose columns are the corresponding eigenvectors of \(\boldsymbol {R}_{\widetilde u}\).

For complex symbols, the pseudo-covariance matrix *C*
_{
u
} is generally equal to the zero matrix, which makes the matrix \(\boldsymbol {R}_{\widetilde u}\) of full rank. To avoid this problem, we apply a simple precoding at the input of the IFFT. It transforms the data symbol *X*
_{
q
} to

$$\begin{array}{@{}rcl@{}} \widetilde{\boldsymbol{X}}_{q} = \boldsymbol{P} \boldsymbol{X}_{q} + \boldsymbol{Q} \boldsymbol{X}_{q}^{*}. \end{array} $$

(23)

where *P* and *Q* are two matrices. By combining the data symbol *X*
_{
q
} and its complex conjugate, we force the pseudo-covariance matrix to be different from zero. Appendix 2 gives a detailed discussion about the choice of the matrices *P* and *Q* so that the covariance matrix \(\boldsymbol {R}_{\widetilde u}\) has rank 2*NN*
_{
t
} and can be decomposed as *U*
*D*
*U*
^{H} with *D* as the 4*NN*
_{
t
}×4*NN*
_{
t
} diagonal matrix with zeroes in the first 2*NN*
_{
t
} diagonal elements.

The noise subspace is the span of the *p*=2*MN*
_{
r
}−2*NN*
_{
t
} eigenvectors of \(\boldsymbol {R}_{\widetilde y}\) corresponding to the smallest eigenvalue *σ*
^{2}, and the columns of \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\) belong to the signal subspace. Due to the orthogonality between the signal and the noise subspaces, each column of \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\) is orthogonal to any vector in the noise subspace. Let \(\{\boldsymbol {\nu }_{i}\}_{i=1}^{p}\) denote the *p* co-orthogonal eigenvectors corresponding to the smallest eigenvalue of \(\boldsymbol {R}_{\widetilde y}\). Then we have the following set of equations:

$$ \boldsymbol{\nu}_{i}^{H} \widetilde{\boldsymbol{H}} \boldsymbol{R}_{\widetilde u} \widetilde{\boldsymbol{H}}^{H} = \mathbf{0},~i=1,~2,\dots,~p. $$

(24)

From (24), we conclude that *ν*
_{
i
} spans the left null space of \( \widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\). For convenience, *U* is written as a block of 4 2*NN*
_{
t
}×2*NN*
_{
t
} matrices:

$$ \boldsymbol{U} = \left(\begin{array}{cccccc} \boldsymbol{U}_{1} & \boldsymbol{U}_{2} \\ \boldsymbol{U}_{3} & \boldsymbol{U}_{4} \end{array} \right), $$

(25)

where the columns of \([\boldsymbol {U}_{1}^{T},~\boldsymbol {U}_{3}^{T}]^{T}\) are the eigenvectors of \(\boldsymbol {R}_{\widetilde u}\) corresponding to the eigenvalue zero and the columns of \([\boldsymbol {U}_{2}^{T},~\boldsymbol {U}_{4}^{T}]^{T}\) are the other eigenvectors. Then, taking into account the eigenvalue decomposition of \(\boldsymbol {R}_{\widetilde u}\), the set of equations in (24) are equivalent to

$$\begin{array}{@{}rcl@{}} \boldsymbol{\nu}_{i}^{H} \left(\begin{array}{c} \boldsymbol{H} \boldsymbol{U}_{2}\\ \boldsymbol{H}^{*} \boldsymbol{U}_{4} \end{array} \right) = \mathbf{0},~i=1,~2,\dots,~p. \end{array} $$

(26)

By dividing *ν*
_{
i
} into two *MN*
_{
r
}×1 vectors, i.e., \(\boldsymbol {\nu }_{i} = [\boldsymbol {\nu }_{i,1}^{T},~\boldsymbol {\nu }_{i,2}^{T}]^{T}\), (26) is rewritten as

$$ \boldsymbol{\nu}_{i,1}^{H} \boldsymbol{H} \boldsymbol{U}_{2} + \boldsymbol{\nu}_{i,2}^{H} \boldsymbol{H}^{*} \boldsymbol{U}_{4} = \mathbf{0}, $$

(27)

for *i*=1, 2,…,*p*. The matrix *H* is completely defined by the set of matrices *H*(*l*), for *l*=0, 1,…, *L*. Therefore, the specific structure of *H* should be taken into consideration when solving the equations in (27) to obtain a more accurate estimate of the channels. To that end, we divide the two vectors *ν*
_{
i,1} and *ν*
_{
i,2} as follows:

$$ \begin{aligned} \boldsymbol{\nu}_{i,j} &= \left[\boldsymbol{\nu}_{i,j}^{T}(M),~\boldsymbol{\nu}_{i,j}^{T}(M-1),\dots,~\boldsymbol{\nu}_{i,j}^{T}(1)\right]^{T},\\ j&=1,~2,~i=1,~2,\dots,~p, \end{aligned} $$

(28)

where each *ν*
_{
i,j
}(*n*), for *n*=1, 2,…, *M*, is a *N*
_{
r
}×1 vector. From (13) and (28), each term \(\boldsymbol {\nu }_{i,1}^{H} \boldsymbol {H}\) in (27) is rewritten as

$$ \begin{aligned} &\sum_{l=0}^{L} \boldsymbol{\nu}_{i,1}^{H}(n+L-l) \boldsymbol{H}(l) + \sum_{l=n}^{L} \boldsymbol{\nu}_{i,1}^{H}(M-l+n) \boldsymbol{H}(l),\\&\quad\text{for}~n=1,~\dots,~L,\\ &\sum_{l=0}^{L} \boldsymbol{\nu}_{i,1}^{H}(n+L-l) \boldsymbol{H}(l), \text{for}~n=L+1,\dots,~M, \end{aligned} $$

(29)

and \(\boldsymbol {\nu }_{i,2}^{H} \boldsymbol {H}^{*}\) can also be partitioned in the same manner. By introducing \(\boldsymbol {\check {h}}(l) = \text {vect}(\boldsymbol {H}(l))\) and \(\boldsymbol {V}_{i,j}(n) = \boldsymbol {I}_{2N_{t}} \otimes \boldsymbol {\nu }_{i,j}^{H}(n)\), for *i*=1,…, *p* and *j*=1, 2, it is easy to verify that \(\boldsymbol {\nu }_{i,j}^{H}(n) \boldsymbol {H}(l) = \boldsymbol {\check h}^{T}(l) \boldsymbol {V}_{i,j}^{T}(n)\). Let us denote the 2*NN*
_{
t
}×2*N*
_{
t
}
*N*
_{
r
}(*L*+1) matrices *V*
_{
i,j
}, for *j*=1, 2, as

$$ { \begin{aligned} \boldsymbol{V}_{i,j} &= \left(\begin{array}{llll} \boldsymbol{V}_{i,j}(L+1) & \boldsymbol{V}_{i,j}(L) & \ldots & \boldsymbol{V}_{i,j}(1) \\ \boldsymbol{V}_{i,j}(L+2) & \boldsymbol{V}_{i,j}(L+1) & \ldots & \boldsymbol{V}_{i,j}(2) \\ \boldsymbol{V}_{i,j}(L+3) & \boldsymbol{V}_{i,j}(L+2) & \ldots & \boldsymbol{V}_{i,j}(3) \\ \vdots & \vdots & \vdots & \vdots \\ \boldsymbol{V}_{i,j}(N+L) & \boldsymbol{V}_{i,j}(N+L-1)& \ldots & \boldsymbol{V}_{i,j}(N) \\ \end{array}\right)\\ &\quad+ \left(\begin{array}{llll} \mathbf{0} & \boldsymbol{V}_{i,j}(N+L) & \ldots & \boldsymbol{V}_{i,j}(N+1) \\ & & \ddots & \vdots \\ \vdots& & & \boldsymbol{V}_{i,j}(N+L) \\ \vdots& & & \mathbf{0} \\ & & & \vdots \\ \mathbf{0} & & & \mathbf{0} \\ \end{array}\right), \end{aligned}} $$

(30)

and \(\boldsymbol {\check {h}} = [\boldsymbol {\check h}^{T}(0),~\boldsymbol {\check h}^{T}(1),\dots,~\boldsymbol {\check h}^{T}(L)]^{T}\). Then, using the previous notations, (27) is rearranged to obtain

$$\begin{array}{@{}rcl@{}} \boldsymbol{\check h}^{T} \boldsymbol{V}_{i,1}^{T} \boldsymbol{U}_{2} + \boldsymbol{\check h}^{H} \boldsymbol{V}_{i,2}^{T} \boldsymbol{U}_{4} = \mathbf{0}, \end{array} $$

(31)

or, by taking the transpose of the previous equation:

$$\begin{array}{@{}rcl@{}} \boldsymbol{U}_{2}^{T} \boldsymbol{V}_{i,1} \boldsymbol{\check h} + \boldsymbol{U}_{4}^{T} \boldsymbol{V}_{i,2} \boldsymbol{\check h}^{*} = \mathbf{0}, \end{array} $$

(32)

for *i*=1, 2,…, *p*. Note that the difference between (27) and (32) is that (32) takes into account the Toeplitz blocks structure of *H*. Now, collecting all the previous equations, we obtain

$$ \boldsymbol{\Theta}_{1} \boldsymbol{\check h} +\boldsymbol{\Theta}_{2} \boldsymbol{\check h}^{*} = \mathbf{0}, $$

(33)

where

$$ { \begin{aligned} \boldsymbol{\Theta}_{1} & = \left[\left(\boldsymbol{U}_{2}^{T} \boldsymbol{V}_{1,1}\right)^{T},~\left(\boldsymbol{U}_{2}^{T} \boldsymbol{V}_{2,1}\right)^{T},\dots,~\left(\boldsymbol{U}_{2}^{T} \boldsymbol{V}_{p,1}\right)^{T}\right]^{T},\\ \boldsymbol{\Theta}_{2} & = \left[\left(\boldsymbol{U}_{4}^{T} \boldsymbol{V}_{1,2}\right)^{T},~\left(\boldsymbol{U}_{4}^{T} \boldsymbol{V}_{2,2}\right)^{T},\dots,~\left(\boldsymbol{U}_{4}^{T} \boldsymbol{V}_{p,2}\right)^{T}\right]^{T}. \end{aligned}} $$

(34)

Separating the real and imaginary parts of (33), we have

$$\begin{array}{*{20}l} \underbrace{\left(\begin{array}{ll} \Re(\boldsymbol{\Theta}_{1}+\boldsymbol{\Theta}_{2}) & \Im(-\boldsymbol{\Theta}_{1}+\boldsymbol{\Theta}_{2})\\ \Im(\boldsymbol{\Theta}_{1}+\boldsymbol{\Theta}_{2}) & \Re(\boldsymbol{\Theta}_{1}-\boldsymbol{\Theta}_{2}) \end{array}\right)}_{\boldsymbol{\overline \Theta}} \underbrace{\left(\begin{array}{l} \Re(\boldsymbol{\check h}) \\ \Im(\boldsymbol{\check h}) \end{array}\right)}_{\boldsymbol{\overline h}} = \mathbf{0}. \end{array} $$

(35)

From (35), the vector \(\boldsymbol {\overline h}\) belongs to the right null space of \(\boldsymbol {\overline \Theta }\). In practice, \(\boldsymbol {\overline h}\) is a linear combination of the 4*N*
_{
t
}
*N*
_{
r
} right singular vectors of the matrix \(\boldsymbol {\overline \Theta }\), denoted by *β*
_{
i
}, which are equal to the eigenvector of the Gramian \(\overline {\boldsymbol {\Theta }}\overline {\boldsymbol {\Theta }}^{H}\) corresponding to the zero eigenvalue. Therefore, an estimate of \(\overline {\boldsymbol {h}}\) is given by

$$ \widehat{\overline{\boldsymbol{h}}} = \overline{\boldsymbol{\Phi}} \boldsymbol{c}, $$

(36)

where \(\overline {\boldsymbol {\Phi }}=[\boldsymbol {\beta }_{1},~\boldsymbol {\beta }_{2},\dots,~\boldsymbol {\beta }_{4N_{t}N_{r}}]\), and the 4*N*
_{
t
}
*N*
_{
r
}×1 vector *c* represents the ambiguity term to be estimated. The complex channel vector can also be obtained as

$$ \widehat{\boldsymbol{\check h}} = \boldsymbol{\Phi} \boldsymbol{c}, $$

(37)

where *Φ* is obtained by combining the lines of \(\overline {\boldsymbol {\Phi }}\) in the following way:

$$\begin{array}{@{}rcl@{}} \overline{\boldsymbol{\Phi}} = \left(\begin{array}{l} \overline{\boldsymbol{\Phi}}_{real} \\ \overline{\boldsymbol{\Phi}}_{imag} \end{array}\right) \rightarrow \boldsymbol{\Phi} = \overline{\boldsymbol{\Phi}}_{real} + j\overline{\boldsymbol{\Phi}}_{imag}, \end{array} $$

(38)

and *j* is the complex number satisfying *j*
^{2}=−1.

We mention that the matrices *U*
_{
2
} and *U*
_{
4
} do not depend on the received signal and can be computed offline prior to the transmission. It is also seen that the overestimated channel order *L* does not affect the estimation process. This is a common property with other subspace-based estimators [17].