- Research
- Open Access

# Subspace-based self-interference cancellation for full-duplex MIMO transceivers

- Ahmed Masmoudi
^{1}Email author and - Tho Le-Ngoc
^{1}

**2017**:55

https://doi.org/10.1186/s13638-017-0839-x

© The Author(s) 2017

**Received:**14 July 2016**Accepted:**23 February 2017**Published:**23 March 2017

## Abstract

This paper addresses the self-interference (SI) cancellation at baseband for full-duplex MIMO communication systems in consideration of practical transmitter imperfections. In particular, we develop a subspace-based algorithm to jointly estimate the SI and intended channels and the nonlinear distortions. By exploiting the covariance and pseudo-covariance of the received signal, we can increase the dimension of the received signal subspace while keeping the dimension of the signal subspace constant, and hence, the proposed algorithm can be applied to most of full-duplex MIMO configurations with arbitrary numbers of transmit and receive antennas. The channel coefficients are estimated, up to an ambiguity term, without any knowledge of the intended signal. A joint detection and ambiguity identification scheme is proposed. Simulation results show that the proposed algorithm can properly estimate the channel with only one pilot symbol and offers superior SI cancellation performance.

## Keywords

- Full-duplex communication
- SI suppression
- MIMO
- parameter estimation
- Subspace method
- Second-order statistics

## 1 Introduction

Half-duplex transmission is commonly used in the current communication systems by transmitting and receiving over orthogonal channels. Full-duplex communication represents an attractive alternative to save channel resources or to increase the transmission efficiency. The main deterrent to employ full-duplex is the large self-interference (SI) from the simultaneous transmission and reception over the same frequency band. The SI is usually several orders of magnitude higher than the intended signal received from the other transmitter, because the later travels a longer distance than the former signal. Recent works have shown that, using different cancellation stages, the SI can be sufficiently suppressed to properly detect the intended signal [1, 2].

The SI is first cancelled at the radio-frequency (RF) level, prior to the low-noise amplifier (LNA) and the analog-to-digital converter (ADC), to avoid overloading/saturation of these devices [1–3]. In other words, the SI should be sufficiently suppressed at RF to maintain the receiver’s limited dynamic range. Then, further SI suppression can be done after the ADC at the baseband [4, 5]. In the following, we assume that a cancellation stage at RF is available and we concentrate on the SI cancellation in the baseband.

To further reduce the SI, channel state information of the interference link should be available. Therefore, estimating the SI channel is a critical issue in full-duplex systems. In [6], the SI channel estimation is performed in the frequency domain using a least square (LS) technique. LS and minimum mean square error (MMSE) channel estimations are proposed in [7] to estimate the SI channel in the relay station. However, these approaches ignore the intended signal coming from the other transceiver and treat it as additive noise. An adaptive least mean square algorithm to estimate the SI channel is proposed in [8] where the large SI compared to the intended signal and additive noise is exploited to obtain an estimate of the SI channel. A more elaborate LS-based estimator was presented in [9] where a first estimate of the SI channel is obtained by considering the intended signal as additive noise. Then an iterative detection of the intended signal and channel estimation is performed to obtain a better estimate of the channel. On the other hand, spatial domain cancellation attempts to reduce the SI by precoding at the transmit chain and decoding at the receive chain. Spatial domain cancellation is formulated in the frequency domain [10–12]. An alternative time domain formulation was presented in [13] by precoding the transmitted SI to coincide with the null space of the SI channel. These techniques are based on the knowledge of both the SI and intended channels at the two transceivers, which further motivates the development of channel estimators for full-duplex systems. A novel cancellation method is proposed in [14] by adding a cancelling signal to the original signal.

In addition to the SI channel information for SI cancellation, intended channel knowledge is an important prerequisite for signal detection. Motivated by this fact, channel estimation has been the subject of intense research. In the case of data-aided transmissions, training-based techniques can be applied [15, 16]. However, the amount of training increases dramatically with the number of antennas and channel order. Blind approaches have been proposed as more bandwidth efficient techniques [17, 18] where subspace methods, initially presented in [19], have a great potential. By decomposing the covariance matrix of the received signal, subspace methods exploit the orthogonality between the noise and the signal subspaces in the observation space to express the channel coefficients as a linear combination of a basis of the signal subspace. Although previous researches have shown the potential of this procedure to give an accurate estimate of the channel, it remains of limited practical interest. Actually, considering that the noise subspace needs to be nondegenerated, it is legitimate to wonder how we can satisfy this condition. Previous works rely on oversampling of the received signal or using more receive antennas than transmit antennas [20, 21]. However, such solutions increase the receiver cost and need additional hardware. Moreover, they may result in correlated noise which makes the subspace technique inappropriate. A maximum likelihood estimator was presented in [22] by exploiting the pilots in the intended signal.

In the full-duplex context, the transmitter impairments, including power amplifier (PA) nonlinearity and IQ mixer imbalance, become limiting factors and need to be reduced to properly detect the intended signal. In practice, the inband image resulting from the IQ mixer in mobile user is about 28 dB lower than the direct signal [23]. In the presence of strong SI of about 50 dB higher than the intended signal, this IQ image represents additional interference for the intended signal. The effects of transceiver impairments are illustrated in detail in [3, 24]. Due to the importance of the nonlinearities, a digital cancellation procedure has been proposed to reduce the effects of the PA in [25] by estimating the nonlinear coefficients of the PA and another algorithm has been proposed to deal with the IQ mixer imbalance [26]. However, there is no discussion about the intended signal in the existing literature, which limits the estimation performance if it is considered as additive noise.

In this work, we incorporate the intended signal in the estimation process. We also take into account the transmitter impairments when modelling the SI signal. For realistic multipath propagation channels, we need to estimate the SI channel, the intended channel and the distorted SI. And noting that the intended signal is unknown, we propose to use a novel subspace method to efficiently estimate the different parameters. Since the received signal consists of the SI and intended signals, the dimension of the signal subspace in full-duplex operation is at least twice that in traditional half-duplex operation [5, 27]. Thus an essential shortcoming of the existing subspace-based technique is that it can be applied only when the number of receive antennas is larger than the number of transmit antennas. In the following, we circumvent this condition and develop a subspace-based algorithm suitable for MIMO full-duplex systems with larger or equal numbers of transmit and receive antennas. We exploit both the covariance and pseudo-covariance matrices of the received signal to effectively increase the dimension of the observation space while keeping the dimension of the signal subspace unchanged. The joint processing of the received signal and its complex conjugates has been used in many works to improve the detection performance on various systems [28, 29]. Also, in an entirely different context, the improper property of the received signal was first exploited for channel identification in [30] to obtain a virtual SIMO model from a SISO one. Preliminary results can be found in [31] for real-valued symbols to enable the application of widely linear processing techniques, but entail a loss in spectral efficiency compared to complex-valued symbols. We propose in this paper a method to use the widely linear processing to complex symbols by forcing the transmit signal to be improper. We justify the advocated time domain approach and compare its performances to a frequency domain approach and we generalize the PA model to any nonlinearity order. In practice, we cannot blindly recover the channel coefficients since an ambiguity term always appears in the final estimate [5]. This ambiguity is resolved using a sequence of pilot symbols, considerably shorter than needed in training-based techniques. In the following, we propose a joint data detection and estimation of the ambiguity term to considerably reduce the length of the pilot sequence. We show through simulation that just one pilot symbol is sufficient to perfectly estimate the channel.

The paper is organized as follows. In Section 2, the full-duplex system model is presented. The subspace-based channel estimation is described in Section 3. In Section 4, we describe the joint decoding and ambiguity removal procedure. Illustrative simulation results are given in Section 5 and Section 6 presents the conclusion.

Notations commonly used in this paper are presented. Subscripts (·)^{∗}, (·)^{
T
}, and (·)^{
H
} refer to conjugate, transpose and conjugate transpose for matrices or vectors, respectively. For a given vector x, diag(x) returns a diagonal matrix whose diagonal elements are the entries of x. rank(M) returns the rank of a given matrix M, det(M) returns the determinant of M and vect(M) stacks the columns of M into one vector. The operator ⊗ refers to the Kronecker product of two matrices. ℜ(·) and *I*(·) return the real and imaginary parts of complex numbers. *E*(·) denotes the mathematical expectation. ||·||_{2} returns the Euclidean norm of a vector. I
_{
p
} refers to the *p*×*p* identity matrix and 1
_{
p
} the *p*×1 vector with 1 at all elements. A term accented by a hat, \(\widehat x\), means an estimate of *x*.

## 2 Full-duplex MIMO system model

*N*

_{ t }transmitting antennas and

*N*

_{ r }receiving antennas. At transmitting antenna

*q*, a group of

*N*data symbols X

_{ q }=[

*X*

_{ q }(0),…,

*X*

_{ q }(

*N*−1)]

^{ T }is first modulated by the IFFT matrix to form an OFDM block, then the time domain vector x

_{ q }=[

*x*

_{ q }(0),…,

*x*

_{ q }(

*N*−1)]

^{ T }is extended by the cyclic prefix of length

^{1}

*N*

_{ cp }and the resulting vector is sent sequentially. In the transmit stream

*q*, the complex signal

*x*

_{ q }(

*t*) after the digital-to-analog conversion (DAC), is passed through an imbalance IQ mixer whose output is as follows:

*k*

_{1,q }and

*k*

_{2,q }are the responses of the IQ mixer at antenna

*q*to the direct signal and the image, respectively. Then, the signal is amplified with a nonlinear PA. In the following, we model the PA response with a Hammerstein model whose response is:

*α*

_{2p+1,q }, for

*p*=0,…,

*P*, are the nonlinearity coefficients of the PA at transmit antenna

*q*,

*P*is the nonlinearity order and

*f*(

*t*) is the memory of the PA. In (2), ⋆ denotes the convolution operator. The transmitted signal is coupled to produce SI in the receiver. Considering multipath channels, the received signal at antenna

*r*is as follows:

*s*

_{ q }(

*t*) is the transmitted signal from the

*q*

^{ th }antenna of the other intended transceiver. \(h_{r,q}^{c}(t)\) is the response of the SI channel from transmitting antenna

*q*to receiving antenna

*r*of the same transceiver. \(h_{r,q}^{s}(t)\) is the response of the intended channel from transmitting antenna

*q*of the other intended transceiver to receiving antenna

*r*of the same transceiver.

*w*

_{ th,r }(

*t*) is the additive thermal noise in Rx stream

*r*. To reduce the SI before the LNA and ADC, the RF cancellation stage is performed as follows:

*w*

_{ LNA }(

*t*) is the additive noise caused by the LNA and

*k*

_{ LNA }is the gain of the LNA. Finally, the received signal is adjusted by the variable gain amplifier (VGA) to match the dynamic range of the ADC. For simplicity, we suppose that the linear gains

*k*

_{1,q }and

*α*

_{1,q }of the IQ mixer and PA are equal to 1. Combining (2), (3) and (5), the received samples are given by

*p*+1)

^{ rd }order nonlinearity and

*w*

_{ r }(

*n*) collects the thermal noise, the LNA noise and the quantization noise. In (6), the global channel responses are given by

*L*and the channels of order lower than

*L*are zero-padded so that the different channels have the same order and

*L*still satisfies

*L*<

*N*

_{ cp }. The received vector \(\boldsymbol {y}(n)=\ [\!y_{1}(n),\dots,~y_{N_{r}}(n)]^{T}\) over the

*N*

_{ r }antennas is given by

*l*=0, 1,…,

*L*and \(\boldsymbol {w}(n) =\ [w_{1}(n),~w_{2}(n),\dots, w_{N_{r}}(n)]^{T}\). For a more compact representation, we gather the transmitted signals from the

*N*

_{ t }antennas to obtain

*N*

_{ r }×

*N*

_{ t }matrices H

^{(i)}(

*l*) and H

^{(s)}(

*l*) are given by

*l*=0,…,

*L*and

^{(i)}(

*l*) and H

^{(s)}(

*l*) in one

*N*

_{ r }×2

*N*

_{ t }matrix H(

*l*)= [H

^{(i)}(

*l*), H

^{(s)}(

*l*)] and gather all the channel coefficients in the following

*N*

_{ r }

*M*×2

*N*

_{ t }

*N*block Toeplitz matrix:

*N*

_{ r }antennas is:

*M*=

*N*+

*L*, the 2

*N*

_{ t }

*N*×1 data vector u is given by

For multi-block transmission, the received vector in (14) is indexed by the block number *t*, i.e., y
_{
t
}. For convenience, we omit this indexation and we will consider later a given number of transmitted blocks to compute the covariance matrix of the received vector.

## 3 Subspace-based channel estimator

_{ u }, the covariance of u, the covariance matrix R

_{ y }of the received vector y is given by

as long as the signal samples are uncorrelated from the noise samples^{2}.

*l*∈ [ 0,

*L*] such that H(

*l*) is full rank

^{3}, the matrix H is a full-rank matrix. Therefore, the dimension of the signal subspace is 2

*NN*

_{ t }. It follows that, to obtain a nondegenerate noise subspace, its dimension

*N*

_{ r }

*M*−2

*N*

_{ t }

*N*should be larger than zero, and thus, the number of receiving antennas should be larger than the number of transmitting antennas to make the subspace method work, and in [5], we developed the linear subspace algorithm for this setting. In the following, we will develop the subspace-based algorithm for general numbers of transmit and receive antennas. When

*N*

_{ t }=

*N*

_{ r }, the matrix R

_{ y }cannot be directly used to find the noise subspace. As an alternative different approach, we consider the augmented received vector as

It is worth mentioning that the proper noise has a vanishing pseudo-covariance [34]. The main purpose of using the extended received signal is to increase the dimension of the received signal and thus avoid the degenerate noise subspace. Hence, the subspace identification procedure can be derived only if the signal part covariance matrix, given by \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\), of the covariance matrix \(\boldsymbol {R}_{\widetilde y}\) is singular. It results that \(d_{s} = \text {rank}(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}) < 2MN_{r}\). In this case, the signal is confined in a *d*
_{
s
}-dimensional subspace and the remaining noise subspace is with dimension 2*MN*
_{
r
}−*d*
_{
s
}. Singularity of \(\boldsymbol {R}_{\widetilde u}\) is a necessary condition to obtain a nondegenerate noise subspace. Actually, noting that \(\widetilde {\boldsymbol {H}}\) is full rank, nonsingular \(\boldsymbol {R}_{\widetilde u}\) results in \(\text {rank}(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}) = 2MN_{r}\), and thus, the matrix \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\) spans all the observation space. On the other hand, since the matrix \(\widetilde {\boldsymbol {H}}\) is a tall matrix, singularity of \(\boldsymbol {R}_{\widetilde u}\) is not a sufficient condition to guarantee the singularity of \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\).

_{ u }=

*E*(u u

^{ H }), the pseudo-covariance matrix C

_{ u }=

*E*(u u

^{ T }) and their complex conjugates as

In the following, we distinguish two cases of real and complex modulated symbols.

*N*×2

*N*matrix M having the following form:

From (22), we note that each column of M appears exactly two times (the first column of M is the same as the (*N*+1)^{
th
} column, and the *i*
^{
th
} column of M is the same as the (2*N*−*i*+2)^{
th
} column, for *i*=2,…, *N*). Therefore, the matrix M has exactly *N*-independent columns and thus its rank is *N*. It follows that the rank of \(\boldsymbol {R}_{\widetilde u}\) is 2*NN*
_{
t
}. In Appendix 1, we show that \(\boldsymbol {R}_{\widetilde u}\) has zero eigenvalue with multiplicity 2*NN*
_{
t
} and 2*α*
^{2} also with multiplicity 2*NN*
_{
t
}. Then, the matrix \(\boldsymbol {R}_{\widetilde u}\) is decomposed as U
D
U
^{
H
} where D is the 4*NN*
_{
t
}×4*NN*
_{
t
} diagonal matrix with zeroes in the first 2*NN*
_{
t
} diagonal elements and 2*α*
^{2} in the last 2*NN*
_{
t
} diagonal elements and U is an orthogonal matrix whose columns are the corresponding eigenvectors of \(\boldsymbol {R}_{\widetilde u}\).

_{ u }is generally equal to the zero matrix, which makes the matrix \(\boldsymbol {R}_{\widetilde u}\) of full rank. To avoid this problem, we apply a simple precoding at the input of the IFFT. It transforms the data symbol X

_{ q }to

where P and Q are two matrices. By combining the data symbol X
_{
q
} and its complex conjugate, we force the pseudo-covariance matrix to be different from zero. Appendix 2 gives a detailed discussion about the choice of the matrices P and Q so that the covariance matrix \(\boldsymbol {R}_{\widetilde u}\) has rank 2*NN*
_{
t
} and can be decomposed as U
D
U
^{
H
} with D as the 4*NN*
_{
t
}×4*NN*
_{
t
} diagonal matrix with zeroes in the first 2*NN*
_{
t
} diagonal elements.

*p*=2

*MN*

_{ r }−2

*NN*

_{ t }eigenvectors of \(\boldsymbol {R}_{\widetilde y}\) corresponding to the smallest eigenvalue

*σ*

^{2}, and the columns of \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\) belong to the signal subspace. Due to the orthogonality between the signal and the noise subspaces, each column of \(\widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\) is orthogonal to any vector in the noise subspace. Let \(\{\boldsymbol {\nu }_{i}\}_{i=1}^{p}\) denote the

*p*co-orthogonal eigenvectors corresponding to the smallest eigenvalue of \(\boldsymbol {R}_{\widetilde y}\). Then we have the following set of equations:

_{ i }spans the left null space of \( \widetilde {\boldsymbol {H}} \boldsymbol {R}_{\widetilde u} \widetilde {\boldsymbol {H}}^{H}\). For convenience, U is written as a block of 4 2

*NN*

_{ t }×2

*NN*

_{ t }matrices:

_{ i }into two

*MN*

_{ r }×1 vectors, i.e., \(\boldsymbol {\nu }_{i} = [\boldsymbol {\nu }_{i,1}^{T},~\boldsymbol {\nu }_{i,2}^{T}]^{T}\), (26) is rewritten as

*i*=1, 2,…,

*p*. The matrix H is completely defined by the set of matrices H(

*l*), for

*l*=0, 1,…,

*L*. Therefore, the specific structure of H should be taken into consideration when solving the equations in (27) to obtain a more accurate estimate of the channels. To that end, we divide the two vectors ν

_{ i,1}and ν

_{ i,2}as follows:

_{ i,j }(

*n*), for

*n*=1, 2,…,

*M*, is a

*N*

_{ r }×1 vector. From (13) and (28), each term \(\boldsymbol {\nu }_{i,1}^{H} \boldsymbol {H}\) in (27) is rewritten as

*i*=1,…,

*p*and

*j*=1, 2, it is easy to verify that \(\boldsymbol {\nu }_{i,j}^{H}(n) \boldsymbol {H}(l) = \boldsymbol {\check h}^{T}(l) \boldsymbol {V}_{i,j}^{T}(n)\). Let us denote the 2

*NN*

_{ t }×2

*N*

_{ t }

*N*

_{ r }(

*L*+1) matrices V

_{ i,j }, for

*j*=1, 2, as

*i*=1, 2,…,

*p*. Note that the difference between (27) and (32) is that (32) takes into account the Toeplitz blocks structure of H. Now, collecting all the previous equations, we obtain

*N*

_{ t }

*N*

_{ r }right singular vectors of the matrix \(\boldsymbol {\overline \Theta }\), denoted by β

_{ i }, which are equal to the eigenvector of the Gramian \(\overline {\boldsymbol {\Theta }}\overline {\boldsymbol {\Theta }}^{H}\) corresponding to the zero eigenvalue. Therefore, an estimate of \(\overline {\boldsymbol {h}}\) is given by

*N*

_{ t }

*N*

_{ r }×1 vector c represents the ambiguity term to be estimated. The complex channel vector can also be obtained as

and *j* is the complex number satisfying *j*
^{2}=−1.

We mention that the matrices U
_{
2
} and U
_{
4
} do not depend on the received signal and can be computed offline prior to the transmission. It is also seen that the overestimated channel order *L* does not affect the estimation process. This is a common property with other subspace-based estimators [17].

## 4 Resolving the ambiguity term

*N*

_{ t }

*N*

_{ r }(

*L*+1)×4

*N*

_{ t }

*N*

_{ r }matrices Φ

_{ i }and Φ

_{ s }which contribute in the SI and intended channels, respectively (i.e., \(\boldsymbol {\check h}^{(i)} = \boldsymbol {\Phi }_{i} \boldsymbol {c}\) and \(\boldsymbol {\check h}^{(s)} = \boldsymbol {\Phi }_{s} \boldsymbol {c}\big)\). By rearranging the elements of Φ

_{ i }as

_{ i,q }(

*l*) is a

*N*

_{ r }×4

*N*

_{ t }

*N*

_{ r }matrix, \(\boldsymbol {\check H}^{(i)} = [{\boldsymbol {H}^{(i)}}^{T}(0),~{\boldsymbol {H}^{(i)}}^{T}(1),\dots,~{\boldsymbol {H}^{(i)}}^{T}(L)]^{T}\) can be written as

and \(\boldsymbol {\check H}^{(s)} = [{\boldsymbol {H}^{(s)}}^{T}(0),~{\boldsymbol {H}^{(s)}}^{T}(1),\dots,~{\boldsymbol {H}^{(s)}}^{T}(L)]^{T}\) can be also written as \(\boldsymbol {\check H}^{(s)} = \boldsymbol {\check \Phi }_{s} (\boldsymbol {I}_{N_{t}} \otimes \boldsymbol {c}),\) where \(\boldsymbol {\check \Phi }_{s}\) is defined in the same way as \(\boldsymbol {\check \Phi }_{i}\). \(\boldsymbol {\check H}^{(i)}\) and \(\boldsymbol {\check \Phi }_{i}\) are used to build the matrices H
^{(i)} and Ψ
_{
i
}, respectively, having the same block structure as H in (13).

_{ p }whose diagonal elements are \(\boldsymbol {k} = [k_{2,1},\dots,~k_{2,N_{t}}]^{T}\) and \(\boldsymbol {\alpha }_{p} = [\alpha _{2p+1,1},\dots,~\alpha _{2p+1,N_{t}}]^{T}\), respectively, and we denote by \(\boldsymbol {x}_{ip,p}(n) = [ x_{1,ip,p}(n),\dots,~ x_{N_{t},ip,p}(n)]^{T}\), and \(\boldsymbol {x}_{ip,p} = [\boldsymbol {x}_{ip,p}^{T}(0),\dots,~\boldsymbol {x}_{ip,p}^{T}(N-1)]^{T}\). Using the previous notations and by developing \(\boldsymbol {x} = \boldsymbol {x}_{i} + (\boldsymbol {I}_{N} \otimes \boldsymbol {K}) \boldsymbol {x}^{*}_{i} + \sum _{p=1}^{P} (\boldsymbol {I}_{N} \otimes \boldsymbol {A}_{p}) \boldsymbol {x}_{ip,p}\) in terms of the transmitter impairments, one can express the received signal in (14) as

_{ s }and H

^{(s)}are defined in the same way as Ψ

_{ i }and H

^{(i)}, respectively, and s=[s

^{ T }(0),…, s

^{ T }(

*N*−1)]

^{ T }. After some manipulations, one can easily verify that \((\boldsymbol {I}_{NN_{t}} \otimes \boldsymbol {c}) \boldsymbol {x}_{i} = (\boldsymbol {x}_{i} \otimes \boldsymbol {I}_{4N_{t}N_{r}}) \boldsymbol {c}\) and \((\boldsymbol {I}_{NN_{t}} \otimes \boldsymbol {c}) \boldsymbol {s} = (\boldsymbol {s} \otimes \boldsymbol {I}_{4N_{t}N_{r}}) \boldsymbol {c}\). Then, the received vector in (41) is rewritten as

_{ N }⊗A

_{ p })x

_{ ip,p }and \((\boldsymbol {I}_{N} \otimes \boldsymbol {K}) \boldsymbol {x}_{i}^{*}\) of the SI from the cascade of the IQ mixer and PA need to be estimated. We begin by writing the following cost function \(f(\boldsymbol {c},\boldsymbol {s},\boldsymbol {K},\boldsymbol {A}_{p}) = ||\boldsymbol {y} - \boldsymbol {\Psi }_{i} ((\boldsymbol {x}_{i} + (\boldsymbol {I}_{N} \otimes \boldsymbol {K})\boldsymbol {x}_{i}^{*} + \sum _{p=1}^{P}(\boldsymbol {I}_{N} \otimes \boldsymbol {A}_{p}) \boldsymbol {x}_{ip,p}) \otimes \boldsymbol {I}_{4N_{t}N_{r}}) \boldsymbol {c} - \boldsymbol {\Psi }_{s} (\boldsymbol {s} \otimes \boldsymbol {I}_{4N_{t}N_{r}}) \boldsymbol {c}||^{2}\) depending on c, K, A

_{ p }(for

*p*=1,…,

*P*) and s. Given an initial estimate \(\widehat {\boldsymbol {c}}\) of c, the minimization of \(f(\widehat {\boldsymbol {c}},\boldsymbol {s},\boldsymbol {K},\boldsymbol {A}_{p})\) with respect to s, K and A

_{ p }can be recast as a least square (LS) problem. Then, using the solutions \(\widehat {\boldsymbol {s}}\), \(\widehat {\boldsymbol {K}}\) and \(\widehat {\boldsymbol {A}}_{p}\), we minimize \(f(\boldsymbol {c},\widehat {\boldsymbol {s}},\widehat {\boldsymbol {K}},\widehat {\boldsymbol {A}}_{p})\) with respect to c. We iterate this procedure until the estimated parameters converge. An initial estimate of c is obtained using the LS criteria as

^{ # }returns the pseudo-inverse of a given matrix. At the

*k*

^{ th }iteration, the estimate \(\widehat {\boldsymbol {c}}_{k-1}\) obtained at the previous iteration is used to find s, K and A

_{ p }(or equivalently k and α

_{ p }) as follows:

where, for clarity, we introduce \(\phantom {\dot {i}\!}\boldsymbol {B} = \boldsymbol {1}_{N} \otimes \boldsymbol {I}_{N_{t}}\) and \(\phantom {\dot {i}\!}\widehat {\boldsymbol {C}}_{k-1} = \boldsymbol {I}_{NN_{t}} \otimes \widehat {\boldsymbol {c}}_{k-1}\) and we use the equality \(\Big (\big ((\boldsymbol {I}_{N} \otimes \boldsymbol {K})\boldsymbol {x}_{i}^{*} \big) \otimes \boldsymbol {I}_{4N_{t}N_{r}} \Big) \boldsymbol {c} = \Big (\big (\text {diag}(\boldsymbol {x}_{i}^{*})\boldsymbol {B} \big) \otimes \boldsymbol {c} \Big) \boldsymbol {k}\). Then, \(\widehat {\boldsymbol {s}}_{k}\) is transformed in the frequency domain and each element of the frequency domain vector is projected to its closest discrete constellation point. The obtained vector is converted back to the time domain to obtain a better estimate \(\widetilde {\boldsymbol {s}}_{k}\) of s.

*k*is obtained as:

*P*

_{pilot}, pilot symbols are available at subcarriers indexed by \(\mathcal {P}=\{ p_{1},\dots,~p_{P_{\text {pilot}}}\}\), the intended transmit signal at antenna

*q*can be represented as the sum of two signals:

^{ p }and s

^{ d }are constructed in the same way as s and contain the pilot symbols and unknown symbols, respectively. The initial estimate of c is modified to incorporate the pilot symbols as

^{ d }, K and A

_{ p }at iteration

*k*are given by

*k*is obtained as:

- 1.Compute the augmented covariance matrix \(\boldsymbol {R}_{\widetilde y}\) by time averaging of
*T*received samples as:$$ \widehat{\boldsymbol{R}}_{\widetilde y} = \frac{1}{T} \sum_{t=1}^{T} \left(\begin{array}{l} \boldsymbol{y}_{t} \\ \boldsymbol{y}^{*}_{t} \end{array}\right) \left(\begin{array}{l} \boldsymbol{y}_{t} \\ \boldsymbol{y}^{*}_{t} \end{array}\right)^{H} $$ - 2.
Perform eigendecomposition of \(\boldsymbol {R}_{\widetilde y}\) and take the

*p*eigenvectors ν_{ i }corresponding to the smallest eigenvalue of \(\boldsymbol {R}_{\widetilde y}\). - 3.
Construct the matrix \(\boldsymbol {\overline \Theta }\) from ν

_{ i }and compute the 4*N*_{ t }*N*_{ r }singular vectors of \(\boldsymbol {\overline \Theta }\) corresponding to the zero singular value to form \(\overline {\boldsymbol {\Phi }}\). - 4.
Build the matrices \(\boldsymbol {\check \Phi _{i}}\) and \(\boldsymbol {\check \Phi _{s}}\) as given in (39).

- 5.
Estimate the ambiguity vector c by iterating between (44) and (45) if no pilot symbols are available or between (49) and (50) if a set of pilot symbols are available from the intended transceiver.

## 5 Simulation results

In this section, we provide some simulation results on the performance of the proposed estimation algorithm for a 2×2 MIMO full-duplex system. The transmitted bits are mapped to 4-QAM symbols, then passed through an OFDM modulator of length *N*=64. The wireless channel is represented as a Rayleigh multipath fading channel with five equal-variance resolvable paths. Since the exact number of paths is supposed to be unknown, the algorithm is parametrized as if there are eight paths. In the following, the SNR is defined as the average intended-signal-to-thermal noise power ratio and the estimation mean square error (MSE) of H is \(\textrm {MSE} = E\Big (||\boldsymbol {H} - \widehat {\boldsymbol {H}}||^{2}\Big).\) To model the RF impairments, a complete transmission chain is simulated. The PA coefficients are derived from the intercept points by taking the IIP 3=20 dBm. For the IQ mixer, the ratio between the direct signal and the image is set to 28 dB which is specified in 3GPP LTE specifications [23]. The ADC is modelled as a 14-bit quantizer to incorporate the quantization noise. Therefore, no simplifications are made regarding the different impairments. Antenna separation can attenuate the SI by 40 dB while the RF cancellation stage reduces the direct path by 30 dB [1] leaving the weaker reflections and transceiver impairments to be reduced by the proposed digital algorithm.

*self*signal and the pilot symbols in the intended signal. It simply considers the unknown symbols as additive noise. The ML estimate is obtained by maximizing the following cost function:

where \(\boldsymbol {R} = \alpha ^{2} {\boldsymbol {H}^{(s)}}^{H}\boldsymbol {H}^{(s)} + \sigma ^{2} \boldsymbol {I}_{N_{r}M}\). An iterative procedure to find the ML estimate was proposed in [35]. The covariance matrix is obtained by averaging 60 OFDM blocks. Figures 2 and 3 plot the MSE vs. SNR curves for the SI and intended channel estimations, respectively. In both figures, one pilot symbol, from the intended transceiver, is used to solve the ambiguity matrix. For comparison purpose, a perfect estimate of the ambiguity term c is obtained as \(\boldsymbol {c}_{perfect} = \arg \min _{\boldsymbol {c}}||\boldsymbol {\check h} - \boldsymbol {\Phi }\boldsymbol {c}||_{2}^{2}\) and the corresponding curves are labelled by clairvoyant subspace. It is seen that, when one pilot symbol is used in the ML and LS estimators, the proposed subspace algorithm offers notably lower MSE over a large SNR range. We also represent the performance of the ML and LS estimators when 20% of the transmit symbols are known (pilot symbols equally spaced within one OFDM symbol) while keeping one pilot symbol for the subspace method^{4}. In this case, the three algorithms give comparable performance at low SNR region with the expanse of lower bandwidth efficiency. As the SNR increases, the performance of the LS and ML estimators saturate due to the reduced number of pilot symbols and the presence of the unknown transmit signal from the intended transceiver which acts as an additive noise. While the subspace algorithm exploits the information bearing in the unknown data to find the signal subspace. The ambiguity term is first solved using the known transmit symbols, then the iterative decoding ambiguity estimation is applied to improve the estimation performance. From Figs. 2 and 3, three to four iterations are sufficient to converge and the performance is close to the performance when the ambiguity term c is perfectly obtained. Note that the ML solution is also obtained in an iterative way and for a fair comparison; we simulate the performance of the ML estimator after four iterations. As it can be expected, the estimate of the SI channel is more accurate than the estimate of the intended channel. This can be explained by the fact that the self-signal is known while one pilot symbol is known in the intended signal.

*P*

_{pilot}combinations from

*N*subcarriers and hence, leads to an NP-hard problem beyond the scope of this paper, and is left for future work. It can be seen from these figures that the subspace method is not greatly affected by the number of pilot symbols since the subspaces are obtained using the second-order statistics of the received signal and not the transmit signal itself. Clearly, the proposed algorithm outperforms the ML and LS estimators at a reduced number of pilots while this tendency is inverted when the number of pilots increases. However, a system with a large amount of pilot symbols is not of practical interest.

*f*

_{3dB }for SNR =20 dB and common oscillator at the transmitter and the receiver. The residual SI depends obviously on the quality of the oscillator represented by its

*f*

_{3dB }. Higher

*f*

_{3dB }results in a fast varying process. Clearly, the proposed method still offers good cancellation performance, which is degraded as

*f*

_{3dB }increases.

*perfect*cancellation, the resulting SINR after cancellation would be the SNR =20 dB. A lower IIP3 indicates higher PA distortions (or poorer PA) and hence reduces the resulting SINR after cancellation. Figure 12 shows that as the IIP3 value increases, the cancellation performance is improved. However, for a sufficiently high IIP3 (e.g., 18 dBm or higher), the PA distortions are no longer dominant and the resulting SINR after cancellation is unchanged. This can be explained by the fact that, when developing the algorithm, the third-order component of the signal \(x_{q,ip3}(n) = x_{q}^{IQ}(n)|x_{q}^{IQ}(n)|^{2}\) is approximated by

*x*

_{ q }(

*n*)|

*x*

_{ q }(

*n*)|

^{2}to simplify the algorithm. This approximation only affects the algorithm performance when the nonlinear coefficients are sufficiently high.

## 6 Conclusions

In this paper, a subspace-based estimation has been proposed to jointly estimate the SI channel, the intended channel and the transmitter impairments for MIMO full-duplex systems. By exploiting the covariance and pseudo-covariance matrix of the received signal, an effective way has been formulated to apply the subspace method for symmetric MIMO systems. The complete characterization of the second-order statistic of the received signal avoids the need of oversampling, required in traditional subspace methods. The subspace that contains the channels is blindly estimated and a short pilot sequence is needed to extract the channel coefficients from this subspace. The proposed method dramatically reduces the number of pilot symbols needed to identify the channel coefficients. Simulation results show that one pilot symbol is enough to obtain an accurate estimate while other methods are not able to recover the channel.

## 7 Endnotes

^{1} The length of the cyclic prefix *N*
_{
cp
} should be larger than the delay spread of the channel to eliminate the inter-symbol interference and inter-carrier interference. Therefore, if we know the length of the channel, we can set the cyclic prefix to be sufficiently large to satisfy *N*
_{
cp
}>*L*. Since this information is in general not available, *N*
_{
cp
} is chosen to guarantee *N*
_{
cp
}>*L*. For example, if the distance between the two transceivers is 1 km, a cyclic prefix of 4 microsec is sufficient.

^{2} Physically, the additive noise arises from the thermal agitation of the charge carriers in an electronic device and is independent from the input. It can also contain interference from other systems whose signals are independent from the transmit signal of the considered system.

^{3} The previous condition is verified for independent channels between different antennas.

^{4} The pilot symbols are equally spaced within one OFDM symbol.

## 8 Appendix 1: Eigenvalues of \(\boldsymbol {R}_{\widetilde u}\)

*N*, then it has

*N*strictly positive eigenvalues,

*τ*

_{1},

*τ*

_{2},…,

*τ*

_{ N }, and eigenvalue 0 of multiplicity

*N*. And since the covariance matrix \(\boldsymbol {R}_{\widetilde u}\) is given by \(\alpha ^{2} \boldsymbol {M} \otimes \boldsymbol {I}_{2N_{t}}\), it follows that \(\boldsymbol {R}_{\widetilde u}\) has also

*N*eigenvalues

*τ*

_{1},

*τ*

_{2},…,

*τ*

_{ N }each of multiplicity 2

*N*

_{ t }and eigenvalue 0 of multiplicity 2

*NN*

_{ t }. To find the non-zero eigenvalues, we solve the characteristic polynomial of M given by

First, if *τ*=1 is an eigenvalue of M, then it exists a vector a≠**0** such that M
a−a=**0**. It follows that a(1)=a(2)=⋯=a(2*N*)=0, which is in contradiction with a≠**0**. Therefore, 1 is not an eigenvalue of M.

*τ*≠1, is written as

where we used the fact that M
_{1,2}
M
_{1,2}=I
_{
N
}. Then, the solutions to det(M−*τ*
I
_{2N
})=0 are 0 and 2. Therefore, all non-zero eigenvalues of M are equal to 2 and thus all the non-zero eigenvalues of \(\boldsymbol {R}_{\widetilde u}\) are equal to 2*α*
^{2}.

## 9 Appendix 2: Precoding for complex modulation

*a*,

*b*,

*c*and

*d*. Similarly to the real modulation, we have \(\boldsymbol {R}_{\widetilde u} = \boldsymbol {M} \otimes \boldsymbol {I}_{2N_{t}}\) where M for complex modulation is given by

*a*

^{2}+

*c*

^{2}=

*b*

^{2}+

*d*

^{2}. Thus, for

*a*,

*b*,

*c*and

*d*satisfying

*a*

^{2}+

*c*

^{2}=

*ad*+

*bc*and

*b*

^{2}+

*d*

^{2}=

*ad*+

*bc*, each line of M is repeated two times and \(\boldsymbol {R}_{\widetilde u}\) has rank 2

*NN*

_{ t }. As an example, we can take

*a*=0.757,

*b*=0.5032,

*c*=0.4935 and

*d*=0.7506.

## Declarations

### Acknowledgements

This work was supported in part by an R&D Contract from Huawei Technologies Canada and in part by a Grant from the Natural Sciences and Engineering Research Council of Canada.

### Competing interests

The authors declare that they have no competing interests.

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- JI Choi, M Jain, K Srinivasan, P Levis, S Katti, in
*Proc. ACM MobiCom*. Achieving single channel, full duplex wireless communication (ACM Chicago, 2010), pp. 1–12.Google Scholar - M Duarte, A Sabharwal, V Aggarwal, R Jana, KK Ramakrishnan, CW Rice, NK Shankaranarayanan, Design and characterization of a full-duplex multiantenna system for WiFi networks. IEEE Trans. Veh. Technol.
**63**(3), 1160–1177 (2014).View ArticleGoogle Scholar - A Masmoudi, T Le-Ngoc, in
*Proc. IEEE Global Telecommun. Conf*. Self-interference cancellation limits in full-duplex communication systems (IEEE Washington DC, 2016).Google Scholar - MA Khojastepour, S Rangarajan, in
*Proc. ASILOMAR Signals, Syst., Comput*. Wideband digital cancellation for full-duplex communications (IEEE Pacific Frove, 2012), pp. 1300–1304.Google Scholar - A Masmoudi, T Le-Ngoc, Channel estimation and self-interference cancellation in full-duplex communication systems. IEEE Trans. Veh. Technol.
**66**(1), 321–334 (2017).Google Scholar - M Duarte, C Dick, A Sabharwal, Experiment-driven characterization of full-duplex wireless systems. IEEE Trans. Wireless Comm.
**11**(12), 4296–4307 (2012).View ArticleGoogle Scholar - J Ma, GY Li, J Zhang, T Kuze, H Iura, in Proc. IEEE Global Telecommun. Conf. A new coupling channel estimator for cross-talk cancellation at wireless relay stations (Honolulu, 2009).Google Scholar
- JR Krier, IF Akyildiz, in
*Proc. IEEE Pers. Indoor and Mobile Radio Commun*. Active self-interference cancellation of passband signals using gradient descent (IEEE London, 2013).Google Scholar - S Li, RD Murch, in Proc. IEEE Global Telecommun. Conf. Full-duplex wireless communication using transmitter output based echo cancellation, (2011), pp. 1–5.Google Scholar
- D Bliss, P Parker, A Margetts, in Prog. IEEE Statistical Signal Processing. Simultaneous transmission and reception for improved wireless network performance, (2007), pp. 478–482.Google Scholar
- BP Day, AR Margetts, DW Bliss, P Schniter, Full-duplex bidirectional MIMO: achievable rates under limited dynamic range. IEEE Trans. Signal Process.
**60**(7), 3702–3713 (2012).MathSciNetView ArticleGoogle Scholar - AC Cirik, J Zhang, M Haardt, Y Hua, in IEEE Workshop on Signal Processing Advances in Wireless Communications (SPAWC). Sum-rate maximization for bi-directional full-duplex MIMO systems under multiple linear constraints, (2014), pp. 389–393.Google Scholar
- Y Hua, P Liang, Y Ma, AC Cirik, Q Gao, A method for broadband full-duplex MIMO radio. IEEE Signal Process. Lett.
**19**(12), 793–796 (2012).View ArticleGoogle Scholar - A Masmoudi, T Le-Ngoc, in
*Proc. IEEE Veh. Technol. Conf*. Self-interference mitigation using active signal injection full-duplex MIMO-OFDM systems (IEEE Montreal, 2016).Google Scholar - J-J Van de Beek, O Edfors, M Sandell, SK Wilson, P Ola Borjesson, in Proc. IEEE Veh. Technol. Conf. On channel estimation in OFDM systems, (1995), pp. 815–819.Google Scholar
- H Minn, N Al-Dhahir, Optimal training signals for MIMO OFDM channel estimation. IEEE Trans. Wireless Comm.
**5**(5), 1158–1168 (2006).View ArticleGoogle Scholar - F Gao, Y Zeng, A Nallanathan, T-S Ng, Robust subspace blind channel estimation for cyclic prefixed MIMO ODFM systems: algorithm, identifiability and performance analysis. IEEE J. Select. Areas Comm.
**26**(2), 378–388 (2008).View ArticleGoogle Scholar - C-C Tu, B Champagne, Subspace-based blind channel estimation for MIMO-OFDM systems with reduced time averaging. IEEE Trans. Veh. Technol.
**59**(3), 1539–1544 (2010).View ArticleGoogle Scholar - E Moulines, P Duhamel, J-F Cardoso, S Mayrargue, Subspace methods for the blind identification of multichannel FIR filters. IEEE Trans. Signal Process.
**43**(2), 516–525 (1995).View ArticleGoogle Scholar - Y Zeng, T-S Ng, A semi-blind channel estimation method for multiuser multiantenna OFDM systems. IEEE Trans. Signal Process.
**52**(5), 1419–1429 (2004).MathSciNetView ArticleGoogle Scholar - E de Carvalho, DT Slock, Blind and semi-blind FIR multichannel estimation: (global) identifiability conditions. IEEE Trans. Signal Process.
**52**(4), 1053–1064 (2004).MathSciNetView ArticleGoogle Scholar - A Masmoudi, T Le-Ngoc, A maximum-likelihood channel estimator for self-interference cancellation in full-duplex systems. IEEE Trans. Veh. Technol.
**65**(7), 5122–5132 (2016).View ArticleGoogle Scholar - LTE; evolved universal terrestrial radio access (E-UTRA); user equipment (UE) radio transmission and reception (3GPP TS 36.101 version 11.2.0 release 11). ETSI, Sophia Antipolis Cedex, France (2012).Google Scholar
- DW Bliss, TM Hancock, P Schniter, in
*Proc. ASILOMAR Signals, Syst., Comput*. Hardware phenomenological effects on cochannel full-duplex MIMO relay performance (IEEE Pacific Frove, 2012).Google Scholar - E Ahmed, A Eltawil, A Sabharwal, in
*Proc. ASILOMAR Signals, Syst., Comput*. Self-interference cancellation with nonlinear distortion suppression for full-duplex systems (IEEE Pacific Frove, 2013).Google Scholar - D Korpi, L Anttila, V Syrjala, M Valkama, Widely linear digital self-interference cancellation in direct-conversion full-duplex transceiver. IEEE J. Selected Areas Commun.
**32**(9), 1674–1687 (2014).View ArticleGoogle Scholar - A Masmoudi, T Le-Ngoc, in
*Proc. IEEE Wireless Commun. and Netw. Conf*. Self-interference cancellation for full-duplex MIMO transceivers (IEEE New Orleans, 2015).Google Scholar - WH Gerstacker, R Schober, A Lampe, Receivers with widely linear processing for frequency-selective channels. IEEE Trans. Commun.
**51**(9), 1512–1523 (2003).View ArticleGoogle Scholar - R Schober, WH Gerstacker, L-J Lampe, Data-aided and blind stochastic gradient algorithms for widely linear MMSE MAI suppression for DS-CDMA. IEEE Trans. Signal Process.
**52**(3), 746–756 (2004).MathSciNetView ArticleGoogle Scholar - M Kristensson, B Ottersten, D Slock, in
*Proc. ASILOMAR Signals, Syst., Comput*. Blind subspace identification of a BPSK communication channel (IEEE Pacific Frove, 1996).Google Scholar - A Masmoudi, T Le-Ngoc, in
*Proc. IEEE Int. Conf. Commun*. A digital subspace-based self-interference cancellation in full-duplex MIMO transceivers (IEEE London, 2015), pp. 4954–4959.Google Scholar - A Masmoudi, T Le-Ngoc, in
*Proc. IEEE Int. Conf. Commun*. Residual self-interference after cancellation in full-duplex systems (IEEE Sydney, 2014).Google Scholar - JG McMichael, KE Kolodziej, in 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton). Optimal tuning of analog self-interference cancellers for full-duplex wireless communication, (2012), pp. 246–251.Google Scholar
- FD Neeser, JL Massey, Proper complex random processes with applications to information theory. IEEE Trans. Inf. Theory.
**39**(4), 1293–1302 (1993).MathSciNetView ArticleMATHGoogle Scholar - A Masmoudi, T Le-Ngoc, in
*Proc. IEEE Veh. Technol. Conf*. A maximum-likelihood channel estimator in MIMO full-duplex systems (IEEE Vancouver, 2014).Google Scholar