Nonlinear self-interference cancellation in MIMO full-duplex transceivers under crosstalk

This paper presents a novel digital self-interference canceller for an inband multiple-input-multiple-output (MIMO) full-duplex radio. The signal model utilized by the canceller is capable of modeling the in-phase quadrature (IQ) imbalance, the nonlinearity of the transmitter power amplifier, and the crosstalk between the transmitters, thereby being the most comprehensive signal model presented thus far within the full-duplex literature. Furthermore, it is also shown to be valid for various different radio frequency (RF) cancellation solutions. In addition to this, a novel complexity reduction scheme for the digital canceller is also presented. It is based on the widely known principal component analysis, which is used to generate a transformation matrix for controlling the number of parameters in the canceller. Extensive waveform simulations are then carried out, and the obtained results confirm the high performance of the proposed digital canceller under various circuit imperfections. The complexity reduction scheme is also shown to be capable of removing up to 65% of the parameters in the digital canceller, thereby significantly reducing its computational requirements.


Introduction
Inband full-duplex communications is a promising candidate technology for further improving the spectral efficiency of the next generation wireless systems, such as the 5G networks [1][2][3][4][5][6][7][8][9][10][11]. The basic idea behind it is to transmit and receive at the same time at the same center-frequency, thereby in principle doubling the spectral efficiency. The drawback of such inband full-duplex operation is the own transmit signal, which is coupling to the receiver and becomes an extremely powerful source of self-interference (SI). The most significant challenge in implementing inband full-duplex radios in practice is thereby the development of SI cancellation solutions, which are capable of removing the SI in the receiver. There are already reports of various demonstrator implementations, which achieve relatively high SI cancellation performance, thereby allowing for true inband full-duplex operation [1-3, 6, 7, 11-14].
Moreover, in order to meet the high throughput requirements of the future wireless networks, it is inevitable that digitally generated cancellation signals do not include any of the transmitter-induced impairments, which thereby remain unaffected by this type of an RF cancellation solution [1]. Another possible solution for decreasing the complexity of RF cancellation in the context of very large transmit antenna arrays is to use beamforming to form nulls in the receive antennas [4,21], which might even allow for completely omitting RF cancellation. In typical MIMO devices, however, the increase in the RF cancellation complexity is more or less inevitable.
Also, the complexity of digital SI cancellation is somewhat increased under MIMO operation, but it is obviously more straight-forward to process several SI signals in the digital domain. In particular, more computational resources are needed to estimate all the channel responses between the several transmitters and receivers, but no additional RF hardware is required. However, having several transmit chains on a single chip introduces another issue from the perspective of the digital canceller: the crosstalk between the transmitters, which occurs both before and after the power amplifiers (PAs) [22][23][24][25][26][27][28]. This phenomenon is illustrated in Fig. 1 for an example case of three transmitters. What makes this an especially cumbersome issue is the fact that typically the PAs introduce significant nonlinear distortion [3,29]. This, on the other hand, means that nonlinear modeling of the SI is required in the digital canceller, which is very challenging if the PA input is in fact a linear combination of all the original transmit signals, as is the case under crosstalk [26]. Nevertheless, it is still necessary to model the crosstalk, since otherwise the accuracy of the regenerated SI signal is not sufficiently high. This is especially crucial for the emerging massive MIMO transceivers, where the large amount of transmit chains calls for a Fig. 1 An illustration of the crosstalk phenomenon in a three-antenna MIMO transmitter, where crosstalk occurs both before and after the PAs. The former is typically referred to as nonlinear crosstalk, while the latter is called linear crosstalk high level of integration, which results in more leakage between the transmission paths [28]. Hence, the increase in computational complexity caused by the crosstalk modeling must be tolerated in order to obtain sufficient levels of SI cancellation also under MIMO operation.
In this article, we present a general signal model for the observed SI in the digital domain under a scenario where there is crosstalk between the transmit chains before and after nonlinear PAs. Moreover, it is shown that the signal model can be applied to various different RF cancellation solutions. The presented comprehensive signal model, which shows the effect of the crosstalk in terms of the original transmit signals, is then used as a basis for a high-performance digital SI canceller. The IQ imbalance occurring both in the transmitters and in the receivers is also included in the signal model, since it is typically one of the dominant sources of distortion in a practical transceiver, alongside with the PA-induced nonlinearities [30].
Furthermore, to address the increase in the computational complexity due to the MIMO operation and crosstalk modeling, a novel principal component analysis (PCA)-based solution is proposed, which can be used to control the complexity of the signal model. In particular, PCA processing is used to identify the insignificant terms in the observed SI signal, which are then omitted in the further cancellation processing. This results in a significant reduction of the unknown parameters that must be estimated, which obviously decreases the computational requirements of the digital SI canceller. Moreover, since the most dominant SI terms are retained by such processing, there is no essential degradation in the cancellation performance. To the best of our knowledge, such complexity reduction schemes have not been previously proposed in the context of SI cancellation solutions.
The detailed list of novel contributions in this paper is as follows: • We derive the most comprehensive MIMO signal model for the observed SI presented so far in the literature. It covers various RF cancellation scenarios, while also modeling the crosstalk between the transmitters under low-cost nonlinear PAs and IQ imbalance. • We propose a novel nonlinear digital SI canceller, which utilizes the aforementioned advanced signal model. • We propose a novel complexity reduction scheme based on PCA, which can be used to control the computational complexity of the digital canceller, while minimizing the decrease in the cancellation performance.
• We present numerical results, which illustrate various aspects of the proposed digital SI cancellation solution with realistic waveform simulations.
The rest of this article is organized as follows. In Section 2, the MIMO signal model is derived. Then, in Section 3, the actual nonlinear digital SI canceller is presented, alongside with the parameter estimation procedure and the PCA-based complexity reduction scheme. After this, in Section 4, the proposed digital SI cancellation solution is evaluated with realistic waveform simulations. Finally, the conclusions are drawn in Section 5.

Baseband equivalent signal modeling
In this section, we build a complete SI channel model for a MIMO full-duplex device, including the effects of transmitter impairments (PA nonlinearity, IQ imbalance, and transmitter crosstalk), the linear MIMO SI channel, and RF cancellation. In the forthcoming analysis, the nonlinearities produced by the digital-to-analog and analogto-digital converters (DACs and ADCs) [31], alongside with phase noise, are omitted from the signal model for simplicity, although phase noise is still included in the reported simulation results.
An illustration of the considered full-duplex MIMO transceiver is given in Fig. 2, with two alternative RF cancellation solutions. In particular, the RF cancellation can be done either by utilizing the PA output signals, or by generating the cancellation signals in the digital domain and upconverting them with the help of auxiliary transmitters. In the forthcoming analysis, both of these options are considered. Furthermore, in Fig. 2, the transceiver is shown to have separate transmit and receive antennas only for illustrative purposes, since the same signal model can also be applied to a case where each antenna is shared between a transmitter and a receiver [32]. Hence, the forthcoming analysis is directly applicable also to a shared-antenna architecture. Note that, for notational simplicity, the actual received signals of interest and additive noise are not included in the following presentation.

Power amplifier and IQ modulator models with crosstalk
Let us denote the baseband signal of transmitter j (j = 1, 2, . . . , N T ) by x j (n). The output signal of a frequencyindependent IQ modulator model is [33] x IQM j where g j , ϕ j are the gain and phase imbalance parameters of transmitter j. Notice that under typical circumstances |K 1, j | |K 2, j |. The magnitude of the IQ image component, represented by the conjugated signal term in (1), can be characterized with the image rejection ratio (IRR) as 10 log 10 |K 1, j | 2 /|K 2, j | 2 . The response of the PA is approximated using the widely known parallel Hammerstein (PH) model, given for transmitter j as [34] x PA where x j,in (n) is the PA input signal, the basis functions are defined as and h p,j (n) denote the impulse responses of the PH branches for transmitter j, while M and P denote the memory depth and nonlinearity order of the PH model, respectively [34][35][36]. The PH nonlinearity is a widely used nonlinear model for direct as well as inverse modeling of PAs [34][35][36][37]. Due to the crosstalk occurring before each PA, referred to as nonlinear crosstalk, the input signal x j,in (n) can be written as where α ij is the crosstalk coefficient between the ith and jth transmitter chains, and α jj = 1 ∀ j. In other words, as a result of the crosstalk occurring before the PAs, each PA input signal is in fact a linear combination of all the different transmit signals. The crosstalk phenomenon is illustrated for an example case of three transmitters in Fig. 1, where both the nonlinear and linear crosstalk are shown. Inserting now (1) into (4), we can rewrite the PA input signal as where α 1,ij = α ij K 1,i and α 2,ij = α ij K 2,i .
Using (5), the signal at the PA output can be written as follows: It can be further modified by expanding all the integer powers of the sum signals as shown in the Appendix, which gives where h p,j,q 0 ,... ,r N T −1 (m) are the coefficients for the basis function of the form This signal model is of similar form as the one presented in [26], with the exception that the model in (7) also incorporates the effect of IQ imbalance and is thus more complete.
In order to simplify (7), it can be noted that, for the jth transmit signal and the pth nonlinearity order, the signal model contains in fact all the different combinations of the exponents q m and r n , under the constraint that their sum is equal to p. This means that we can rewrite (7) as where s k is the kth combination of the 2N t × 1 exponent vector s, h j,p,s k (m) contains the corresponding coefficients, and · 1 denotes the L 1 -norm. Note that all the elements of s are non-negative integers, as per the signal model. To illustrate its structure, all the variations of s for N T = 1 and P = 3 are written below: After the PAs, there is typically also some additional crosstalk between the transmitters, referred to as linear crosstalk. Taking also this phenomenon into account, the final output signal for the jth transmitter can be written as where β lj is the crosstalk coefficient between the lth and jth transmitters. It can be observed that the essential signal model remains the same as in (8), but with modified coefficients written aś Denoting the MIMO propagation channel impulse response from TX antenna j to RX antenna i by c ij (l), l = 0, 1, . . . , L, the received SI signal at RX antenna i (i = 1, 2, . . . , N R ) can now be written as Again, the signal model still remains the same as in (8), but with slightly modified coefficients, which are obtained from The new memory length of the received signal model is also increased from M to M+L. The input signal of the ith receiver (z i (n)) is then further processed by the RF canceller and the actual receiver chain. Note that the above signal model in (11) also applies to circulator and electrical balance duplexer-based implementations, where each transmitter and receiver pair share the same antenna [32], and hence it is generic in that respect.

RF cancellation
To ensure an extensive analysis and derivation for the proposed digital cancellation algorithm, we consider three different RF cancellation solutions. The first technique is similar to what has been used, e.g, in [5,6], and it involves directly tapping the transmitter outputs to obtain the reference signals for RF cancellation. This method is based on purely analog processing, as the whole cancellation procedure is performed in the RF domain. The two other considered methods are based on auxiliary TX chains, which are used to produce the RF cancellation signal from digital baseband samples [1,38,39]. We call this latter approach hybrid RF cancellation to distinguish it from purely analog cancellation. Furthermore, we consider both linear and nonlinear preprocessing to be used with this auxiliary transmitter based RF cancellation.

RF cancellation with transmitter output signals
In this RF cancellation method, the output of each TX chain is tapped, and subtracted from each of the received signals after suitable gain, phase and delay adjustments. These RF cancellers can be either single-tap or multi-tap [9,40], for which reason we denote them with impulse responses h RF ij (l), operating on the TX output signals x TX j (n). The coefficients are obviously chosen such that they model the MIMO coupling channel coefficients in c ij (n) as accurately as possible. The RF cancellation signal for the ith receiver can thus be written as where L is the number of taps in the RF canceller. It can be easily shown that the cancellation signal is of similar form as the actual received signal in (11), with coefficients of the form and a memory length of M + L . Thus, the received SI signal of receiver i, after this type of analog RF cancellation, becomes Hence, the structure of the RF canceller output signal model is still of the same form as in (11), but with modified coefficients expressed ash i,p, This type of purely analog RF cancellation calls for N T × N R canceller circuits to be implemented in the device, one canceller from each transmitter to each receiver. The complexity may become prohibitive when the number of antennas is significantly increased and, thereby, when implementing a high order full-duplex MIMO device, alternative methods for RF cancellation might have to be considered.

Hybrid RF cancellation using auxiliary transmitters with linear preprocessing
One such alternative RF canceller structure is the hybrid method, which utilizes extra transmitter chains, one for each receiver, to upconvert and subtract estimated replicas of the SI signals from the received signals at RF [1,38,39]. In this case, linear MIMO filtering is already done at digital baseband on the transmit signals x j (n) with some estimated MIMO channel responses h RF ij (l). Since the transmit signals from the different antennas can now be combined already in the digital domain, the analog hardware complexity of this type of an RF cancellation scheme scales with N R instead of N T N R , and may prove to be more attractive with a high number of antennas. Note that in this subsection, we consider only linear processing for the hybrid RF canceller, and thereby IQ modulator imbalance or PA nonlinearity are not explicitly dealt with at this stage. The RF cancellation signal can in this case be written as which is a special case of the signal model in (11) with P = 1 and coefficientsȟ RF i,1,s k (m) consisting of h RF ij (l) with proper s k . The signal after RF cancellation is again obtained as shown in (13), and with the final coefficients as Also this model is essentially of the same form as (11), with the coefficients of the linear SI terms being affected by the hybrid RF cancellation procedure, while the other terms remain unchanged. This means that the observed SI signal in the receiver digital domain can still be modeled with the same signal model as in the case of pure analog RF cancellation (or no RF cancellation at all). Thus, from the perspective of the digital cancellation algorithm, it makes no difference whether RF cancellation is performed by tapping the transmitter output or by using auxiliary TX chains with linear preprocessing, although the RF cancellation performance itself might obviously be different for the considered methods.

Hybrid RF cancellation using auxiliary transmitters with nonlinear preprocessing
Yet another alternative RF cancellation technique utilizes auxiliary transmitters, but with nonlinear preprocessing, instead of purely linear processing. The estimated MIMO channel responses of the different nonlinear SI terms are now denoted by h RF ij,p (l). In the forthcoming analysis, it is assumed that the auxiliary TX chains are linear. This is a relatively feasible assumption, since no PA is required due to the lower output power requirements. Now, the cancellation signal obtained with this RF cancellation procedure can be expressed as where P is the nonlinearity order of the RF cancellation signals. Note that this signal model neglects IQ imbalance and crosstalk, since the RF canceller must only attenuate the SI such that the receiver is not saturated. Also this RF cancellation signal can be easily represented with a signal model of the same form as in (11). The coefficientš h RF i,p,s k (m) of the signal model now consist of h RF ij,p (l) with the parameters p and s k that correspond to the basis func- , and other coefficients are set to zero. Similar to the other RF cancellation schemes, after subtracting the cancellation signal from the received signal, as in (13), the signal model remains the same and its coefficients . Now, also some of the nonlinear SI terms are attenuated by RF cancellation, as they are modeled in the preprocessing stage.
Overall, it can be concluded that the essential structure of the observed SI signal in the digital domain is independent of the chosen method for RF cancellation. This means that, in the forthcoming analysis, the same digital cancellation algorithm can be applied in all the situations since the only difference between the three alternative RF cancellation schemes are the relative power levels of the various SI terms. However, as already mentioned, the RF cancellation performance is likely to differ between these techniques, and also the hardware and computational requirements are different for each RF canceller structure.
In the forthcoming analysis, we will refer to the parameters of the signal model in all cases byh i,p,s k (m), similar to the above derivations, even though the exact values of the different coefficients vary for different RF cancellation techniques. This notation will simplify the equations and make them more straightforward and illustrative. Hence, the signal after RF cancellation, which is then processed by the digital canceller, can be written as Note that this signal model implicitly incorporates also the IQ imbalance occurring in the receiver, even though it is omitted in the derivations for brevity [15].

Total number of basis functions in the overall model
In general, with the above cascaded modeling approach for IQ modulator and PA impairments with crosstalk between the transmitters, it can easily be shown that the total number of basis functions in (16) becomes Figure 3 illustrates the number of basis functions for different nonlinearity orders and numbers of transmit antennas for the full signal model and also for the crosstalk-free signal model discussed below in Section 2.4. It is immediately obvious that with higher order MIMO systems, or with heavily nonlinear PAs, the number of basis functions becomes unacceptably high when utilizing the full signal model with crosstalk. Thus, it is necessary to determine methods that will decrease the number of basis functions, and thereby facilitate the estimation of the parameters of this signal model also in practice.
Luckily, many of the terms arising from the cascade of the impairments are so insignificant that they can be neglected with very little effect on the overall modeling accuracy. This will reduce the computational cost of such modeling and the corresponding cancellation procedure. In this work, we propose a specific preprocessing stage which can be used to decrease the dimensionality of the full signal model in (16). This is elaborated in more details in Section 3.2.

Nonlinear signal model without crosstalk
Another simple way to decrease the number of basis functions is to neglect the crosstalk effect between the transmitters. Then, the cross terms between the different transmit signals will be removed, which obviously results in a significant decrease in the number of unknown parameters. Modifying (16) accordingly, we can write the signal model now as whereh i,j,p,q (m) represents now the coupling channel corresponding to the considered SI signal terms propagating from the jth transmitter to the ith receiver. This signal model is also derived in [15], where it is briefly discussed and analyzed. For this reason, the detailed derivation process of (18) is omitted in this article.
Since now all the cross-terms are neglected from the signal model, the number of basis functions can be expressed as When investigating Fig. 3, it can be seen that this signal model results in a significant reduction of basis functions, when compared to the full signal model with crosstalk. With moderate crosstalk levels, it is therefore likely that using this signal model will provide a very favorable tradeoff between cancellation performance and computational complexity. However, as already discussed, in highly integrated transceivers explicit modeling of the crosstalk between the transmitters is likely required in order to ensure sufficient cancellation performance [28].

Self-interference parameter estimation and digital cancellation
In this section, building on the previous modeling in, e.g., [15,29], we will describe the proposed digital cancellation algorithm that models both IQ imbalance and PA nonlinearity in a MIMO full-duplex transceiver with crosstalk between the transmitters. In general, there are two possible approaches for nonlinear digital SI cancellation: (i) construct a linear-in-parameters model of the observed SI signal in the digital domain, including the different impairments, the MIMO propagation channel, and RF cancellation, estimate the unknown parameters of the model, and finally recreate and cancel the SI from the received signals; (ii) have separate models for the MIMO propagation channel and the transmitter impairments, estimate the unknown model parameters sequentially, and recreate and cancel the SI from the received signals. Typically the latter approach is computationally less demanding, but it requires a more elaborate estimation procedure.
In this article, we consider the former approach, while the latter is left for future work.

Linear-in-parameters model
Having already derived a linear-in-parameters signal model in Section 2, presented in (16), the next step is to estimate its parameters inh i,p,s k (m). After this, the estimated parameters are used to regenerate the SI signals, which are then subtracted from the received signals at digital baseband to obtain cancellation. Figure 4 shows the whole digital cancellation procedure on a fundamental level.
Denoting the desired signal of interest and additive noise at the ith receiver by s i (n) and w i (n), respectively, the overall received signal at digital baseband can be expressed as The corresponding output of the digital SI canceller is then wherer i (n) denotes the SI estimate obtained using the signal model in (16) with estimated parameters, written aŝ Here,P is the nonlinearity order of the digital canceller, M 1 is the number of pre-cursor taps, M 2 is the number of post-cursor taps, andĥ i,p,s k (m) contains the estimated parameters of the signal model. The pre-cursor taps are introduced to model all the memory effects produced by the transmitter and RF cancellation circuitry.

Least-squares-based estimator
In this work, the actual parameter learning is performed with the widely used least squares (LS) estimation. For brevity, the parameter learning and digital cancellation procedure is here outlined only for the ith receiver, since the procedure is identical for all the receivers. In practice, calculating the LS estimate requires knowledge of (i) the original transmitted data signal, (ii) the predetermined signal model in (16), and (iii) the observed received signal y i (n). In the considered MIMO full-duplex device, all of these are obviously known by the digital canceller. Since the LS estimation is performed using a block of data, the vector/matrix representations of the relevant signals with N observed samples are first defined as and r i , s i , w i are defined in the same manner as y i . The error vector is then defined as where the nonlinear SI estimate iŝ Here, is a horizontal concatenation of the convolution matrices defined as follows: with p = 1, 3, . . . ,P, and s k is each combination for which s k 1 = p, similar to the sum limits shown in (16). Overall, the number of concatenated matrices is given by the total number of basis functions in (17), since this is the amount of different combinations of s k for all the nonlinearity orders.
Alternatively, in the crosstalk-free model consists of the concatenation of the matrices defined as follows: 1, 2, . . . , N T ,  p = 1, 3, . . . ,P, and q = 0, 1, . . . , p. An estimate of the parameter vectorh i , denoted byĥ i , is a vertical concatenation of the vectorŝ In the crosstalk-free model, the parameter vector consists of the concatenation of vectorŝ The LS estimate of the parameter vectorh i is then found as the solution which minimizes the power of the error vector e i , aŝ assuming full column rank in .

Computationally efficient estimation with principal component analysis
Another approach to simplify the estimation procedure is to retain the cross-terms, and instead determine which of them are actually significant in terms of the cancellation performance. In this analysis, principal component analysis (PCA) [41] is used to decrease the number of parameters to be estimated. The idea behind the PCA is to determine which of the terms have the highest variance, providing valuable information regarding the significance of the different basis functions. In practice, PCA results in a transformation matrix, with which the original data matrix is multiplied. The size of the transformation matrix can be chosen to provide the desired number of parameters for the final estimation procedure. There are also various alternative solutions for model complexity reduction, such as compressed sampling (CS) based techniques. Nevertheless, in this work, we choose to use the PCA since it is a straight-forward method for the complexity reduction of the proposed signal model, while also providing nearly the same performance as CS when high modeling accuracy is required [42]. Experimenting with different complexity reduction methods is an important future work item for us. The first step in obtaining the desired PCA transformation matrix is to determine the least squares channel estimate given in (28) using all the basis functions. This estimate should be calculated with the highest possible transmit power, since the nonlinear SI terms that are negligible with the highest power will also be negligible with any lower transmit power. Hence, this reveals the terms, which can be omitted under the whole considered transmit power range. If the transceiver in question has more than one receiver chain, the channel estimation can be done individually for all of them, after which the mean value of the estimates is calculated. This is done to avoid having separate transformation matrices for each receiver, resulting in a decreased amount of required data storage. The hereby obtained coefficient vector, which is denoted byĥ 0 , is used as an initial channel estimate for the full set of basis functions. The next step is to determine the relative strengths of the different terms present in the SI signal. Using the initial channel estimate, this can be done by multiplying the original data matrix with the obtained estimate. Then, we get where 1 is a column vector consisting of 1s, and × denotes element-wise multiplication between two matrices. The matrix 0 now contains all the SI terms in its columns, each multiplied with the corresponding coefficient of the initial channel estimate. As a starting point for the PCA, the singular value decomposition of the normalized data matrix can be expressed as where U and V are the matrices containing the left and right singular vectors, respectively, while is a diagonal matrix consisting of the corresponding singular values. In this analysis, it is assumed that the singular values are in decreasing order. To minimize the possible numerical issues upon the PCA transformation, the actual transformation matrix is obtained in its normalized form, which is given by To control the number of parameters, part of the columns of the obtained matrix W can then be omitted. Based on the earlier assumption regarding the ordering of the singular values, the columns of the transformation matrix represent the different parameters in the descending order of their significance. Thus, by starting to remove the columns from the right, the number of parameters can be decreased with minimal effect on the modeling accuracy. Thus, denoting the number of chosen parameters with u, we can write the final transformation matrix as where w i is the ith column of the matrix W. Finally, the reduced data matrix can be calculated as The hereby obtained data matrix is then used in the least squares estimation as a replacement for the original data matrix . It should also be noted that when generating the actual digital cancellation signal, the cancellation data matrix must be transformed with the same matrix W, as the SI channel estimate is only valid in this transformed space.
An important aspect to point out is that the transformation matrix W is calculated only once with the highest transmit power, after which it can be used with all transmit powers to reduce the number of basis functions. Namely, since the strengths of the nonlinearities are directly proportional to the transmit power, the SI terms that are negligibly weak with the highest transmit power are at least as weak with the lower transmit powers, which means that the same SI terms can be omitted also then. This is also proven by the waveform simulations, the results of which will be discussed in Section 4. However, should the SI channel change drastically at any point, then the matrix W must be recalculated to ensure that no significant memory taps are neglected.
In general, perhaps the most crucial design problem in the context of the PCA is to determine the optimal number of parameters to be included in the final model. This can be most easily determined experimentally by reducing the number of parameters until the obtained cancellation performance starts to drop. Also, the singular values in can be used to calculate the percentage of the variance accounted for by the included basis functions. We will address this issue more closely with the help of waveform simulations in Section 4.

Performance simulations and analysis
The evaluation of the proposed scheme is now done with realistic waveform simulations, utilizing a comprehensive inband full-duplex transceiver model. It incorporates all the relevant impairments, and thereby the SI waveform represents a real-world scenario rather well. Below, we describe the waveform simulator in detail, after which the results are shown. As an important future work item, we aim to evaluate the proposed scheme also with actual RF measurements to confirm the results obtained here with the simulations.

Simulation setup and parameters
The waveform simulations are performed with Matlab, where all the relevant aspects of the full-duplex transceiver are modeled. These include the nonlinearity of the PAs, the crosstalk between the transmitters (both before and after the PA), the multipath SI channel, the imperfect RF cancellation, nonlinearity of the receiver, IQ imbalance, phase noise, and the quantization upon analog-to-digital conversion, while the DAC/ADC nonlinearities are omitted also from the simulator model since we have not observed them to be a significant factor in our earlier RF measurements [3,43]. This means that the simulator model is rather comprehensive and can be expected to provide realistic results, although they must still be confirmed with real-life measurements. Note that, since the focus of this work is on SI cancellation, the signal of interest is not present in any of the simulations. The RF cancellation is performed in all the cases using the transmitter output signal, since the essential signal model is not affected by the RF cancellation procedure, as shown in Section 2.2. The used waveform is a 20 MHz LTE downlink signal, which utilizes OFDM with a 4-QAM constellation. When modeling the phase noise, a common local oscillator for all the transmitters and receivers is assumed, which is a feasible assumption for an inband full-duplex device. All the relevant parameters of the waveform simulator are listed in Table 1, while the used phase noise characteristics are shown in Fig. 5.
In the forthcoming results, five different digital cancellers are considered, and they are as follows:  (22), including PCA processing to decrease the dimensionality and computational complexity • Digital canceller with the full signal model in (22), but without any dimensionality reduction • Digital canceller utilizing the N-input memory model from [26], which considers the nonlinearity of the PA and both linear and nonlinear crosstalk. • Digital canceller with the crosstalk-free signal model in (18), from [15], where both the nonlinearity of the PA and the IQ imbalance are modeled. • Digital canceller with a traditional linear signal model, where P = 1.
In all the cases, the same parameter estimation sample size is used for the different cancellers with M 1 = 10 and M 2 = 20 to ensure a fair comparison. The PCA matrix is calculated using 10 000 samples in the initial channel estimation stage. Furthermore, to avoid overfitting when estimating and cancelling the SI, separate portions of the signal are used for calculating the SI channel estimate and evaluating the actual SI cancellation performance.

Results
First, the signal spectra after the different digital cancellers are shown in Fig. 6 using the default parameters, alongside with the spectra of the RF cancelled signal and the receiver noise floor. It can be observed that only the digital cancellers utilizing the full signal model can obtain Fig. 6 The signal spectra after the different digital cancellers, alongside with the spectra of the RF cancelled signal and the receiver noise floor sufficient levels of SI cancellation. In particular, the digital canceller utilizing the linear signal model and the nonlinear crosstalk signal model from [26] perform very poorly since in this case IQ imbalance is the dominant source of distortion. The signal model from [15], on the other hand, has insufficient modeling accuracy since it does not take into account the crosstalk. Thereby, it is necessary to model both the IQ imbalance and the crosstalk, together with the nonlinearity of the PA, to obtain sufficient levels of digital cancellation. Furthermore, based on Fig. 6, the number of basis functions can be reduced to 35% without any reduction in the cancellation performance when using the full signal model.
Note that in this case the phase noise has no significant effect on the residual SI power since a common local oscillator between the transmitters and receivers is assumed. This results in a certain level of self-cancellation of the phase noise upon downconversion, considerably reducing its significance [44]. Figure 7 shows then the increase in the effective noise floor due to the residual SI for the different digital cancellers, with respect to the total transmit power. In other words, the closer to 0 dB the canceller achieves, the better is its overall SI cancellation performance. As expected, the linear canceller is not capable of efficient cancellation even with the lowest transmit powers, whereas the nonlinear cancellers with IQ imbalance modeling suppress the SI nearly perfectly up to transmit powers of 20 dBm. Moreover, the digital canceller utilizing the nonlinear crosstalk signal model from [26] performs very poorly with the whole transmit power range since it does not model the IQ imbalance, as already discussed.
With transmit powers beyond 20 dBm, the crosstalk effects begin to decrease also the accuracy of the crosstalk-free nonlinear signal model from [15]. On the other hand, the full signal models perform relatively well Fig. 7 The increase in the noise floor due to residual SI, with respect to the total transmit power even with the highest transmit powers, resulting in only a very minor increase in the noise floor. Furthermore, as observed earlier, retaining only 35% of the terms after the PCA processing does not seem to decrease the accuracy of the signal model when compared to the full signal model with all the terms included. In fact, the performance of the digital canceller with the lower transmit powers is slightly improved by the dimensionality reduction since the smaller number of parameters results in a more accurate parameter vector estimate, and hence in more efficient cancellation.
To investigate the PCA-based dimensionality reduction in greater detail, Fig. 8 shows the increase in the noise floor with respect to the percentage of the terms included after the PCA, when using the full signal model in (22). The performance of the case without any PCA processing is also shown for reference. It can be observed from the Fig. 8 The increase in the noise floor due to residual SI, with respect to the percentage of included terms figure that there is a wide range of values for the percentage of included terms that provide the same cancellation performance. However, if the percentage of included terms goes significantly below 35%, the performance of the PCA-based canceller is rather poor. This is caused by the decreased accuracy of the signal model due to excluding some of the significant terms. Also note that when 50-80% of the terms are included, the PCA-based solution achieves slightly higher levels of SI cancellation than the canceller without PCA processing. The reason for this is the decreased variance of the parameter estimate, thanks to the smaller number of terms.
In order to minimize the computational complexity of the cancellation procedure, the number of included terms must obviously be minimized. Hence, the smallest number of terms that still provides the required performance is in this sense the optimal choice. Figure 8 indicates that, with the parameters considered in these simulations, the optimal percentage of included terms is roughly 35%, which corresponds to 840 coefficients with the considered nonlinearity order and number of memory taps.
Since the level of the crosstalk occurring before the transmitter PAs is obviously the most significant aspect in determining whether the full signal model is actually necessary, Fig. 9 shows then the performance of the different digital cancellers with different crosstalk levels. It can be observed that, with the considered transmit power of 25 dBm, the crosstalk has a rather significant effect already at the level of −20 dB, since using the nonlinear signal model without any crosstalk modeling from [15] results in a 3 dB higher noise floor than when using the full signal models. With higher crosstalk levels, the performance difference is obviously further emphasized. Furthermore, similar to the earlier observations, the signal models that do not model the IQ imbalance perform very poorly since it is the dominant source of distortion. Fig. 9 The increase in the noise floor due to residual SI, with respect to the level of the crosstalk before the PAs It can also be observed from Fig. 9 that a larger number of terms is required with the very high crosstalk levels. In particular, having only 35% of the terms retained results in a somewhat higher residual SI power than retaining all of the terms. This is explained by the fact that higher crosstalk levels also result in a larger number of significantly powerful SI terms. Nevertheless, the cancellation performance differences between the full signal models, with or without PCA processing, are still relatively small with these reasonable crosstalk levels.
In order to further investigate the differences in the computational complexity of the different digital cancellers, Fig. 10 shows their performance for different parameter estimation sample sizes (N). It can be observed that the signal models without sufficient modeling accuracy are not bottlenecked by the amount of available learning data, since their performance is largely unaffected by the value of N. The benefits of the PCA-based dimensionality reduction for the full signal model are also clearly apparent, since the case with 35% of the terms retained performs relatively well even with very small parameter estimation sample sizes. As opposed to this, without any dimensionality reduction, roughly N = 24 000 is required to obtain a sufficiently accurate estimate of the parameters. Overall, it is hence clear that the PCA processing helps in significantly reducing the computational complexity of the digital SI cancellation procedure when utilizing the full signal model.

Conclusions
In this paper, a novel digital self-interference canceller for a nonlinear MIMO inband full-duplex transceiver was presented. The canceller is based on a comprehensive signal model for the SI observed in the digital domain, which includes the effect of crosstalk occurring between the transmit chains, while also incorporating the most Fig. 10 The increase in the noise floor due to residual SI, with respect to the parameter estimation sample size (N) significant RF imperfections. Furthermore, it was also shown that the signal model is valid for various different RF cancellers. To control the complexity of the cancellation procedure, a novel principal component analysis based scheme was then proposed, which can be used to control the number of parameters in the signal model. With the help of waveform simulations, the proposed digital canceller was shown to cancel the SI nearly perfectly, even when its computational complexity was significantly reduced using principal component analysis.

Appendix: Power amplifier output signal under crosstalk
Let us define a signal y(n) as follows: where α i is a scaling constant and x i (n) are known signals. To express an arbitrary integer power p of the signal y(n) in terms of the signals x i (n), let us expand the corresponding equation accordingly.
Applying now the binomial theorem to the above expression, we obtain y(n) p = and continuing in a similar manner, we finally obtain the following equation: where A k 1 ,... ,k N−1 is a constant.