In this section, the details of ILA and DLA are presented. And, the two architectures are compared by some simulations.

### 2.1 Indirect learning architecture

Figure 1 shows the block diagram of ILA. Firstly, the postdistorter is identified. Usually, the postdistorter is modeled as a MP [4]. The output of the postdistorter can be expressed as

$$ z_{p}(n)= \sum\limits_{k=0}^{K} \sum\limits_{l=0}^{L}a_{kl}\Phi_{kl}[z(n)] $$

(1)

where \(z(n)=\frac {y(n)}{G_{0}}\), *Φ*
_{
kl
}[*z*(*n*)]=*z*(*n*−*k*)|*z*(*n*−*k*)|^{2l}, *G*
_{0} is the desired gain, *a*
_{
kl
} (*k*=0,⋯,*K* and *l*=0,⋯,*L*) are the complex-valued coefficients, *K* refers to the memory depth and 2*L*+1 the highest nonlinearity order. Theoretically, it requires *z*
_{
p
}(*n*)=*x*(*n*) in a perfect system. The signals *x*(*n*) and *y*(*n*) are the measured input and output of PA, respectively. Assume that the total number of samples is *N*, we get

$$ \mathbf{z}_{\mathbf{p}}= \mathbf{Z} \mathbf{a} $$

(2)

where **z**
_{
p
}=[*z*
_{
p
}(1),⋯,*z*
_{
p
}(*N*)]^{T},**Z**=[**u**
_{00},⋯,**u**
_{0L
},⋯,**u**
_{
kl
},⋯,**u**
_{
K0},⋯,**u**
_{
KL
}]_{
N×(K+1)(L+1)},**u**
_{
kl
} =[*Φ*
_{
kl
}[*z*(1)],⋯,*Φ*
_{
kl
}[*z*(*N*)]]^{T}, and **a** =[*a*
_{00},⋯,*a*
_{0L
},⋯,*a*
_{
kl
},⋯,*a*
_{
K0},⋯,*a*
_{
KL
}]^{T}. Classical LS method can be used to extract the coefficients

$$ \hat{\mathbf{a}} = \left(\mathbf{Z}^{H}\mathbf{Z}\right)^{-1}\mathbf{Z}^{H}\mathbf{z}_{\mathbf{p}} $$

(3)

where (·)^{H} denotes the complex conjugate transpose operation. After the identification of the postdistorter, the copy of the postdistorter is placed directly in front of PA as the predistorter.

### 2.2 Direct learning architecture

The block diagram of DLA is illustrated in Fig. 2. As mentioned before, the determination of PD needs two steps in DLA. Firstly, the model of PA’s behavior needs to be predefined. Here, MP is adopted as the PA model. The PA model is defined as

$$ y(n)= \sum\limits_{p=0}^{P}\sum\limits_{q=0}^{Q}c_{pq}\Phi_{pq}[x(n)] $$

(4)

where *Φ*
_{
pq
}[*x*(*n*)]=*x*(*n*−*p*)|*x*(*n*−*p*)|^{2q}, *P* and 2*Q*+1 are the memory depth and highest nonlinearity order, respectively, and *c*
_{
pq
} are the model’s coefficients. The signals *x*(*n*) and *y*(*n*) are the input and output of PA, respectively. The coefficients *c*
_{
pq
} are extracted by LS method as mentioned before. Secondly, the identified model of PA is reversed for the determination of PD. In the following, the method in [10] to determine the PD with DLA is described.

The ideal cascaded PD-PA system can be expressed as

$$ y(n)=\sum\limits_{p=0}^{P}\sum\limits_{q=0}^{Q}c_{pq}\Phi_{pq}[x(n)]=G_{0}u(n) $$

(5)

where *u*(*n*) denotes the input of the PD-PA system and *x*(*n*) the predistorted signal (also the input of PA). According to the memory behavior of PA, the output *y*(*n*) of PA can be represented as the sum of two parts: the static part *s*(*n*) depending only on the current input sample (*p*=0), and the dynamic part *d*(*n*) which depends on the previous input samples (*p* from 1 to *P*).

$$ \left\{ \begin{array}{l} y(n)= s(n)+d(n) \\ s(n)= \sum_{q=0}^{Q}c_{0q}\Phi_{0q}[x(n)] \\ d(n)= \sum_{p=1}^{P}\sum_{q=0}^{Q}c_{pq}\Phi_{pq}[x(n)] \end{array} \right. $$

(6)

From (5) and (6), *s*(*n*) can be rewritten as

$$ s(n)=e^{j\angle x(n)}\sum\limits_{q=0}^{Q}c_{0q}|x(n)|^{2q+1}=G_{0}u(n)-d(n) $$

(7)

where *∠*
*x*(*n*) and |*x*(*n*)| are the phase and amplitude of the predistorted signal *x*(*n*), respectively. By definition, *d*(*n*) depends only on the previous samples, the right-hand side of (7) and coefficients *c*
_{0q
} are known at instant *n*. The amplitude |*x*(*n*)| of the predistorted signal can be calculated by taking the absolute value.

$$ \left| \sum\limits_{q=0}^{Q}c_{0q}|x(n)|^{2q+1}\right|=\left| G_{0}u(n)-d(n)\right| $$

(8)

Equation (8) is a high-order nonlinear equation, the amplitude |*x*(*n*)| of the predistorted signal can be found by a root-finding procedure [10]. The corresponding phase *∠*
*x*(*n*) is then calculated by

$$ \angle x(n)=arg\left\{\frac{G_{0}u(n)-d(n) }{\sum_{q=0}^{Q}c_{0q}|x(n)|^{2q+1}}\right\}. $$

(9)

Finally, the predistorted signal *x*(*n*) is given by

$$ x(n)=|x(n)|e^{j\angle x(n)}. $$

(10)

### 2.3 Comparison of ILA and DLA

Two different DPD architectures (ILA and DLA) are presented in the above subsection. They both need to identify a pre-assumed model (MP model) with the measured input and output of PA. LS method is used in the model identification. From (3), it can be seen that the inverse of **Z**
^{H}
**Z** is required in LS algorithm. When solving the inverse problem, the condition number is usually used to measure how sensitive the solution is to changes or errors in the input. A problem with a low condition number is said to be well-conditioned, while a problem with a high condition number is said to be ill-conditioned.

In this subsection, the condition number of **Z**
^{H}
**Z** is considered to measure the stability of the two DPD architectures. In ILA, **Z** is related to the output signal of PA. While in DLA, it is related to the input signal of PA. A simulation is realized to compare these two DPD architectures. In this simulation, a Wiener model is used as PA model. It is implemented as a three-tap FIR filter with coefficients [0.7692, 0.1538, 0.0769] [25], followed by a Saleh model. Saleh model is described as

$$ y(n)=\frac{\alpha_{a}|v(n)|}{1+\beta_{a}|v(n)|^{2}}e^{j \angle[v(n)+\frac{\alpha_{p}|v(n)|^{2}}{1+\beta_{p}|v(n)|^{2}}]} $$

(11)

where *y*(*n*) and *v*(*n*) are the output and input of Saleh model, respectively, and *α*
_{
a
}=20, *β*
_{
a
}=2.2, *α*
_{
p
}=2, *β*
_{
p
}=1 [26]. The input signal of PA is a WCDMA signal with 3.84 MHz bandwidth. The measured sequence has 1000 symbols (8000 samples). The ideal gain of this PA model is about 26 dB. The input power at 1 dB compression point is about −1 dBm.

The condition number of **Z**
^{H}
**Z** is affected by **Z** which depends on the memory depth, nonlinearity order and value of each element (varies with the input signal power). In the first simulation, we assume that the memory depth is 3 (*K*=*P*=3) and the nonlinearity order is 5 (*L*=*Q*=2). The signal input power varies from −20 to 5 dBm. The condition numbers of **Z**
^{H}
**Z** of two DPD architectures are compared. The results are shown in Fig. 3.

In the second simulation, we assume that the input signal power is 0 dBm. The nonlinearity order varies from 3 to 15. The memory depth varies from 0 to 3. The comparison of two DPD architectures is shown in Fig. 4.

From Figs. 3 and 4, it can be seen that the condition number of **Z**
^{H}
**Z** in DLA is always lower than that in ILA. It indicates that DLA is more robust than ILA with respect to perturbations. To further validate this result, the influences of the noise and quantization error of ADC on the modeling performance are analyzed.

In practical measurement, an ADC is required to realize the signal acquisition. The number of bits of ADC will affect the accuracy of the signal acquisition and then affect the performance of the model’s identification. Additionally, the measurement at the output of PA is noisy in practical application. The noise also affects the performance of the model’s identification. In the third simulation, these two DPD architectures are compared in terms of the number of bits of ADC in signal acquisition and the noise power at the output of PA.

For analyzing the influence of the noise at the output of PA, it assumes that complex white Gaussian noise is present at the output of PA as shown in Figs. 1 and 2. An original WCDMA signal *x*
^{′}(*n*) (bandwidth 3.84 MHz, power 0 dBm, sequence length 8000 samples) is given as the PA input. And the output *y*
^{′}(*n*) of PA is obtained by the Wiener model. The white Gaussian noise is added in the output signal *y*
^{′}(*n*) to get the noisy output *y*
*noise*′(*n*). The input *x*
^{′}(*n*) and output *y*
*noise*′(*n*) of PA are used for the model identification of (1) and (4). In this process, we take *K*=*P*=3 (memory depth = 3) and *L*=*Q*=2 (nonlinearity order = 5), respectively.

After the model identification, another WCDMA signal *x*
^{′′}(*n*) is used to test the identified model. For the input *x*
^{′′}(*n*), the corresponding output *y*
^{′′}(*n*) is obtained. For ILA, the signal *y*
^{′′}(*n*) is the input of the postdistorter, its output \(z^{\prime \prime }_{p}(n)\) is obtained by the identified model (1). The normalized mean squared error (NMSE) between \(z^{\prime \prime }_{p}(n)\) and *x*
^{′′}(*n*) is treated as the criterion of ILA modeling accuracy. For DLA, the signal *x*
^{′′}(*n*) is the input of the identified model (4), the corresponding output \(y^{\prime \prime }_{DLA}(n)\) is obtained. The normalized mean squared error (NMSE) between \(y^{\prime \prime }_{DLA}(n)\) and *y*
^{′′}(*n*) is treated as the criterion of DLA modeling accuracy. Figure 5 shows the performance of ILA and DLA in terms of NMSE when the noise is present. It can be seen that the modeling accuracy of ILA and DLA both improves as the noise power decreases. When the noise is high, the performance of DLA is better than that of ILA. When the power of noise is low (nearly ideal case), the performance of DLA and ILA is similar. In practical applications, the noise can not be ignored. DLA is more robust than ILA when the noise is present.

For the influence of the number of bits of ADC, an ADC is assumed for the signal acquisition. The signals *x*
^{′}(*n*) and *y*
^{′}(*n*) pass the ADC and are denoted by *x*
*ADC*′(*n*) and *y*
*ADC*′(*n*), respectively. The signals *x*
*ADC*′(*n*) and *y*
*ADC*′(*n*) are used for the model identification of (1) and (4). The same signal *x*
^{′′}(*n*) is used to test the identified model. The comparison principle considering the number of bits of ADC is the same as that considering the noise. Figure 6 shows the performance of ILA and DLA in terms of NMSE with varying number of bits of ADC. It can be seen that the modeling accuracy of ILA and DLA both improves as the number of bits of ADC increases. The modeling accuracy of DLA is always better than that of ILA.

According to the above analysis, it further verifies that DLA is more robust than ILA with respect to perturbations. Finally, the two DPDs based on different learning architectures are compared for the validation. The DPD proposed in [7] is taken as the example of ILA. And, the DPD proposed in [10] is taken as the example of DLA. In this simulation, the number of bits of ADC and the noise both are taken into account in the process of identification. We assume that the number of bits of ADC is 8, the SNR at the output of PA is 35 dB. The average power of the input signal varies from −13 to −3 dBm. The input signal, which has an average power of −3 dBm, includes many samples entering the nonlinear region of PA model. Figure 7 illustrates the distribution of samples of the input signal when its average power is −3 dBm. It shows that the power of about 40 % of the input samples is higher than the input power (−1 dBm) at 1 dB compression point of the PA model.

The original input *x*
^{′}(*n*) and output *y*
^{′}(*n*) of PA pass firstly the ADC, then the noise is added on the output signal of PA. The processed input and output signals are used for the model identification of (1) and (4). For ILA, the identified postdistorter model is directly placed in front of PA as the predistorter [7]. For DLA, the predistorter is determined by inverting the identified PA model [10].

The linearized outputs of the two different learning architectures are obtained by the DPD solutions stated above. The adjacent channel power ratio (ACPR) and EVM of linearized outputs of two architectures are calculated, respectively. Figure 8 shows the evolution of ACPR of linearized outputs with varying input power. And, Fig. 9 shows the evolution of EVM of linearized outputs with varying input power. It can be seen that the linearization performance of DLA is much better than that of ILA when considering the number of bits of ADC and the noise at the output of PA. Consequently, in this paper, DPD based on DLA is adopted.