Skip to main content

A general neuro-space mapping technique for microwave device modeling

Abstract

Accurate modeling of nonlinear microwave devices is critical for reliable design of microwave circuit and system. In this paper, a more general neuro-space mapping (Neuro-SM) method is proposed to fulfill the needs of the increased modeling complexity. The proposed technique retains the capability of the existing dynamic Neuro-SM in modifying the dynamic voltage relationship between the coarse model and the desired model. The proposed Neuro-SM also considers dynamic current mapping besides voltage mappings. In this way, the proposed Neuro-SM generalizes the previously published Neuro-SM methods and has the potential to produce a more accurate model of microwave devices with more dynamics and nonlinearity. A new formulation and new sensitivity analysis technique are derived to train the general Neuro-SM with dc, small-, and large-signal data. A new gradient-based training algorithm is also proposed to speed up the training. The validity and efficiency of the general Neuro-SM method are demonstrated through a real 2 × 50 μm GaAs pseudomorphic high-electron mobility transistor (pHEMT) modeling example. The proposed general Neuro-SM model can be implemented into circuit simulators conveniently.

1 Introduction

Microwave transistors are key components in the next generation wireless communication systems [1,2,3,4], such as cognitive multiple-input multiple-output (MIMO) systems [5,6,7], and cognitive relay network [8, 9]. With the increasing complexity of communication circuit and system structure, designers rely more heavily on computer-aided design (CAD) software to achieve efficient design. Microwave device models are essential to CAD software. The accuracy of these models can even decide whether the communication circuit and system design is successful or not. Due to rapid technology development in semiconductor industry, new microwave devices constantly arrive. Models suitable for previous devices may not fit new devices well. There is an ongoing need for new accurate models.

In recent years, neuro-space mapping (Neuro-SM) technique [10] combining artificial neural networks [11] with space mapping [12] has been recognized in microwave device modeling with the advantages of good efficiency and accuracy. In Neuro-SM, neural networks are used to automatically map and modify an existing equivalent circuit model also called coarse model to a desired/accurate model through a process named training. In order to fulfill the needs of the increased modeling complexity and the industry’s increasing need for tighter accuracy, several improvements on the basis of [10] were subsequently studied to enhance the modeling accuracy and efficiency, such as Neuro-SM with the output mapping [13], dynamic Neuro-SM [14], and analytical Neuro-SM with sensitivity analysis [15]. Neuro-SM with the output mapping [13] was introduced, through incorporation of a new output/current mapping, for modeling of microwave devices. Compared to the Neuro-SM presented in [10], Neuro-SM with the output mapping is more suitable for modeling nonlinear devices with more nonlinearity due to the additional and useful degrees of freedom from the output mapping neural network. In order to accurately model nonlinear devices which have higher order dynamic effects (e.g., capacitive effect or non-quasi-static effect) than that of the coarse model, dynamic Neuro-SM was introduced [14]. However, when the modeling devices have both more nonlinearity and high order dynamics, in such case, even though existing Neuro-SM [13, 14] is used to map the coarse model towards the device data, the match between the trained Neuro-SM models and the device data may be still not good enough. More effective Neuro-SM methods need to be investigated to overcome the accuracy limit of the Neuro-SM presented in [13, 14].

In this paper, we propose a more generalized Neuro-SM approach including not only static mapping but also dynamic mapping, and considering both voltage mapping and current mapping for the first time. This paper is a further expansion of the work in [13, 14]. Compared to [13] where only static mapping is used, the proposed technique is more suitable for modeling nonlinear devices with higher order dynamic effects and non-quasi-static effect that may be missing in the coarse model due to inclusion of dynamic mapping. Compared to [14], the general Neuro-SM considers not only input voltage mapping, but also output current mapping, further refining the existing coarse model. In this way, well trained general Neuro-SM model can represent the dynamic behavior and large-signal nonlinearity of the microwave devices more accurately than the coarse model, Neuro-SM model with the output mapping [13], as well as dynamic Neuro-SM model [14]. The modeling results of a real 2 × 50 μm GaAs pseudomorphic high-electron mobility transistor (pHEMT) demonstrate the correctness and validity of the proposed general Neuro-SM technique.

2 Concept of the general Neuro-SM model

Suppose the existing equivalent circuit model is a rough approximation of the behavior of the microwave device. We name this existing model as the coarse model. Let the desired model that accurately matches the device data be called the fine model. Just take field effect transistor (FET) modeling as an example, let the gate and drain voltages and currents of the coarse model be defined as v c  = [vc1, vc2]T and i c  = [ic1, ic2]T, respectively. Let the terminal voltages and currents of the fine model as v f  = [vf1, vf2]T and i f  = [if1, if2]T, respectively.

Suppose the total number of voltage delay buffers at gate and drain be the same and both equal to N v . Let τ be the time delay parameter. To represent time-domain behavior, the time parameter t is introduced. Figure 1 illustrates the signal flow of the general Neuro-SM model. At first, the present voltages of the fine model v f (t) as well as their historyv f (t − τ), v f (t − 2τ), …, and v f (t − N v τ) are mapped into the coarse model voltages v c (t). Because the formula of the mapping is unknown and usually nonlinear, a neural network is used to learn and represent the mapping. While the Neuro-SM presented in [10] uses a static neural network such as multilayer perceptron (MLP), we propose to use a time delay neural network (TDNN) to map the coarse model to fine model. In functional form, v c (t) can be described as

$$ {\boldsymbol{v}}_c(t)={\boldsymbol{f}}_{\mathrm{ANN}}\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right),{N}_v\ge 0\kern0.2em $$
(1)

where fANN represents the input/voltage mapping neural network, and w 1 is a vector containing all the weights of the input mapping neural network. As seen from Eq. (1), voltages at gate and drain of the coarse model depend on not only the present voltages of the fine model, but also their history signals making the proposed technique more suitable for modeling the dynamic behavior of the nonlinear devices. Then, after the coarse model computation, the coarse model currents i c (t) can be obtained. Suppose the total number of current delay buffers at gate and drain be the same and both equal to Nc. At last, i c (t) and their history i c (t − τ), …, i c (t − N c τ) as well as the present voltages of the fine model v f (t) are mapped by another TDNN to the external currents as

$$ {\boldsymbol{i}}_f(t)={\boldsymbol{h}}_{\mathrm{ANN}}\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right),{N}_c\ge 0\kern0.1em $$
(2)

where hANN represents the output/current neural network, and vector w2 contains all the output mapping neural network weights. Compared to [14], the new output neural network mapping further refines the coarse model current signals to produce the fine model outputs. The combined dynamic voltage mapping neural network, coarse model, and dynamic current mapping neural network is called the general Neuro-SM model.

Fig. 1
figure 1

Signal flow of the general Neuro-SM model

The proposed general Neuro-SM is more general than Neuro-SM technique presented in [10, 13, 14]. While N v  = 0, then the general Neuro-SM model without the output mapping is static Neuro-SM model [10]. While N v  = 0 and N c  = 0, then the general Neuro-SM model belongs to the Neuro-SM model with the output mapping [13]. While N v > 0, then the general Neuro-SM model without the output mapping is the dynamic Neuro-SM model [14]. In this way, the proposed general Neuro-SM generalizes the previously published Neuro-SM technique. Furthermore, while N v > 0 and N c > 0, a new Neuro-SM technique is presented for the first time. Compared to the Neuro-SM introduced in [10, 13, 14], the new Neuro-SM is more suitable for modeling the microwave devices with high order dynamics and nonlinearity due to inclusion of dynamic mapping as well as current mapping.

3 Proposed analytical formulation of the general Neuro-SM model for training

The general Neuro-SM model will not be accurate unless the dynamic voltage and dynamic current mapping neural networks are trained suitable. In order to train the general Neuro-SM efficiently with typical types of transistor modeling data, the relationship between the dynamic voltage and current mapping neural networks with typical types of transistor data, such as DC, bias-dependent S parameter, and large-signal harmonic data need to be derived.

In the DC case, present voltage signals of the fine model v f (t) as well as its history, i.e.,v f (t − τ), …, and v f (t − N v τ) are all equal and defined as Vf, DC. Similarly, present current signals of the coarse model i c (t) as well as its history, i.e.,i c (t − τ), …, and i c (t − N c τ) are all equal and defined as Ic, DC. The response of the general Neuro-SM model at Vf, DCcan be generally described as

$$ {\boldsymbol{I}}_{f.\mathrm{DC}}={\boldsymbol{I}}_f\left({\boldsymbol{V}}_{f.\mathrm{DC}}\right)={\boldsymbol{h}}_{\mathrm{ANN}}\left({\boldsymbol{V}}_{f,\mathrm{DC}},\overset{N_c+1}{\overbrace{{\left.\operatorname{}{\boldsymbol{I}}_{c,\mathrm{DC}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}},{\boldsymbol{I}}_{c,\mathrm{DC}}{\left|{}_{{\boldsymbol{V}}_{c,\mathrm{DC}}},\dots, {\boldsymbol{I}}_{c,\mathrm{DC}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}}}},{\boldsymbol{w}}_2\right)\kern0.1em $$
(3)

where

$$ {\boldsymbol{V}}_{c,\mathrm{DC}}={\boldsymbol{f}}_{\mathrm{ANN}}\left(\overset{N_v+1}{\overbrace{{\boldsymbol{V}}_{f,\mathrm{DC}},{\boldsymbol{V}}_{f,\mathrm{DC}},\dots, {\boldsymbol{V}}_{f,\mathrm{DC}}}},{\boldsymbol{w}}_1\right)\kern0.1em $$
(4)

The small-signal S parameter of the general Neuro-SM model can be calculated by transforming its Y parameters Y f , which can be obtained by mapping Y parameters of the coarse model Y c . In functional form, Y f can be described as

$$ {\displaystyle \begin{array}{l}{\boldsymbol{Y}}_f\left(\omega \right)\\ {}={\left({\left.\sum \limits_{l=0}^{N_c}\operatorname{}{e}^{- j\omega l\tau}\cdot \frac{\partial {\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{i}}_c\left(t- l\tau \right)}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}\\ {}{\left.\operatorname{}{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\cdots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\\ {}\kern0.5em \cdot \operatorname{}{\boldsymbol{Y}}_c\left(\omega \right){\left|{}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\cdot \Big(\sum \limits_{k=0}^{N_v}{e}^{- j\omega k\tau}\cdot \operatorname{}\frac{\partial {\boldsymbol{f}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {\boldsymbol{v}}_f\left(t- k\tau \right)}\right|}_{{\boldsymbol{v}}_f(t)={\boldsymbol{v}}_f\left(t-\tau \right)=\cdots ={\boldsymbol{v}}_f\left(t-{N}_v\tau \right)={\boldsymbol{V}}_{f,\mathrm{Bias}}}\Big){}^T\\ {}\kern0.5em +{\left({\left.\operatorname{}\frac{\partial {\boldsymbol{h}}_{\boldsymbol{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{v}}_f(t)}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}\\ {}{\left.\operatorname{}{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\cdots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\kern7.299995em \end{array}} $$
(5)

where

$$ {\boldsymbol{V}}_{c,\mathrm{Bias}}={\boldsymbol{f}}_{\mathrm{ANN}}\left(\overset{N_v+1}{\overbrace{{\boldsymbol{V}}_{f,\mathrm{Bias}},{\boldsymbol{V}}_{f,\mathrm{Bias}},\dots, {\boldsymbol{V}}_{f,\mathrm{Bias},}}},{\boldsymbol{w}}_1\right)\kern0.1em $$
(6)

where the first-order derivatives of fANN and hANN can be obtained at the bias Vf, Bias using adjoint neural network method [15]. Superscript k and l represent the index of voltage and current delay buffers, respectively. Equation (5) includes two parts. The first part is in the form of multiplications of three matrices, which are defined as the output/current Y-mapping matrix, i.e., the sum of products of ejωlτ and hANN/i c , Y parameter matrix of the coarse model Y c , as well as the input/voltage Y-mapping matrix, i.e., the sum of products of ejωkτ and fANN/v f . The other part is the sensitivity matrix of h ANN . Equation (5) is more general than formulas of small-signal Y parameter of the Neuro-SM models in [10, 13, 14] due to the consideration of the new effects of current mappings and dynamic mappings. For large-signal case, we need to derive the relationship between HB computation and dynamic voltage and current mapping neural networks so that model training can be performed with harmonic data. Let the harmonic current of the general Neuro-SM model and coarse model at a generic harmonic frequency ω k be I f (ω k ) and I c (ω k ), respectively. The I f (ω k ) can be evaluated as

$$ {\displaystyle \begin{array}{l}{\boldsymbol{I}}_f\left({\omega}_k\right)\\ {}{\left.=\frac{1}{N_T}\sum \limits_{n=0}^{N_T-1}{\boldsymbol{h}}_{\mathrm{ANN}}\Big({\boldsymbol{v}}_f\left({t}_n\right),{\boldsymbol{i}}_c\left({t}_n\right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)},{\boldsymbol{i}}_c\left({t}_n-\tau \right){\left|{}_{{\boldsymbol{v}}_c\left({t}_n\right)},\dots, {\boldsymbol{i}}_c\left({t}_n-{N}_c\tau \right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)},{\boldsymbol{w}}_2\Big)\cdot {W}_N\left(n,k\right)\end{array}} $$
(7)

where

$$ {\boldsymbol{v}}_c\left({t}_n\right)={\boldsymbol{f}}_{\mathrm{ANN}}\left({\boldsymbol{v}}_f\left({t}_n\right),{\boldsymbol{v}}_f\left({t}_n-\tau \right),\dots, {\boldsymbol{v}}_f\left({t}_n-{N}_v\tau \right),{\boldsymbol{w}}_1\right) $$
(8)
$$ {\boldsymbol{v}}_f\left({t}_n- m\tau \right)=\sum \limits_{k=0}^{N_H}{\boldsymbol{V}}_f\left({\omega}_k\right)\cdot {e}^{- jm{\omega}_k\tau}\cdot {W}_N^{\ast}\left(n,k\right),m=0,1,\dots, {N}_v $$
(9)

where the subscript k represents the index of the harmonic frequency, k = 0, 1, 2, …, N H , where N H is the number of harmonics considered in HB simulation. N T is the number of time sampling points, W N (n, k) is the Fourier coefficient for the nth time sample and the k-th harmonic, superscript * denotes complex conjugate, and m represents the index of voltage delay buffers, m = 0, 1,…, N v, . As seen from (7)~(9), apart from changing the nonlinearity of the coarse model, dynamic voltage and current neural network mappings can also change the dynamic order so that the proposed general Neuro-SM has the potential to model the microwave devices with high order dynamics and nonlinearity.

4 Sensitivity analysis of the general Neuro-SM model with respect to mapping neural network weights

Let the number of hidden neurons of the dynamic voltage and current mapping neural networks be Nhv and Nhc, respectively. Let generic symbols w1, i (i = 1, 2, …,  Nhv) and w2, i (i = 1, 2, …, Nhc) be internal weights of the voltage and current mapping neural network, respectively. w1, i and w2, i are the i-th component of vectors w1 and w2, respectively. In order to train the general Neuro-SM efficiently, gradient information provided by sensitivities of the model with respect to w1, i and w2, i is needed [16].

(1) DC sensitivity: let the DC output at gate and drain of the general Neuro-SM model be If, DC. The sensitivities of If, DC with respect to w1, i and w2, i are described in functional form as

$$ {\displaystyle \begin{array}{l}\frac{\partial {\boldsymbol{I}}_{f,\mathrm{DC}}}{\partial {w}_{1,i}}={\left(\frac{\partial {\boldsymbol{I}}_{f,\mathrm{DC}}^T}{\partial {\boldsymbol{I}}_{c,\mathrm{DC}}}\right)}^T\cdot {\left(\frac{\partial {\boldsymbol{I}}_{c,\mathrm{DC}}^T}{\partial {\boldsymbol{V}}_{c,\mathrm{DC}}}\right)}^T\cdot \frac{\partial {\boldsymbol{V}}_{c,\mathrm{DC}}}{\partial {w}_{1,i}}\\ {}={\left(\frac{\partial {\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{V}}_{f,\mathrm{DC}},\overset{N_c+1}{\overbrace{{\left.{\boldsymbol{I}}_{c,\mathrm{DC}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}},{\boldsymbol{I}}_{c,\mathrm{DC}}{\left|{}_{{\boldsymbol{V}}_{c,\mathrm{DC}}},\dots, {\boldsymbol{I}}_{c,\mathrm{DC}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}}}},{\boldsymbol{w}}_2\right)\kern0.6em }{\partial {\boldsymbol{I}}_{c,\mathrm{DC}}}\right)}^T\kern0.3em \cdot {\boldsymbol{G}}_c\\ {}\kern3.399999em \cdot \frac{\partial {\boldsymbol{f}}_{\mathrm{ANN}}\left(\overset{N_v+1}{\overbrace{{\boldsymbol{V}}_{f,\mathrm{DC}},{\boldsymbol{V}}_{f,\boldsymbol{DC}},\dots, {\boldsymbol{V}}_{f,\mathrm{DC}}}},{\boldsymbol{w}}_1\right)\kern0.1em }{\partial {w}_{1,i}}\kern11.60001em \end{array}} $$
(10)
$$ \frac{\partial {\boldsymbol{I}}_{f,\mathrm{DC}}}{\partial {w}_{2,i}}=\frac{\partial {\boldsymbol{h}}_{\mathrm{ANN}}\left({\boldsymbol{V}}_{f,\mathrm{DC}},\overset{N_c+1}{\overbrace{{\left.{\boldsymbol{I}}_{c,\mathrm{DC}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}},{\boldsymbol{I}}_{c,\mathrm{DC}}{\left|{}_{{\boldsymbol{V}}_{c,\mathrm{DC}}},\dots, {\boldsymbol{I}}_{c,\mathrm{DC}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}}}},{\boldsymbol{w}}_2\right)\kern0.1em }{\partial {w}_{2,i}} $$
(11)

where \( {\boldsymbol{G}}_c={\left(\partial {\boldsymbol{I}}_{c,\mathrm{DC}}^T/\partial {\boldsymbol{V}}_{c,\mathrm{DC}}\right)}^T \) is the DC conductance matrix of the existing coarse model, and the first-order derivatives fANN/∂w1, i and hANN/∂w2, i can be calculated by neural network backpropagation [17].

(2) S parameter sensitivity: S parameter sensitivity can be obtained by converting its Y parameter sensitivity. The small-signal Y parameter sensitivities of the general Neuro-SM model with respect to w1, i and w2, i are shown in Eqs. (12) and (13), respectively. These two equations can be obtain by differentiating (5) with respect to w1, i and w2, i, respectively.

$$ {\displaystyle \begin{array}{l}\frac{\partial {\boldsymbol{Y}}_f\left(\omega \right)}{\partial {w}_{1,i}}\\ {}=\sum \limits_{r=1,2}\sum \limits_{m=0}^{N_c}\left(\begin{array}{l}{\left(\sum \limits_{l=0}^{N_c}{\left.{e}^{- j\omega l\tau}\cdot \frac{\partial^2{\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{i}}_c\left(t- l\tau \right)\partial {i}_{cr}\left(t- m\tau \right)}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f,\mathrm{Bias}}\\ {}{\left.{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\dots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\\ {}{\left.\cdot {e}^{- j\omega m\tau}\cdot \sum \limits_{p=1,2}{Y}_{c, rp}\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\cdot {\left.\frac{\partial {f}_{\mathrm{ANN}p}\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {w}_{1,i}}\right|}_{{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}}\kern0.1em \end{array}\right)\\ {}\kern0.5em \cdot {\left.{\boldsymbol{Y}}_c\left(\omega \right)\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\cdot {\left(\sum \limits_{k=0}^{N_v}{e}^{- j\omega k\tau}\cdot {\left.\frac{\partial {\boldsymbol{f}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {\boldsymbol{v}}_f\left(t- k\tau \right)}\right|}_{{\boldsymbol{v}}_f(t)={\boldsymbol{v}}_f\left(t-\tau \right)=\dots ={\boldsymbol{v}}_f\left(t-{N}_v\tau \right)={\boldsymbol{V}}_{f,\mathrm{Bias}}}\right)}^T\\ {}+{\left(\sum \limits_{l=0}^{N_c}{\left.{e}^{- j\omega l\tau}\cdot \frac{\partial {\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{i}}_c\left(t- l\tau \right)}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}\\ {}{\left.{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\dots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\cdot {\left.{\boldsymbol{Y}}_c\left(\omega \right)\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\kern0.1em \\ {}\kern0.3em \cdot {\left(\sum \limits_{k=0}^{N_v}{e}^{- j\omega k\tau}\cdot {\left.\frac{\partial^2{\boldsymbol{f}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {\boldsymbol{v}}_f\left(t- k\tau \right)\partial {w}_{1,i}}\right|}_{{\boldsymbol{v}}_f(t)={\boldsymbol{v}}_f\left(t-\tau \right)=\dots ={\boldsymbol{v}}_f\left(t-{N}_v\tau \right)={\boldsymbol{V}}_{f,\mathrm{Bias}}}\right)}^T\\ {}+{\left(\sum \limits_{l=0}^{N_c}{\left.{e}^{- j\omega l\tau}\cdot \frac{\partial {\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{i}}_c\left(t- l\tau \right)}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}\\ {}{\left.{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\dots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\\ {}\cdot \left(\sum \limits_{r=1,2}{\left.\frac{\partial {\boldsymbol{Y}}_c}{\partial {v}_{cr}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\cdot {\left.\frac{\partial {f}_{\mathrm{ANN}r}\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {w}_{1,i}}\right|}_{{\boldsymbol{v}}_f(t)={\boldsymbol{v}}_f\left(t-\tau \right)=\dots ={\boldsymbol{v}}_f\left(t-{N}_v\tau \right)={\boldsymbol{V}}_{f,\mathrm{Bias}}}\right)\\ {}\kern0.6em \cdot {\left(\sum \limits_{k=0}^{N_v}{e}^{- j\omega k\tau}\cdot {\left.\frac{\partial {\boldsymbol{f}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {\boldsymbol{v}}_f\left(t- k\tau \right)}\right|}_{{\boldsymbol{v}}_f(t)={\boldsymbol{v}}_f\left(t-\tau \right)=\dots ={\boldsymbol{v}}_f\left(t-{N}_v\tau \right)={\boldsymbol{V}}_{f,\mathrm{Bias}}}\right)}^T\kern3.199999em \\ {}\kern1em \end{array}} $$
(12)
$$ {\displaystyle \begin{array}{l}\frac{\partial {\boldsymbol{Y}}_f\left(\omega \right)}{\partial {w}_{2,i}}\\ {}={\left(\sum \limits_{l=0}^{N_c}{\left.{e}^{- j\omega l\tau}\cdot \frac{\partial^2{\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{i}}_c\left(t- l\tau \right)\partial {w}_{2,i}}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}\\ {}{\left.{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\dots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\cdot {\left.{\boldsymbol{Y}}_c\left(\omega \right)\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\\ {}\kern0.6em \cdot {\left(\sum \limits_{k=0}^{N_v}{e}^{- j\omega k\tau}\cdot {\left.\frac{\partial {\boldsymbol{f}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {\boldsymbol{v}}_f\left(t- k\tau \right)}\right|}_{{\boldsymbol{v}}_f(t)={\boldsymbol{v}}_f\left(t-\tau \right)=\dots ={\boldsymbol{v}}_f\left(t-{N}_v\tau \right)={\boldsymbol{V}}_{f,\mathrm{Bias}}}\right)}^T\\ {}\kern0.5em +{\left({\left.\frac{\partial^2{\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{v}}_f(t)\partial {w}_{2,i}}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}\\ {}{\left.{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\dots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\end{array}} $$
(13)

where the second-order derivative of the dynamic voltage and current mapping neural networks fANN and hANN, which are the differentiation of the Jacobian matrix \( \partial {\boldsymbol{f}}_{\mathrm{ANN}}^T/\partial {\boldsymbol{i}}_c\left(t- l\tau \right) \) and \( \partial {\boldsymbol{f}}_{\mathrm{ANN}}^T/\partial {\boldsymbol{v}}_f\left(t- k\tau \right) \) with respect to w1, i and w2, i, can be obtained by the adjoint neural network back-propagation [17], respectively.

(3) HB sensitivity: the sensitivities of the large-signal harmonic current of the general Neuro-SM model with respect to w1, i and w2, i at a generic harmonic frequency ω k , k = 0, 1, 2, …, N H can be described in functional form as

$$ {\displaystyle \begin{array}{l}\frac{\partial {\boldsymbol{I}}_f\left({\omega}_k\right)}{\partial {w}_{1,i}}\\ {}=\frac{1}{N_T}\sum \limits_{n=0}^{N_T-1}\left(\begin{array}{l}\sum \limits_{m=0}^{N_c}\frac{{\left.\operatorname{}\partial {\boldsymbol{h}}_{\mathrm{ANN}}\Big({\boldsymbol{v}}_f\left({t}_n\right),{\boldsymbol{i}}_c\left({t}_n\right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)},{\boldsymbol{i}}_c\left({t}_n-\tau \right){\left|{}_{{\boldsymbol{v}}_c\left({t}_n\right)},\dots, {\boldsymbol{i}}_c\left({t}_n-{N}_c\tau \right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)},{\boldsymbol{w}}_2\Big)}{\partial {\boldsymbol{i}}_c\left({t}_n- m\tau \right)}\cdot {e}^{- j\omega m\tau}\\ {}{\left.\operatorname{}\cdot {\boldsymbol{G}}_c\left({t}_n\right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)}\cdot \frac{\partial {\boldsymbol{f}}_{\mathrm{ANN}}\left({\boldsymbol{v}}_f\left({t}_n\right),{\boldsymbol{v}}_f\left({t}_n-\tau \right),\dots, {\boldsymbol{v}}_f\left({t}_n-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {w}_{1,i}}\cdot {W}_N\left(n,k\right)\end{array}\right)\kern0.1em \end{array}} $$
(14)
$$ {\displaystyle \begin{array}{l}\frac{\partial {\boldsymbol{I}}_f\left({\omega}_k\right)}{\partial {w}_{2,i}}\\ {}=\frac{1}{N_T}\sum \limits_{n=0}^{N_T-1}\frac{{\left.\partial {\boldsymbol{h}}_{\mathrm{ANN}}\Big({\boldsymbol{v}}_f\left({t}_n\right),{\boldsymbol{i}}_c\left({t}_n\right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)},{\boldsymbol{i}}_c\left({t}_n-\tau \right){\left|{}_{{\boldsymbol{v}}_c\left({t}_n\right)},\dots, {\boldsymbol{i}}_c\left({t}_n-{N}_c\tau \right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)},{\boldsymbol{w}}_2\Big)}{\partial {w}_{2,i}}\cdot {W}_N\left(n,k\right)\end{array}} $$
(15)

where G c (t n ) at the mapped voltage of coarse model vc(t n ) is the nonlinear conductance matrix of the existing coarse model at time point tn.

5 Sensitivity analysis of the general Neuro-SM model with respect to coarse model parameters

Let x be a generic variable in the coarse model. In case the coarse model parameter needs to be treated as a variable in circuit optimization, it is useful to obtain the sensitivity for DC, bias-dependent S parameter, and large-signal HB responses of the general Neuro-SM model due to changes in the generic optimization variable x.

(1) DC sensitivity: the sensitivity of If, DC with respect to x is derived as

$$ {\displaystyle \begin{array}{l}\frac{\partial {\boldsymbol{I}}_{f,\mathrm{DC}}}{\partial x}={\left(\frac{\partial {\boldsymbol{I}}_{f,\mathrm{DC}}^T}{\partial {\boldsymbol{I}}_{c,\mathrm{DC}}}\right)}^T\cdot \frac{\partial {\boldsymbol{I}}_{c,\mathrm{DC}}^T}{\partial x}\\ {}={\left(\frac{\partial {\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{V}}_{f,\mathrm{DC}},\overset{N_c+1}{\overbrace{{\left.{\boldsymbol{I}}_{c,\mathrm{DC}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}},{\boldsymbol{I}}_{c,\mathrm{DC}}{\left|{}_{{\boldsymbol{V}}_{c,\mathrm{DC}}},\dots, {\boldsymbol{I}}_{c,\mathrm{DC}}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}}}},{\boldsymbol{w}}_2\right)\kern0.6em }{\partial {\boldsymbol{I}}_{c,\mathrm{DC}}}\right)}^T\cdot {\left.\frac{\partial {\boldsymbol{I}}_{c,\mathrm{DC}}^T}{\partial x}\right|}_{{\boldsymbol{V}}_{c,\mathrm{DC}}}\end{array}} $$
(16)

where \( \partial {\boldsymbol{I}}_{c,\mathrm{DC}}^T/\partial x \) is the DC current response due to changes in coarse model variable x evaluated at the mapped bias Vc, DC.

(2) S parameter sensitivity: S parameter sensitivity with respect to coarse model variable x can also be calculated by converting its Y parameter sensitivity. The Y parameter sensitivity is shown as

$$ {\displaystyle \begin{array}{l}\frac{\partial {\boldsymbol{Y}}_f\left(\omega \right)}{\partial x}\\ {}=\sum \limits_{r=1,2}\sum \limits_{m=0}^{N_c}\left({\left(\sum \limits_{l=0}^{N_c}{\left.{e}^{- j\omega l\tau}\cdot \frac{\partial^2{\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{i}}_c\left(t- l\tau \right)\partial {i}_{cr}\left(t- m\tau \right)}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}\\ {}{\left.{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\dots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\cdot {e}^{- j\omega m\tau}\cdot \frac{\partial {i}_{cr}}{\partial x}\right)\\ {}\kern0.7em \cdot {\left.{\boldsymbol{Y}}_c\left(\omega \right)\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\cdot {\left(\sum \limits_{k=0}^{N_v}{e}^{- j\omega k\tau}\cdot {\left.\frac{\partial {\boldsymbol{f}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {\boldsymbol{v}}_f\left(t- k\tau \right)}\right|}_{{\boldsymbol{v}}_f(t)={\boldsymbol{v}}_f\left(t-\tau \right)=\dots ={\boldsymbol{v}}_f\left(t-{N}_v\tau \right)={\boldsymbol{V}}_{f,\mathrm{Bias}}}\right)}^T\\ {}+{\left(\sum \limits_{l=0}^{N_c}{\left.{e}^{- j\omega l\tau}\cdot \frac{\partial {\boldsymbol{h}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{i}}_c(t),{\boldsymbol{i}}_c\left(t-\tau \right),\dots, {\boldsymbol{i}}_c\left(t-{N}_c\tau \right),{\boldsymbol{w}}_2\right)}{\partial {\boldsymbol{i}}_c\left(t- l\tau \right)}\right|}_{\begin{array}{l}{\boldsymbol{v}}_f={\boldsymbol{V}}_{f.\mathrm{Bias}}\\ {}{\left.{\boldsymbol{i}}_c(t)={\boldsymbol{i}}_c\left(t-\tau \right)=\dots ={\boldsymbol{i}}_c\left(t-{N}_c\tau \right)={\boldsymbol{I}}_c\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\end{array}}\right)}^T\cdot {\left.\frac{\partial {\boldsymbol{Y}}_c{\left(\omega \right)}_c}{\partial x}\right|}_{{\boldsymbol{V}}_{c,\mathrm{Bias}}}\\ {}\kern0.6em \cdot {\left(\sum \limits_{k=0}^{N_v}{e}^{- j\omega k\tau}\cdot {\left.\frac{\partial {\boldsymbol{f}}_{\mathrm{ANN}}^T\left({\boldsymbol{v}}_f(t),{\boldsymbol{v}}_f\left(t-\tau \right),\dots, {\boldsymbol{v}}_f\left(t-{N}_v\tau \right),{\boldsymbol{w}}_1\right)}{\partial {\boldsymbol{v}}_f\left(t- k\tau \right)}\right|}_{{\boldsymbol{v}}_f(t)={\boldsymbol{v}}_f\left(t-\tau \right)=\dots ={\boldsymbol{v}}_f\left(t-{N}_v\tau \right)={\boldsymbol{V}}_{f,\mathrm{Bias}}}\right)}^T\kern6.099997em \end{array}} $$
(17)

where Y c /∂x is the sensitivity for Y parameter of the coarse model due to changes in x. ∂i cr /∂x, r = 1, 2 is the derivative of coarse model current with respect to x, which can be calculated by coarse model sensitivity analysis.

(3) HB sensitivity: the sensitivity of the harmonic current of the general Neuro-SM model with respect to x at a generic harmonic frequency ω k , k = 0, 1, …, N H is shown in Eq. (18), where i c (t n )/∂x is the sensitivity of the nonlinear current of the coarse model with respect to x at time sample t n .

$$ {\displaystyle \begin{array}{l}\frac{\partial {\boldsymbol{I}}_f\left({\omega}_k\right)}{\partial {w}_{1,i}}\\ {}=\frac{1}{N_T}\sum \limits_{n=0}^{N_T-1}\left(\begin{array}{l}\sum \limits_{m=0}^{N_c}\frac{{\left.\partial {\boldsymbol{h}}_{\mathrm{ANN}}\Big({\boldsymbol{v}}_f\left({t}_n\right),{\boldsymbol{i}}_c\left({t}_n\right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)},{\boldsymbol{i}}_c\left({t}_n-\tau \right){\left|{}_{{\boldsymbol{v}}_c\left({t}_n\right)},\dots, {\boldsymbol{i}}_c\left({t}_n-{N}_c\tau \right)\right|}_{{\boldsymbol{v}}_c\left({t}_n\right)},{\boldsymbol{w}}_2\Big)}{\partial {\boldsymbol{i}}_c\left({t}_n- m\tau \right)}\cdot {e}^{- j\omega m\tau}\\ {}\cdot \frac{\partial {\boldsymbol{i}}_c\left({t}_n\right)}{\partial x}\cdot {W}_N\left(n,k\right)\end{array}\right)\kern0.2em \end{array}} $$
(18)

6 Proposed training algorithm for the general Neuro-SM model

Training is the key step to determine the general Neuro-SM model. The model development process needs two phases: initial training and formal training.

  1. A.

    Initial training

Before the nonlinear device data from simulation or measurement is used for formal training, the general Neuro-SM model is first initialized to be equal to the original coarse model. In such case, the dynamic voltage and current neural networks are initialized to learn unit mappings, i.e., to learn the relationships vc1(t) = vf1(t), vc2(t) = vf2(t), ic1(t) = if1(t), and ic2(t) = if2(t) in the entire operation range of the nonlinear device.

  1. B.

    Formal training

In this phase, the weights of dynamic voltage and current mapping neural networks, i.e., w 1 and w 2 , are trained such that the overall training error of the general Neuro-SM model can be reduced to satisfy the specifications. The overall training error for combined DC, small-signal S parameter, and large-signal HB training is defined as the total difference between all nonlinear device data and the general Neuro-SM model as:

$$ {\displaystyle \begin{array}{l}E\left({\boldsymbol{w}}_{\boldsymbol{1}},{\boldsymbol{w}}_{\boldsymbol{2}}\right)=\frac{1}{2}\sum \limits_{k=1}^{N_{V_{f2}}}\sum \limits_{l=1}^{N_{V_{f1}}}{\left\Vert \boldsymbol{A}\cdot \left(\boldsymbol{I}\left({V}_{Gl},{V}_{Dk},{\boldsymbol{w}}_{\boldsymbol{1}},{\boldsymbol{w}}_{\boldsymbol{2}},\right)-{\boldsymbol{I}}_{Dl}^k\right)\right\Vert}^2\\ {}\kern1.6em +\frac{1}{2}\sum \limits_{k=1}^{N_{V_{f2}}}\sum \limits_{l=1}^{N_{V_{f1}}}\sum \limits_{j=1}^{N_{\mathrm{freq}}}{\left\Vert \boldsymbol{B}\cdot \left(\boldsymbol{S}\left({V}_{Gl},{V}_{Dk},{\omega}_j,{\boldsymbol{w}}_{\boldsymbol{1}},{\boldsymbol{w}}_{\boldsymbol{2}}\right)-{\boldsymbol{S}}_{Dl j}^k\right)\right\Vert}^2\kern5.999997em \\ {}\kern1.1em +\frac{1}{2}\sum \limits_{k=1}^{N_{V_{f2}}}\sum \limits_{l=1}^{N_{V_{f1}}}\sum \limits_{m=1}^{N_H}\sum \limits_{n=1}^{N_P}{\left\Vert \boldsymbol{C}\cdot \left(\boldsymbol{HB}\left({V}_{Gl},{V}_{Dk},{\omega}_m,{P}_n,{\boldsymbol{w}}_{\boldsymbol{1}},{\boldsymbol{w}}_{\boldsymbol{2}}\right)-{\boldsymbol{HB}}_{Dk l}^{mn}\right)\right\Vert}^2\end{array}} $$
(19)

where I(.), S(.), and HB(.) are the DC, bias-dependent S parameter, and HB responses of the general Neuro-SM model, respectively. Take FET modeling as an example, vector I(.) contains gate and drain current If1 and If2, which can be computed by Eq. (3). Vector S(.) is achieved from the Y matrix defined by Eq. (5). HB responses of the general Neuro-SM model, i.e., HB(.) can be calculated by Eq. (7). I D , S D , and HB D represent the DC current, small-signal S parameter, and large-signal HB responses of the modeling device, respectively. The subscript k \( \left(k=1,2,\dots, {N}_{V_{f2}}\right) \), l \( \left(l=1,2,\dots, {N}_{V_{f1}}\right) \), j (j = 1, 2, …, Nfreq), m (m = 1, 2, …, N H ), and n (n = 1, 2, …, N P ) denote the indices of Vf2, Vf1, frequency, harmonic frequency, and input power level, respectively. \( {N}_{V_{f1}} \), \( {N}_{V_{f2}} \), Nfreq, N H , and N P are the total number of Vf1, Vf2,frequency, harmonic frequency, and input power level, respectively. Diagonal matrices A, B, and C contain all the scaling factors, which are defined as the inverse of the minimum-to-maximum range of the I D data, S D data, and HB D data, respectively. The training error calculation of the general Neuro-SM model for combined DC and S parameter training as well as HB training further illustrates in Fig. 2. Figure 2a, b is error calculation for combined dc and small-signal S parameter training as well as large-signal HB training, respectively.

Fig. 2
figure 2

Block diagram for error calculation of the general Neuro-SM model. a Error calculation for combined dc and small-signal S parameter training. b Error calculation for large-signal HB training

The objective of the model training is to minimize the error E defined in (19) by optimizing w 1 and w 2 . In general, gradient-based training algorithm is used. After training, the general Neuro-SM model with appropriate hidden neurons and delay buffers can accurately represent the nonlinear behavior of the modeling device.

7 Discussion

The proposed Neuro-SM model, after being trained for a specific range, is very good at representing the nonlinear behavior of the microwave device within the training region. However, when we use model in a wider range than the training range, inappropriate derivative information of the model outside the training range may mislead the iterative process into slow convergence or even divergence during large-signal simulation. One possible way to solve the divergence problem is to use appropriate extrapolation technique. For general Neuro-SM technique, a simple and effective extrapolation technique is used to improve the convergence of the model [18].

For simplification, the proposed general Neuro-SM technique is formulated for 2-port field-effect transistor (FET) modeling. This approach can be further extended to n-port network, where all the notations and equations are extended accordingly. After the generalization, the proposed general Neuro-SM technique has the potential to be used for developing models of microwave devices with trapping effect.

The format of the general Neuro-SM model presented so far is to map the voltage input signals between the coarse and fine models. Hence, our approach presented so far is applicable to modeling voltage controlled devices, such as FET and HEMT. It is possible to extend the method to a mixed input mapping case, where the dynamic input mappings are for a mixture of port voltage and current signals. In that way, our approach can be extended to modeling current controlled devices, such as HBT.

The frequency limit of the proposed general Neuro-SM model depends on the frequency limit of training data. For example, if the frequency in the training data extends to millimeter wave bands, the proposed general mapping will be even more important because of the need of capacitive effects, non-quasi-static effects, and nonlinear effects in the model. In this case, more hidden neurons and time delay buffers maybe needed to guarantee the accuracy of the proposed general Neuro-SM model.

8 A pHEMT modeling using the proposed general Neuro-SM method

This example illustrates the use of the general Neuro-SM for modeling of a real 2 × 50 μm GaAs pHEMT device. The training and test data is obtained from measurement. An enhanced Angelov model including a thermal subcircuit to model the self-heating effect of the device proposed in [19] is used as the existing coarse model. Even though parameters in enhanced Angelov model are extracted as much as possible, there are still distinct differences between the model and measured data. Thus, Neuro-SM is used to bridge the gap between the coarse model and measured data. We then apply the previously published Neuro-SM technique such as Neuro-SM with the output mapping [13] and dynamic Neuro-SM [14] to get more accurate models. After training, the accuracy of the two Neuro-SM models is clearly improved compared to that of the coarse model, as shown in Fig. 3. However, the previous Neuro-SM techniques at their best are still insufficient to achieve the desired accuracy. Then, our proposed general Neuro-SM is used to get a more accurate model.

Fig. 3
figure 3

Comparison between the pHEMT device data, coarse model, and three Neuro-SM models. a dc. b-e S parameter at two test biase points (0.7V, 2.4V) and (0.3 V 5.2V). f HB at different input power levels -10-3dBm

Training was firstly done in NeuroModelerPlus [20] using DC and bias-dependent S parameter data for 400 iterations. Then, training refinement was done using combined DC, bias-dependent S parameter, and HB data at 189 different biases for 3600 iterations. Harmonic data used for HB training was measured at 7.5 GHz fundamental frequency and different input power levels (− 10~ 3 dBm). Time delay parameters are both 0.008 ns. The number of hidden neurons for both voltage and current mapping neural networks is 30.

9 Results

After training, we compared the DC, bias-dependent S parameter, and large-signal HB responses of the pHEMT device with those computed from the coarse model, Neuro-SM with the output mapping [13], dynamic Neuro-SM [14] with 5 delay buffers and 30 hidden neurons, and the proposed general Neuro-SM model with 5 delay buffers and 30 hidden neurons both for dynamic voltage and current mapping neural networks as shown in Fig. 3. In Fig. 3a, b–e, f represent the comparisons of dc, S parameter at two test bias points (0.7 V, 2.4 V) and (0.3 V, 5.2 V), as well as HB responses at different input power levels (− 10~ 3 dBm), respectively. As observed from Fig. 3, the responses computed from the proposed general Neuro-SM are closest to the data among all the four models in this comparison. We obtain further improvement in model accuracy using general Neuro-SM technique because additional and useful degrees of freedom provided by the new dynamic current mappings at the gate and the drain in the general model. The increased accuracy of the general Neuro-SM model helps to improve the accuracy of circuit and system simulation, such as simulation to predict power performance and linearity of high-frequency PA designs.

There are two important factors that impact the accuracy of the dynamic Neuro-SM model and the proposed general Neuro-SM model, i.e., number of hidden neurons and delay buffers. To show the results further, we compared the training and test error of the dynamic Neuro-SM and general Neuro-SM with different delay buffers and hidden neurons as shown in Table 1. As seen in Table 1, general Neuro-SM with 30 hidden neurons and 5 delay buffers both for dynamic voltage and current mapping neural networks are suitable for this example.

Table 1 Training and test error comparison of coarse model, dynamic Neuro-SM model, and the proposed general Neuro-SM model after combined DC, S parameter, and HB training

The proposed general Neuro-SM model can be conveniently implemented into the existing circuit simulators such as Keysight ADS for high-level circuit and system design. Figure 4 shows the proposed general Neuro-SM model structure in ADS. The time delay parameter is 0.08 ns. In this figure, the dynamic voltage mapping neural networks are embedded as the functions in two 7-port symbolically defined devices (SDDs), i.e., SDD7P1, and SDD7P2. Similarly, the dynamic current mapping neural networks are embedded as the functions in two 9-port SDDs, i.e., SDD9P1 and SDD9P2. Time delay voltage and current signals can be obtained using voltage controlled voltage sources with delay parameters, i.e., SRC1~SRC8. After implementing the general Neuro-SM model into ADS, we have also compared simulation speed between coarse model, dynamic Neuro-SM, and the proposed general Neuro-SM model on an Intel i5-3230M 2.6 GHz computer as shown in Table 2. The simulation was performed by Monte Carlo analysis of 200 HB simulations. As seen in Table 2, the simulation time is 48.32 s using coarse model, compared to 57.17 s using general Neuro-SM, showing that the simulation speed of the proposed general Neuro-SM is acceptable in view of its good accuracy.

Fig. 4
figure 4

Structure of the general Neuro-SM model with two delay buffers in ADS

Table 2 Model simulation time comparison between coarse model, dynamic NEURO-SM, and the general Neuro-SM model

10 Conclusions

This paper has presented a general Neuro-SM technique for nonlinear device modeling. By modifying the dynamic current and dynamic voltage relationships in the existing coarse model, the proposed general Neuro-SM model can exceed the accuracy limit over the coarse model, the Neuro-SM model with the output mapping, and the dynamic Neuro-SM model. Compared to previously published Neuro-SM, the proposed general Neuro-SM has demonstrated much improved performance in terms of accuracy by a pHEMT modeling example. The general Neuro-SM model can be applied to microwave circuit and system design.

References

  1. Z Li, Y Chen, H Shi, et al., NDN-GSM-R: a novel high-speed railway communication system via named data networking. EURASIP J. Wirel. Commun. Netw. 48(1), 1–5 (2016)

    Article  Google Scholar 

  2. H Shi, Z Li, D Liu, et al., Efficient method of two-dimensional DOA estimation for coherent signals. EURASIP J. Wirel. Commun. Netw. 60, 1–10 (2017)

    Google Scholar 

  3. F Zhao, L Wei, H Chen, Optimal time allocation for wireless information and power transfer in wireless powered communication systems. IEEE Trans. Veh. Technol. 65(3), 1830–1835 (2016)

    Article  Google Scholar 

  4. D Liu, Z Li, X Guo, et al., DOA estimation for wideband chip with a few snapshots. EURASIP J. Wirel. Commun. Netw. 28, 1–7 (2017)

    Google Scholar 

  5. F Zhao, B Li, H Chen, et al., Joint beamforming and power allocation for cognitive MIMO systems under imperfect CSI based on game theory. Wirel. Pers. Commun. 73(3), 679–694 (2013)

    Article  Google Scholar 

  6. F Zhao, W Wang, H Chen, et al., Interference alignment and game-theoretic power allocation in MIMO heterogeneous sensor networks communications. Signal Process. 126(9), 173–179 (2016)

  7. Z Li, L Song, H Shi, Approaching the capacity of K-user MIMO interference channel with interference counteraction scheme. Ad Hoc Netw. 58(4), 286–291 (2017)

    Article  Google Scholar 

  8. F Zhao, H Nie, H Chen, Group buying spectrum auction algorithm for fractional frequency reuses cognitive cellular systems. Ad Hoc Netw. 58(4), 239–246 (2017)

    Article  Google Scholar 

  9. F. Zhao, X. Sun, H. Chen, et al., Outage performance of relay-assisted primary and secondary transmissions in cognitive relay networks, EURASIP J. Wirel. Commun. Netw., 2014, 60(1).

  10. L Zhang, J Xu, M Yagoub, et al., Neuro-space mapping technique for nonlinear device modeling and large-signal simulation (IEEE MIT-S Int. Microwave Symp, Philadelphia, PA, 2003), pp. 173–176

    Google Scholar 

  11. Q Zhang, K Gupta, V Devabhaktuni, Artificial neural networks for RF and microwave design: From theory to practice. IEEE Transaction on Microwave Theory and Techniques 51(4), 1339–1350 (2003)

    Article  Google Scholar 

  12. J Bandler, Q Cheng, S Dakroury, et al., Space mapping: the state of the art. IEEE Transaction on Microwave Theory and Techniques 52(1), 337–361 (2004)

    Article  Google Scholar 

  13. L Zhu, K Liu, Q Zhang, et al., An enhanced analytical neuro-space mapping method for large-signal microwave device modeling (IEEE MIT-S Int. Microwave Symp. Dig, Montreal, QC, 2012), pp. 1–3

    Google Scholar 

  14. L Zhu, Q Zhang, K Liu, et al., A novel dynamic neuro-space mapping approach for nonlinear microwave device modeling. IEEE Microwave and Wireless Components Letters 26(2), 131–133 (2016)

    Article  Google Scholar 

  15. L Zhang, J Xu, MC Yagoub, et al., Efficient analytical formulation and sensitivity analysis of neuro-space mapping for nonlinear microwave device modeling. IEEE Transaction on Microwave Theory and Techniques 53(9), 2752–2767 (2005)

    Article  Google Scholar 

  16. Q Song, J Spall, Y Soh, et al., Robust neural network tracking controller using simultaneous perturbation stochastic approximation. IEEE Transaction on Neural Network 19(5), 817–835 (2008)

    Article  Google Scholar 

  17. Q Zhang, K Gupta, Neural network for microwave design (Artech House, Boson, MA, 2000)

    Google Scholar 

  18. L Zhang, QJ Zhang, Simple and effective extrapolation technique for neural-based microwave modeling. IEEE Microwave Component Letter 20(6), 301–303 (2010)

    Article  Google Scholar 

  19. L Liu, J Ma, G Ng, Electrothermal large-signal model of III-V FETs including frequency dispersion and charge conservation. IEEE Transaction on Microwave Theory and Techniques 57(12), 3106–3117 (2009)

    Article  Google Scholar 

  20. NeuroModelPlus_V2.1E, Q. J. Zhang, Dept. of Electronics, Carleton University, Ottawa, ON, Canada.

Download references

Acknowledgements

The authors would like to thank Prof. Q. J. Zhang at Carleton University, Ottawa, ON, Canada, for valuable discussions and insights throughout this work.

Funding

This work is supported by the Fundamental Research Funds for Universities in Tianjin (No. 2016CJ13), partly supported by the Key project of Tianjin Natural Science Foundation (No. 16JCZDJC38600), National Natural Science Foundation of China (No. 61601494, 61602346), and the Research Forums Cooperation Project of ZTE Corporation (2016ZTE04-09).

Availability of data and materials

The training and test data of the microwave transistor is obtained from measurement and can be shared if it is necessary.

Author information

Authors and Affiliations

Authors

Contributions

The authors have contributed jointly to all parts on the preparation of this manuscript. LZ (first author) and JZ contributed to the structure and sensitivity analysis of the general Neuro-SM model. LZ (third author), WL and LP contributed to the training algorithm development. HW and DL contributed to the analysis of simulation results. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lin Zhu.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, L., Zhao, J., Li, Z. et al. A general neuro-space mapping technique for microwave device modeling. J Wireless Com Network 2018, 37 (2018). https://doi.org/10.1186/s13638-018-1034-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13638-018-1034-4

Keywords