### Feature extraction model

In the feature extraction process, the noise environment needs to be estimated first, and two covariance matrices are obtained by splitting and recombining the noise signal matrix in sequence and interval. The model of feature extraction based on decomposition and recombination and information geometry is shown in Fig. 2. In order to accurately estimate the noise environment, collect enough noise signal matrices and perform O-DAR and I-DAR and covariance transformation (as shown in the box in Fig. 2). Then, use the Riemann mean calculation method to solve the Riemann mean of these covariance matrices. Similarly, the signal matrix with the perception is also subjected to two kinds of split recombination, and the covariance matrix is transformed. Finally, the distance from the covariance matrix obtained from the environment to be perceived to the Riemann mean is calculated. Then, use this distance as a statistical feature of the signal.

### Information geometry overview

According to the matrix **X**, the corresponding covariance matrix can be calculated as shown in Eq. 5.

$$\begin{array}{@{}rcl@{}} {\mathbf{R}} = \frac{1}{N}{\mathbf{X}}{{\mathbf{X}}^{T}} \end{array} $$

(5)

From the theory of information geometry, we assume a set of probability density functions *p*(*x*|*θ*), where *x* is an *n*-dimensional sample belonging to the random variable *Ω*, *x*∈*Ω*∈*C*^{n}. *θ* is an *m*-dimensional parameter vector, *θ*∈*Θ*⊆*C*^{m}. Therefore, the probability distribution space can be described by parameter set *Θ*. The probability distribution function family *S* is as shown in Eq. 6.

$$\begin{array}{@{}rcl@{}} S = \left\{{p(x|{\theta})|{\theta} \in \Theta \subseteq {C^{m}}} \right\} \end{array} $$

(6)

Under a certain topological structure, *S* can form a microscopic manifold, called a statistical manifold, and *θ* is the coordinate of the manifold. From the perspective of information geometry, the probability density function can be parameterized by the corresponding covariance matrix. Under the two hypotheses *H*_{0} and *H*_{1} of spectrum sensing, the signal can be mapped to a point that is **R**_{w} or **R**_{s}+**R**_{w}, on the manifold. **R**_{w} and **R**_{s}+**R**_{w} are respectively the covariance calculated from the noise matrix and the signal matrix. In particular, both **R**_{w} and **R**_{s}+**R**_{w} are Toeplitz Hermitian positive definite matrices [17]. Therefore, a symmetric positive definite (SPD) matrix space composed of a covariance matrix can be defined as an SPD manifold.

### Decomposition and recombination

In this section, we first split and reorganize the signal matrix for SU to logically increase the number of cooperative SUs. The DAR is divided into O-DAR and I-DAR. At the same time, the O-DAR and I-DAR are used to process the signal vector perceived by the SU. The specific algorithm is as follows [14]:

In the process of O-DAR, *x*_{i} will be sequentially split into sub-signal vectors of *q*(*q*>0) segment *s*=*N*/*q* long. Then, the result of splitting *x*_{i} is as follows:

$$ {{} \begin{aligned} {x_{i}}\left\{\begin{array}{l} {x_{i1}} = \left[ {{x_{i}}(1),{x_{i}}(2),\ldots,{x_{i}}(s)} \right]\\ {x_{i2}} \,=\, \left[ {{x_{i}}(s + 1),{x_{i}}(s + 2),\ldots,{x_{i}}(2s)} \right]\\ \vdots \\ {x_{iq}} = \left[ {{x_{i}}((q - 1)s + 1),{x_{i}}((q - 1)s + 2),\ldots,{x_{i}}(qs)} \right] \end{array}\right. \end{aligned}} $$

(7)

The signal vector in Eq. 4 is split according to Eq. 7, and then, the split sub-signal vector is recombined to obtain a *q**M*×*s* dimensional signal matrix **Y**_{O−DAR}.

$$ {\begin{aligned} {{\mathbf{Y}}_{O - DAR}} = \left[ \begin{array}{l} {x_{11}}\\ \vdots \\ {x_{1q}}\\ \vdots \\ {x_{im}}\\ \vdots \\ {x_{Mq}} \end{array} \right] = \left[ {\begin{array}{cccc} {{x_{1}}(1)}&{{x_{1}}(2)}& \cdots &{{x_{1}}(s)}\\ \vdots &{}&{}&{}\\ {{x_{1}}((q - 1)s + 1)}&{{x_{1}}((q - 1)s + 2)}& \cdots &{{x_{1}}(qs)}\\ \vdots &{}&{}&{}\\ {{x_{i}}((m - 1)s + 1)}&{{x_{i}}((m - 1)s + 2)}& \cdots &{{x_{i}}(ms)}\\ \vdots &{}&{}&{}\\ {{x_{M}}((q - 1)s + 1)}&{{x_{M}}((q - 1)s + 2)}& \cdots &{{x_{M}}(qs)} \end{array}} \right] \end{aligned}} $$

(8)

In the process of I-DAR, select sampling points in the sampled data every *q*−1 units and then recombine the signal matrix **X**. The sampled data is separated by *q*−1 units, the sample points are reselected, and the signal matrix is recombined. According to I-DAR, the sampled data can be split into sub-signal vectors of *q*(*q*>0) segment *s*=*N*/*q* long. Then, the result of splitting *x*_{i} is as follows:

$$\begin{array}{@{}rcl@{}} {x_{i}}\left\{\begin{array}{l} {x_{i1}} = \left[ {{x_{i}}(1),{x_{i}}(q + 1),\ldots,{x_{i}}((s - 1)q + 1)} \right]\\ {x_{i2}} = \left[ {{x_{i}}(2),{x_{i}}(q + 2),\ldots,{x_{i}}((s - 1)q + 2)} \right]\\ \vdots \\ {x_{iq}} = \left[ {{x_{i}}(q),{x_{i}}(q + q),\ldots,{x_{i}}((s - 1)q + q)} \right] \end{array} \right. \end{array} $$

(9)

The signal vector in Eq. 4 is split according to Eq. 9, and then, the split sub-signal vector is recombined to obtain a *q**M*×*s* dimensional signal matrix **Y**_{I−DAR}.

$$ {\begin{aligned} {{\mathbf{Y}}_{I - DAR}} = \left[ \begin{array}{l} {x_{11}}\\ \vdots \\ {x_{1q}}\\ \vdots \\ {x_{im}}\\ \vdots \\ {x_{Mq}} \end{array} \right] = \left[ {\begin{array}{llll} {{x_{1}}(1)}&{{x_{1}}(q + 1)}& \cdots &{{x_{1}}((s - 1)q + 1)}\\ \vdots &{}&{}&{}\\ {{x_{1}}(q)}&{{x_{1}}(q + q)}& \cdots &{{x_{1}}((s - 1)q + q)}\\ \vdots &{}&{}&{}\\ {{x_{i}}(m)}&{{x_{i}}(q + m)}& \cdots &{{x_{i}}((s - 1)q + m)}\\ \vdots &{}&{}&{}\\ {{x_{M}}(q)}&{{x_{M}}(q + q)}& \cdots &{{x_{M}}((s - 1)q + q)} \end{array}} \right] \end{aligned}} $$

(10)

According to **Y**_{O−DAR} and **Y**_{I−DAR}, the corresponding covariance matrices **R**^{O} and **R**^{I} can be calculated.

$$\begin{array}{@{}rcl@{}} {{\mathbf{R}}^{O}} = \frac{1}{s}{{\mathbf{Y}}_{O - DAR}}{{\mathbf{Y}}_{O - DAR}}^{T} \end{array} $$

(11)

$$\begin{array}{@{}rcl@{}} {{\mathbf{R}}^{I}} = \frac{1}{s}{{\mathbf{Y}}_{I - DAR}}{{\mathbf{Y}}_{I - DAR}}^{T} \end{array} $$

(12)

### Riemann mean

First, SUs collect *P* environmental noise matrices. These noise matrices are then processed using O-DAR and I-DAR, and the covariance matrix will be calculated. Thus, we can obtain \({\mathbf {R}}_{k}^{O}(k = 1,2,\ldots,P)\) and \({\mathbf {R}}_{k}^{I}(k = 1,2,\ldots,P)\) matrices. Their Riemann mean objective functions are shown in Eqs. 13 and 14, respectively.

$$\begin{array}{@{}rcl@{}} \Phi \left({\overline {\mathbf{R}}^{O}}\right) = \frac{1}{P}\sum\limits_{k = 1}^{P} {\mathrm{D}} \left({\mathbf{R}}_{k}^{O},{\overline {\mathbf{R}}^{O}}\right)\end{array} $$

(13)

$$\begin{array}{@{}rcl@{}} \Phi \left({\overline {\mathbf{R}}^{I}}\right) = \frac{1}{P}\sum\limits_{k = 1}^{P} {\mathrm{D}} \left({\mathbf{R}}_{k}^{I},{\overline {\mathbf{R}}^{I}}\right)\end{array} $$

(14)

\({\overline {\mathbf {R}}^{O}}\) and \({\overline {\mathbf {R}}^{I}}\) are the matrix when *Φ*(∙) takes the minimum value, where D(∙,∙) is the geodesic distance of two points on the manifold described below.

$$\begin{array}{@{}rcl@{}} {\overline {\mathbf{R}}^{O}} = {\arg\min}\ \Phi \left({\overline {\mathbf{R}}^{O}}\right) \end{array} $$

(15)

$$\begin{array}{@{}rcl@{}} {\overline {\mathbf{R}}^{I}} = {\arg\min}\ \Phi \left({\overline {\mathbf{R}}^{I}}\right) \end{array} $$

(16)

Assume that for the case where there are two points **R**_{1} and **R**_{2} on the matrix manifold, \(\overline {\mathbf {R}}\) is located at the midpoint of the geodesic line connecting the two points **R**_{1} and **R**_{2} on the manifold. Its expression is as shown in Eq. 17.

$$\begin{array}{@{}rcl@{}} \overline {\mathbf{R}} = {\mathbf{R}}_{1}^{1/2}{\left({\mathbf{R}}_{1}^{- 1/2}{{\mathbf{R}}_{2}}{\mathbf{R}}_{1}^{- 1/2}\right)^{1/2}}{\mathbf{R}}_{2}^{1/2} \end{array} $$

(17)

If *P*>2, the Riemann mean will be difficult to calculate. Literatures [28, 29] give a method of iteratively calculating \(\overline {\mathbf {R}}\) using the gradient descent algorithm, and finally obtain the Riemann mean calculation formula as shown in Eq. 18.

$$\begin{array}{@{}rcl@{}} {\overline {\mathbf{R}}_{l + 1}} \!= \overline {\mathbf{R}}_{l}^{1/2}{e^{-\frac{\tau}{P}{\sum}_{k = 1}^{P} {\log \left(\overline {\mathbf{R}}_{l}^{- 1/2}{{\mathbf{R}}_{k}}\overline {\mathbf{R}}_{l}^{- 1/2}\right)}}}\overline {\mathbf{R}}_{l}^{1/2},\;\;\;0 \!\le\! \tau \!\le\! 1 \end{array} $$

(18)

where *τ* is the step size of iteration and *l* indicates the number of iteration steps. Therefore, we use the gradient descent algorithm to calculate the Riemann matrix, and get \({\overline {\mathbf {R}}^{O}}\) and \({\overline {\mathbf {R}}^{I}}\).

### Geodesic distance

The study of a geometric structure is mainly to study some properties such as distance, tangent, and curvature on the structure. There are many ways to measure the distance between two probability distributions on a statistical manifold. The most common is the geodesic distance.

Assuming *θ* is a point on the manifold, the metric on the statistical manifold can be defined by G(*θ*) of the following equation, called the Fisher information matrix.

$$\begin{array}{@{}rcl@{}} {\text{G}({\theta}) = \text{E}} \left[ {\frac{{\partial \ln p(x|{\theta})}}{{\partial {{\theta}_{i}}}} \cdot \frac{{\partial \ln p(x|{\theta})}}{{\partial {{\theta}_{j}}}}} \right] \end{array} $$

(19)

Due to the nature of the manifold curvature, we determine the distance between the two points by defining the length of the curve connecting the two points on the manifold. Consider an arbitrary curve *θ*(*t*)(*t*_{1}≤*t*≤*t*_{2}) between two points *θ*_{1} and *θ*_{2} on an arbitrary manifold, where *θ*(*t*_{1})=*θ*_{1}, *θ*(*t*_{2})=*θ*_{2}. Then, the distance between *θ*_{1} and *θ*_{2} can be obtained along the curve *θ*(*t*) [30].

$$\begin{array}{@{}rcl@{}} {\text{D}({{\theta}_{1}}{{,}}{{\theta}_{2}})} \buildrel \Delta \over = \int_{{t_{1}}}^{{t_{2}}} {\sqrt {{{\left({\frac{{d{\theta}}}{{dt}}} \right)}^{T}}{\mathrm{G}({\theta})}\left({\frac{{d{\theta}}}{{dt}}} \right)dt}} \end{array} $$

(20)

It can be seen that the distance between *θ*_{1} and *θ*_{2} depends on the selection of the curve *θ*(*t*). We call the curve that makes Eq. 20 have the smallest distance as the geodesic, and call the corresponding distance as the geodesic distance.

For any probability distribution, the calculation of geodesic distance is more complicated, which has some adverse effects on its application. For a multivariate Gaussian distribution family with the same mean but different covariance matrices, consider the two members of **R**_{1} and **R**_{2} in the covariance matrix. The geodesic distance between them is shown in the following Eq. 21 [31].

$$ {\begin{aligned} {\mathrm{D}}({{\mathbf{R}}_{1}},{{\mathbf{R}}_{2}}) &\buildrel \Delta \over = \sqrt {\frac{1}{2}tr{{\log}^{2}}\left({{\mathbf{R}}_{1}^{- 1/2}{{\mathbf{R}}_{2}}{\mathbf{R}}_{1}^{- 1/2}} \right)} \\[-4pt]&= \sqrt {\frac{1}{2}\sum\limits_{i = 1}^{n} {{{\log}^{2}}{\eta_{i}}}} \end{aligned}} $$

(21)

where *η*_{i} is the *i* eigenvalues of the matrix \({\mathbf {R}}_{1}^{- 1/2}{{\mathbf {R}}_{2}}{\mathbf {R}}_{1}^{- 1/2}\).

According to the feature extraction process and the above analysis, the signal matrix to be perceived is split and recombined in sequence and interval, and the covariance matrix is transformed to obtain **R**_{O} and **R**_{I}. Then, we use Eq. 21 to solve the corresponding geodesic distance.

$$ {{} \begin{aligned} {d_{1}}={\mathrm{D}}\left({{\mathbf{R}}^{O}}{\mathrm{,}}{\overline {\mathbf{R}}^{O}}\right) &\buildrel \Delta \over = \sqrt {\frac{1}{2}tr{{\log}^{2}}\left({{{\left({{\mathbf{R}}^{O}}\right)}^{- 1/2}}{{\overline {\mathbf{R}}}^{O}}{{\left({{\mathbf{R}}^{O}}\right)}^{- 1/2}}} \right)} \\&= \sqrt {\frac{1}{2}\sum\limits_{i = 1}^{qM} {{{\log}^{2}}{\eta_{i}}}} \end{aligned}} $$

(22)

$$ {\begin{aligned} {d_{2}}&={\mathrm{D}}\left({{\mathbf{R}}^{I}}{\mathrm{,}}{\overline {\mathbf{R}}^{I}}\right) \buildrel \Delta \over = \sqrt {\frac{1}{2}tr{{\log}^{2}}\left({{{\left({{\mathbf{R}}^{I}}\right)}^{- 1/2}}{{\overline {\mathbf{R}}}^{I}}{{\left({{\mathbf{R}}^{I}}\right)}^{- 1/2}}} \right)} \\&= \sqrt {\frac{1}{2}\sum\limits_{i = 1}^{qM} {{{\log}^{2}}{\eta_{i}}}} \end{aligned}} $$

(23)

According to the geodesic *d*_{1} and *d*_{1}, a two-dimensional feature vector **D**=[*d*_{1},*d*_{2}] is used to represent the signal sensed by the SU. Finally, the feature vector **D** is used for spectrum sensing.