System model
To reduce the error probability of decision fusion and improve the performance of spectrum sensing, this paper proposes a clusterbased cooperative spectrum sensing scheme. It is assumed that the channel state between cognitive radio and fusion center is known to cognitive radio [17]. It is necessary to estimate the channel state before the SU sends sensing data in each intervals. In addition, in the node’s clustering structure, the nearest cognitive users should be selected as the member nodes in the same cluster, and the channel state between them can be approximately considered to be ideal [18, 19].The fusion results of each cluster will finally sent to the fusion center (FC) by cluster heads (CHs) for fusion, and the FC uses OR fusion method for processing [20, 21].
Considering that a FC or base station and \(N\) cognitive users participate in cooperative spectrum sensing. The SUs will be organized into \(K\) clusters, and there are \(K_{c}\) cognitive users in the \(c\)th cluster. The energy detection method is applied, and the spectrum the sensing samples of \(i\)th SU in \(c\)th cluster at \(m\)th sampling slot can be expressed as [22, 23]:
$$r_{ci} (m) = \left\{ {\begin{array}{*{20}l} {n_{ci} (m)} \hfill & {H_{0} } \hfill \\ {h_{ci} s_{ci} (m) + n_{ci} (m)} \hfill & {H_{1} } \hfill \\ \end{array} } \right.$$
(1)
where \(s_{ci} (m)\) is the sampling value of the PU’s signal received by the SU, \(h_{ci}\) and \(n_{ci} (m)\) represent the channel gain and channel noise from the SU to the PU, respectively. The noise is assumed to be additive, white and Gaussian (AWGN) with zeromean and known variance \(\sigma_{n,ci}^{2}\), i. e., \(n_{ci} (m)\sim{\mathcal{N}}\left( {0,\sigma_{n,ci}^{2} } \right)\).
Cooperative spectrum sensing
Let \(\tau\) be the sensing time of the SU and \(f_{{\text{s}}}\) be the sampling frequency, then after the sum of \(M = \tau f_{{\text{s}}}\) samples, the test statistics of the \(j\)th SU in the \(c\)th cluster can be expressed as:
$$R_{ci} = \frac{1}{M}\sum\limits_{m = 1}^{M} {\left {r_{ci} (m)} \right^{2} }$$
(2)
Under the hypothesis \(H_{0}\), the probability density function of \(R_{ci}\) obeys the central chi square distribution with \(2M\) degree of freedom. Otherwise, under the hypothesis \(H_{1}\), the probability density function of \(R_{ci}\) will obey the noncentral chi square distribution with \(2M\) degree of freedom.
When the value of \(M\) is large enough, the test statistic can be approximated as Gaussian. By applying the central limit theorem [24, 25], the test statistic can be defined as follows:
$$R_{ci} \sim\left\{ {\begin{array}{*{20}l} {{\mathcal{N}}\left( {M\sigma_{n,ci}^{2} ,2M\sigma_{n,ci}^{4} } \right),} \hfill & {H_{0} } \hfill \\ {{\mathcal{N}}\left( {(M + \gamma_{i} )\sigma_{n,ci}^{2} ,2(M + 2\gamma_{i} )\sigma_{n,ci}^{4} } \right),} \hfill & {H_{1} } \hfill \\ \end{array} } \right.$$
(3)
All member nodes will send their observations to the CH of the corresponding cluster [26, 27]. Considering that the geographical distance between the nodes in the cluster and the cluster head is relatively close, the noise between the cognitive user and the cluster head is ignored. Suppose that the cluster head of the \(c\)th cluster assign different weight values to its member node’ received observation. Then, the weight vector of the cluster can be expressed as \(W_{c} = [w_{1} ,w_{2} , \ldots ,w_{{K_{c} }} ]^{T}\), and the sum of the test statistics of all member nodes in the cluster can also obey the normal distribution by:
$$R_{c} \sim\left\{ {\begin{array}{*{20}l} {{\mathcal{N}}\left( {\sum\limits_{i = 1}^{{K_{c} }} {w_{i} M\sigma_{n,ci}^{2} } ,\sum\limits_{i = 1}^{{K_{c} }} {2w_{i,j} M\sigma_{n,ci}^{4} } } \right),} \hfill & {H_{0} } \hfill \\ {{\mathcal{N}}\left( {\sum\limits_{i = 1}^{{K_{c} }} {w_{i} M\left( {1 + \gamma_{i} } \right)\sigma_{n,ci}^{2} } ,\sum\limits_{i = 1}^{{K_{c} }} {2w_{i} M\left( {1 + 2\gamma_{i} } \right)\sigma_{n,ci}^{4} } } \right),} \hfill & {H_{1} } \hfill \\ \end{array} } \right.$$
(4)
The weight vector can reflect the contribution of individual SU to the final fusion results, and two factors are taken into account: SNR and error rate of each member node [28, 29]. If a SU's SNR is high, it should be assigned a larger weight value for better channel communication quality. In contrast, for the SU being suffering deep fading or shadow effect, its weight value for fusion results should be reduced so as to shorten the negative effect on the final decision [30, 31].
In addition, the historical error rate of member nodes should also be considered seriously [32]. Suppose that in the previous round \(t\), the number of times that the number of sensing result of \(i\)th SU being consistent with the actual PU’s state is \(u(t)\), and the number of inconsistent results is \(v(t)\). Then, the error rate factor can be defined as:
$$g_{ci} (t) = \exp \left[ {\frac{{v_{ci} (t)}}{{u_{ci} (t) + v_{ci} (t)}}} \right]$$
(5)
By considering the above factors, the weighting coefficient is defined as:
$$w_{i} = \frac{{g_{ci} (t)\gamma_{i} }}{{\sqrt {\sum\limits_{i = 1}^{{K_{c} }} {\left( {g_{ci} (t)\gamma_{i} } \right)_{{}}^{2} } } }}$$
(6)
where \(\gamma_{i}\) represents the signaltonoise ratio of the \(i\)th SU.
Assuming that the energy detection threshold of \(c\)th cluster is \(\lambda_{c}\), the threshold is substituted into the equation of detection probability. Then, the detection probability and false alarm probability of \(c\)th cluster can be obtained as following:
$$P_{f,c} = Q\left( {\frac{{\lambda_{c}  \sum\limits_{i = 1}^{{K_{c} }} {Mw_{i}^{{}} \sigma_{n,ci}^{2} } }}{{\sqrt {\sum\limits_{i = 1}^{M} {2M\sigma_{n,ci}^{4} w_{i}^{2} } } }}} \right)$$
(7)
$$P_{d,c} = Q\left( {\frac{{Q^{  1} \left( {P_{f} } \right)\sqrt {\sum\limits_{i = 1}^{{K_{c} }} {\left( {2M\sigma_{n,ci}^{2} + \gamma_{i}^{2} } \right)w_{i}^{2} } }  \sum\limits_{i = 1}^{{K_{c} }} {M\gamma_{i} \sigma_{n,ci}^{2} w_{i} } }}{{\sqrt {\sum\limits_{i = 1}^{{K_{c} }} {2M\left( {1 + 2\gamma_{i} } \right)\sigma_{n,ci}^{4} w_{i}^{2} + \gamma_{i}^{2} w_{i}^{2} } } }}} \right)$$
(8)
where \(Q(x) = \frac{1}{{\sqrt {2\pi } }}\int_{x}^{\infty } {\exp \left( {  \frac{{t^{2} }}{2}} \right)dt}\).
Clustering formation
During the clustering formation, \(C\) CHs from \(N\) SUs should be selected primarily. The selection of candidate CHs should meet the following requirements: the candidate nodes should be closer to the FC, and the candidate nodes are also be closer to other SUs. Then, the residual SUs are equally divided into several clusters formed into \(C\) clusters according to the process of clustering formation. If the distance between the cooperative SUs in a cluster is far, relatively small number of members in a single cluster will be. It will result in low performance of cooperative spectrum sensing of the cluster, and the decision result of the cluster may be inaccurate. Thus, the main idea of clustering is to organize the adjacent SUs into a same cluster.
The clusterbased cooperative spectrum sensing can be divided into two parts: spectrum sensing and intra cluster data fusion [33, 34]. All SUs in each cluster need to sense the PU’s signal independently [35, 36]. Then, the CH receives the sensing observations from all member nodes in the cluster, and decides the authorized user’s state. Compared with the typical cooperative spectrum sensing, the clusteredbased cooperative spectrum sensing can make more reasonable use of the spatial diversity of nodes in different geographical locations, and reduce the error of decision information sent by SUs to the FC as much as possible. For simplicity, we define the Euclidean distance \({\text{dis}}\left( {s_{i} ,s_{j} } \right)\) between \(i\)th node and \(j\)th node, and assumes that the number of nodes in each cluster is an integer. The specific steps of clustering process are as follows:

Step 1: The distance from all SUs to the FC is calculated, and the \(2C\) SUs with the shortest distance will be selected as candidate CHs;

Step 2: The distance between those candidate CHs and the centroid degree of all cooperative SUs is will be estimated. The optimal nodes with the smallest distance are added into the CHs set \(\{ {\text{CH}}_{1} ,{\text{CH}}_{2} , \ldots ,{\text{CH}}_{C} \}\), and the number of optimal nodes is \(C\);

Step 3: Initialize the member nodes set of the clusters, cluster center \(\hat{m}_{c}\) and the number of nodes in the cluster as \(K_{c}\). The total number of residual SUs is denoted as \(N_{{{\text{res}}}} = N  C\).

Step 4: Calculate the distance between the SUs from residual nodes set and cluster centroid. For a SU, if it satisifies with \(c = \arg \min \{ {\text{dis}}(s_{i} ,{\text{CH}}_{c} )\}\), the node should be joined into \(c\)th cluster and the cluster centroid will be updated. Then, the number of member nodes in \(c\)th cluster plus one, i. e., \(K_{c} = K_{c} + 1\) and the total number of residual SUs will be decreased by \(N_{{{\text{res}}}} = N_{{{\text{res}}}}  1\);

Step 5: If \(K_{c} = \frac{N  C}{C}\), it shows that the \(c\)th cluster is at full length, and subsequent nodes are no longer joined into the cluster;

Step 6: if \(N_{{{\text{res}}}} > 0\), return to step 4 and continue execution;

Step 7: The distance from all SUs in each cluster to the FC is calculated, and the nearest SU can be determined as the CH. The CH assigns ID to each member node, and the formation of cluster ends.