 Research
 Open Access
 Published:
An intrusion detection method for internet of things based on suppressed fuzzy clustering
EURASIP Journal on Wireless Communications and Networking volume 2018, Article number: 113 (2018)
Abstract
In order to improve the effectiveness of intrusion detection, an intrusion detection method of the Internet of Things (IoT) is proposed by suppressed fuzzy clustering (SFC) algorithm and principal component analysis (PCA) algorithm. In this method, the data are classified into highrisk data and lowrisk data at first, which are detected by high frequency and low frequency, respectively. At the same time, the selfadjustment of the detection frequency is carried out according to the suppressed fuzzy clustering algorithm and the principal component analysis algorithm. Finally, the key factors influencing the algorithm are analyzed deeply by simulation experiment. The results shows that, compared to traditional method, this method has better adaptability.
Introduction
Owing to the rapid development and wide applications of the Internet of Things (IoT) techniques, security of IoT has attracted increasing attentions. IoT is a sensor network consisting of various sensor nodes, which are readily exposed to attacks as they are usually located in sites with no monitoring [1, 2]. To make it worse, attacks on IoT may lead to huge damages in a wide range, compared with computer networks. Hence, security risks in all aspects of IoT and strategies should be analyzed as a whole and the simplification of end security setup is of IoT great significance [3]. The detection systems should be further optimized based on analysis of risk categories and security structures of IoT [4].
The intrusion detection method judges attacks based on data collected by multiple collection points in a computer network [5, 6]. The intrusion detection is an active protection technique that can intercept and respond to intrusions before they reach the network. However, the huge network data traffic is a huge challenge to intrusion detection systems as it induces high requirements on the detection efficiency so that attacks can be detected in real time. The restricted Boltzmann machine (RBM) network was trained using the greedy algorithm, and low dimension expressions of the RBM network output were classified downwards using the back propagation algorithm [7]. The results indicated that the proposed model shows improved accuracy of intrusion detections, thus suitable for information extractions in highdimension space. For data compression and low clustering efficiency issues, a modified selfadjustment clustering method is established based on direct correlation to samples close to cluster center [8]. This method effectively reduced clustering sample size and clustering timespace consumption and improves the effectiveness of intrusion detection. Aimed at feature optimization and selection in intrusion detections, a support vector machine (SVM) based twostage feature selection method was proposed based on feature evaluations of the ratio of detection rate and false alarm rate [9]. In this method, filter noises and irrelevant features were filtered using Fisher classification and information gain in the filtering mode, respectively, to obtain overlapping feature subsets and effectively reduce modeling and detection time. For intrusion detections of internal nodes in wireless sensor networks (WSN), a layerclustered intrusion detection method for trust value of node a based on the Beta distribution theory and outlier factor was proposed [10]. This method identifies abnormal nodes based on the Mahalanobis distance and exhibits low false alarm rates. Based on classification methods in data mining, optimized solutions were identified by direction calculations of relevant matrices and multicategory network attacks are analyzed by multiobjective mathematical programming model [11]. This method exhibits advantages such as low complexity, effective detections of multicategory attacks, and low false alarm rates. A fuzzy clustering intrusion model based on genetic algorithm and hierarchical algorithm was proposed [12]. Herein, the feature volume was determined by deletion of data set features using the Youden index, the susceptibility to initial cluster centers was relieved, and the local optimization issues in iteration were overcome. Experimental results demonstrated excellent detection performance of the proposed model for network attacks. For Kernel restriction issues in minimum enclosing ball algorithm, an intrusion detection method based on minimum enclosing ball with extensive Kernel was proposed [13]. This method can obtain the minimum enclosing ball of the sample set according to updates of the center and the radius of sphere and categorize network intrusions according to distributions of support vectors. To enhance the accuracy of intrusion detections, an intrusion detection method combining artificial immunity and rough set was proposed for vaccine injections [14]. This method can achieve real time detections of unknown attacks with improved effectiveness and efficiency. Also, a network intrusion detection model was proposed and an alarm system combining multiple proof techniques was established to filter false alarms [15]. An improved multiobjective genetic algorithmbased intrusion detection integrated method has been proposed [16]. This algorithm can effectively solve feature selection issues in intrusion detections, and the method based on this algorithm exhibited excellent detection accuracy and wide applicability to different categories of attacks.
Although intrusion detection technology has been widely used, there are still many problems, such as large number of alerts, high false alarm rate, poor generality, and false report. In this article, an intrusion detection method for IoT was proposed based on suppressed fuzzy clustering (SFC) algorithm [17, 18] and principal component analysis (PCA) algorithm [19, 20]. Simulations demonstrated high detection efficiency and significantly reduced detection time of the proposed method. Section 2 describes the objective prejudgment model of intrusion detections, Section 3 proposes solution of intrusion detections, Section 4 involves simulations, and Section 5 includes a conclusion.
Objective prejudgment model
IoT consists of multiple sensor nodes with low communication traffic and short communication range. All nodes are identical and abnormal data packets can be detected by monitoring of wireless ports at all nodes. An intrusion detection system (IDS) for IoT is a solid system consisting of six closely related parts, including data packet monitoring, boundary identification, key management, local detection, voting, and local responses [21].
As the sensor nodes in IoT are readily exposed to attacks, IDS proxy was designed for each node in IoT to realize network monitoring, group decision, and other operations. However, current algorithms are limited by drawbacks such as late alarms, high false alarm rate, and low detection efficiency [22]. This study focuses on optimization of efficiency and effectiveness of intrusion detections. Owing to the rapidly increasing data transmission size in IoT, feature extraction can be extremely time consuming and its efficiency has a severe effect on detections. To guarantee good efficiency and effectiveness of intrusion detections, extractions of feature vectors of the data obtained were achieved by the PCA algorithm. In this way, the efficiency and effectiveness intrusion detection algorithm were significantly improved.
With one sample size, p variables of n groups of data were monitored (e.g., abnormal request data packets, missing data packets). The sample monitoring data matrix X can be described by:
where X is a matrix consisting of p column vectors, n is the size of data to be monitored, and each data is a coordinate in the nth dimension. The monitoring data of original samples were standardized:
Then, X was projected to the low dimension space vector (T), as follows:
where W refers to the projection matrix, which is an orthogonal matrix consisting of covariance matrices (V(x_{ j })), it can be calculated by:
where the covariance matrix (W_{ i }) is located in the ith column of W, and \( \overrightarrow{{\mathrm{W}}_{\mathrm{i}}} \) is the feature vector corresponding to one specific feature (λ_{ i }) of V. To achieve rapid detections, features with ambiguous values were eliminated to enhance the detection efficiency. The feature vectors with relatively large values were selected. Herein, an objective prejudgmentbased intrusion detection, frequency selfadjustment algorithm for IoT was proposed. In this algorithm, the huge data flow is integrated and analyzed. More specifically, the data is classified using the clustering algorithm: the data sent to potential objectives to be attacked is classified as highrisk data, while other data is classified as lowrisk data. The highrisk data and the lowrisk data are detected under high frequency and low frequency, respectively.
Define the data set to be clustered as X = {x_{1}, x_{2}, …, x_{ n }}, where each sample x_{ k } (1, 2, …, n) has several features, including data transmission rate, average length of data packets, and intervals of data emission. x_{k} = (x_{k1}, x_{k2}, …, x_{kj})^{T} ∈ R^{j} is the corresponding feature vector, which represents a point in the data feature space, and x_{ kj } is the feature vector in the jth dimension. In this way, data in X is categorized as highrisk data or lowrisk data. The result is denoted as a matrix in order of c*n (U = [u_{ij}]_{c*n}), which satisfies
The weight distances from samples to the cluster center are defined as objective functions, which can be calculated by
where U = [u_{ij}]_{2*n} is fuzzy classification matrix (u_{ij} ∈ [0, 1]), m is the weighed index, m ∈ [1, ∞], d_{ ij } is the Euclidean distance (d_{ ij }) between x_{ j }, and the ith cluster center v_{ i } (i = 1, 2, …, c). d_{ij} can be calculated by Eq. (11).
Intrusion detection method
To ensure rapid and effective detections of attacks, an objective prejudgmentbased intrusion detection, frequency selfadjustment method for IoT is proposed. With no distortions of original data guaranteed, the PCA algorithm can reduce the number of variables and eliminate features with low discriminations. The dimension reduced data was divided by SFC algorithm as highrisk and lowrisk data, which are detected using different frequencies to achieve enhanced detection efficiency and accuracy. The procedures are as follows:

(a)
Data initialization: define fuzzy clustering set as c = 2 and optimized classification threshold as ε = 0.3, initialized detection machine number as n, minimum detection frequency of lowrisk data as l_{ min }, maximum detection frequency of highrisk data as l_{ max }, time intervals as Δt.

(b)
Data preprocessing: randomly initialize the affiliation matrix U = [u_{ij}]_{c*n} (u_{ij} ∈ [0,1]) and Eq. (9) was satisfied. The cluster centers v_{ i } (i = 1, 2, …, c) were determined and the Euclidean distance (d_{ ij }) between the jth sample data and the ith cluster center. d_{ ij } can be calculated by
The objective function was calculated using Eq. (10). If the value of objective function was no lower than the given optimized classification threshold, the process was repeated; if the value of objective function was lower than the given optimized classification threshold and no changes of any cluster was observed, the process ends.
Due to the slow converging rates of conventional fuzzy clustering algorithms, a suppressor λ_{ ij } (0 < λ_{ij} < 1) was introduced to d_{ ij } for the purpose of correction. Define the corrected distance as \( {d}_{ij}^{\hbox{'}} \)
It satisfies

(c)
First, classified data was analyzed using the PCA algorithm and features with low discriminations were eliminated to accelerate the detection process.

(d)
Then, data was detected with selfadjustment of detection frequency according to the detection results. Assume that the quantity of objective machines that can be effectively detected by objective prejudgment based methods is
where n is the quantity of monitored objective machines, n′ is the quantity of monitored objective machines before application of objective prejudgment, and Δn is the quantity of monitored objective machines after application of objective prejudgment.
The total number (N) of data packets that can be detected per second can be obtained by:
where l_{min} refers to the minimum detection frequency of lowrisk data, l_{max} refers to the maximum detection frequency of highrisk data, and t (0 ≤ t ≤ n) refers to the total number of objective data under attack. In cases of no attacks to the IoT system, the detection frequency of lowrisk data (P′) to each objective with detection effectiveness guaranteed is defined as
The detection frequency can be neither overadjusted (highrisk data is identified as lowrisk data) nor underadjusted (highrisk data cannot be detected) and an appropriate frequency difference is of great significance. With detection effectiveness and accuracy guaranteed, the adjustable data detection frequency difference is defined as
In this way, detection frequencies of highrisk data (P^{i}_{max}) and lowrisk data (P^{i}_{low}) of the ith objective machine by the NIDS system can be obtained
where η_{ i } is the ratio of abnormal data sent to the ith objective machine in Δt, A_{ i } refers to the detected abnormal data sent to ith potential target in Δt, and C_{ i } refers to all detected abnormal data sent to ith potential target in Δt. In detections, the detection frequency of the objective was adjusted in real time according to η_{ i } to optimize the detection accuracy.
Simulations
In this study, the objective prejudgmentbased intrusion detection system for IoT was employed. The data was preprocessed using SFC algorithm and PCA algorithm and then detected by frequency selfadjustment. The results indicated that the proposed algorithm can not only enhance the detection efficiency but also reduce the false alarm rate. The accuracy index is the most important performance index of the intrusion detection system, and its value depends on the sample set and the test environment used in the test. The three parameters of detection duration, accuracy rate, and false alarm rate are usually used as evaluation indexes. Therefore, in this study, detection duration (T), accuracy (P), and false alarm rate (F) were employed as evaluation parameters:
where N_{ t } is the size of attack samples that are accurately detected, N_{ f } is the size of false alarms, and N′ is the size of overall attacks identified by the system, and N is the size of overall real attacks.
Table 1 summarizes detection results of the proposed algorithm, the neural network algorithm, and the Bayesian algorithm. As observed, the accuracy of the proposed algorithm is higher than those of the other two algorithms. This can be attributed to data dimension reduction by PCA algorithm, which eliminates interferences by irrelevant factors. Additionally, the false alarm rate of the proposed algorithm is lower than those of the other two algorithms. Hence, it can be concluded that the proposed algorithm is viable.
Figure 1 summarizes the effects of the overall data size on the detection efficiency using different algorithms. As observed, detection efficiencies of the three algorithms decreased as the data size increased, while the detection efficiency of the proposed algorithm is lower than those of the other two algorithms. The detection efficiency of the proposed algorithm is higher than those of the other two algorithms, regardless of the overall data size. Therefore, it can be concluded that the proposed algorithm can enhance detection efficiency.
Table 2 summarizes detection results of the proposed algorithm. As observed, the detection rate of abnormal data increased as the time increased. This can be attributed to the continuous selfadjustment of detection frequency according to the responses received.
The accuracy is a key indicator for intrusion detections. The effects of overall data size on the detection accuracy were investigated using different algorithm models, and the results are shown in Fig. 2. As observed, the detection accuracy degraded as the data size increased in all three cases, as the increasing data size leads to decreasing detection efficiency. However, the decreasing rate of the proposed algorithm model was lower than those of the other two algorithms and the accuracy of the proposed algorithm model was higher than those of the other two algorithms in all cases. Therefore, the proposed algorithm model is viable.
Results
With the rapid development of the IoT technology, the security problem is becoming more and more serious. All sensor nodes in IoT are vulnerable to external attacks. The huge network data traffic is a huge challenge to intrusion detection systems as it induces high requirements on the detection efficiency so that attacks can be detected in real time. However, current algorithms are limited by drawbacks such as late alarms, high false alarm rate, and low detection efficiency. Aimed at low effectiveness of intrusion detections for IoT, an intrusion detection method for IoT based on SFC algorithm and PCA algorithm is proposed. In this method, the data obtained are classified by objective prejudgment into highrisk data and lowrisk data, which are detected at high frequency and low frequency, respectively. Meanwhile, the selfadjustment of detection frequency is achieved by employing the SFC algorithm and the PCA algorithm. Experimental results revealed improved applicability of the proposed method, compared with conventional methods (e.g., neural network algorithm, Bayesian algorithm). The innovation of this paper is to propose an objective prejudgmentbased intrusion detection, frequency selfadjustment method for IoT. With no distortions of original data guaranteed, the PCA algorithm can reduce the number of variables and eliminate features with low discriminations. The dimension reduced data was divided as highrisk and lowrisk data, and then tested at different frequencies, which are detected using different frequencies to achieve enhanced detection efficiency and accuracy. This method can quickly improve the effectiveness and accuracy of intrusion detection.
Discussion
In this study, the number of samples and the detection time are several important factors that affect the efficiency of the algorithm. The key factors affecting this algorithm are analyzed by simulations. Experimental results show that with the increase of data volume, the efficiency and accuracy of intrusion detection algorithm will gradually decrease. Compared with Bayesian algorithm and neural network algorithm, the new algorithm in this paper still has better detection efficiency.
With the continuous development of related research, new intrusion detection models of IOT will be more and more, and the evaluation index will also continue to expand. In subsequent research, we can consider the combination of other indicators and new features of IOT to improve the intrusion detection model.
Abbreviations
 IDS:

Intrusion detection system
 IoT:

Internet of things
 PCA:

Principal component analysis
 RBM:

Restricted Boltzmann machine
 SFC:

Suppressed fuzzy clustering
 SVM:

Support vector machine
 WSN:

Wireless sensor networks
References
 1.
XD Hu, ZM Jia, A method of lightweight intrusion detection for the Internet of things. J. Chongqing Univ. Posts Telecommun. 2(27), 255–259 (2015)
 2.
GY Tang, Network intrusion detection method based on constraint fuzzy clustering thought. Natl. Sci. J. Xiangtan Univ. 3(39), 61–64 (2017)
 3.
F Jia, LZ Kong, Intrusion detection algorithm based on convolutional neural network. Trans. Beijing Inst. Technol. 12(37), 1271–1275 (2017)
 4.
XC Liu, SF Lu, W Zhao, et al., A cloud computing intrusion detection with objective function optimization based on fuzzy Cmeans clustering algorithm. J. Cent. South Univ. 7(47), 2320–2325 (2016)
 5.
GD Li, JP Hu, KW Xia, Intrusion detection using relevance vector machine based on cloud particle swarm optimization. Control Decision. 30(4), 698–702 (2015)
 6.
YH Yang, HZ Huang, QN Shen, et al., Research on intrusion detection based on incremental GHSOM. Chin. J. Comput.. 37(5), 1216–1224 (2014)
 7.
N Gao, L Gao, YY He, Deep belief nets model oriented to intrusion detection system. Syst. Eng. Electron. 38(9), 2201–2207 (2016)
 8.
J Jiang, ZF Wang, TM Chen, et al., Adaptive AP clustering algorithm and its application on intrusion detection. J. Commun. 36(11), 118–126 (2015)
 9.
XN Wu, XJ Peng, YY Yang, et al., Twolevel feature selection method based on SVM for intrusion detection. J. Commun. 36(4), 20151271–20151278 (2015)
 10.
WM Tong, JQ Liang, L Lu, et al., Intrusion detection scheme based node trust value in WSNs. Syst. Eng. Electron 37(7), 1644–1649 (2015)
 11.
B Wang, XW Nie, Multicriteria mathematical programming based method on network intrusion detection. J. Comput. Res. Dev. 52(10), 2239–2246 (2015)
 12.
CH Tang, PC Liu, SS Tang, et al., Anomaly intrusion behavior detection based on fuzzy clustering and features selection. J. Comput. Res. Dev. 52(3), 718–728 (2015)
 13.
QA Wang, B Chen, Intrusion detection system using CVM algorithm with extensive kernel methods. J. Comput. Res. Dev. 49(5), 974–982 (2012)
 14.
L Zhang, ZY Bai, SS Luo, et al., Integrated intrusion detection model based on rough set and artificial immune. J. Commun. 34(9), 166–176 (2013)
 15.
ZH Tian, BL Wang, WZ Zhang, et al., Network intrusion detection model based on context verification. J. Comput. Res. Dev. 50(3), 498–508 (2013)
 16.
Y Yu, H Huang, An ensemble approach to intrusion detection based on improved multiobjective genetic algorithm. J. Softw. 18(6), 1369–1378 (2007)
 17.
B Liu, SX Xia, Y Zhou, et al., A sampleweighted possibilistic fuzzy clustering algorithm. Acta Electron. Sin. 40(2), 371–375 (2012)
 18.
AG Chen, ST Wang, Fuzzy clustering algorithm based on multiple medoids for largescale data. Control Decision. 31(12), 2122–2130 (2016)
 19.
ZZ Liang, Y Li, SX Xia, et al., Principal component analysis based on L1norm maximization with Lpnorm constraints. PR AI 26(2), 211–217 (2013)
 20.
Y Ruan, HW Chen, ZH Liu, et al., Quantum principal component analysis algorithm. Chin. J. Comput 37(3), 666–676 (2014)
 21.
M Ahmed, AN Mahmood, J Hu, A survey of network anomaly detection techniques. J. Netw. Comput. Appl. 60, 19–31 (2016)
 22.
Y Chen, An efficient feature selection algorithm toward building lightweight intrusion detection system. Chin. J. Comput. 30(8), 1398–1408 (2015)
Acknowledgements
The research presented in this paper was supported by the Funds of Science & Technology Research of Guangdong Province and the Natural Science Foundation of Guangdong Province, China.
Funding
The authors acknowledge the Funds of Science & Technology Research of Guangdong Province, China (Grants: 2015B010128015 and 2017A040403070) and the Natural Science Foundation of Guangdong Province (Grant: 2017A030307027),China.
Author information
Affiliations
Contributions
LL is the main writer of this paper. She proposed the main idea, deduced the performance of the algorithm detection, completed the simulation, and analyzed the result. BX introduced the suppressed fuzzy clustering algorithm and principal component analysis algorithm and analyzed the data of the simulation experiment. XZ analyzed the key factors influencing the algorithm. XW gave some important suggestions for intrusion detection. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Bing Xu.
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Internet of things
 Intrusion detection
 Suppressed fuzzy clustering algorithm
 Principal component analysis algorithm