 Research
 Open Access
 Published:
GAAdaBoostSVM classifier empowered wireless network diagnosis
EURASIP Journal on Wireless Communications and Networking volume 2018, Article number: 77 (2018)
Abstract
Selfhealing is one of the most important parts in selforganizing mobile communication network. It focuses on detecting the decline of service quality and finding out the cause of network anomalies and repairing it with high automation. Diagnosis is a particularly important task which identifies the fault cause of problematic cells or regions. To perform the diagnosis, this paper presents two modified ensemble classifiers by using Support Vector Machine (SVM) with different kernels, i.e., SVM with the radial basis function (RBF) kernel (RBFSVM in short) and SVM with the linear kernel (LSVM in short), as component classifier in Adaptive Boosting (AdaBoost), and we call the two ensemble classifiers as Adaptive Boosting based on RBFSVM (AdaBoostRBFSVM in short) and Adaptive Boosting based on linear kernel (AdaBoostLSVM in short). Different with previous AdaBoostSVM classifiers using weak component classifiers, in this paper, the performance of the classifiers is adaptively improved by using moderately accurate SVM classifiers (the training error is less than 50%). To solve the accuracy/diversity dilemma in AdaBoost and get good classification performance, the training error threshold is regulated to adjust the diversity of classifier, and the parameters of SVM (regularization parameter C and Gaussian width σ) are changed to control the accuracy of classifier. The accuracy and diversity will be well balanced through reasonable parameter adjustment strategy. Results show that the proposed approaches outperform individual SVM approaches and show good generalization performance. The AdaBoostLSVM classifier has higher accuracy and stability than LSVM classifier. Compared with RBFSVM, the undetected rate and diagnosis error rate of AdaBoostRBFSVM decrease slightly, but the false positive rate does reduce a lot. It means that the AdaBoostRBFSVM classifier is indeed available and can greatly reduce the number of normal class samples that have been wrongly classified. Therefore, the two ensemble classifiers based on the SVM component classifier can improve the generalization performance by reasonably adjusting the parameters. To set the parameter values of component classifiers in a more reasonable and effective way, genetic algorithm is introduced to find the set of parameter values for the best classification accuracy of AdaBoostSVM, and the new ensemble classifier is called AdaboostSVM based on genetic algorithm (GAAdaboostSVM in short) (including AdaboostLSVM based on genetic algorithm and AdaboostRBFSVM based on genetic algorithm). Results show that GAAdaboostSVM classifiers have a lower overall error than AdaboostSVM classifiers. Genetic algorithm could help to achieve a more optimal performance of the ensemble classifiers.
Introduction
Over the past few years, the wireless network has undergone great changes. The coexistence of 2G, 3G, LTE/LTEA, and HetNet architecture makes the wireless network more and more complex. A sharp increase in the traffic demand has forced the operator to increase CAPital EXpenditures (CAPEX) and OPerational EXpenditures (OPEX). In order to reduce operating and maintenance costs, the selforganizing network (SON) [1] has been introduced by 3GPP. Selforganizing networks (SONs), a set of principles and concepts for increasing the automation of mobile networks, automatically choose the network parameters to improve the key performance indicators (KPIs). Three categories, selfconfiguration, selfoptimization, and selfhealing [2], have been involved in SONs. Selfconfiguration includes automatic planning and deployment of the network, such as selfestablishing base stations and automatic management during the operation of the base station. Selfoptimization refers to adaptively adjusting the parameters of network equipment according to its own operating conditions in order to achieve the goal of optimizing network performance. Selfhealing, the ability to automatically recover from failures, includes detection, diagnosis, and recovery. This work is centered in diagnosis which identifies the fault cause of problematic cells or regions.
Recently, some research on network diagnosis has been published [3,4,5,6]. However, the number of papers on selfhealing is limited due to two major reasons. One of the reasons is that the fault causes and the corresponding KPIs are not recorded when fault occurs. The other reason is that historical data of faults in mobile networks is usually in the hands of operators, and it is usually hard for the scientific community to get. In view of the above problems, some scholars use simulators to simulate faults and corresponding network KPIs, but there is a big difference from the real network settings [7]. In spite of this, many significant projects have been developed, such as the UniverSelf Project [8], the COMMUNE Project [9], and the SELFNET Project [10]. There have been quite a few researches on network diagnosis, most of which apply new concepts and techniques, such as data mining [3, 11], selforganizing maps [4], genetic algorithms [5], fuzzy logic [6], and Bayesian networks [12, 13], to diagnose faults in communication network. But there is little research based on Machine Learning [14,15,16,17,18,19,20] for network diagnosis. In this paper, several supervised Machine Learning (ML) techniques, i.e., Support Vector Machine (SVM), Adaptive Boosting based on SVM (AdaBoostSVM), and AdaboostSVM based on genetic algorithm (GAAdaBoostSVM), have been used for diagnosis in network.
Support Vector Machine (SVM) evolves from the optimal classification of linearly separable cases. The optimal classification surface requires that the classification surface not only correctly separates the two classes (the training error rate is 0), but also makes the classification interval the largest. In order to get a good classification effect, kernel functions were usually used to map the training samples to a highdimensional feature space. There are many kernel functions, such as linear kernel, radial basis function (RBF) kernel, and polynomial kernel, which were commonly used in the SVM. Among them, two popular kernels used in SVM are the RBF and linear kernels, which respectively have a parameter known as regularization parameter C and Gaussian width σ. The parameters are used to control the model complexity and training error.
Adaptive Boosting (AdaBoost) [21] is one of the ensemble learning algorithms, which improves the performance of the ensemble classifier by improving the accuracy of the weak classifier. The weight coefficients of each classifier are set to be the same before starting the iteration. After each iteration, the weight coefficients of each classifier will be adaptively adjusted according to the classification results. The weights of misclassified samples will be increased; on the contrary, the weights of correctly classified samples will be decreased. Many researches that use Decision Trees [22], Neural Networks [23], or RBFSVM [17] as component classifiers in AdaBoost have been investigated. To the best of our knowledge, there are few researches using linear kernel as component classifiers in AdaBoostSVM. It is well known that there is a dilemma of accuracy/diversity in AdaBoost, which means that the more accurate the two component classifiers, the less disagreement between them. AdaBoost can demonstrate excellent generalization performance only if accuracy and diversity are well balanced. Therefore, how could we balance the accuracy/diversity dilemma in AdaBoostSVM?
In this paper, we try our best to find solutions to the following problems: Can we use the component classifiers based on linear kernel or RBF kernel to get better generalization performance in AdaBoostSVM? If we can, which classifier based on the different kernels could get better performance and why? How could we balance the accuracy/diversity dilemma in AdaBoostSVM? How could we set the parameter values of component classifier in a reasonable and effective way?
As mentioned above, there are two parameters σ and C in Adaptive Boosting based on RBFSVM (AdaBoostRBFSVM) and one parameter C in Adaptive Boosting based on linear kernel (AdaBoostLSVM) which have to be set beforehand. According to the performance analysis of RBFSVM [17], we know that σ is a more important parameter than C: the performance of RBFSVM mainly depends on the value of σ in the proper range of C. As known in [17], if all RBFSVM component classifiers are set to a single σ, it will result in an unsuccessful AdaBoost process due to the reason that overweak or overstrong component classifiers may appear. So, in this paper, the proposed AdaBoostRBFSVM method adaptively adjusts the value of σ in the RBFSVM component classifier to obtain a set of moderately accurate RBFSVMs for AdaBoost. Similarly, the C values in LSVM component classifiers are also adaptively adjusted. It means we adjust the accuracy of the classifier by changing the values of the parameters C and σ. Furthermore, as mentioned above, there is a dilemma of accuracy/diversity in AdaBoost. Therefore, we increase the diversity of the classifier by increasing the training error threshold. The greater the training error threshold, the more the weak classifiers satisfying the condition will be obtained, so that the diversity of the classifier will be better.
The performance of the ensemble classifier depends on the parameters value of each component classifier. How to set the parameter value of the component classifier in a reasonable and effective way is a very important issue. Genetic algorithm is a method of searching for the optimal solution, which is largely used in search and optimization problems. In this paper, genetic algorithm is proposed to find the set of parameter values for the optimal performance of the ensemble classifiers.
In this paper, two modified ensemble classifiers, i.e., AdaBoostRBFSVM and AdaBoostLSVM, were employed for root cause analysis in network by using the cases from [6]. The cases for training and validation were generated by the real LTE network. Each case includes the information on CauseKPI (key performance indicators) relations, which will be used for training and validating the model generated by AdaBoostSVM with different kernels. By using genetic algorithm to optimize parameters and control training error threshold, a good balance on accuracy/diversity will be achieved. Since SVM and AdaBoost were originally designed for binary classifier, in this paper, OAO (One Against one) approach was used for classifiers to generate a multiclassifier to train the model. The results show that the two proposed algorithms based on AdaBoostRBFSVM and AdaBoostLSVM can automatically diagnose different classes of network anomalies with high accuracy, low diagnosis error rate, low false positive rate, and low undetected rate. Genetic algorithm is used to find the set of parameter values for the optimal accuracy of the ensemble classifier. GAAdaboostSVM classifiers outperform AdaboostSVM classifiers with a lower overall error. Therefore, the genetic algorithm could help the AdaBoostLSVM classifier to obtain the optimal performance.
The main contributions of this paper are as follows:

1.
Proposed two modified ensemble learning algorithms using LSVM and RBFSVM as component classifier, i.e., AdaBoostLSVM and AdaBoostRBFSVM, to improve wireless network troubleshooting performance.

2.
Proposed a new method to solve the accuracy/diversity dilemma in AdaBoost to obtain the optimal performance of the AdaBoostSVM.

3.
Genetic algorithm is used to get the best classification accuracy of AdaBoostSVM.

4.
The diversity of the AdaBoostSVM is regulated by changing the training error threshold.
Problem formulation
There are three main tasks in the process of troubleshooting: detection, diagnosis, and recovery. This work is centered in diagnosis with the cases provided in [6]. The following sections provide the knowledge necessary to understand the diagnosis system, such as performance metrics, fault causes, and related KPIs.
KPIs
KPI is an indicator that reflects network performance. The statistics and calculations of abnormal KPI value which is lower or higher than a certain threshold can reflect the network performance of a cell or part of region. In this paper, seven common KPIs including Retainability, Handover Success Rate (HOSR), Reference Signal Received Power (RSRP), Reference Signal Received Quality (RSRQ), Signal to Interference Noise Ratio (SINR), Distance, and Average throughput were calculated for later diagnosis.
Fault causes
A network failure can cause an abnormality in the KPI indicator. The causes of mobile network failure can generally be divided into three categories, including coverage, mobility, and interference. In this paper, the data set is provided in [6], which is generated by the real LTE network. Six fault causes were selected, such as excessive uptilt (EU), excessive downtilt (ED), reduction in cell power (RP), coverage hole (CH), mobility, and intersystem interference (II). For more explanations about KPIs and fault causes in this paper, please refer to reference [6].
Performance metrics
The accuracy of the case diagnosis is utilized to assess the diagnostic performance of the system. The higher the correct rate is, the better the system performance will be. Seven metrics were calculated to evaluate the performance of the diagnosis system.

(a)
Diagnosis error rate (E_{ d }): The proportion of misdiagnosed cases in the total number of cases. It shows the accuracy of the classifier.

(b)
Undetected rate (E_{ u }): The proportion of fault cases diagnosed as normal cases in the total number of fault cases. It shows the reliability of the classifier.

(c)
False positive rate (E_{ fp }): The ratio of normal cases diagnosed as fault cases to the total number of normal cases. It shows the availability of the classifier.

(d)
Total error rate (E_{ p }): The sum of diagnosis error rate (DER) and Undetected rate (UDR). It is given by E_{ p } = E_{ d } + E_{ u }.

(e)
Overall error (E): The probability that misdiagnosis occurs. It is given by E = P_{ n } · E_{ fp } + P_{ p } · E_{ p }, where P_{ n } and P_{ p } are the percentage of normal and fault cases in the validation set, respectively.

(f)
Complementary of the Positive Predictive Value (P_{ fp }): The probability that a given positive diagnosis is a false positive, which indicates the importance of a low false positive rate. High P_{ fp } makes the system unreliable because too many of the fault cases that are diagnosed are not real. It is given by \( {P}_{fp}=\frac{P_n\times {E}_{fp}}{P_n\times {E}_{fp}+{P}_p\times \left(1{E}_u\right)} \).

(g)
Confusion matrix: The confusion matrix is used to compare the mapping probabilities between the classification result and the true value. Each column of the confusion matrix represents a prediction category of data, and each row represents the true category of data.
Fault management based on Machine Learning
Support Vector Machine
SVM is a dichotomous model whose main idea is to find the separating hyperplane that can correctly classify the training set and maximize the geometric interval. The decision function of SVM can be expressed as ƒ(x) = < w, ϕ (x) > + b, where ϕ(x) represents the mapping of the input sample x to a highdimensional space [20]. <·, ·> denotes the dot product in the feature space. The optimal w and b can be solved by solving the following formula:
where ξ_{ i } is the ith slack variable and C is the regularization parameter. According to the Wolfe dual form, the above minimization problem can be written as:
where α_{ i } is a Lagrange multiplier which corresponds to the sample x_{ i } and k〈⋅, ⋅〉 and k(·,·) are kernel functions mapping all input vectors into an appropriate feature space k(x_{ i }, x_{ j }) = 〈ϕ(x_{ i }), ϕ(x_{ j })〉. The linear kernel function is expressed as \( k\left({\mathbf{x}}_i,{\mathbf{x}}_j\right)={\mathbf{x}}_i^T{\mathbf{x}}_j \), and the RBF kernel function is expressed as k(x_{ i }, x_{ j }) = exp(−‖x_{ i } − x_{ j }‖^{2}/2σ^{2}). By applying the kernel function, the sample is mapped linearly to the highdimensional feature space. In this space, the optimal separating the hyperplane is constructed via SVM. Platt’s sequential minimal optimization (SMO) [19] has been widely used for solving the SVM problem. SMO is a fast iterative algorithm, which decomposes a large QP (quadratic programming) problem into several QP subproblems of the minimum size. Each QP subproblem has only two variables. For this small QP subproblem, the analytic solution can be found, so that the training speed gets faster.
AdaBoost
AdaBoost is one of the ensemble learning algorithms, which improves the performance of the ensemble classifier by boosting the accuracy of the weak classification classifier. After each iteration, the weight of classifier will be changed according to the classification results. If the classification result is wrong, the weight will be increased; otherwise, the weight will be reduced. The bigger the training error is, the smaller the weight will be. Finally, all the classifiers will be linearly combined to compose the final classifier.
Genetic algorithm
Genetic algorithm (GA) is a computational model that simulates the biological evolutionary process of natural selection and genetics of Darwin’s biological evolution [24, 25]. It is a method of searching for the optimal solution by simulating the natural evolutionary process. According to the principle of survival of the fittest, the genetic algorithm first generated an initial population of potential solution sets and then evolved generation after generation to get better and better approximate solution. At each generation, individuals were selected based on the fitness of individuals in the problem domain. Crossover and mutation were used to generate individuals that represented new potential solutions. The flow chart of GA is shown in Fig. 1.
Proposed algorithm: AdaBoostRBFSVM
In this part, RBFSVM classifier was employed as component classifier in AdaBoost. Before the AdaBoost iterations, it is the most important problem on setting the σ and C values for these RBFSVM component classifiers. According to RBFSVM performance analysis in [26], we know that σ is a more important parameter that affects the performance of a classifier than C. If a roughly suitable C is given, the performance of RBFSVM classifier is largely determined by the σ. It is known that setting a too large σ will get a too weak RBFSVM component classifier. On the contrary, a too small σ will make the RBFSVM component classifier too strong to boost it. As known in [17], giving all RBFSVM component classifiers a single σ value, the boosting process will be unsuccessful. Therefore, in this paper, the σ value will be adaptively adjusted to obtain a set of moderately accurate RBFSVM component classifiers. AdaBoostRBFSVM can be described as follows (Algorithm 1):
Firstly, weak RBFSVM classifiers are generated by setting a large σ value, and the weights of training samples are initialized to the same value.
Then, the weak RBFSVM classifiers with an initial σ value are trained on the weighted training set. The training error of RBFSVM is calculated, on which based different operations are performed. If the training error is more than the threshold ε_{ th }, the σ value will be decreased slightly by σ_{ step } and go back to step 3. Otherwise, the weights of RBFSVM classifiers will be set, and the weights of training samples will also be updated to calculate the training error for the next iteration. Slightly decreasing the σ value, we can prevent the new RBFSVM from being too strong for the current weighted training samples. Different from the AdaBoostSVM in [17] with the fixed training error value (ε_{ th } = 0.5), in this paper, we adjust the diversity of the classifier by changing the training error threshold. The greater the training error threshold, the more the weak classifiers satisfying the condition will be obtained, so that the diversity of the classifier will be better. Therefore, by reasonably adjusting the values of σ and ε_{ th }, the accuracy/diversity dilemma can be balanced and the optimal parameter configuration of the classifier is obtained.
Furthermore, the weights of training samples will be adaptively adjusted by the classified results, i.e., component classifiers with lower training errors will gain greater weights, and component classifiers with higher training errors will get smaller weights. This process will finish when the σ is less than the given minimal value.
Finally, AdaBoost makes a linear combination of all component classifiers into a single final hypothesis f.
Proposed algorithm: AdaBoostLSVM
This section aims at employing LSVM as component classifier in AdaBoost. Similar to the AdaBoostRBFSVM, it is important to set the C value for these LSVM component classifiers during the AdaBoost iterations. It is known that the value of C represents the importance of outliers to the classifier. The larger C represents more attention will be paid to the outliers, which means that they cannot be easily ignored. Increasing the value of C can always achieve the correct classification of the training samples, but this will lead to overfitting and bad generalization performance. On the contrary, continuously decreasing the value of C will result in underfitting. Obviously, if all LSVM component classifiers were set to a single C value, the boosting process will be unsuccessful. Therefore, in this paper, a set of moderately accurate LSVM component classifiers will be obtained by adaptively adjusting the C value. AdaBoostLSVM can be described as follows (Algorithm 2):
Firstly, weak LSVM classifiers are generated by setting a small C value, which means that the LSVM classifiers have weak learning ability, and the weights of training samples are initialized to the same value.
Then, LSVM with this C is trained in as many cycles as it can get in less than a training error threshold ε_{ th }. Otherwise, this C value is increased slightly to enhance the learning capability of LSVM to help it achieve less than the training error threshold ε_{ th }. Similar to AdaBoostRBFSVM, we adjust the diversity of the classifier by changing the training error threshold. Through regulating the values of C and ε_{ th } reasonably, we can balance the accuracy/diversity dilemma and get the optimal parameter configurations of classifier.
Furthermore, the weights of training samples will be adaptively adjusted by the classified results. This process continues until the C is increased to the given maximal value.
Finally, AdaBoost makes a linear combination of all component classifiers into a single final hypothesis f.
Proposed algorithm: GAAdaBoostSVM
The principle of AdaBoost is to linearly combine multiple component classifiers into ensemble classifier. The value of parameters (C and σ) plays a big role in the performance of the component classifier during the AdaBoost iterations. Different values of parameters will get different component classifiers, resulting in different performance of the ensemble classifier. Therefore, the performance of the ensemble classifier depends on the parameter value of each component classifier. Although the AdaBoostSVM algorithm can achieve good classification performance, it needs to set the value of C_{ ini }, C_{ step }, σ_{ ini }, and σ_{ step } in advance. Therefore, how to set the parameter values of component classifier in a reasonable and effective way is a very important issue. Genetic algorithm is a method of searching for the optimal solution, which is largely used in search and optimization problems. In this paper, genetic algorithm is used to find the optimal set of parameter values of the ensemble classifier. The GAAdaBoostSVM algorithm is showed in Algorithm 3. With different kernel functions, the GAAdaboostSVM is abbreviated as GAAdaboostLSVM and GAAdaboostRBFSVM.
Multiclassifier based on binary classification
Since SVM and AdaBoost were originally designed for binary problems, several methods were proposed to extend binary classifier to solve multiclassification problems. One approach is to decompose the multiclassification problem into multiple binary classification problems, and then, the classification result of each binary classifier is combined to obtain the final classification result. There are several commonly used multiclassification methods based on binary classifier, such as OAA (One Against All), OAO (One Against one), and DAG (directed acyclic graph) [27,28,29].
The OAA method is to classify the samples of one category into one class, and the rest of the samples are classified as another one. In this way, samples of k categories construct k classifiers. The classification result is to classify the unknown sample into the class with the maximum value of the classification function. The advantage of this method is that for the k classification problem, only k binary classifiers need to be trained, so the number of the classification functions (k) obtained is less, and the classification speed is relatively fast. The disadvantage is that it will cause imbalances in the categories, which greatly affect classification accuracy. Therefore, it is not very practical.
The OAO method is to design a classifier between any two classes of samples, so k(k − 1)/2 classifiers need to be designed for samples of k classes. The classification result is to classify the unknown sample into the class with the maximum value of the classification function. The advantage of this method is that the training accuracy is relatively high, but the classification speed is slow and it takes high cost.
Similar to the OAO method, the DAG method also needs to construct k(k − 1)/2 binary classifiers and obtain the corresponding decision functions of these classifiers. However, in classification, the DAG method is classified by constructing a “binary directed acyclic graph” with a root node. The “binary directed acyclic graph” has k(k − 1)/2 internal nodes and k leaf nodes. Each internal node corresponds to a binary classifier, and each leaf node corresponds to a class.
For OAO and DAG methods, OAO is generally considered to be slightly more accurate than DAG for the same training time, but the testing time of DAG is slightly lower or the same. The most commonly used multiclassification methods are the OAO and OAA methods, but the OAO method is more suitable for practical applications. Therefore, in this paper, OAO approach was used for each binary classifier to train the multiclassification model. The algorithm of multiclassifier based on binary classification is showed in Algorithm 4.
Evaluation
Case study
In this work, the training and validation cases are provided in [6]. The training data is a set of cases following the format described in Fig. 2. A case is a vector consisting of multiple KPIs and corresponding fault cause.
Experimental design
There are 550 cases in the training set and 4009 cases in the validation set. The distribution of fault causes in the training set is showed in Fig. 3, and the sample distribution of the validation set is comparable to that of the training set. All the algorithms were firstly trained with the training cases, and afterward, these were tested with the validation cases, and five performance metrics (E_{ d }, E_{ u }, E_{ fp }, E, and P_{ fp }) were calculated. To compare the advantages of the proposed algorithms with other algorithms in generalization performance, the same performance metrics were calculated by other algorithms in the same training and validating set. In order to improve the accuracy of the classification results, this paper tests the validation set 100 times and takes the average. The main parameters of the algorithm for testing and evaluation are shown in Table 8 (Appendix).
Results and discussion
Evaluation based on LSVM
The parameter C has a great influence on the performance of the LSVM classifier. If C is small, the classifier will be underfitting. On the contrary, the classifier will be overfitting. So, we tested the performance of the classifier with different C values and got the optimal C values. The diagnosis error rate (DER), undetected rate (UDR), false positive rate (FPR), Overall error (OE), and Complementary of the Positive Predictive Value (PFP) with different C values were shown in Fig. 4. It can be seen that as the C value increases, the five metrics are reduced firstly and then increased and maintained in a relatively stable range. It obtains the minimum OE and PFP value at C = 0.5.
Evaluation based on RBFSVM
We know that the parameter C and σ have a great influence on the performance of the RBFSVM classifier. Given a roughly suitable C, the performance of the RBFSVM classifier is largely determined by the σ value which also influences the complexity of classifier. With a larger σ, the complexity of classifier often decreases and it gets bad classification performance. Conversely, the complexity of classifier increases and good classification performance will achieve a small σ value. Several performance metrics with different σ values are shown in Fig. 5 where C is set to be 1. It can be seen that the minimum OE will be obtained when σ = 3. Compared with LSVM, the UDR of RBFSVM was significantly higher. It means that the LSVM classifier can decrease the number of minority class samples that are misclassified.
Evaluation based on AdaBoostLSVM
It is known that the good generalization performance cannot be gotten by using a too large or too small value of C. Simply applying a single C to all LSVM component classifiers cannot lead to successful AdaBoost due to the overfitting or underfitting situations encountered in the Boosting process. So, in this section, the proposed AdaBoostLSVM approach adaptively adjusts the C value in LSVM component classifiers to obtain a set of moderately accurate LSVMs for AdaBoost. In order to increase the diversity of the classifier, we change the training error threshold from 0.01 to 0.5 and several performance metrics with different ε_{ th } and C were calculated. Table 1 shows several performance metrics with different ε_{ th } on the optimal C values. It can be seen that we could get the optimal parameter configurations of classifier through regulating the values of C and ε_{ th } reasonably. Figure 6 shows the best performance metrics with optimal C values and training error threshold ε_{ th }. Compared with LSVM, all performance metrics of AdaBoostLSVM have a significant improvement. The overall error is reduced to 6.8% and the complementary of the Positive Predictive Value is reduced to 4.1%. It means that the AdaBoostLSVM classifier has higher accuracy and stability than LSVM. Therefore, the ensemble classifier based on LSVM component classifier could boost the generalization performance through regulating the values of C and ε_{ th } reasonably.
Table 2 shows the normalized confusion matrix of AdaBoostLSVM method with optimal parameter values. The diagonal of the matrix represents the diagnosis success rate of each problem in the system. It can be seen that more than half accuracy can be obtained for each problem, which illustrates the diagnosis system is indeed available. The normal cases have the highest diagnosis success rate, which demonstrates the diagnostic system has high availability with a lower FPR. However, the diagnostic accuracy of II, CH, and TLHO is relatively low, and the probability that each fault is misdiagnosed as normal is higher than the probability of being misdiagnosed as another fault, which respectively corresponds to a high DER and UDR.
Evaluation based on AdaBoostRBFSVM
According to the previous analysis, we know that σ is a more important parameter compared to C: the performance of classifier is largely determined by σ. For comparison with RBFSVM, we tested several performance metrics with different σ values when C was set to be 1. In order to increase the diversity of the classifier, we changed the training error threshold from 0.01 to 0.5 and several performance metrics with different ε_{ th } and σ were calculated. Table 3 shows several performance metrics with different ε_{ th } on the optimal σ values. It can be seen that we could get the optimal parameter configurations of classifier through regulating the values of σ and ε_{ th } reasonably. Figure 7 shows the best performance metrics with optimal σ values and training error threshold ε_{ th }.
False positive rate (FPR) suggests the ability to filter out normal cases. High FPR indicates that the diagnostic systems are not available because of the high probability of false positives. Compared with RBFSVM, although the UDR and DER are only slightly reduced, the FPR does reduce a lot. It means the AdaBoostRBFSVM classifier is indeed usable and could largely reduce the number of normal cases being misclassified. Furthermore, the OE and PFP also decrease, indicating that the AdaBoostRBFSVM classifier has higher accuracy and reliability compared to RBFSVM. Compared with AdaBoostLSVM, AdaBoostRBFSVM shows a significantly higher UDR. It means that the AdaBoostLSVM classifier, which is the same as LSVM, also can decrease the number of minority class samples that are misclassified.
Table 4 illustrates the normalized confusion matrix of AdaBoostRBFSVM method with optimal parameter values. Compared with Table 2, the diagnosis success rate of normal cases is increased, but the diagnostic accuracy of other problems is decreased a lot. The diagnostic accuracy of CH is reduced to 35.92%, which shows bad diagnosis performance. Furthermore, with a high UDR and DER, a significant increase appears in the probability that each fault is misdiagnosed as normal or other faults. Generally, the performance of AdaBoostRBFSVM is worse than that of AdaBoostLSVM.
Evaluation based on GAAdaBoostSVM
The difference between this algorithm and the traditional genetic algorithm is that the best parameter we find in this paper is a set of numerical values rather than a single numerical value. Therefore, in the initial population stage, different individuals in multiple populations are randomly assigned different numerical sizes. The individual sets in each population represent a set of potential optimal solutions. At each generation, the fitness value of each population was calculated and used to select the population for the next generation by roulette wheel selection method. In order to prevent the solution set from falling into local optimal, crossover and mutation were used to generate populations that represented new sets of solutions. The basic process of genetic algorithm is summarized as follows:

1.
Initial population: Each population is a possible solution to the problem. Each individual of the population is randomly selected and coded as binary bits. For the GAAdaBoostRBFSVM classifier, C and σ values are both coded as binary bits. Only C value is coded as binary bits in GAAdaBoostLSVM. In this paper, multiple populations were introduced to obtain sets of parameter values that optimize the objective function. Each population contains the same number of individuals representing different parameter values. In this paper, 100 populations are initially generated, and each population includes 15 randomly generated individuals.

2.
Evaluation: The fitness value of each population determines whether the population will survive and reproduce in future generations, which is decided by fitness function. In this paper, the overall error (OE) is used as fitness function.

3.
Selection: Population with better fitness has greater probability to be selected to compose the population sets for the next generation. A selection by roulette wheel is used to choose the population sets for the next generation in this paper.

4.
Crossover: Crossover refers to the operation of generating a new individual by replacing and reorganizing parts of two parental individuals. By crossing, the search power of genetic algorithms is dramatically increased. Singlepoint crossover operator is implemented to perform the crossover in this paper. The crossover rate is set to be 0.8.

5.
Mutation: Mutation refers to the variation of certain gene values of individual strings to increase the population diversity. The mutation rate is set to be 0.1.
The parameters of genetic algorithm can be seen in Table 5. Figures 8 and 9 show the overall error variation of the GAAdaboostLSVM and GAAdaboostRBFSVM classifiers at different training error thresholds, respectively. As can be seen, when the number of iterations is greater than 120, the overall error no longer changes.
Figure 10 shows the minimum overall error rate of different classifiers at different training error thresholds. From this figure, we can find that, compared with AdaboostSVM, GAAdaboostSVM has better classification performance, which can make the classifier get a lower overall error, no matter what the threshold is set to be. It can be seen that the GAAdaboostLSVM and GAAdaboostRBFSVM classifier will get the minimum OE separately when ε_{ th } is set to be 0.05 and 0.3, which more illustrates the validity of our previous point of view: the accuracy/diversity dilemma can be solved through reasonable parameter adjustment strategy which will be more reasonable and effective by using genetic algorithm.
Tables 6 and 7 respectively illustrate the normalized confusion matrix of GAAdaBoostLSVM and GAAdaBoostRBFSVM method with optimal parameter values. Compared with Tables 2 and 4, the diagnosis success rate of each fault cause in Tables 6 and 7 is increased significantly at the expense of a slight drop in FPR. What is more, a significant decrease appears in the probability that each fault is misdiagnosed as normal or other faults and low DER and UDR are obtained. In comparison with Tables 6 and 7, we can see that the diagnosis success rate of each fault cause in Table 6 is higher than that of Table 7 with a slight decreasing on FPR. It demonstrates that the GAAdaBoostLSVM classifier, with low UDR and DER and almost the same FPR, has better classification performance than the GAAdaBoostRBFSVM classifier in this sample set.
Conclusions
In conclusion, two multiclassification diagnosis systems based on AdaBoostRBFSVM and AdaBoostLSVM have been presented for mobile network selfdiagnosis. Both of the two diagnosis systems can automatically detect and diagnose different classes of network anomalies with good performance. Before testing the performance of proposed approaches, the performance of individual LSVM and RBFSVM was tested firstly to find the suitable range of parameters. Then, the AdaBoostRBFSVM and AdaBoostLSVM approaches were employed to perform the diagnosis. The result shows that the two proposed approaches outperform individual SVM approaches and show good generalization performance. The AdaBoostLSVM classifier has higher accuracy and stability than LSVM classifier. Compared with RBFSVM, the UDR and DER of AdaBoostRBFSVM are only slightly reduced, but the FPR does reduce a lot. It means the AdaBoostRBFSVM classifier is indeed usable and could largely reduce the number of normal class samples being misclassified. Through some parameteradjusting strategies, we can tune the distributions of accuracy and diversity over these component classifiers to achieve a good balance. Therefore, the ensemble classifier based on SVM component classifier could boost the generalization performance through regulating the parameters reasonably. In order to get a more accurate and effective classifier, genetic algorithm is used to make more reasonable adjustments to the classifier parameters.
In this paper, we did not consider the effect of imbalanced data on the classifier performance. So, in the next step, we will consider some data balancing methods [22], such as random oversampling, undersampling, and synthetic minority oversampling technique (SMOTE), to reduce the impact of data imbalance on diagnostic performance.
Abbreviations
 3GPP:

Third Generation Partnership Project
 AdaBoost:

Adaptive Boosting
 AdaBoostLSVM:

Adaptive Boosting based on linear kernel
 AdaBoostRBFSVM:

Adaptive Boosting based on RBFSVM
 AdaBoostSVM:

Adaptive Boosting based on SVM
 CAPEX:

Capital Expenditures
 CH:

Coverage hole
 DAG:

Directed acyclic graph
 DER:

Diagnosis error rate
 ED:

Excessive downtilt
 EU:

Excessive uptilt
 FPR:

False positive rate
 GA:

Genetic algorithm
 GAAdaboostLSVM:

AdaboostLSVM based on genetic algorithm
 GAAdaboostRBFSVM:

AdaboostRBFSVM based on genetic algorithm
 GAAdaboostSVM:

AdaboostSVM based on genetic algorithm
 HetNet:

Heterogeneous network
 HOSR:

Handover Success Rate
 II:

Intersystem interference
 KPIs:

Key performance indicators
 LSVM:

SVM with the linear kernel
 LTE:

Long Term Evolution
 LTEA:

LTEAdvanced
 ML:

Machine Learning
 OAA:

One Against All
 OAO:

One Against one
 OE:

Overall error
 OPEX:

Operational Expenditures
 PFP:

Complementary of the Positive Predictive Value
 RBF:

Radial basis function
 RBFSVM:

SVM with the RBF kernel
 RP:

Reduction in cell power
 RSRP:

Reference Signal Received Power
 RSRQ:

Reference Signal Received Quality
 SINR:

Signal to Interference Noise Ratio
 SMOTE:

Synthetic minority oversampling technique
 SONs:

Selforganizing networks
 SVM:

Support Vector Machine
 UDR:

Undetected rate
References
 1.
3GPP. (2012). Telecommunication Management; SelfOrganizing Networks (SON); Concepts and Requirements. Next Generation Mobile Networks (NGMN) Alliance, ts 32.500 edn
 2.
SelfOrganizing Networks (SON), Concepts and Requirements Version 12.1.0, 3GPP TS, vol 32 (2014), p. 500
 3.
EJ Khatib, R Barco, P Muñoz, et al, Knowledge Acquisition for Fault Management in LTE Networks[J]. Wirel. Pers. Commun. 95, 1–20 (2017)
 4.
A GómezAndrades, P Muñoz, I Serrano, et al., Automatic root cause analysis for LTE networks based on unsupervised techniques[J]. IEEE Trans. Veh. Technol. 65(4), 2369–2386 (2016)
 5.
EJ Khatib, R Barco, A GómezAndrades, et al., Diagnosis based on genetic fuzzy algorithms for LTE selfhealing[J]. IEEE Trans. Veh. Technol. 65(3), 1639–1651 (2016)
 6.
A GómezAndrades, P Muñoz, EJ Khatib, et al., Methodology for the design and evaluation of selfhealing LTE networks[J]. IEEE Trans. Veh. Technol. 65(8), 6468–6486 (2016)
 7.
Rezaei S, Radmanesh H, Alavizadeh P, et al. Automatic fault detection and diagnosis in cellular networks using operations support systems data[C]// NOMS 2016–2016 IEEE/IFIP Network Operations and Management Symposium. IEEE, 2016:468–473.
 8.
Univerself. (2012). Univerself project. http://www.universelfproject.eu/.
 9.
COMMUNE. (2012). Commune (Cognitive Network Management Under Uncertainty).
 10.
SELFNET. (2015). Selfnet Project. https://selfnet5g.eu/.
 11.
EJ Khatib, R Barco, A GómezAndrades, et al., Data mining for fuzzy diagnosis systems in LTE networks[J]. Expert Syst. Appl. 42(21), 7549–7559 (2015)
 12.
Iacoboaiea O, Sayrac B, Jemaa S B, et al. SON conflict diagnosis in heterogeneous networks[C]//Personal, Indoor, and Mobile Radio Communications (PIMRC), 2015 IEEE 26th Annual International Symposium on. IEEE, 2015: 1459–1463.
 13.
R Barco, V Wille, L Díez, M Toril, Learning of model parameters for fault diagnosis in wireless networks. Wireless Netw. 16(1), 255–271 (2010)
 14.
J Moysen, L Giupponi, A Reinforcement Learning Based Solution for SelfHealing in LTE Networks[C]//Vehicular Technology Conference. IEEE, 2014:1–6
 15.
L FloresMartos, A GomezAndrades, R Barco, et al, Unsupervised System for Diagnosis in LTE Networks Using Bayesian Networks[C]//Vehicular Technology Conference. IEEE, 2015:1–5
 16.
P Casas, A D'Alconzo, P Fiadino, et al, Detecting and Diagnosing Anomalies in Cellular Networks Using Random Neural Networks[C]//Wireless Communications and Mobile Computing Conference. IEEE, 2016:351–356
 17.
X Li, L Wang, E Sung, AdaBoost with SVMbased component classifiers[J]. Eng. Appl. Artif. Intell. 21(5), 785–795 (2008)
 18.
H He, EA Garcia, Learning from imbalanced data[J]. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
 19.
Platt J. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines[J]. 1998.
 20.
Xuewen Liu, Gang Chuai, Weidong Gao, Yifang Ren and Kaisa Zhang. Diagnosis Based on Machine Learning for LTE SelfHealing[M]// The Proceedings of the Sixth International Conference on Communications, Signal Processing, and Systems. Springer International Publishing (Accepted).
 21.
RE Schapire, Y Singer, Improved boosting algorithms using confidencerated predictions. Mach. Learn. 37(3), 297–336 (1999)
 22.
TG Dietterich, An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization[J]. Mach. Learn. 40(2), 139–157 (2000)
 23.
H Schwenk, Y Bengio, Boosting neural networks[J]. Neural Comput. 12(8), 1869–1887 (2000)
 24.
D Yang, Z Liu, T Shu, et al., An improved genetic algorithm for multiobjective optimization of helical coil electromagnetic launchers[J]. IEEE Transactions on Plasma Science PP(99), 1–7 (2017)
 25.
CZ Cooley, MW Haskell, SF Cauley, et al., Design of sparse Halbach magnet arrays for portable MRI using a genetic algorithm[J]. IEEE Trans. Magn. PP(99), 1–12 (2017)
 26.
G Valentini, TG Dietterich, Biasvariance analysis of support vector machines for the development of SVMbased ensemble methods[J]. J. Mach. Learn. Res. 5(Jul), 725–775 (2004)
 27.
C.Wei Hsu, C.Jen Lin, A comparison of methods for multiclass support vector machines Neural Networks, IEEE Trans. on, vol. 13, no. 2, pp. 415‑425, 2002.
 28.
G Madzarov, D Gjorgjevikj, Multiclass classification using support vector machines in decision tree architecture. EUROCON 2009, 288–295 (2009)
 29.
HJ Rong, GB Huang, YS Ong, Extreme learning machine for multicategories classification applications[C]// IEEE International Joint Conference on Neural Networks. IEEE, 2016:1709–1713
Acknowledgements
This work was funded by the National Science and Technology Major Project: No. 2018ZX03001029004.
Availability of data and materials
The datasets supporting the conclusions of this article were collected from reference [11].
Author information
Affiliations
Contributions
WDG conceived and designed the study. XWL performed the simulation experiments. KSZ wrote the paper. GC reviewed and edited the manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Xuewen Liu.
Ethics declarations
Authors’ information
Xuewen Liu is currently working toward a Ph.D. degree with the School of Information and Communication Engineering, Beijing University of Posts and Telecommunications.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Liu, X., Chuai, G., Gao, W. et al. GAAdaBoostSVM classifier empowered wireless network diagnosis. J Wireless Com Network 2018, 77 (2018) doi:10.1186/s1363801810785
Received:
Accepted:
Published:
Keywords
 Diagnosis
 AdaBoostSVM
 GAAdaboostSVM
 Selforganizing networks (SONs)