CSIT: channel state and idle time predictor using a neural network for cognitive LTE-Advanced network

Cognitive radio (CR) is a novel methodology that facilitates unlicensed users to share a licensed spectrum without interfering with licensed users. This intriguing approach is exploited in the Long Term Evolution-Advanced (LTE-A) network for performance improvement. Although LTE-A is the foremost mobile communication standard, future underutilization of the spectrum needs to be addressed. Therefore, dynamic spectrum access is explored in this study. The performance of CR in LTE-A can significantly be enhanced by employing predictive modeling. The neural network-based channel state and idle time (CSIT) predictor is proposed in this article as a learning scheme for CR in LTE-A. This predictive-based learning is helpful in two ways: sensing only those channels that are predicted to be idle and selecting the channels for CR transmission that have the largest predicted idle time. The performance gains by exploiting CSIT prediction in CR LTE-A network are evaluated in terms of spectrum utilization, sensing energy, channel switching rate, packet loss ratio, and average instantaneous throughput. The results illustrate that significant performance is achieved by employing CSIT prediction in LTE-A network.


Introduction
Long Term Evolution-Advanced (LTE-A) is an evolving next generation mobile network that guarantees high data rates up to 1 Gbps for low mobility and 100 Mbps for high mobility [1]. International Telecommunication Union (ITU) issued requirements in terms of International Mobile Telecommunication-Advanced (IMT-Advanced). In order to achieve the requirements, various optimized wireless techniques proposed in literature for LTE Rel 8. Among them, carrier aggregation (CA) is the technique that is concerned with the escalation of bandwidth of the user equipment (UE) from 20 to 100 MHz [2]. Hence, by inculcating the concept of CA in LTE, the peak-supported transmission rate will be 1 Gbps in downlink and 500 Mbps in uplink, which is in accordance with the limits imposed by the ITU. However, despite the potential advantages of CA in terms of throughput, a major concern with the spectrum underutilization due to increased UEs in the future has to be addressed.
The Federal Communication Commission regulates the reservation of spectrum resources on a long-term basis to the primary (license) network. However, the fixed spectrum assignment policy results in the underutilization of the spectrum in the primary network that leads to unutilized portions [3]. These unutilized portions are termed white spaces or spectrum holes (SH) in the literature. This underutilization forces the development of dynamic spectrum access (DSA) in the primary network for efficient spectrum utilization. The DSA can be employed by inculcating cognitive radio (CR) (secondary) users in the primary network to enhance the capacity and improve spectrum utilization [4,5].
CR operation is illustrated as a simple 'cognitive cycle' as in [4], which consists of three tightly interconnected phases. The first phase is the sensing, during which CR senses the primary network channels and discovers the statistic of sensed ones. The statistics includes channel occupancy, idle time, power level, modulation schemes, etc. These statistics help in making the operation of CR possible, i.e., deciding where to operate, how much power is needed, and which modulation scheme is needed, etc. Therefore, the sensing task is the most critical part in the overall cognitive cycle. Various spectrum sensing mechanisms have been proposed in the literature such as energy detection, matched filter detection, cyclo-stationary feature detection, and cooperative spectrum sensing [3]. The optimized sensing method that gives best sensing results, e.g., cooperative spectrum sensing, would be adopted because erroneous sensing results contribute to wrong decisions in the third phase, i.e., decision. This erroneous information will also lead to drastic interference to the licensed network. During the second phase, predictive modeling, the previously sensed information is stored in the database, and based upon the past history, future status about the channels is predicted and forwarded to the decision phase block. The third and the last phase is the decision, which concerns with setting the transmission parameters, such as channel selection, transmit power, and modulation scheme. The concern of the implication of predictive modeling in the second phase is to have optimum decisions in the third phase.
CR is termed as software-controlled radio that can adjust its parameters on run time with respect to the changing environment. These software-based capabilities are provided by the cognitive engine (CE) [6]. The CE forces the software-defined radio to adjust the parameters based upon the knowledge base. The information in the knowledge base is not only the currently observed information but also the past observations. The actual intelligence in CR is possible by learning via the information in the knowledge base. Based upon the above discussion, it is concluded that integration of CE with cognitive cycle is important for providing actual intelligence for CR. Various learning mechanisms are available for use in the CR environment ranging from simple lookup search to machine learning models, like the neural network (NN) and hidden Markov model.
Most existing works exploring DSA in LTE-A focus on the components used for inculcating CR concepts. The main contributions of this study are categorically elaborated as follows: 3. The primary user (PU) traffic is modeled in statistical domain, i.e., by Poisson and Pareto stochastic processes, as described in Section 6. The concern of employing the aforementioned stochastic processes is that they closely approximate the behavior of PU in mobile communication environment. 4. Finally, the proposed CSIT predictor is compared with nonpredictive CR in LTE-A in terms of five performance measures, i.e., spectrum utilization, sensing energy, channel switching rate, packet loss ratio (PLR), and average instantaneous throughput. In addition, the limitation of the study is also presented.
The rest of this paper is structured as follows. Related work is presented in Section 2, and the system model is outlined in Section 3. A review of NN is projected in Section 4, and the proposed MLP-based CSIT prediction is elaborated in Section 5. The simulation and analysis in terms of the accuracy of the proposed predictor and performance improvement are listed in Section 6. Finally, Section 7 concludes the article.

Related work
In the literature [7][8][9], the authors surveyed the machine learning algorithms applicable to each phase of the cognitive cycle in the CR network. The closest work relating to our study that involves prediction via NN in CR domain is given in [10][11][12][13][14]. Moreover, incorporating CR concepts in LTE-A is presented in [15,16].
In [10] the authors exploited the Elman recurrent neural network for spectrum prediction based on the multivariate time series. Elman recurrent is a class of NN, and the difference with feedforward NN lies in the inclusion of recurrent layer(s) within it. In their proposed model, time series is estimated using the cyclostationary feature detection and then the spectrum is predicted by applying that series to Elman recurrent NN. The authors in [11] exploited the MLP for spectrum prediction in CR. The concern of their study is to reduce the sensing time for CR, i.e., predicting the status of the channels just before brutally applying the sensing. Another approach towards reducing the sensing time and hence increasing the transmission rate is projected in [12]. In their proposed model, the sensing task is supervised by the network, i.e., by giving the intelligent sensing matrix to the CR. This supervised learning results in significant sensing time reduction and, hence, increased throughput. The NN-based throughput learner for CR is proposed in [13] where authors exploited NN for predicting the appropriate transmission rate that can be used for a concerned channel. They presented the basic and extended throughput prediction model using NN. The basic throughput prediction is a simple prediction whereas the extended one also takes into account the geographical/time location of users. Another traffic prediction for opportunistic spectrum access is presented in [14]. They proposed a selective opportunistic spectrum access in which the probability of channels appearing to be idle is computed on the statistical basis. On the basis of the predicted idle time, the best sensing order is formulated. They evaluated their results in terms of packet loss ratio and throughput.
According to the best of our knowledge, our proposed CSIT MLP prediction model is unique in the CR domain and specifically for LTE-A. However, a nominal work done in inculcating CR into LTE-A is presented here after. The DSA framework for LTE-A network is presented in [15,16]. The authors in [15] highlighted the existing blocks in LTE-A network for supporting CR concepts. According to them, DSA can be indulged just by introducing a few blocks within the existing network. The authors in [16] utilize the framework presented in [15] and introduce the opportunistic spectrum access in LTE-A network. Their work illustrates the adoption of a geo-location database within LTE-A network that gathers the cooperative information from the CR users with its surrounding environment.

Introducing prediction-based CR in LTE-A network
The system model comprises of DSA overlay in LTE-A network as shown in Figure 1. The DSA is supported by the introduction of cognitive user equipment (CU sense ). We are projecting a new prediction-based CR user equipment, i.e., CU predict . The difference between the two is that the former one is the normal CR whose job is to sense the primary channels and discover the opportunities in terms of SHs, whereas the latter one is the predictive CR which senses only those channels that are predicted to be idle and/or has larger predicted idle time. This predictive modeling results in significant performance improvement of the existing LTE-A network. The prediction regarding future channel statistics is carried out by having the past history on each predictive CR in the secondary network. Therefore, each CU predict houses the primary channel's past observation. Information about the past primary channels statistics are repeatedly acquired on the control channel. Furthermore, we also assume that the projected predictive secondary network nodes operate in a docitive fashion and forward their collaborative predicted results to the primary base station (eNB) where docitive network is an emerging paradigm for DSA in which the UEs teach each other that leads to reduced complexity and fine-tuned decisions [17,18].
As far as the interference to the primary network is concerned, CU predict applies sensing to the slot(s) that are predicted to be more idle, in terms of idle time slot (s), before using them for transmission. Therefore, there is a less chance of interference being inducted into the primary network due to the implication of docitive and predictive modeling approach. However, if the interference experienced by the PUs due to secondary users (S e U) exceeded beyond the specified threshold, then eNB alerts the concerned CU predict to change the slot(s) for transmission to avoid the interference.

System architecture of the proposed CU predict
The proposed system architecture of CU predict is illustrated in Figure 2. The basic idea of our study is to predict the CSIT of the primary channels in the LTE-A network without just relying on the current sensed observations. The CSIT prediction is done by employing NN as a learning of CU predict . The brief operation of CU predict for future prediction using NN as learner is as follows: 1. Firstly, the NN in CU predict is trained properly with the appropriate data set and once it is trained, its complexity is significantly reduced.    Here, we are considering the past observations of primary channels on a per resource block (RB) basis, where RB is the basic unit of the spectrum in LTE-A. The predicted channels statistics are grouped in to either SH or PU, where each one consists of multiple RBs, as shown in Figures 2 and 3 PU1={RB1, RB2}, SH1={RB3, RB4}, etc. However, we discussed throughout this paper in terms of the sensed/scanned slot(s). This is due to the fact that we consider the basic unit of the spectrum in LTE-A in the time-slotted fashion as shown in Figure 4a. Each CR in the LTE-A network, either CU sense or CU predict , can sense slot(s) in a particular sensing time slot. The difference lies here that CU sense has to sense all the RBs, whereas CU predict applies sensing to only those slot(s) that are predicted to be idle. Furthermore, the difference between the sensing time slot and available slot (s) to CR is that the former is in accordance to measurement gap repetition period (MGRP) defined in the standard and the latter is just to have different interpretations of the RBs set in a time-slotted fashion. Noticeably, the sensing time slot is directly associated with the gap pattern defined in standard for UE in RRC_CONNECTED state, i.e., measurement gap length (MGL) and MGRP. According to the standard, MGL is reserved for capturing the samples from a certain bandwidth and is fixed at 6 ms whereas MGRP is the gap duration during which MGL (samples extraction) and then data transmission is carried out. Moreover, MGRP is configurable in multiples in frame length (i.e., 10 ms) [19]. The relationship between MGL and MGRP is also illustrated in Figure 4b where MGRP is selected to be 40 ms. Within the gap period (sensing time slot), the scheduler does not allocate channels/slot(s) to UE. Also, the channels and slot(s) are used interchangeably as licensed and primary users in this study do.

Notation and assumptions
The notations presented in Table 1 are used throughout the rest of this paper.

Neural network
The human brain consists of real biological neurons that are interconnected in a sense to provide stunning  functionality. The functionality includes classification, optimization, decision making, etc. Although there is a significant development in information technology, still the intelligence provided by the advancement can never cross the real boundaries of human brain. On the other hand, NN are composed of artificial neurons that imitate the behavior of biological neurons in human brain. These artificial neurons are interconnected in a software structure and behave the same as biological neurons in the human brain. Therefore, the NN has been extensively applied to the areas that require cognitive tasks, such as predictive modeling and pattern classifications.
The generalized structure of NN consists of the three types of neuron layers, i.e., input layer, hidden layer, and output layer. Generally, the input layer acquires the data from the outside world and the output layer returns the data after passing through the hidden layer(s). NN structure can either be of feedforward or recurrent [20]. In feedforward structure, data flow from the input layer to the output layer without having any feedback connection to the backward layer(s). The classical example of feedforward structure is the perceptron listed in [21]. However in recurrent NN, there are connections between the output neuron to the input neuron of the same or the backward layer. The example being the Elman's recurrent neural network [22] in which the concept of recurrent NN is employed. In this study, we employ MLP, a feedforward structure of NN, which has been used widely in time series prediction [23] and binary prediction [24].
In the generalized NN, each neuron communicates with one another by having a direct connection between them. Every connection is specified by the weight v βα which deliberates the influence of α neuron on β neuron. Each neuron computes the output using the specified activation function. As shown in Figure 5, the β neuron has α inputs from the previous layer neurons with their specified weights, i.e., v βα and a bias input b β . The  This corresponds to how fast the a SU can find the idle slot(s) among total of N s activation function of the β neuron used to compute the output I′ β is given by In any case, NN has to be configured in a manner so that it produces the desired outputs for any given set of inputs. The desired output can be achieved by adjusting the connection weights v βα of all pairs of neurons (α, β). This procedure is termed as learning or training. NN should be learned in a manner so the error between seen and unseen data output is minimized, where unseen data are the data for which the NN has not been trained for. Furthermore, the learning process can be classified as either supervised or unsupervised learning. In the supervised learning mechanism, NN is given the teaching data base and on the basis of this, it adjusts connection weights to produce the desired output. This weight-changing procedure of the connection is termed as the backpropagation learning rule. This learning mechanism works by adjusting the connection link weights so that the error between the actual and the desired output is minimized. When this is done, NN is ready to give the output for any type of unseen data with least possible error. The other learning scheme, the unsupervised learning, is worked in the manner that does not require any real physical data set for the training but rather exploits the statistical representation of the input data for training NN. In this study, we use the supervised learning scheme for training the proposed MLP predictor.

CSIT prediction using MLP
The design specifications of CSIT predictor start by providing two types of inputs to MLP, i.e., primary channels/slot(s) state history and idle time history in time slot(s). The slot(s) state history is modeled as the bipolar binary series of length τ, i.e., s τ = {s 1 , s 2 , s 3 ,…, s τ }. The slot(s) can either be in one of the two states, i.e., idle or busy, where the idle state is represented by −1 and the busy state is represented by 1. The idle time history is modeled as an integral time series of length τ, i.e., i τ = {i 1 , i 2 , i 3 ,…, i τ }. Here we confined our analysis by assuming that the idle time slot(s) of the SH is to be in the integral range from {1 to 5}. Also, only the idle slot(s) in the vector s has an idle time slot(s) reference value in the corresponding vector i. The busy slot has 0 corresponding value in the idle time history vector i. This assumption is due to the fact that we are predicting the idle slot(s) in the RBs set and their idle duration in time slot(s). By training MLP using the above-mentioned past history vectors, i.e., s and i, it can predict the channels/ slot(s) state and their idle duration in time slot(s). As far as the complexity of the proposed predictor is concerned, once the NN is trained properly, the computational complexity is significantly minimized.

MLP-based CSIT predictor design
MLP is the multilayer feedforward structure consisting of input layer, some hidden layers, and output layer. We have selected two sets of neurons in the input layer with each length τ for our proposed predictor, one set for capturing the slot(s) status, and the other one for the idle time slot(s) history. The overfitting problem in the predictor is avoided by fine tuning the neuron count in each layer, where the overfitting is a malfunction of MLP for the unseen data. The number of hidden layers and the neurons count in them depends upon the application. The best results are achieved by having two hidden layers, where the first one having 15 neurons and the other one having 20 neurons. The same specifications in terms of hidden layers and neuron count are used in [11], where the application under consideration was channel status prediction. The output layer consists of just two neurons, one for slot(s) state prediction and the other for the idle time slot(s) prediction. As far as the activation function is concerned, we have selected the logsigmoid function for the hidden layers and pure linear function for the output layer because we want one output neuron in the range from {−1 to 1} and the other one in the range from {1 to 5}. The output y k β from the neuron β in the kth layer is represented by the generalized activation function as where The parameter y k−1 α corresponds to the links from α neuron in the back (k − 1)th layer.

MLP-based CSIT training
The training process of the proposed predictor is illustrated in Figure 6. The training data are organized into two streams each of length T time slot(s). Furthermore, both the streams are chopped into chunks of τ, i.e., s τ and i τ and their corresponding desired outputs, i.e., s τ + 1 and i τ + 1 . Remember that s τ incorporates slot(s) state values, whereas i τ houses the idle time slot(s) values. Also, we are acquiring the history on a per-RB basis. This means that the chuck houses information regarding each RB. For each chuck, the output is computed by the activation functions at each hidden layer and the output layer. The achieved output from the predictor is expressed as s τþ1 and i τþ1 . This step of NN is termed as forward pass in which the data flow from input to output by lingering with the neurons in the hidden layer. The difference between the desired output and the actual output for the two neurons of our proposed predictor is expressed as We combined the two errors {e = e s + e i } and fed that to the backpropagation block as shown in Figure 6. The concern of employing backpropagation for CSIT training is that we are using supervised learning, where supervised learning refers to the learning of NN in CR LTE-A in which the NN is trained with every possible data set. The job the backpropagation algorithm is to update the connection link weights, i.e., v k βα of each kth layer. This step is termed as the backpropagation. Noticeably, we are using log-sigmoid for hidden layers and pure linear for output layer as activation functions. The concern of employing pure linear function as activation function is that we want one output neuron for slot(s) state prediction having value in the range from {−1 to 1} and the other one for idle time slot(s) prediction in the range from {1 to 5}. As far as the error minimization is concerned, it is easier to minimize the mean square error E as in (6) rather than the actual combined error, i.e., e. According to the backpropagation algorithm in [20], the connection weights are updated according to (7) and (8): where v ¼ v k βα t−1 and this corresponds to the connection weight of neuron β of the kth layer, and the subscript (t − 1) classifies the previous instant weights. The partial derivative ∂E ∂v in (8) is computed successively for each neuron from the output to the input layer and is expressed in chain rule in terms of e, y k β , z k β , and v k βα as follows: The above expression's chain rule components are represented by (2), (3), and (6) as follows: Therefore, based upon the above partial expressions values, (9) can be represented as To compute the rate of change of mean square error with respect to the desired output from neuron, replace k with o in the above expression, where o is the delimiter that corresponds to the output layer. Then (10) can be written as However, if we want to compute the partial derivative for neurons in hidden layers whose desired outputs are not known, then the partial derivative ∂E ∂v is calculated in terms of local gradient and is denoted by δ k β . The local gradient ∂E ∂z k β is also represented by the chain rule as follows: The above mathematical modeling is the training representation of our proposed predictor. We train the MLP with the slot(s) state and idle time slot(s) history. The training of the proposed predictor is done by changing the weights according to (7) with the aim of minimizing the mean square error, i.e., E in (6). We have repeated the above weight updating procedure until the threshold in terms of the required mean square error is achieved, where the threshold mean square error is the tolerable prediction error.
Once the CSIT predictor training has been completed, we apply the unseen data for checking the performance of the trained predictor. We predict the slot(s) state by using the binary symbol decision boundary as in (12a). The predicted idle time slot(s) is represented by taking into account the round function to the nearest integral value as in (12b) because, as already mentioned, the idle time slot(s) is represented in the integral range from {1 to 5}: The complexity of the proposed MLP-based CSIT predictor actually lies in the training period. Remember the training period is classified as the interval in which the NN is trained for every possible data set. Once the training period has been completed, the computational complexity is significantly reduced. As mentioned in Section 6.1, once the proposed MLP-based CSIT predictor is trained with 8,000 slot(s), there is just a prediction error of about 0.09 by validating the data set of 30,000 slot(s). However, if the NN is trained with lesser slot(s), then there will be a bit larger prediction error. As far as the network operation is considered, the primary network would not be effected within the training period because the interfering cCU predict is alerted by the eNB to change the transmission slot(s), as mentioned in Section 3.1. Finally, the delay involved in the training of the MLP-based CSIT predictor increases the information exchange among eNB and cCU predict that guarantees the interference minimization to the primary network. However, this information exchange is reduced as the training of the predictor is done progressively. This is due to the fact that training reduces the prediction error and so does the interference to the primary network.

Simulation and analysis
In this section, we first elaborate the accuracy of our proposed MLP-based CSIT predictor, then the performance improvement by employing it in cognitive LTE-A network is illustrated. For the sake of CSIT prediction, the PU traffic is modeled here in statistical domain. We use two types of stochastic processes, i.e., Poisson and Pareto random process. The former is used for the arrival of PU and the vacancy of the channels/slot(s) in terms of time slot(s) is modeled by the latter one. We train our predictor using the statistical model of PU because we are doing analysis on the offline data. Therefore, we have used MLP for predicting future opportunities. However, MLP is applicable to any type of stochastic traffic model. The parameter, PU activity, as used later in this section, is the percentile activity of the PU on primary channels. This corresponds to the increased arrivals of PU and hence increased busy time slot(s) and is changed by adjusting the parameters of the Pareto distribution. All the simulations are carried out in the MATLAB R2010 using the Neural Network Tool Box (Mathworks, Natick, MA, USA).

Accuracy of MLP-based CSIT predictor
The accuracy of our proposed predictor is presented here by two means: slot(s) state and idle time slot(s) prediction. The training and validation of the proposed MLP are carried out for both slot(s) state and idle time slot(s). Generally speaking, the training refers to the phase of NN in which the nonlinear mapping between the inputs and the desired outputs is done by adjusting the link weights. The weights are adaptively adjusted by minimizing the mean square error between the desired and the actual MLP output. The validation refers to confirm the accuracy of the trained NN by giving the unseen data, where unseen data are the data for which the NN is not trained.

Training and validation of slot(s) states
The training of the MLP-based CSIT predictor in terms of slot(s) state is depicted in Figure 7. We train the predictor with a total of 8,000 slot(s) or 1,000 data points, where one data point is equivalent to the eight slot(s) that contain the slot(s) state information, either 1 or −1. Each data point actually represents a single RB which means that we are indirectly training the MLP for history of 1,000 RBs. However, due to limited space we just show 100 data points for better visual illustration.
As shown in Figure 7, there is just an error of about 0.07 for the training of 100 data points or 800 slot(s). In addition, the error is about 0.04 for the training of 1,000 data points or 8,000 slot(s). This malfunction in terms of mismatch between the proposed predictor and the targeted output is inherently induced by training the predictor not with the original input data but rather with some erroneous data, where the erroneous data are generated by minimally randomizing the original input data and the original input data correspond to the PU traffic that is modeled in the statistical domain. The concern of employing erroneous data for training is twofold. Firstly, in real world problems, it is impossible to have exact training data sets [25]; therefore, erroneous data are exploited here for close approximation to real situation. Secondly, binding of erroneous input and desired output contributes to better convergence in terms of reduced prediction error when using the NN for unseen data. The training with original input data leads to the prediction error of 0.10, whereas with the erroneous data, this error is reduced to 0.09 for the validation of 30,000 slot(s). Figure 8 illustrates the validating data results of our proposed predictor. We have tested our predictor for the unseen data of 30,000 slot(s) and it is evaluated that there is just an error of about 0.09. However, the prediction error in the validation of 100 data points or 800 slot(s) is about 0.08 as shown in Figure 8. This reduced error is due to the fact that we have trained the NN with some erroneous data.

Training and validation of idle time slot(s)
The training of the MLP-based CSIT predictor for idle time slot(s) prediction is illustrated in Figure 9. The training and validation in terms of idle time slot(s) are carried out here in the same way for the slot(s) states. Although we used the same structure for the slot(s) and their corresponding data points as in the previous subsection, here we used the integral idle time slot(s) history for training and validation. The training is also carried out here with erroneous data, where the same reason holds as for the slot(s) state training. There is a negligible error between the targeted output and the predictor output, i.e., about 0.07, among the training of 100 data points or 800 slot(s) as shown in Figure 9. Moreover, the error is about 0.047 for the training of 1,000 data points or 8,000 slot(s). The validation or the interaction with the unseen data of our proposed predictor is shown in Figure 10. It can be illustrated from the Figure 10 that there is an error of about 0.014, for validation of 100 data points or 800 slot(s). In addition, the error is about 0.09 for the validation of 30,000/8 data points or 30,000 slot(s). It can be concluded from the above explanation that by training the predictor with 8,000 slot(s) or 1,000 data points, there is a prediction error of 0.09 for both slot(s) state and idle time slot(s) between the desired and the predicted output. Furthermore, it can also be deduced that once the NN is trained for prediction with appropriate data sets, the error between the desired and actual predictor output is significantly reduced. We are not referring to the real prediction error here that resulted from using NN. However, the proposed CSIT MLP approximates the prediction error in a real environment. This is due to the fact; we are training the NN with the data set that closely approximates the behavior of PU activity in a real mobile communication environment. The data set in the form of PU traffic is explicitly described in the results in Section 6.

Performance improvement by predictive cognitive LTE-A
The advantage of exploiting CSIT prediction for cognitive LTE-A is illustrated here in terms of five performance measures, i.e., spectrum utilization, sensing energy, channel switching, PLR, and average instantaneous throughput. The comparison is carried out between predictive and nonpredictive CR, CU predict and CU sense . Although the exploitation of CU sense in LTE-A contributes to the significant performance improvement, e.g., spectrum utilization and throughput, the performance can further be enhanced by inculcating CU predict along with CU sense in LTE-A network. The job of CU sense is just to scan the whole available band and discover the opportunities in terms of SHs. However, CU predict works slightly different in the sense that it predicts the CSIT before actually applying sensing on the slot(s). The benefits of employing predictive CR concepts in LTE-A are illustrated following this paragraph.

Spectrum utilization
The performance improvement by employing CSIT prediction in terms of spectrum utilization (SU), as illustrated in Table 2, is defined as the ratio of the number of idle slot (s) sensed to the total number of idle slot(s) in the system for a limited duration in terms of time slot(s),  of idle slots in the system is N is . Therefore, the SU for each can be represented as SU improvement (SU imp ) can be represented based upon the above two expression as The numerical results of SU imp are presented in Table 2. The analysis here is carried out by different primary slot(s) in the system, i.e., N T . Also, two different PU user activity levels are incorporated here, i.e., 50% and 60%. It can be seen that by increasing N T , SU imp is going to increase regardless of the PU activity level. The is due to the fact that the increased number of PU slot (s) results in more idle slot(s) discovered by cCU predict . In contrary, the number of slot(s) sensed by CU sense , N i , did not changed that much with the increase of N T . This is due to the fact that CU sense cannot discover idle slot (s) beforehand. The increased PU activity level also has a significant impact on SU imp , i.e., it lessens the SU imp , and this can be illustrated by the difference in the last two columns of Table 2.

Sensing energy
The sensing energy by employing CU predict within the LTE-A network is illustrated in two ways. First, the sensing energy consumption by CU sense and CU predict with different PU activity is illustrated numerically. Second, the sensing energy reduction analysis is carried out mathematically.
The energy consumption of the CU sense and CU predict with varying PU activity is illustrated in Figure 11. We assumed that the energy required for sensing a slot(s) is 100 (joules). It is illustrated that energy consumption of CU sense rises by increasing the available primary slot(s), irrespective of the PU activity. This is due to the fact that CU sense has no a prior information about the slot(s) statistics. However, CU predict contributes to the significant reduction in energy consumption because it applies sensing on those slot(s) that are predicted to be idle. This trend is also illustrated in Figure 11. Furthermore, it is notable that with an increase in PU activity, the energy consumption of CU predict is also reduced. This is due to the fact that increased PU activity corresponds to less transmission opportunities in terms of idle slot(s). Therefore, CU predict will not have enough slot(s) on which to apply sensing, and this results in further sensing energy reduction.
Here is a simple mathematical interpretation of sensing energy reduction. Let us assume that the total sensing energy required for classifying the occupancy of one slot(s) is X J. Then the total energy required for CU sense CUsense without prediction CUpredict with 60% PU activity CUpredict with 50% PU activity Figure 11 Sensing energy comparison between CU sense and CU predict .
for sensing Z slot(s) is given in (16), where Z is the total number of the slot(s) sensed by CU sense in a finite duration of time slot(s): and the total energy required for CU predict for applying sensing on the Y slot(s) is where Y is the number of slot(s) predicted to be idle among the total N T slots. It can be represented from (16) and (17) that the sensing energy of CU sense is far greater than that of CU predict and this can also be illustrated from the Figure 11. The reason being that the former applies sensing brutally whereas the latter one exploits slot(s) statistics by employing prediction. Furthermore, the percentile reduction in the sensing energy from (16) and (17) is represented in (18): Hence, by exploiting predictive modeling in CU predict , the sensing energy can be significantly reduced.

Channel switching rate
The channel switching rate is defined as the number of slot(s) switched per unit transmission time by the S e U due to the result of PU arrival on the same slot(s), where S e U can be either CU sense or CU predict . S e U in LTE-A network has to switch to another idle slot(s) whenever PU wants to access the slot(s) temporarily reserved by SU. The channel switching rate is an important performance bottleneck for the actual exploitation of DSA in the LTE-A network. The proposed MLP-based CSIT predictor in cognitive LTE-A network results in significant reduction in the channel switching rate. The benefit of predicting the idle time slot(s) is actually more elaborative here, in the sense that predictive S e U operates on those groups of RBs that are predicted to be more idle in terms of the number of time slot(s).
The comparison of CU sense and CU predict in terms of slot(s) selection is carried out in LTE-A network in terms of number of primary slot(s) with varying PU traffic activity. The slot(s) selection is carried out randomly in CU sense whereas predictive modeling is employed for the other one. Figure 12 depicts that idle time slot(s) prediction is more beneficial when we are talking in terms of the S e U channel switching rate. It is seen that CU predict with idle time slot(s) prediction results in a reduced channel switching rate as compared to the CU sense with random slot(s) selection. Moreover, the channel switching rate is further reduced by increasing the number of available primary slot(s), specifically where CU predict is concerned. Because of the presence of more primary slot(s), there is more probability of having longer predicted idle time slot(s), which results in a reduced channel switching rate. The effect of PU activity is also depicted such that the channel switching rate is minimized by reduction in the PU activity from 60% to 50%. Moreover, the channel switching rate of CU sense , by employing random slot(s) selection, always remains at high level as compared to the others. This is due to the lack of predictive modeling in CU sense that leads to improper slot(s) selection as compared to its counterpart. Hence, the channel switching rate of CU sense remains constant irrespective of the primary slot(s).

Packet loss ratio
The PLR is defined as the event in which S e U is unable to find the idle time slot(s) in the primary pool of slot(s). We illustrate the impact of employing CSIT prediction by CU predict in LTE-A network in terms of reduced PLR in both theoretical and simulated means. The comparison of this predictive with nonpredictive modeling in terms of PLR is illustrated in Figure 13.
The theoretical modeling of average PLR in terms of both predictive and nonpredictive modeling is listed here. The expression of average PLR for CU sense , i.e., without prediction is listed below: where C is the threshold in terms of sensing slot(s) for the event PLR to occur, p m is the probability of the mth CUsense random selection CUpredict with 60% PU activity CUpredict with 50% PU activity Figure 12 Channel switching rate comparison between CU sense and CU predict . slot(s) appearing to be idle and Z is the total slot(s) sensed by the CU sense in LTE-A network.
The inculcation of CSIT prediction by employing CU predict in the LTE-A in (19) can be rewritten as For employing predictive modeling via CU predict , we define that P e , as P e > 0, is the average probability that CU predict will predict the state of primary slot(s) incorrectly. The concern of its implication is to have theoretical representation for PLR CUpredict . Hence, the introduction of the term P e contributes to reduced PLR in terms of theoretical means as shown in Figure 13.
The performance gap of average PLR in Figure 13 elaborates the essence of employing CSIT prediction. The PLR for CU sense does not improve by increasing the number of primary slot(s). Although increased slots result in more transmission opportunities in terms of idle slot(s), the sensing order for CU sense cannot be improved. That is why the average PLR for the nonpredictive remains the same regardless of the number of available slot(s). However, CU predict has optimized sensing order by incorporating CSIT prediction. Therefore, CU predict results in high probability of achieving idle time slot(s) and hence reduced average PLR. This effect is depicted by both the simulated and theoretical curves of CU predict that average PLR is reduced for both cases. The increased number of primary slot(s) not only increases the transmission opportunities but also results in reduced PLR by improving the sensing order. However, there is a little difference between the simulated and theoretical curves for each CU sense and CU predict , which is due to the prediction error in the trained NN that contributes to the increase simulated PLR values for each case.

Average instantaneous throughput
The analysis of employing CU predict in LTE-A network is carried out here in terms of average instantaneous  throughput. The comparison is carried out between CU sense and CU predict with respect to average instantaneous throughput by varying primary slot(s). For the sake of simulation, we formulated the theoretical model of the instantaneous throughput per slot(s). We built the theoretical model for nonpredictive case to elaborate the simulation curves in Figure 14. The theoretical average instantaneous throughput model is formulated by assuming C 0 as the reference transmission capacity of S e U and Δ s as the time slot duration. We define a new independent parameter η K i which corresponds to how fast S e U can find the idle time slots where 1≤η K i ≤ N s . The throughput R (i) in time slot i for nonpredictive case is given by We simulated the average instantaneous throughput for the CU sense and CU predict with respect to the number of primary slot(s) as illustrated in Figure 14, where the significant performance gap is achieved by employing CSIT prediction modeling. Moreover, the instantaneous throughput for the CU sense case is same. This is due to the fact that S e U CU sense has no information about the statistics of the primary slot(s). However, CU predict results in a bit increased throughput by increasing the primary slot(s) because more slot(s) result(s) in more available opportunities.

Conclusions
In this study we investigated the DSA in LTE-A network and proposed improvements by employing predictive modeling. Although LTE-A is the foremost mobile communication technology aiming to meet the requirements in terms of IMT-Advanced, inculcating CR concepts in it still needs to be addressed. Therefore, we have investigated the improvement by employing predictive CR concepts. The predictive modeling is inculcated by employing CSIT prediction using MLP NN. We train NN with the statistical model for PU traffic in mobile communication environment. Once the NN has been trained, it can predict the CSIT based on the sensed history on the control channel(s). Therefore, the projected CU predict works by predicting the CSIT values. The improvement by employing CSIT using NN is illustrated in terms of five performance measures, i.e., spectrum utilization, sensing energy, channel switching rate, PLR, and average instantaneous throughput. Our proposed CSIT predictor results in significant performance gains in all the above-mentioned performance measures. As far as the complexity of the proposed predictor is concerned, once the NN is trained with appropriate data set, then the computational complexity is significantly minimized. This study can be extended horizontally by incorporating other prediction parameters, like throughput. The work presented here can also be extended vertically by analyzing other prediction models and compare the complexity of the suggested ones.