A cross-layer adaptive channel selection mechanism for IEEE 802.11P suite

Recent advances in the automotive industry enabled us to build fast, reliable, and comfortable vehicles with lots of safety features. Also, roads are designed and made safer than ever before. However, traffic accidents remain one of the major causes of death. Intelligent transport systems are expected to reduce if not prevent accidents with interconnected vehicles and infrastructures. These vehicular ad hoc networks are highly dynamic and fragile. Although the standardization efforts are mature enough, the non-emergency/service channel selection mechanisms are not explicitly defined. In this paper, a novel cross-layer prediction-based algorithm is proposed to select the best possible service channel to decrease collisions beforehand. Theoretical analysis regarding the mean squared error prediction performance is established. It is shown that the proposed method outperforms the general Markovian-based prediction schemes under various traffic load scenarios.


Introduction
Transportation is by far one of the most important features in mankind's entire history. Especially in today's busy world, transportation is not only a necessity for personal travel, but it is also necessary for the shipment of goods across the globe. All these needs require infrastructure, planning, and optimization to run safely and smoothly.
Recent advances in the automotive industry have enabled us to build fast, reliable, and comfortable vehicles with numerous safety features. At the same time, roads are designed and built safer than ever before. However, traffic accidents remain one of the major causes of death [1,2]. The current situation implies that the further improvements in technology for each individual vehicle will not provide extra safety measures for transportation on the whole. Thus, deployment of cooperative and coordinated designs seems to be inevitable in order to extend contemporary safety measures. Even though preventing accidents entirely is a difficult task, cooperative and coordinative approaches provide a promising solution to make transportation safer. Motivated by these, intelligent transport *Correspondence: aboyaci@ticaret.edu.tr 1 Department of Electrical and Electronics Engineering, Istanbul Ticaret University, Küçükyalı E5 Kavşagıİnönü Cad. No: 4, Küçükyalı, 34840 Istanbul, Turkey Full list of author information is available at the end of the article systems (ITS) envisage a network topology consisting of interconnected vehicles and infrastructures [3,4]. In this regard, vehicles that are aware of each other and of their surrounding environment will form an intelligent, dynamic subnetwork which can be used in several ways ranging from safety to infotainment. Early warning, intersection collision avoidance, and adaptive cruise control are just a few of the potentially life-saving applications available, which are depicted in Fig. 1 [5]. Advertisement, web surfing, gaming, and video streaming are applications falling into the infotainment category. One should keep in mind that potential benefits of ITS are not limited to these two categories. It is believed that ITS will pave the road to green transportation as well [6,7]. Vehicular networks are considered to be a special form of ad hoc networks because of their high-level dynamic topologies. From sparse and fast highways to congested and slow downtown traffic, these network structures are naturally desired to operate in the optimum way. However, the extreme conditions caused by high mobility render vehicular ad hoc networks (VANETs) to a short-lived nature. Despite their inherent constraints, VANETs are also a promising heterogeneous network type, since they are planned to consist of contemporary cellular networks, next-generation wireless networks along with vehicleto-vehicle (V2V) and vehicle-to-infrastructure (V2I) link  support as well as device-to-device communications options [8]. In order to encompass all of the aforementioned tasks and constraints, the IEEE 802.11p/1609 work group strives to establish a standard known as wireless access in vehicular environments (WAVE) [9].
Although the primary intent for V2V communications systems is safety, like cellular phones and PDAs, entertainment is the primary driving force for this kind of communications technology. Infotainment applications make driving more enjoyable and bring new opportunities to the market [10,11]. With specialized entertainment products to enhance the driver or the passenger experience, these applications such as video streaming and voice over IP (VoIP) require a high bandwidth connection to create a seamless customer experience.
One should keep in mind that V2V communications take place in the wireless spectrum which is a very valuable, finite, and public resource. This nature of the wireless spectrum brings about multi-access interference (MAI) issues. Interference causes poor signal reception, drastic decrease in system capacity, frequent hand-offs, and service interruptions. Unlike centralized network structures, with ad hoc vehicle networks, there are multiple interference scenarios present simultaneously such as cochannel interference (CCI), adjacent channel interference (ACI), narrowband interference (NBI), wideband interference (WBI), and so on [12]. In the literature, there are mainly three strategies to combat interference: interference avoidance, mitigation, and cancellation. Of these three strategies, it is obvious that interference avoidance is the simplest and most effective one, since avoidance strategy will pave the path for interference-free communications. However, avoidance relies on identifying unused resources. Thus, any avoidance strategy always includes some sort of identification procedure to sense the use of resources, then decides based on the data and acts on it accordingly [12]. Note that even such an effort might not be sufficient to prevent MAI problems such as collision. Even though there are numerous collision avoidance strategies in the literature, they are all based on the assumption of having a channel selected already at hand. It is evident that a successful collision avoidance strategy should definitely be preceded by identification of statistically least occupied channel in advance.
In light of the discussion above, Federal Communications Commission (FCC) allocates the 5.850-5.925-GHz portion of the radio spectrum as dedicated short-range communications (DSRC) for vehicular networks [13]. DSRC standard defines seven 10-MHz channels in this range: one control, two reserved, and four non-safety application channels as shown in Fig. 2. However, a selection mechanism for these channels is not defined explicitly.
It is obvious that selecting a channel randomly at medium access control (MAC) layer is the simplest and widely adopted solution. However, a channel selection mechanism solely relying on MAC layer functionalities is not sufficient to provide satisfactory performance and needs novel data dissemination strategies benefiting from advanced techniques such as space time network coding [14]. Therefore, a cross-layer design including both MAC and physical layer (PHY) operations is required. In this regard, from the PHY layer perspective, a prediction mechanism is required in order for the MAC layer to take appropriate actions in advance. Failure to achieve these cross-layer operations may cause the system to select an interfering or busy channel and drastically lose bandwidth and performance. Once the channel is selected, it will be used until the next cycle and a busy or interfering medium is subject to wait in collision avoidance state. Also, note that the collision avoidance mechanism in IEEE 802.11p uses carrier sense multiple access/collision avoidance (CSMA/CA) and can only detect known type of signals [9]. In the literature, there are mainly four categories in selecting a service channel for VANET systems: pre-allocation-based [15][16][17][18][19], randomized rotation-based [20], minimum duration-based [21], and predictive-based schemes [22]. Pre-allocation-based schemes employ a static database and then select a channel based on that database. Despite its simplicity, such an approach evidently yields a poor performance under dynamic access. Furthermore, the database itself should be updated frequently. The randomized rotation-based algorithms take advantage of a different version of frequency hopping. Therefore, they inherently adopt both advantages and disadvantages of frequency hopping approaches such as improved fairness and lack of knowledge regarding the state of the medium. Minimum duration-based methods need to store the occupancy durations of each channel so that the least used can be identified. However, the performance of these methods converge with those of pure random channel selection schemes. Predictive algorithms seem to fit best for the channel selection task, but they lack cross-layer and adaptive design leading to poor and unsatisfactory performances [23]. On the other hand, studies that focus on predictions taking advantage of Markovian assumption rely heavily on selecting the appropriate length of both prediction period and history. In the absence of historical data (past observations), it is reported that prediction accuracy degrades dramatically [22]. Studies which follow decision-making processes based on a posteriori probabilities of the occupancy state of a channel could also be considered in this manner. Nevertheless, such attempts require some sort of collaboration and coordination [24]. It is worth mentioning at this point that there are also studies in the literature, which aim to be aware of spectrum conditions on future positions along the path of any vehicle. Yet, such a strategy mandates obtaining the location of each vehicle via an already-existing reference such as Global Positioning System (GPS) and an infrastructure which allows the vehicles to share the digital maps across the network [25,26]. Therefore, in this study, a novel cross-layer, adaptive channel selection mechanism is proposed. Even though several predictive channel selecting mechanisms presently exist, to the best of the knowledge of the authors, this study is pioneering the incorporation of crosslayer architecture with predictive strategy into vehicular networks.
The rest of this paper is organized as follows: Section 2 describes the system and signal model along with the theoretical analysis. Section 3 covers the proposed scheme in detail. In Section 4, the results are presented along with relevant discussions. Finally, conclusions and future directions are outlined in Section 5.

System and signal model
In this study, a cross-layer adaptive channel selection mechanism is proposed for IEEE 802.11p suite. Therefore, it is first desirable to discuss a complete characterization of the proposed system in conjunction with layered structure. Channel selection mechanism in V2V networks relies on IEEE 802.11 suite including both PHY and MAC layers [27,28]. The PHY layer is responsible for various air interface functionalities such as synchronization, channel estimation, and equalization. The MAC layer takes care of channel access and scheduling operations. Note that IEEE 802.11p and IEEE 1609.x are jointly known as WAVE and specifically IEEE 1609.4 part extends MAC to multichannel-operations-enable mode. Peculiar to V2V networks, WAVE multichannel operations are carried out over two types of channels, namely control channel and service channel. It is important to keep in mind that there is a single control channel, whereas there are multiple service channels in WAVE. Because the safety and control signals/messages are sent over the control channel and other types of application signals/messages are transmitted over the service channels, a coordination mechanism is necessary. It is obvious that a successful coordination mechanism relies on the joint performance of both PHY and MAC layers, which directly points out a cross-layer design.

System and signal model
The received signal at baseband which includes the ambient noise and probably an unknown signal is given input to radiometer. The radiometer needs to decide whether the unknown signal is present or not in the received signal, which is expressed as where n(t) is complex additive white Gaussian noise (AWGN) with CN 0, σ 2 N in the form of n(t) = n I (t) + jn Q (t) as both n I (t) and n Q (t) being N 0, σ 2 N /2 and j = √ −1; x(t) is the complex baseband equivalent of the unknown signal; and H 0 denotes the hypothesis corresponding to the absence of the unknown signal, whereas H 1 is the hypothesis corresponding to the presence of it. Hence, the statement of the problem can be expressed as deciding whether an unknown signal x(t) is present by examining the statistical characteristics of the received signal r(t) in the presence of noise n(t).
The unknown signal, x(t), can be decomposed into the following form under the narrowband channel assumption [29]: where m(t), s(t), and a(t) represent complex fading channel process, slow-fading process, and the unknown baseband signal, respectively. In addition, the unknown baseband signal is assumed to be digitally modulated as [30]: where E, θ, and p(·) are the energy, phase, and the complex-valued pulse shaping waveform, respectively for the kth digital symbol with k = 0, 1, . . . , M − 1 in an M-ary scheme. Note that all three processes in (2) are independent of each other and of n(t).
For modeling the fading processes, physical radio propagation environment should be examined. Transmit signals reach the receiver antenna as multiple rays or paths. Due to constructive and destructive superposition, received signal power level fluctuates drastically leading to the phenomenon known as fading. The resulting signal can be represented with a complex fading channel process as where h(t) and φ(t) refer to the amplitude and phase of the complex channel process, respectively. In case there is sufficiently large number of independent paths superposing at the receiver antenna in the absence of a specular signal such as in line-of-sight (LOS) environments, h(t) = |m(t)| yields the Rayleigh distribution in accordance with the central limit theorem (CLT). In addition, mobility gives rise to correlation in the fading channel process [31]. If the angle of arrival (AoA) of the paths at an omni-directional antenna is assumed to be uniformly distributed [ −π, π) on a 2-D plane, then Jakes' Doppler spectrum occurs as a special case. Correlation in temporal domain for this special case is given by is the zeroth-order Bessel function of the first kind, f D is the maximum Doppler frequency with f c v/c, f c is the operating frequency, v is the mobile speed, and c is the speed of light (c = 3 × 10 8 m/s).
Beside fast-fading process, transmitter-receiver separation and the obstacles present in between affect the received signal power as well. Loss in the received signal power due to the transmitter-receiver separation is known as distance-dependent path loss. It decreases monotonically as a function of the relative distance between transmitter and receiver. Obstacles along the propagation paths between transmitter and receiver cause drastic fluctuations in the power level of the received signal too. This phenomenon is known as shadow fading. Measurement data reveal that the first-order statistics of the slow-fading phenomenon can be approximated by a log-normal distribution. Therefore, the joint effect of path loss and shadow fading could be modeled by a single process of the form given below [29]: where μ(t)/2 represents mean, σ G /2 is the standard deviation of log-normal fading, and g(t) is a real-valued unit normal process N (0, 1). It is not difficult to infer from (5) that μ(t) represents the impact of distance-dependent path loss varying over relatively longer periods of time.
As in complex fading process, experimental results also report that g(·) exhibits correlation of an exponentially decaying form [32]: where E {·} is the statistical expectation and d ρ refers to the decorrelation distance. Field measurements show that various environments have different decorrelation distances. For example in [32], d ρ is calculated to be 5.75 and 350 m for urban and suburban environmental classes, respectively. It is crucial to state that both (5) and (6) correspond to simplified theoretical approximations which are consistent to some extent with experimental results available in the literature. However, there are some other studies available in the literature related to shadowing models such as static and dynamic shadowing [33].

Energy detection
Energy detector (or radiometer) is a simple, first-order receiver which accumulates the energy of the received signal for a specific time interval. Collected energy, which is called decision statistic, is then sent to a decision device. Decision device compares the instantaneous decision statistic with a pre-defined threshold to come up with a binary conclusion regarding the absence/presence of an unknown signal. In discrete domain, assuming that the detector has a sufficiently high sampling rate, the output of the detector is calculated by where N is an arbitrary number of samples taken into consideration and r[ ·] denotes the discrete counterpart of the received signal, r(·). For H 0 , AWGN assumption yields a central provides a non-central χ 2 N distribution with an additional shape parameter. If N is selected to be large enough, the decision statistic is assumed to be asymptotically normally distributed with certain mean and variance values in accordance with the CLT. Before proceeding with the performance analysis of radiometer, it is worth mentioning that the radiometer is a non-coherent receiver and is known to be the "optimum" detector in the absence of a priori knowledge about the received signal [34,35]. Yet, it has very critical drawbacks. First of all, noise-plus-interference uncertainty degrades the performance of the radiometer drastically [36]. In addition, low signal-to-noise ratio (SNR) regime leads to unsatisfactory results [37]. Combining these two issues, one could easily conclude that the radiometer performs poorly in detecting spread-spectrum signals [38]. Furthermore, it is clear that the performance of the radiometer is heavily dependent on the SNR/signal-to-interference-plus-noise ratio. For the sake of completeness, it is desirable to show how the performance of the radiometer is related to SNR.
As stated earlier, the decision statistic d[ n] under the AWGN assumption has a central χ 2 distribution with N degrees of freedom (χ 2 N ) for H 0 , whereas it has χ 2 N (m), where m denotes the shape (non-centrality) parameter [34] for H 1 . Based on these assumptions, one could easily obtain the probability density function (PDF) for H 0 scenario as where (·) denotes the gamma function. Similarly, the PDF for H 1 scenario is given by where J k (·) denotes the k-th-order modified Bessel function of the first kind. Based on both f χ 2 receiver operating characteristic (ROC) could theoretically be derived by seeking for Pr χ 2 N > λH 1 and Pr χ 2 N,m > λH 0 , which correspond to probability of detection and probability of false alarm, respectively. Here, Pr χ 2 N > λH 1 is given by Q g √ 2γ , √ λ where λ denotes the decision threshold, Q g (·) is the generalized Marcum Q function, and γ is the instantaneous SNR. Note that the optimum threshold selection mandates one to have knowledge about the SNR ( [34] Eq. (35)). It is not difficult to see that the same conclusion (with more sophisticated calculations) holds also for the SINR in case there are unknown activities within the spectrum of interest apart from that of primary user [39].

Channel selection
Cross-layer channel selection mechanism relies on both PHY and MAC operations as outlined in Fig. 3. In order for the MAC layer to select a specific channel, a shortterm history of the time-domain statistics of candidate channels are stored. Next, a short-term/next step prediction is evaluated based on the collected data. Evaluation is followed by selection step where the node chooses the channel that has the maximum value of a pre-defined metric.
The proposed channel selection metric is defined to be the first-order difference of variance estimates of the predicted value of the energy detection operation output. As will be shown subsequently, variance estimates are scaled appropriately in order not to violate Kolmogorov's assumptions for the probability metric. Before proceeding further, prediction of the energy detector output should be given. Assuming that the predicted value of the energy detection operation, namelyX k n+1 , is a wide-sense stationary (WSS) process, then expresses the weighted moving average prediction wherê X k n+1 denotes the predicted value for the n + 1-th step based on past L historical data of the k-th channel; α i s denote the weighting coefficients; and μ X k represents the intercept of the prediction. Once L past energy detector observations are at hand, the channel selection mechanism then proceeds with calculating the variance of the past observations as: where E X k n−i denotes the sample mean of the past L observations. In order for the channel selection mechanism to quantify whether a state transition takes place at the prediction stage/step, (11) needs to be scaled so that as mentioned above, Kolmogorov's assumptions for the probability metric are not violated: where k is a constant for the k-th channel which satisfies 0 < ψ X k [ n] ≤ 1. 1 Finally, the channel selection mechanism applies the first-order difference operator to (12) and reaches the channel selection metric as where D −1 (·) denotes the first-order difference operator being applied to its input. Now, the channel selection mechanism is ready to provide the hard decision based on (13) as follows: whered k n denotes the predicted binary state of the k-th channel at the n-th step for the next n + 1-th step and ζ is an adaptive threshold below which is predicted to be occupied.
There are some critical points in establishing the steps (10) through (14). First and foremost, the depth of the history, L, should be specified. Consequently, weighting coefficients α i s in (10) need to be determined in an effective way as well. Second, k in (12) should be decided. Because (10) is a linear model in essence, any minimization approach based on 2-norm ( · 2 ) such as mean squared error (MSE) could be adopted in determining the weighting coefficients as shown subsequently. On the other hand, determining the depth of the history, L, is actually nothing but deciding the order of the linear model adopted in (10) based on a specific set of criteria. Mean magnitude residual statistics, sum of squares of Pearson residuals, and Akaike information criteria (AIC) are prominent model order selection strategies present in the literature [40]. However, model order selection is outside the scope of this study.
In what follows, it will be shown that the proposed channel selection mechanism has a prediction error which cannot be reduced further and the aforementioned issues will all be clarified.

Proposition 1 (Irreducible prediction error). The proposed channel selection mechanism has an irreducible prediction error variance, σ 2 e , which cannot be reduced further and is a function of the threshold that is used by the energy detector at the decision stage.
Proof. See Appendix.
Based on the analysis carried out in the Appendix, it is seen that the channel selection mechanism relies heavily on the threshold for the decision device operating on an energy detector. There are several implications of Proposition 1. First of all, it states that prediction error variance reaches its lower bound with the first threshold when there are multiple thresholds. For instance in [40], a two-level threshold strategy is applied where the first stage is clipping the original time series and the other is used at the prediction stage. Similar multi-stage scenarios take place in decision fusion schemes as well [41]. Hence, with the aid of Proposition 1, it is known in advance that the lower bound for the prediction error variance could only be achieved in case decision is obtained directly from the original time series. Second, due to the positive semi-definite structure of error variance, multiplethreshold strategies cannot outperform single-threshold strategies in terms of prediction error variance given that both single-and multiple-threshold strategies are of the same mathematical structure such as both being linear or non-linear and so on. Performance results and irreducible prediction error will be investigated in the subsequent sections. A sketch of the proposed channel selection mechanism is given in Fig. 4.

Numerical results
The performance of the proposed method is investigated in various traffic conditions and scenarios. Also, a performance comparison is established between the method proposed and Markovian-based prediction scheme. In the sequel, one might be interested in why specifically Markovian-based scheme is selected for comparison purposes. The reason is threefold: (i) Peculiar to the spectrum sensing problem, spectrum occupancy could best be represented with a Bernoulli process, which is a special case of Markov chain. Therefore, Markovian-based structures provide a very high-level flexibility in both modeling and analyzing the spectrum sensing problem. (ii) Markovian-based schemes are vastly employed prediction strategies present in the literature. This stems from the fact that it has a simplistic design and tractable statistical analysis. Furthermore, depending on the problem formulation, several adaptations and extensions such as multi-layer Markovian or hidden Markov model (HMM)based approaches could easily be obtained with minor modifications in the original model organization. (iii) Finally, Markovian-based models are so versatile that they could be exploited to tackle spectrum sensing problems in different parts of open systems interconnection (OSI) reference. For instance, it could be used both in PHY and MAC layers to shed light on different aspects of spectrum sensing problem.
Simulation setup considers several levels of traffic load ranging from light occupancy to densely occupied scenario based on the parameters reported in [42]. At this point, it is worth mentioning why parameter selection is based on the findings reported in [42]. (I) First and foremost, the proposed method requires both theoretical and empirical analyses. Therefore, studies which focus on both theoretical and empirical findings should be evaluated. From this perspective, [42] is a prominent work in the literature. (II) Also, as proposed, channel selection mechanism could be shared by both PHY and MAC layers in the sense of cross-layer approach. In [42], the same strategy is followed and both PHY and MAC layers are binded. (III) Finally, as discussed above, a generic Markovian-based structure for industrial, scientific, and medical (ISM) band depending on several important occupancy parameters is provided in [42] such that it could be used for comparing other findings with each other.
In simulating channel access, both busy and idle channel statistics are taken into account. According to the empirical observation-based model given in [42], busy and idle channel state transitions can be well approximated by the exponential distribution with a factor that changes with the traffic load. For the proposed method, the first energy detector is used to obtain the decision statistics. Next, decision statistics are fed to the prediction stage. Prediction is yielded by equal gain combining (α i = α j = 1 L , ∀i, j). Along with the past measurement results, prediction is quantified with the metric defined in Section 3. Finally, a binary decision is reached so that MSE is obtained for the performance evaluations. It should be stated at this point that equal gain combining might not be the optimal prediction strategy for many scenarios. However, it establishes simplicity in implementation from the practical point of view because it skips the stage at which the weighting coefficients are estimated. Therefore, equal gain combining is adopted in the numerical results.
For Markovian-based prediction scheme, the first output of a Bernoulli stochastic variable is used in order to determine the initial state of the channel. Next, simulated channel statistics are compared with a threshold whose value can be adjusted according to several levels of traffic load. Once busy/idle periods are shaped, prediction stage is initiated. For the Markovian-based scheme, a specific portion of the simulated data is fed to the predictor in order to estimate the transition probability matrix. Then, predictions are calculated based on the initial state (last training data sample) and the transition probability matrix estimate. Finally, MSE values are obtained for performance evaluations and comparisons. General parameters used in the simulations are given in Table 1.
Performance results for Markovian-based scheme are plotted in Fig. 5. It is seen in Fig. 5 that the performance of the Markovian-based prediction method improves with the increasing traffic load. This mainly stems from the fact that underlying stochastic channel access becomes denser with the increased traffic load; therefore, the number of transitions between busy and idle states becomes less frequent. Since it is known that steady-state behavior always converges a biased binary estimation given that the transition probability matrix is not symmetric, prediction step always yields a biased output in favor of the busy state. Another important observation in Fig. 5 is that the performance of the Markovian-based prediction scheme cannot be improved any further by increasing the amount of history at the training stage. As can be verified from Fig. 5, 10 % of the entire data set is sufficient under each level of traffic load to reach the same performance level in training the Markovian-based scheme. The performance of the proposed method for L = 2 can be seen in Fig. 5 as well under a generic scenario for comparison purposes. Here, L = 2 is adopted, since in conjunction with equal gain combining strategy it provides some sort of a lower bound for the performance of the proposed predictor, as discussed above. This way, a fair comparison platform between the proposed method and the Markovian-based structures could be established. First and foremost, the proposed method yields a stationary MSE in contrast to the Markovian-based scheme. This is caused by the combined impact of both energy detector and the equal gain combining. Recall that the channel selection mechanism first collects the energy of the received signal at a sufficiently high sampling rate (generally satisfying the Nyquist criterion) for a very short period of time. This implies that especially switching from idle state to busy state can be captured in a few consecutive decision statistics produced by the energy detector. Once consecutive, high-resolution, and very low-latency decision statistics are fed to the prediction procedure, dramatic power fluctuations in the decision statistics due to fading are smoothed out to some extent since equal gain combining is nothing but applying a low-pass filtering operation to its input. This way, the proposed method firstly can capture the transitions very effectively, and secondly, false alarms due to drastic power fluctuations caused by fading are eliminated automatically. Hence, a stationary output for the proposed method is established as observed in Fig. 5.
The discussion above could be investigated in a better way by examining Fig. 6. In Fig. 6, a single snapshot of the prediction stage is plotted for ≈ 20 ms. As can be seen from the figure, there are two plots corresponding to the output of both the energy detector (decision statistic) and the predictor. Based on the values given in Table 1, several bursts are generated and passed through the energy detector. The decision statistics, output of the energy detector, are stored. Next, a specific portion of the data stored (10 % as discussed above) is used for estimating the weighting coefficients of the predictor for L = 2. Then, the predictor runs. Note that the prediction lags one sample behind its input as expected. Nevertheless, one could conclude that by looking at the time scale of the bursts, prediction reacts very rapidly. Yet, it is observed in Fig. 6 also that the predictor fails to track the actual observations when dramatic fluctuations occur in the received signal power. This is not surprising since prediction could only take into account a temporal window whose duration is L×T A . Therefore, any burst whose duration, say T B , satisfying T B < L × T A will yield a dramatic fluctuation in the decision statistic. Thus, prediction will not be able to keep track of these sudden changes in the received signal power. It is desirable also to see the impact of traffic load on the performance of the proposed method. Average error rate performances for the proposed method under various levels of traffic load are given in Fig. 7. Note that the traffic load does not change the average error rate performance drastically. However, it is observed from Fig. 7 that the traffic load might influence the performance of the proposed method around 2 % on average between the best-and the worst-case scenarios. Moreover, it is seen that the proposed method exhibits its worst performance under the average traffic load scenario. This is not surprising since the entropy of the channel occupancy statistics reaches its maximum when the load is 0.5 [43].
In conjunction with the aforementioned discussion, it is important to check the collision rate performance of the proposed channel selection mechanism. In order to evaluate this, a single-channel access scheme is assumed. As in the previous cases, channel access is shaped by the parameters reported in [42] with a traffic load ranging from light occupancy to densely occupied scenario. Based on the predictor output, it is decided whether the channel is accessed or not. Cases where "no access" takes place are omitted since no access strategy will never yield a collision for any single-channel access scheme. Hence, collision rate is calculated for the predictor output when channel access is allowed. The results are plotted in Fig. 8 under various traffic loads. In parallel with the results given in Fig. 7, collision rate reaches its maximum when the traffic load is 0.5.
In the sequel, it is important to investigate how the proposed mechanism behaves under various threshold values. Recall that the proposed channel selection mechanism relies heavily on selecting an optimal thereshold value, ζ , in (14). Three different threshold values and corresponding collision rates are given in Fig. 9. To better understand how the proposed mechanism behaves under various threshold values, Fig. 9 could be examined in conjunction with Fig. 8, since Fig. 8 shows the results regarding optimal threshold selection. As can be seen from both Figs. 8 and 9, the proposed mechanism performs poorly in case random threshold selection strategy is employed. On the contrary, Fig. 8 implies that optimum threshold gives rise to the minimum collision rate for all traffic load rates. From the practical point of view, obtaining the optimum threshold value might be difficult; therefore, cases where optimum threshold is not available should be examined as well. In Fig. 9, two different suboptimal threshold  values and corresponding collision rates are given. As can be seen from the figure, depending on the traffic load, one of the thresholds slightly outperforms the other in one part, whereas the situation is reversed in the other part, as expected.
It is worth mentioning that whether the theoretical lower bound mentioned in the Appendix given with (23) exists. As Proposition 1 implies, lower bound could be reached in case the behavior of the decision device could be characterized statistically, since (23) is a function of the decision threshold, ζ . For the energy detector, it is known that optimal decision threshold theoretically exists as expressed in ( [44] § III.B).
Before concluding this part, one might want to contemplate the worst-case scenario regarding the prediction error variance. Selecting the weighting coefficients α i in (10) plays an important role on the error variance behavior. In this regard, any approach ignoring the actual observations, X k n−i , in (10) would lead to no-information case and, therefore, yield the maximum entropy. It automatically implies that the maximum entropy could only be reached by setting α i = 0, ∀i. In (10), such a setting gives rise to a prediction,X k n+1 , which is equal to AWGN due to the autoregressive (AR) structure with a certain mean and variance pair. Theoretically speaking, AWGN could take any real value; therefore, prediction error variance diverges for the worst-case scenario.

Conclusions
In this paper, a cross-layer predictive channel selection mechanism is proposed to increase utilization and performance in WAVE multi-channel architecture. With this selection scheme, channel history is used to predict the next state of the wireless channel. By taking advantage of the cross-layer design, the MAC layer is fed with energy detector statistics from the PHY layer to make the best decision to find the most appropriate non-emergency/service channel. The proposed method implies a channel access protocol which improves the effectiveness of collision avoidance procedure. Furthermore, the proposed architecture could be adapted to various scenarios and conditions by appropriately selecting the model parameters. Analysis along with the results shows that the proposed method outperforms the widely deployed Markovian-based prediction method; therefore, it is a promising candidate for the WAVE standard.
The cross-layer architecture allows the proposed method to be extended further in various ways. For instance, both traffic and geographical information could be incorporated into the architecture as a pseudo-layer appended into both PHY, MAC, and network layers. Moreover, predictive strategy could be enhanced further by taking into account network traffic type statistics such as web access, VoIP, video streaming, and gaming. Considering the fact that different network traffic types exhibit distinct statistical behaviors, enhanced predictive strategies could be devised based on these statistical models and parameters reflecting the stochastic nature of each traffic type.
Finally, collaborative, cooperative, and coordinated access schemes could be incorporated into the model proposed. This way, relays, horizontal and vertical handover algorithms, and V2I link support can be optimized since VANETs are considered to be an integral part of next-generation wireless networks (NGWNs).