Decision making for cognitive radio equipment: analysis of the first 10 years of exploration
© Jouini et al; licensee Springer. 2012
Received: 23 May 2011
Accepted: 25 January 2012
Published: 25 January 2012
This article draws a general retrospective view on the first 10 years of cognitive radio (CR). More specifically, we explore in this article decision making and learning for CR from an equipment perspective. Thus, this article depicts the main decision making problems addressed by the community as general dynamic configuration adaptation (DCA) problems and discuss the suggested solution proposed in the literature to tackle them. Within this framework dynamic spectrum management is briefly introduced as a specific instantiation of DCA problems. We identified, in our analysis study, three dimensions of constrains: the environment's, the equipment's and the user's related constrains. Moreover, we define and use the notion of a priori knowledge, to show that the tackled challenges by the radio community during first 10 years of CR to solve decision making problems have often the same design space, however they differ by the a priori knowledge they assume available. Consequently, we suggest in this article, the "a priori knowledge" as a classification criteria to discriminate the main proposed techniques in the literature to solve configuration adaptation decision making problems. We finally discuss the impact of sensing errors on the decision making process as a prospective analysis.
Keywordscognitive radio decision making problems dynamic configuration adaptation design space a priori knowledge
The increase of computational capacity associated with (rather) cheap flexible hardware technologies (such as programmable logic devices, digital signal processors and central processing units) offer a glimpse into new ways to designing and managing future non military communication systems.a As a matter of fact in 1991, Joseph Mitola III argued that in a few years, at least in theory, software design of communication systems should be possible. The term coined by Joseph Mitola to present such technologies is software defined radio (SDR) . For illustration purposes, today's radio devices need a specific dedicated electronic chain for each standard, switching from one standard to another when needed (known as the Velcro approach ). With the growth of the number of these standards (GSM, EDGE, Wi-Fi, Bluetooth, LTE, etc.) in one equipment, the design and development of these radio devices has become a real challenge and the practical need for more flexibility became urgent. Recent hardware advances have offered the possibility to design, at least partially, software solutions to problems which were requiring in the past hardware signal processing devices: a step closer to SDR systems.
In specific, several possible definitions exist--and are still a matter of debate in the community--to define SDR systems. For consistency reasons, we briefly describe software related radio concepts as agreed on by the SDR Forum . This matter is further discussed in . The SDR Forum defines SDR as radio in which some or all of the physical layer functions are software defined where physical layer and software defined terms are respectively described as:
Physical layer: The layer within the wireless protocol in which processing of radio frequency, intermediate frequency, or baseband signals including channel coding occurs. It is the lowest layer of the ISO seven-layer model as adapted for wireless transmission and reception.
Software defined: Software defined refers to the use of software processing within the radio system or device to implement operating (but not control) functions.
Thus, SDR systems are defined only from the design and the implementation perspectives. Consequently it appears as a simple evolution from the usual hardwired radio systems. However, with the added software layer, it is technically possible with current technology to control a large set of parameters in order to adapt on the fly radio equipment to their communication environment (e.g., bandwidth, modulation, protocol, power level adaptation to name a few). Nevertheless the control and optimization of reconfigurable radio devices need the definition of optimization criteria related to the equipment hardware capabilities, the users' needs as well as the regulators' rules. Introducing autonomous optimization capabilities in radio terminals and networks is the basis of cognitive radio (CR), term also suggested and coined by Joseph Mitola III [5, 6].
Detect user communication needs as a function of use context, and
Provide radio resources and wireless services most appropriate to these needs.
Thus, the purpose of this new concept is to autonomously meet the user's expectations, i.e., maximizing his profit (in terms of QoS, throughput or power efficiency to name a few) without compromising the efficiency of the network. Hence, the needed intelligence to operate efficiently must be distributed in both the network and the radio device.
In this article, we suggest to provide a brief discussion on the decision making problems seen from CR equipment's perspective and discussed in the literature as well as the main solutions suggested to tackle these problems. For that purpose, we revisit in Section 2 the rise of CR paradigm from which we discuss a basic definition. Then, in order to objectively compare the techniques introduces to address CR related decision making problem, we describe a conceptual object referred to as design space in Section 3. This conceptual object was introduced in the literature  to suggest that the CR design problem, from the decision making perspective, is better defined by a set of constrains rather than by a set of degrees of freedom. Thus, this section reminds us of the three considered dimensions of constrains viz., the environment's constraint, the equipment's limits and the user's needs. Moreover, in Section 4, we define and use the notion of a priori knowledge, to show that the tackled challenges by the radio community to solve configuration adaptation decision making problems have often the same design space, however they differ by the a priori knowledge they assume available on this design space. Consequently, in Section 4, we suggest the a priori knowledge as a classification criteria to discriminate the main proposed techniques in the literature to solve configuration adaptation decision making problems. Section 5, extends previous classification by adding the impact of observation accuracy and the benefit of learning techniques in such contexts. Section 6 concludes this analysis.
2. Cognitive radio
2.1. The rise of CR
As illustrated in Figure 1, a full cognitive cycleb demands at every iteration five steps: observe, orient, plan, decide, and act. The observe step deals with internal as well as external metrics. It aims at capturing the characteristics of the environment of the communication device (e.g., channel state, interference level or battery level to name a few.). This information is then processed by the three following steps: orient, plan, and decide steps, where priorities are set, schedules are planed according to the systems constraints, and decisions are made. Finally an appropriate action is taken during the act step (such as send a message, reconfigure, modify power level to name a few). In order to complete the cognitive cycle, a last and final step is needed to enhance the decision making engine of the communication device: the learn step. As a matter of fact, learning abilities enable communication equipment to evaluate the quality of their past actions. Thus, the decision making engine learns from its past successes and failures to tune its parameters and adapt its decision rules to its specific environment. Learning can consequently help the decision making engine to improve the quality of future decisions.
In this article although we cannot avoid mentioning CR applications from spectrum management perspective, we focus on the decision making and learning mechanisms designed to deal with broader frameworks, i.e., configuration adaptation problems. Thus, spectrum management problems are, from the equipment point of view, but a subset of configuration adaptation problems.
2.2. Basic cognitive cycle
Since the original definition suggested by Joseph Mitola III, several other definitions were proposed to define the edges of CR [4, 8–10, 15–17]. However, defining cognition is, in general, a harsh task. In the context of CR, basic cognitive abilities are considered:
environment perception (or observation)
and reasoning (or analysis/decision).
Based on these cognitive abilities, a CR needs to take appropriate actions to adapt itself to its surrounding environment.
Observation: Through its sensors the CR gathers information on its environment. Raw data and preprocessed information helps the agent to build a knowledge base. In this context, the term environment is used in a broad sense referring to any source of information that could improve the CR's behavior (internal state, interference level, regulators' rules and enforcement policies, to name a few).
Analysis/decision: This macro-step, presented as a black box in this case, includes all needed operations before given specific orders to the actuators (i.e., before reconfiguration in CR contexts). Depending on the level of sophistication, this step can deal with metric analysis, performance optimization, scheduling, and learning.
Action: Mainly parameter reconfiguration and waveform transmission. A reconfiguration management architecture needs to be implemented to ensure efficient and quick reconfigurations .
This definition is quite general. It can incorporate simple designs as well as complex ones. Most of the published articles deal however with a restricted problem: spectrum management. In such context, the term environment finds more specific definitions such as the followings to name a few: Environment:
Thus, depending on the considered environment, specific sensors are to be designed [4, 31, 32]. The captured -and/or computed- metrics by the sensors are then processed by the decision making engine. The kind of process highly depends on the quality of the metrics (level of uncertainty on the captured numerical value for instance) as well as the global information held by the CR. Finally, the made decisions are translated into appropriate bandwidth occupation and power allocation actions.
3. Decision making problems for CR
Within the basic cognitive cycle, we focus in this section on the analysis step, and more specifically on learning and decision making. We mainly find, in the literature two approaches. On the one hand, some of the articles focus on implementing smart behavior into radio devices to enable more adequate configurations, adapted to their environment, than those imposed by radio standards. As a matter of fact, standard configurations are usually over dimensioned to meet the requirements of various critical communication scenarios. This approach mainly focuses on one equipment, ignoring the rest of the network. We refer to the problem related to the first approach as dynamic configuration adaptation (DCA) problem. On the other hand due to a more pressing matter, most of CR related articles focus on spectrum management. These latter articles aim at enabling a more efficient use of the frequency resources because of its scarcity. This second problem is usually referred as dynamic spectrum access problem (DSA).
3.1. Design space and DCA problem
In this section, we discuss some of the limits related to the idealized CR concept before introducing the so called DCA problem. Several questions arise when designing a CR engine. We summarize our conceptual approach, presented in article , to dimension the decision making and learning abilities of a cognitive engine. Thus, we introduce the notion of design space as a conceptual object that defines a set of CR decision making problems by their constraints rather than by their degrees of freedom. We identified, in our analysis study, three dimensions of constrains: the environment's, the equipment's, and the user's related constrains.
Ideally speaking, CR concept--supported by an SDR platform--opens the way to infinite possibilities. Autonomous and aware of its surrounding environment as well as of it own behavior (and thus of its own abilities), any part of the radio chain could be probed and tested to evaluate its impact on the device's performance. This however implies that the equipment is also able, in its reasoning process, to validate its own choices. Namely, it must self-reference its cognition components . Unfortunately, this class of reasoning is well known in the theory of computing to be a potential black hole for computational resources. Specifically, any turing-capable (TC) computational entity that reasons about itself can enter a Göel-turingcloop from which it cannot recover.
To mitigate this paradox, time limited reasoning has been suggested by Mitola. As a matter of fact, radio systems need to observe, decide, and act within a limited amount of time: The timer and related computationally indivisible control construct is equivalent to the computer-theoretic construct of a step-counting function over "finite minimalization." It has been proved that computations that are limited with reliable watchdog timers can avoid the Gödel-turing paradox to the reliability of the timer. This proof is a fundamental theorem for practical self-modifying systems.
Realistic CR frameworks need to take into account a large set of possible configurations, however, as mentioned hereabove through the Gödel-paradox, the decision making engine also needs to be constrained in order to avoid the system to crash. We argue in the rest of this paragraph that, in general, CR decision making problems are better defined by their constraints rather than by their degrees of freedom.
When designing such CR equipments the main challenge is to find an appropriate way to correctly dimension its cognitive abilities according to its environment as well as to its purpose (i.e., providing a certain service to the user). Several articles in the literature have already been concerned by this matter however their description of the problem usually remained fuzzy (e.g., [6, 14, 34–36]). We summarize their analysis by defining three "constraints" on which the design of a CR equipment depends: First, the constraints imposed by the surrounding environment, then the constraints related to the user's expectations and finally, the constraints inherent to the equipment. We argue that these constraints help dimensioning the CR decision making engine. Consequently, an a priori formulation of these elements helps the designer to implement the right tools in order to obtain a flexible and adequate CR.
The environment constraints: since a CR is a wireless device that operates in a surrounding communicating environment, it shall respect its rules: those imposed by regulation for instance (e.g., allocated frequency bands, tolerated interference, etc.) as well as its physical reality (propagation, multi-path and fading to name a few) and network conditions (channel load or surrounding users' activities for instance). Thus the behavior of CR equipments is highly coordinated by the constraints imposed by the environment. As a matter of fact, if the environment allows no degree of freedom to the equipments, this latter has no choice but to obey and thus looses all cognitive behavior. On the other side, if no constraints are imposed by the environment, the CR will still be constrained by its own operational abilities and the expectations of the user.
User's expectations: when using his wireless device for a particular application (voice communication, data, streaming and so on), the user is expecting a certain quality of service. Depending on the awaited quality of service, the CR can identify several criteria to optimize, such as, minimizing the bit error rate, minimizing energy consumption, maximizing spectral efficiency, etc. If the user is too greedy and imposes too many objectives, the designing problem to solve might become intractable because of the constraints imposed by the surrounding environment and the platform of the CR. However if the user is expecting nothing, then again there is no need for a flexible CR. Usually it is assumed that the user is reasonable in a sense that he accepts the best he could get with a minimum cost as long as the quality of service provided is above a certain level.d
Equipment's operational abilities: These limitations are perhaps the most obvious since one cannot ask the CR equipment to adapt itself more than what it can perform (sense and/or act). It is usually assumed in the CR literature that the equipment is an ideal software radio, and thus, that it has all the needed flexibility for the designed framework. On a real application the efficiency of CR equipments depends of course on the degrees of freedom (or equivalently the constraints) inherent to the wireless platform used to communicate. As examples of commonly analyzed degrees of freedom one can find: modulation, pulse shape, symbol rate, transmit power, equalization to name a few. In all cases, a CR is designed to target and support given scenarios. We do not consider that CR can be designed to answer all scenarios or concepts .
In Figure 4, we represent two sub-spaces referred to as actual design space and virtual design space. On the one hand, the virtual design space refers to the upper bound support of the design space where all three dimensions are considered independently from each others. Its volume can be interpreted as the largest space of decision problems one could define from the three dimensions. On the other hand, the actual design space is included in the virtual design space. It results from the reduction of the design space when taking into account the correlation between the different constraints imposed by every dimension of the design space. For instance, some constraints on the environment such as, "imposed fixed waveform" might limit some objectives such as "find a waveform that maximizes the spectral efficiency".
To define a specific decision making problem, one needs to introduce a last-possibly implicit- function. This latter represents a functional relationship between all three dimensions, more specifically the correlation between the different constraints as illustrated by the design space. Thus, it models the interdependence of all three constraints. A simple representation of this interdependence can be expressed through an explicit objective function which numerical value is computed as a function of the equipment parameters, the environment's conditions as well as the values of other objective functions. Unfortunately such functions are not always available and might remain implicit. In such scenarios, optimization might prove problematic without using appropriate learning tools.
Finally, based on the here above presented analysis, all configuration adaptation problems seem to have the same roots. However, to define a specific problem among the set of possibilities in the design space, prior knowledge is important. This latter notion is further detailed in Section 4, where a classification of decision making tools as a function of prior knowledge is suggested. Nevertheless, the general DCA problem can be described as the most general decision making design space that we can state as follows :
Within this framework, we assume that the environment constrains the CR by allowing only K possible configurations to use. This condition characterizes the environment and the equipment. Moreover we assume that there exist M ≥ 1 objectives that evaluate how well the equipment performs to meet the users expectations.
To conclude, we usually observe in the literature that these constrained based characterizations are implicitly made. Thus, usually the assumptions introduced to define the decision making framework are, unfortunately, hardly explained. These assumptions concern what we refer to as the "a priori model knowledge". In Section 4, we introduce and explain the notion of a priori knowledge and we present a brief state of the art on decision making for CR configuration adaptation using the DCA design space. We show that although the design space is the same, depending on the a priori model knowledge, different approaches are suggested by the community to tackle the defined decision making problems.
The following section describes an important case of DCA know as DSA that we briefly describe for the sake of consistency.
3.2. Spectrum scarcity and dynamic spectrum access
Since the early 90s, the radio community captured the potential industrial and economic opportunities that could emerge from a better frequency resource usage as noticed in 2004 in article : A trend that has the potential to change the current industrial structure is the emergence of alternative spectrum management regimes, such as the introduction of so called "unlicensed bands", where new technologies can be introduced if they fulfil some very simple and relaxed "spectrum etiquette" rules to avoid excessive interference on existing systems. The most notable initiative in this area is the one of the federal communications commission (FCC, the regulator in USA) in the early 90s driving the development of short range wireless communication systems and wireless local area networks (WLANs).
Exploiting portions of the spectrum to unlicensed usage was a first step to introducing alternative frequency management schemes. Rethinking the main regulatory frameworks imposed for decades is the next step. As a matter of fact, during the last century, most of the meaningful spectrum resources were licensed to emerging wireless applications, where the static frequency allocation policy combined with a growing number of spectrum demanding services led to a spectrum scarcity. However, several measurements conducted in the United-States first, and then in numerous other countries [8, 23–27], showed a chronic underutilization of the frequency band resources, revealing substantial communication opportunities.
With the advent of SDR technology, it became, at least theoretically, possible to design agile systems capable of switching from one frequency band to another depending on given communication constraints. Thus, during the years 2002 and 2003 several task forces and researches suggested new frequency management policies and regulatory frameworks to enable efficient use of the spectrum resource [8, 38–43]. The consequences of this new framework are that the spectrum management model of today is abolished for large parts of the spectrum. Instead, "free"espectrum trading becomes the preferred mechanism and technical systems that allow for the dynamic use and re-use of spectrum becomes a necessity.
Dynamic exclusive use model: the spectrum basically is allocated exclusively to specific services or operators. However, the spectrum property rights framework allows opening a secondary market where the licensed users can sell and trade portion of their spectrum, whereas the dynamic spectrum allocation framework aims at providing a better allocation of the spectrum, to exclusive services, by adapting the spectrum allocation to space and time network load information.
Open sharing model (spectrum commons model): aims at generalizing the success encountered by WLAN technologies within the ISM band. In other words, it mainly suggests opening portions if the spectrum to unlicensed users.
Hierarchical access model: this framework introduced a secondary network that aims at exploiting resources left vacant by the incumbent users [usually referred to as primary users (PU)]. Secondary users (SUs) are able to communicate as long as they do not cause harmful interference to PUs. In this article, we do not subdivide this framework. As a matter of fact, their are as many subsets as the possible communication opportunities to exploit: power control, ultra-wide band communication under PUs noise level, spectrum hole detection and exploitation, directional communications to name a few . In general, it is refers to as opportunistic spectrum access (OSA).
Since the seminal article of Haykin  in 2005, OSA research community has been, to the best of authors' knowledge the most active in the field of DSA. With several network models based on game theory , Markov chains or multi-armed Bandit (MAB) (and machine learning in general) [44–50], to name a few, and relying on the concept of CR, the community tackled several challenges encountered when dealing with OSA such as (non exhaustive): dynamic power allocation, optimal band selection (with or without prior knowledge on the occupancy pattern of the spectrum bands by PUs), as well as cooperation among the different SUs  centralized or decentralized, with or without observation errors.
In Section 5.2 an OSA scenario based on a MAB model, described in article , is summarized and illustrates the impact of observation errors on decision making for CR. In the following section, however, we introduce prior knowledge as a classification criteria among the main learning and decision making tools suggested in CR articles.
4. Decision making tools for DCA
4.1. Expert approach
The expert approach relies on the important amount of knowledge collected by telecommunication engineers and researchers. This knowledge is based on theoretical consideration and practical measures on the environment and radio communication parameters. It was first suggested by Mitola in his Ph.D. dissertation on CR . Through intensive off-line simulations, expert systems are provided with a set of inference rules. These rules are then used on-line to adapt the equipment depending on the context faced by CR equipments. Thus, the more available knowledge, the better the equipment can adapt itself to its surrounding dynamic environment. However, this knowledge is usefully as long as if the CR can represent its knowledge in a way that enables to exploit it and to react to the environment by adequate adaptations of its operating configuration. For that purpose, Mitola suggested representing the knowledge of CR equipments using a new dedicated language radio communication: "radio knowledge representation language" (RKRL) [6, 33]. This representation of knowledge uses web semantic such as XML (eXtensible Markup Language), RDF (resource description framework), and OWL (web ontology language). The expert knowledge based approach had a large success especially due to the XG project (neXt Generation) supported by the DARPA (e.g.,  and for spectrum sharing: ). As a matter of fact, if the knowledge is well represented and provided to the equipment as a set of rules, the decision making process becomes very simple. However this approach has a few drawbacks:
The behavior of the designed system is not tuned to a particular user but to all users and to a set of probable environments. Moreover in order to acquaint the CR decision making engine with valuable and large knowledge, an important amount of effort is needed from the designer.
Expert knowledge is mainly based on models. Thus the system might behave in a poor way when it is facing unexpected dynamics in the environment.
The techniques based on expert systems can, however be supported by several other tools (some are discussed later) to help them acquire new knowledge on the environment or help them avoid conflicts between different configuration adaptation rules. A similar approach, based on an ontology to model the knowledge of the decision making engine was recently suggested [55–58]. Where a common language to radio devices is suggested based on an ontology, expressed in OWL and implemented on the USRP card  using GNU radio .
4.2. Exploration based decision making
there exists no universal definition of optimality in this case. Thus the solution of this problem are satisfactory (or not) with respect to a certain function, usually named fitness that evaluates how well the criteria were satisfied.
Thus usually a large space of possible "good" configurations can be available.
The criteria are correlated and can be in conflict (e.g., Figure 7).
If we assume that the previously mentioned off-line expert rule extraction phase has not been (or partially) accomplished an exploration of the space of possible configurations is needed.
There exists various possible algorithm to explore a large set of potential candidates. The most obvious one is probably "exhaustive search", where all possible candidates are computed and evaluated in order to find the best solution. However, when the number of candidates grows large, such approaches can become computationally burdensome and miss the imposed decision making deadlines. Usually in such contexts, heuristics are preferred. In the context of CR, finding the best solution might not be necessary. Instead, the cognitive engine would rather find, within the imposed limited amount of time, a satisfactory solution.
Consequently, if the following criteria are met:
Available a priori knowledge on the complex relationships existing between, the metrics observed, the parameters to adapt and the criteria to satisfy.
Possible heavy parallel computing.
Then a large set of decision making tools are possible such as: simulated annealing, GAs, and swarm algorithms to name a few . Notice that such approaches did not wait for CR to be used on radio technologies. In 1993, article  already suggested simulated annealing as a possible solution to deal with channel assignment for cellular networks.g
This defined CR decision making framework was first analyzed by Rieser and Rondeau. They suggested the use of GAs to tackle this framework [14, 34, 61]. GAs were first designed to mimic Darwin's evolutionary theory and are well known for their capacity to adapt themselves to a changing environment. Without using our formalism, their study showed that under what we define as design space and with the described a priori knowledge, the GAs provide cognitive radios with an efficient and flexible decision making engine. But we cannot consider their model as a generality for all CR use cases, so that other solutions have to be considered additionally. Further details on the different versions suggested and implemented by Virginia Tech can be found in the following recent survey .i
Notice, that once again, prior knowledge can substantially enhance the behavior of these algorithms. An interesting illustration can be found in article  in the case of GAs based decision making engines.
4.3. Learning approaches: exploration and exploitation
As we argued in the previous sections and as several other authors [36, 68] noticed, "Many CR proposals, such as[61, 69, 70], rely on a priori characterization of these performance metrics which are often derived from analytical models. Unfortunately, [...], this approach is not always practical due to e.g., limiting modeling assumption, non-ideal behaviors in real-life scenarios, and poor scalability" . To avoid these limitations and in order to tackle more realistic scenarios, many methods based on learning techniques were suggested: artificial neuronal networks (ANN), evolving connectionist systems (ECS) [71, 72], statistical learning , regression models and so on. All of these approaches have their cons and pros, however they all have in common that they mainly rely on trials conducted within a real environment to try and infer from it decision making rules for CR equipments. Since this learning tools aim at representing the functional relationship between the environment (through the sensed metrics), the systems parameters and the criteria to satisfy, they need a direct interaction with the environment in order to build a posteriori knowledge on their environment. In this study we sub-classify these methods depending on the way they learn and exploit their rules. On the one hand (i), we find a set of techniques that separates exploration and exploitation phases. On the other hand (ii), we find other techniques more flexible that combine both processes.
In the first mentioned case (i) we find several tools such as ANN or statistical learning already used and exploited in other domain requiring some cognitive abilities (robotics, video games, etc.). These methods have two phases: a phase of pure "exploration" where the CR decision making engine learns and infers to find (explicitly or implicitly) decision making rules, then uses in a second phase this a posteriori knowledge to make decision. Since these learning techniques rely on a first learning phase, a large amount of data and computational power is needed in order to extract reliable knowledge. This difficulty is already known concerning ANN for instance. It is still true for statistical learning. As noticed by Weingart in article , the provided techniques are still computationally prohibitive, and not ready yet to be used in a real equipment. However if the first phase is well achieved the second phase is usually very simple and does not require much time or energy . In the second case (ii), we find promising techniques recently introduced to the community and still need to be further investigated [17, 36] in the case of configuration adaptation.j These techniques try to provide the CR with a flexible and incremental learning decision making engine. In the case of ECS based decision making engine, Colson suggested the use of an evolving neural network [71, 72]. Unlike the usual ANN, the ECS-NN can change its structure without "forgetting" already learned knowledge. Thus new rules can be learned by adding new neurons to the neural structure. In order to be efficient the architecture proposed in  needs some expert advice (a priori knowledge) on the several available configurations. These added information ranks the different configurations based on some criteria (robustness, spectral efficiency, etc.) but without knowing a priori which one is more adequate when facing a certain environment.
More recently, article  however assumes that no a priori knowledge is provided and that the performance of the equipment can only be estimated when trying a specific configuration. The associated tools are based on the so-called MAB framework. One advantage here is to provide learning solutions while operating, even if the cognitive engine is facing a completely new environment. Of course, performance increase while the learning process progresses. Note that this approach is also proving its accuracy in the OSA context .
To conclude this section, we would like to emphasize the fact that the proposed classification in this article shows that a CR equipment cannot depend on only one core decision making tool but on a pool of techniques. Every time it faces an environment, the equipment needs to have an estimation of its a priori knowledge and on its reliability. To tackle a particular context, the general process can be summarized through three questions: What can't I do (design space)? What do I already know (a priori knowledge)? And what technique should I select to solve the decision making problem?
In the following section we extend the analysis to the specific and practical context of imperfect sensing. As a matter of fact the impact of sensing errors can be significant on decision making techniques. However, unfortunately, very few studies seem to tackle this specific problem within CR contexts. Hence, we further discuss this matter hereafter.
5. Decision making in the context of sensing errors
As illustrated through the notion of basic cognitive cycle, decision making, and learning rely on prior observations of the environment. Consequently, the performance of the implemented decision making tools highly depends on the quality of the observations. Unfortunately, we could not find substantial quantitative material evaluating the impact of sensing errors on decision making and learning tools. Thus, we suggest to qualitativelyk discuss, in this section, the impact of sensing errors on the previously discussed decision making tools for CR. For that purpose we rely on a specific problem borrowed from the OSAl community to illustrate this discussion where the problem of decision making in the context of sensing errors is clearly formalized and the impact of such errors on the considered learning algorithm's performance is quantified.
5.1. An example of learning approach
Opportunistic spectrum access is a particularly interesting framework that illustrates the challenge faced when learning under uncertainty. When tackling the general DCA problem, described hereabove, while considering K channels to probe, the problem that consists in maximizing the cumulated throughput of the user over the number of transmission trials appears to be consistent with a MAB paradigm [74, 75]. In a nutshell, based on the analogy with the one-armed bandit (also known as slot machine), it models a gambler sequentially pulling one of the several levers (MAB) on the gambling machine. Every time a leverm is pulled, it provides the gambler with a random income usually referred to as reward. Although we assume that the gambler has no a priori information on the rewards' stochastic distributions, he aims at maximizing his cumulated income through iterative pulls. In the OSA framework, the SU is modeled as the gambler while the frequency bands represent the levers. The gambler faces at each trial a trade-off between pulling the lever with the highest estimated payoff (known as exploitation phase) and pulling another lever to acquire information about its expected payoff (known as exploration phase). We usually refer to this trade-off as the exploration-exploitation dilemma. If the problem is assumed modeled as a MAB framework an interesting way to tackle the problem is to use the class of so-called upper confidence bound algorithmsn (UCB) [17, 47, 48, 50, 76]. The main advantage of UCB methods for CR is to offer a balance between exploration and exploitation phases without interrupting the communication process, i.e., while providing a certain service to the user . Namely, a CR based on UCB can jointly communicate and learn. Thus it avoids the instantiation of two steps: a learning step during which the user has to wait. And a communication step that depends on how well the first step performed. It is worth noticing that the suggested illustration, in the article, is based on the so-called UCB1. This latter has been selected for its rather low computational complexity compared to other techniques in the literature.
For illustration purpose, we use the following decision model for OSA of a SU having the choice between ten frequency bands, each one used by PUs with a different probability, usually unknown to the CR decision making engine. A complete model is provided in . Only one band can be sensed and tried at each iteration in order to keep the system's complexity reasonable. Consequently, the cognitive engine only has a partial information on the environment at each iteration and should derive the probability of availability of the bands based on its previous trials. It provides a confidence bound on every band and selects, for the next iteration, the band most likely to be free. Communication can be performed if the band is detected as free; otherwise the SU backs off. However, the SU can make errors due to the non perfect accuracy of its sensing detector. More specifically, the detector might detect the presence of a PU while the band is in fact free and vice-versa. The consequence is that the SU does not transmit during this iteration whereas he could, or transmits when he should not causing interference to the incumbent users. We usually speak of false alarm in the former case and miss-detection in the latter case.
5.2. The impact of observation error and uncertainty on decision making
Analyzing the impact of uncertainty and sensing errors on the performance of a CR decision making engine is very difficult. However due to the importance of this problem to the community, we suggest as a closing point of this article, an intuitive and brief insight view on this matter. Within this framework we consider that the sensing information we capture from the environment may contain errors. Then we describe the potential consequence of such errors on the performance of class of algorithms previously classified.
Taking into account the uncertainty on the environment sensing, we may assert that learning-oriented techniques are more efficient. This is emphasized by the proposed classification based on the a priori knowledge criteria on the environment. Hence, we believe that such approaches should be particularly addressed by the CR community in the second decade of CR decision making era.
We tackled in this article decision making in the sense of a mono-equipment problem. In a multi-equipment context, a higher level of decision (rule, policy, etc.) should specify how equipments cooperate or not. This is out of the scope of this article. However, at the level of each equipment, decision goes back to what has been stated in this article.
In this article, we presented a brief yet original retrospective view on the first 10 years of CR. More specifically of the different challenges faced by the CR decision making community and the suggested solution to answer them. We state that most of these decision making models have the same design space however they differ by the a priori knowledge they assume available. Consequently, we suggested the "a priori knowledge" as a classification criteria to discriminate the main proposed techniques in the literature to solve configuration adaptation decision making problems. Moreover as a qualitative and prospective analysis, we depicted through an toy example the impact of observation errors and uncertainty on CR decision making engine.
We believe that this analysis made on the first 10 years of exploration of decision making for CR may help gaining perspective on the topic and thus help addressing this research domain for the next coming 10 years.
aBoth US and European military have been working on such flexible and inter-operable defense systems since the late 1970s. bIt is called full CR to oppose it to other simplified versions suggested in the literature . cA specific example of such paradox can be illustrated by the following sentence: 'This sentence is false!'  as suggested by Mitola during a recent seminar at Supélec, http://www.rennes.supelec.fr/ren/rd/scee/seminaire.html. dNotice that this assumption introduces the notion of satisfactory behavior. We oppose it to rational thinking where the decision making engine always aims at the most rewarding option. Thus when the decision making engine needs to learn in an uncertain environment, satisfaction based reasoning can be introduced to accelerate the convergence rate of learning algorithms for instance. e[...] "Trade, lease, and rent of licenses were possible without incurring excessive administrative procedures and overhead costs" . fA different, more detailed and more exhaustive, DSA taxomony can be found in article . gIt is indeed a very restrictive case of DCA and DSA where a centralized entity, seen as the cognitive agent (CA) assigns frequency channels to its users depending on the channel conditions. hTo the best of authors' knowledge Swarm algorithms have only been exploited in case of resource allocation. No complex configuration adaptation decision making engine was found in the literature based on such techniques. iThis document is presented as a survey of the various suggested decision making architectures for CR. We notice however, that except the one designed by Mitola, during the DARPA xG Program, and those designed and implemented by Virginia Tech, the community around this topic seems thin and advances slowly toward efficient architectures. Other suggested architectures relying mostly on bio-inspired techniques tackle spectrum resource allocation related problems. jThese same techniques, based on a MAB model, prove to be efficient to tackle some DSA related problems as already discussed in Section 3.2. kTo the best of authors' knowledge such studies have only been conducted when dealing with OSA problems. Consequently, the presented results are exploratory and need further investigations to fully confirm them. We find however the overall discussion interesting to capture CR related decision making challenges. lAs mentioned earlier, we consider in this article that OSA problems are but specific instantiations of DCA problems. mFrom a DCA problems perspective, a lever is a specific configuration to be tested. Thus in OSA, it refers to a band to probe for instance. nUCB algorithms are given here as an example of learning-oriented approach. But the philosophy and conclusions of this section would match other learning techniques.
- Mitola J III: The software radio architecture. IEEE Commun Mag 1995, 33: 26-38.View ArticleGoogle Scholar
- Gul ST: Optimization of multi-standards software defined radio equipments: a common operators approach. In PhD Thesis. University of Rennes 1, Rennes, France; 2009.Google Scholar
- Forum SDR: Sdrf cognitive radio definitions.2007. [http://data.memberclicks.com/site/sdf/SDRF-06-R-0011-V1_0_0.pdf]Google Scholar
- Palicot J: Radio Engineering: From Software Radio to Cognitive Radio. Wiley, UK; 2011.View ArticleGoogle Scholar
- Mitola J, Maguire GQ: Cognitive radio: making software radios more personal. Pers Commun IEEE 1999, 6: 13-18. 10.1109/98.788210View ArticleGoogle Scholar
- Mitola J: Cognitive radio: An integrated agent architecture for software defined radio. PhD Thesis, Royal Inst of Technology (KTH) 2000.Google Scholar
- Jouini W, Moy C, Palicot J: On decision making for dynamic configuration adaptation problem in cognitive radio equipments: a multi-armed bandit based approach. In 6th Karlsruhe Workshop on Software Radios, WSR'10, pp. 21-30. Karlsruhe, Germany; 2010.Google Scholar
- Federal Communications Commission, Spectrum policy task force report[http://www.fcc.gov/sptf/files/SEWGFinalReport_1.pdf]
- Facilitating opportunities for flexible, efficient ad reliable spectrum use employing cognitive radio technologies In Federal Communications Commission Spectrum Policy Task Force, FCC. Washington, DC; 2005:03-108.Google Scholar
- Haykin S: Cognitive radio: brain-empowered wireless communications. IEEE J Sel Areas Commun 2005, 23(2):201-220.View ArticleGoogle Scholar
- Yucek T, Arslan H: A survey of spectrum sensing algorithms for cognitive radio applications. IEEE Commun Surv Tutor 2009, 11(1):116-130.View ArticleGoogle Scholar
- Akyildiz IF, Lo BF, Balakrishnan R: Cooperative spectrum sensing in cognitive radio networks: a survey. Phys Commun 2011, 4(1):40-62. 10.1016/j.phycom.2010.12.003View ArticleGoogle Scholar
- Wang B, Wu Y, Liu KJR: Game theory for cognitive radio networks: an overview. Comput Netw 2010, 54: 2537-2561. 10.1016/j.comnet.2010.04.004View ArticleGoogle Scholar
- Rieser CJ: Biologically Inspired Cognitive Radio Engine Model Utilizing Distributed Genetic Algorithms for Secure and Robust Wireless Communications and Networking. PhD thesis. Virginia Tech; 2004.Google Scholar
- Jondral FK: Software-defined radio - basics and evolution to cognitive radio. EURASIP J Wirel Commun Netw 2005, 2005(3):9.View ArticleGoogle Scholar
- International Telecommunication Union, Definitions of software defined radio (sdr) and cognitive radio system (crs). Report ITU-R SM.2152, SM Series, Spectrum management 2009.Google Scholar
- Jouini W, Ernst D, Moy C, Palicot J: Multi-armed bandit based policies for cognitive radio's decision making issues. In Proceedings of the 3rd international conference on Signals, Circuits and Systems (SCS). Djerba, Tunisia; 2009:1-6.Google Scholar
- Moy C: High-level design approach for the specification of cognitive radio equipments management APIs. Journal of Network and System Management. Spec Issue Manage Funct Cogn Wirel Netw Syst 2010, 18(1):64-96.Google Scholar
- Gibson M: Tv white space geolocation database. In Workshop, IEEE 802 Plenary meeting. San Diego; 2010.Google Scholar
- Karimi HR, Ofcom: Geolocation databases for white space devices in the uhf tv bands: Specification of maximum permitted emission levels. In IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN). Aachen, Germany; 2011:231-241.Google Scholar
- Motorola, Tv white space position paper. Fixed tv white space solutions for wireless isp network operators[http://www.techrepublic.com/whitepapers/tv-white-space-position-paper-fixed-tv-white-space-solutions-for-wireless-isp-network-operators/1115419]
- Office of Communication (Ofcom), Implementing geolocation[http://stakeholders.ofcom.org.uk/binaries/consultations/geolocation/statement/statement.pdf]
- McHenry MA, et al.: Spectrum occupancy measurements, Technical report, Shared Spectrum Company, Jan 2004-Aug 2005.[http://www.sharedspectrum.com]
- Chiang RIC, Rowe GB, Sowerby KW: A quantitative analysis of spectral occupancy measurements for cognitive radio. In Proceedings of the IEEE 65th Vehicular Technology Conference (VTC 2007 Spring). Singapore; 2007:3016-3020.View ArticleGoogle Scholar
- Wellens M, Wu J, Mähönen P: Evaluation of spectrum occupancy in indoor and outdoor scenario in the context of cognitive radio. In Proceedings of the Second International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CrowCom 2007). Netherlands; 2007:1-8.Google Scholar
- Islam MH, et al.: Spectrum survey in Singapore: Occupancy measurements and analyses. In Proceedings of the 3rd International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CrownCom 2008). Singapore; 2008:1-7.Google Scholar
- López-Benítez M, Casadevall F, Umbert A, Pérez-Romero J, Palicot J, Moy C, Hachemani R: Spectral occupation measurements and blind standard recognition sensor for cognitive radio networks. In Proceedings of the 4th International Conference on Cognitive Radio Oriented Wireless Networks and Communications (CrownCom 2009). Hannover; 2009:1-9.View ArticleGoogle Scholar
- Sonnenschein A, Fishman PM: Radiometric detection of spread-spectrum signals in noise of uncertain power. IEEE Trans Aerosp Electron Syst 1992, 28: 654-660. 10.1109/7.256287View ArticleGoogle Scholar
- Tandra R, Sahai A: SNR walls for signal detection. IEEE J Sel Top Signal Process 2008, 2(1):4-17.View ArticleGoogle Scholar
- Jouini W: Energy detection limits under log-normal approximated noise uncertainty. IEEE Signal Process Lett 2011, 18(7):423-426.View ArticleGoogle Scholar
- Hachemani R, Palicot J, Moy C: The "sensorial radio bubble" for cognitive radio terminals. In Proceeding in URSI, The XXIX General Assembly of the International Union of Radio Science. Chicago, USA; 2008:351-368.Google Scholar
- Palicot J, Moy C, Hachemani R: Multilayer sensors for the sensorial radio bubble. Phys Commun 2009, 2: 151-165. 10.1016/j.phycom.2009.03.003View ArticleGoogle Scholar
- Mitola J: Cognitive Radio Architecture - The Engineering Foundations of Radio Xml. Wiley-Blackwell, New York; 2006.View ArticleGoogle Scholar
- Rondeau TW: Application of Artiffcial Intelligence to Wireless Communications. PhD thesis, Virginia Tech; 2006.Google Scholar
- QinetiQ, Ofcom: Cognitive radio technology a study for ofcom, 1.2007. [http://stakeholders.ofcom.org.uk/market-data-research/technology-research/research/emerging-tech/cograd/]Google Scholar
- Colson N, Kountouris A, Wautier A, Husson L: Cognitive decision making process supervising the radio dynamic reconfiguration. In Proceedings of Cognitive Radio Oriented Wireless Networks and Communications. Singapore; 2008:1-7.Google Scholar
- Berggren F, Queseth O, Zander J, Asp B, Jönsson C, Stenumgaard P, Kviselius NZ, Thorngren B, Landmark U, Wessel J: Dynamic spectrum access.2004. [http://www.wireless.kth.se/projects/DSA/DSA_report_phase1.pdf]Google Scholar
- FCC: Spectrum policy task force, report of the spectrum efficiency working group. 2002.Google Scholar
- Reed DP: How wireless networks scale: the illusion of spectrum scarcity.2002. [http://www.its.bldrdoc.gov/meetings/art/art02/slides02/speakers02.html]Google Scholar
- Kolodzy P: Spectrum policy, technology leading to new directions? National Spectrum Managers Association 2002.Google Scholar
- Reed DP: Bits aren't bites: Constructing a "communications ether" that can grow and adapt. 2003.Google Scholar
- FCC: Promoting efficient use of spectrum through elimination of barriers to the development of secondary markets. Report WT Docket No. 00-230. 2003.Google Scholar
- Kolodzy P: Spectrum policy task force, findings and recommendations. In International Symposium on Advanced Radio Technologies (ISART 2003). Boulder, Colo, USA; 2003:168-239.Google Scholar
- Zhao Q, Sadler BM: A survey of dynamic spectrum access: signal processing, networking, and regulatory policy. IEEE Signal Process Mag 2007, 24(3):79-89.View ArticleGoogle Scholar
- Zhao Q, Tong L, Swami A: Decentralized cognitive mac for dynamic spectrum access. In IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN). Baltimore, MD; 2005:601-606.Google Scholar
- Liu K, Zhao Q: Channel probing for opportunistic access with multi-channel sensing. In IEEE Asilomar Conference on Signals, Systems, and Computers. Pacific Grove, CA; 2008:516-532.Google Scholar
- Jouini W, Ernst D, Moy C, Palicot J: Upper confidence bound based decision making strategies and dynamic spectrum access. In Proceedings of the 2010 IEEE International Conference on Communications (ICC). Cape Town, South Africa; 2010:1-5.View ArticleGoogle Scholar
- Jouini W, Moy C, Palicot J: Upper confidence bound algorithm for opportunistic spectrum access with sensing errors. In 6th International ICST Conference on Cognitive Radio Oriented Wireless Networks and Communications. Osaka, Japan; 2011:17.Google Scholar
- Anandkumar A, Michael N, Tang A: Opportunistic spectrum access with multiple users: Learning under competition. In In Proc of IEEE INFOCOM. San Deigo, USA; 2010:803-811.Google Scholar
- Liu K, Zhao Q, Krishnamachari B: Distributed learning under imperfect sensing in cognitive radio networks. In ASILOMAR. Asilomar Conference Grounds Pacific Grove, CA, USA; 2010:671-675.Google Scholar
- Moy C: Bio-inspired cognitive phones based on human nervous system. In 3rd International Symposium on Applied Sciences in Biomedical and Communication Technologies (ISABEL). Rome; 2010:1-5.Google Scholar
- Di W, Feng W, Shengyao Y: Cognitive radio decision engine based on priori knowledge. In 3rd International Symposium on Parallel Architectures, Algorithms and Programming. China; 2011:225-259.Google Scholar
- DARPA XG Working Group: The XG vision. request for comments. BBN Technologies Cambridge MA, USA; 2004, 1-17. Tech. Rep. Version 2.0, [http://www.ir.bbn.com/_ramanath/pdf/rfcvision.pdf]Google Scholar
- Berlemann L, Mangold S, Walke BH: Policy-based reasoning for spectrum sharing in radio networks. In Proceedings of IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks (DySPAN). Baltimore, MD, USA; 2005:1-10.Google Scholar
- Fette B, Kokar MM, Cummings M: Next-generation design issues in communications. Portable Design Magazine 2008, 3: 20-24.Google Scholar
- Li S, Kokar MM, Brady D: Developing an ontology for the cognitive radio: issues and decisions. PSDR Forum Technical Conference 2009.Google Scholar
- Li S, Kokar MM, Brady D, Moskal J: Collaborative adaptation of cognitive radio parameters using ontology and policy approach. Software Defined Radio Technical Conference, SDR' 10. SDRF 2010.Google Scholar
- WIF Forum MLM Working Group: Description of cognitive radio ontology v.1.0. 2010.Google Scholar
- Blossom E: Exploring GNU Radio.[http://www.gnu.org/software/gnuradio/doc/exploring-gnuradio.html]
- Rondeau TW, Maldonado D, Scaperoth D, Bostian CW: Cognitive radio formulation and implementation. In IEEE Proceedings CROWNCOM. Mykonos, Greece; 2006:1-10.Google Scholar
- Russell SJ, Norvig P: Definitions of Software Defined Radio (sdr) and Cognitive Radio System (crs). Artificial Intelligence: A Modern Approach. 2nd edition. Prentice Hall, Upper Saddle River, New Jersey; 2003.Google Scholar
- Duque-Anton M, Kunz D, Ruber B: Channel assignment for cellular radio using simulated annealing. IEEE Trans Veh Technol 1993, 42(1):14-21. 10.1109/25.192382View ArticleGoogle Scholar
- Chen T, Zhang H, Zhou Z: Swarm intelligence based dynamic control channel assignment in cogmesh. In Proc the IEEE ICC 2008 (IEEE CoCoNetÆ08 Workshop). Beijing, China; 2008:168-178.Google Scholar
- Di Lorenzo P, Barbarossa S: Distributed resource allocation in cognitive radio systems based on social foraging swarms. In The 11th IEEE International Workshop on signal processing advances in wireless communications. Marrakech, Morocco; 2010:1-5.Google Scholar
- Atakan B, Akan OB: Biologically-inspired spectrum sharing in cognitive radio networks. In Proc IEEE Wireless Communications and Networking Conference, WCNC. Hong Kong, China; 2007:43-48.Google Scholar
- Amanna A, Reed JH: Survey of cognitive radio architectures. In Proceedings of IEEE SoutheastCon. Charlotte, NC, USA; 2010:292-297.Google Scholar
- Baldo N, Zorzi M: Learning and adaptation in cognitive radios using neural networks. 5th IEEE Consumer Communications and Networking Conference, CCNC 2008, 998-1003.Google Scholar
- Baldo N, Zorzi M: Fuzzy logic for cross-layer optimization in cognitive radio networks. In IEEE Consumer Communications and Networking Conference. Las Vegas, Nevada, USA; 2007:1074-1079.Google Scholar
- Clancy Charles, Hecker Joe, Stuntebeck Erich: Applications of machine learning to cognitive radio networks. IEEE Wirel Commun Mag 2007, 14(4):47-52.View ArticleGoogle Scholar
- Kasabov N, ECOS: Evolving connectionist systems and the eco learning paradigm. In International Conference on Neural Information Processing. Kitakyushu, Japan; 1998:1222-1235.Google Scholar
- Kasabov N: Evolving Connectionist Systems: The Knowledge Engineering Approach. 2nd edition. Springer, New York; 2007.Google Scholar
- Weingart T, Sicker D, Grunwald D: A statistical method for reconfiguration of cognitive radios. IEEE Wirel Commun Mag 2007, 14(4):34-40.View ArticleGoogle Scholar
- Lai TL, Robbins H: Asymptotically efficient adaptive allocation rules. Adv Appl Math 1985, 6: 4-22. 10.1016/0196-8858(85)90002-8MathSciNetView ArticleGoogle Scholar
- Agrawal R: Sample mean based index policies with O(log(n)) regret for the multi-armed bandit problem. Adv Appl Prob 1995, 27: 1054-1078. 10.2307/1427934View ArticleGoogle Scholar
- Liu K, Zhao Q: Distributed learning in cognitive radio networks: Multi-armed bandit with distributed multiple players. In Proc of ICASSP. San Diego; 2010:1-10.Google Scholar
- Auer P, Cesa-Bianchi N, Fischer P: Finite time analysis of multi-armed bandit problems. Mach learn 2002, 47(2/3):235-256. 10.1023/A:1013689704352View ArticleGoogle Scholar
- Audibert J-Y, Munos R, Szepesvßri C: Tuning bandit algorithms in stochastic environments. In Proceedings of the 18th international conference on Algorithmic Learning Theory. Sendai, Japan; 2007:863-889.Google Scholar
- Mitola J: The future of cognitive radio.2011. [http://www.rennes.supelec.fr/ren/rd/scee/ftp/seminaire/seminaire_mitola_12mai2011.pdf]Google Scholar
- Buddhikot MM: Understanding dynamic spectrum access: Models, taxonomy and challenges. In IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN). Dublin, Ireland; 2007:462-471.Google Scholar
- Jouini W: Seminar on learning for opportunistic spectrum access: a multi-armed bandit framework.2011. [http://www.rennes.supelec.fr/ren/rd/scee/seminaire.html]Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.