An overview of learning mechanisms for cognitive systems

Cognitive systems were first introduced by Mitola and in the last decade they have proved to be beneficial in self-management functionalities of future generation networks. The advantages and the way that networks gain benefits from cognitive systems is analysed in this article. Moreover, since such systems are closely related to machine learning, the focus of this article is also placed on machine learning techniques applied both in the network and the user devices side. In particular, celebrating 10 years of cognitive systems, this survey-oriented article presents an extended state-of-the-art of machine learning applied to cognitive systems as coming from the recent research and an overview of three different learning capabilities of both the network and the user device.


Introduction
The success of mobile networks has been driven by the services offered, i.e. voice in second generation and multimedia services in third generation (3G) networks.Similarly, a key issue for the success of future generation networks is considered to be the provision of enhanced, always available, personalised services.In addition to communication and entertainment, a wide range of other life sectors can benefit from evolving multimedia applications, including healthcare, environmental monitoring, transportation and public safety.In this respect, it is necessary to develop mechanisms that will enhance the end-user experience, in terms of quality of service (QoS), availability and reliability.At the same time, the complexity and heterogeneity of the infrastructure of mobile network operators increases as radio access technologies (RATs) continue to evolve and new ones emerge.In summary, fundamental requirements for the success of future networks are service personalisation, always-best-connectivity, ubiquitous service provision as well as efficient handling of the complexity of the underlying infrastructure.All these call for self-management and learning capabilities in future generation network systems.Self-management enables a system to identify opportunities for improving its performance and configuring/adapting its operation accordingly without the need for human intervention [1] Learning mechanisms are essential so as to increase the reliability of decision making.Learning mechanisms also provide the ground for enabling proactive handling of problematic situations, i.e. identifying and handling issues that could undermine the performance of the system before these actually occur.
In this respect, cognitive, reconfigurable systems [2][3][4], encompassing self-management and learning capabilities, have been devised as a solution to address all the key issues identified in the previous.More specifically, cognitive systems determine their behaviour, in a selfmanaged way.This is achieved reactively or proactively [5][6][7], based on goals, policies, knowledge and experience, obtained through learning.Towards this direction, this article provides an overview of two network centric applications based on two different learning techniques for identifying network capabilities in terms of available QoS expressed in bitrate.
Moreover, as the mobile phone becomes more and more an indispensable tool in daily activities, learning functionality is required on the user device as well in order to truly enhance the experience of all users, even technology agnostic ones.In this direction, the focus of this article is also placed in user centric learning capabilities as well by exploiting a learning technique for the identification of user preferences so as to connect to that network which will increase quality of experience (QoE) for the user.
In more detail, the article is structured as follows.Sections 2 presents an extended related work of functionalities built upon learning techniques and Section 3 provides two problem statements, one of them being network centric and one user centric.The two problems showcase the way that the techniques can and/or should be used.The article continues in Section 4 with the approaches that are followed in the problems stated in Section 3 by overviewing the learning-based mechanisms for acquiring learning capabilities both in network and user's equipment.Finally, the article concludes in Section 5.

Related study
For achieving the targets analysed above learning capabilities are required both in network and user equipments.Looking from the side of the management systems of the networks, learning capabilities can offer enhancements to the system by providing knowledge regarding the capabilities of the network and facilitating the decision-making mechanisms.The applications presented in this article referring to network capabilities of the system were expressed in terms of QoS, and more particularly in achieved bitrate.On the other hand, learning capabilities in user devices facilitate the building of knowledge regarding the user's preferences and thus improving QoE for the user.
Relevant past study includes research towards both directions.In particular, regarding networks capabilities a large variety of research has been recorded using enough different learning techniques.To begin with, the study in [8] describes fuzzy logic schemes for representing the knowledge for cross-layer information followed by fuzzy control theory which implements cross-layer optimization strategies.Towards the same direction, authors of [9] suggest fuzzy logic-based schemes which exploit past history and shared knowledge of the service quality experienced by active connections for processing cross-layer communication quality metrics so as to estimate the expected transport layer performance.
Moving to bio-inspired techniques, genetic algorithms (GAs) have also been proposed for similar reasons.More precisely, authors of [10] propose a GA for achieving the optimal transmission with respect to QoS goals (minimization of the bit-error rate, minimizing of power consumption, maximization of the throughput, etc.).For this purpose, the GA scores a subset of parameters and evolves them until the optimal value is reached for a given goal.Furthermore, neural networks (NNs) have also been used for treating similar problems.Only a few examples coming from the recent literature and using NN-based techniques are [11][12][13].Authors of [11] propose NNbased learning schemes with the aim to predict the data rate of a candidate radio configuration, which is to be evaluated by a cognitive radio system (CRS).Several NN-based schemes have also been tested in [12] for similar purposes.Therein, data rate is studied with respect to the quality of the link and the signal strength of the wireless transceiver, while scenarios that test the possibility of predicting the actual achieved throughput, in a short-term fashion in environments that are rapidly changing, also exist.Learning and predicting the performance is also the target of the cognitive controller built using multilayer feed-forward neural network [13].The controller performs this task for different channels in IEEE 802.11 wireless networks based on the experimental measurements and the environmental conditions, and eventually selects the optional channel.
Finally, Bayesian statistics and self-organizing maps (SOMs) have also been applied as techniques that can facilitate the estimation of network capacity.Among the articles that report so are [14][15][16], respectively.The specific approaches are selected to be further analysed in the next sections.
Looking from the user preferences side, effort was put on developing context awareness techniques [17][18][19][20], recording of user preferences [21,22] and learning capabilities [11,14] and exploiting these to influence the configuration selection [23][24][25].Additionally, relevant work also includes the use of Bayesian networks in support of user modelling, as a method for evaluating, in a qualitative and quantitative manner, elements of the user behaviour and consequently updating the user profile.In this direction, diverse research efforts have utilised concepts of Bayesian statistics for various applications such as recommendation systems [26,27], negotiations [28] and calendar scheduling [29].Issues that arise in achieving user-intent ascription through dynamic user model construction with Bayesian networks are addressed in [30].
The work presented in [31] focuses especially on the application of Bayesian statistics concepts for learning user preferences regarding the provision of services in mobile and wireless networks, such as voice, video streaming, web browsing, etc.In general, in the scope of mobile networks and ubiquitous computing, similar schemes have been developed.However, these focus on different aspects of user preferences and not on user preferences regarding the obtained QoS when using a certain service/application.For example, in [32] the targeted user preferences are modifications of the ringer volume or vibrate alarm and the acceptance or rejection of incoming calls.In [33], where the design for a context-aware collaborative filtering system is presented, the focus is on user preferences regarding activities in certain contextual situations.The challenges in progressing from modelling human behaviour to inferring human intent in context aware applications are addressed in [34], where the focus is more on ubiquitous virtual reality applications.
In summary, while a great amount of research efforts have focused on approaches for acquiring, learning and exploiting information on user preferences, the targeted user preferences, as well as the objectives, vary between the different approaches.The scheme for learning user preferences in [31] concentrates on preferences regarding service provisioning, in terms of QoS levels, for various services available in mobile and wireless networks.As the aim of [31] was to dynamically estimate user preferences and exploit these estimations in the selection of the most appropriate device configuration, so as to achieve the "always-best" connectivity concept and subsequently provide an enhanced experience to the user, it was selected to be presented in more details hereafter (see Section 4).
The main innovation of the study presented in this article lies in the fact that the article presents an approach for dynamically learning both context information and user preferences, the combination of which could be exploited in a later stage for the selection of the most appropriate network configuration.It is important here to clarify that the selection itself is out of the scope of this article.

Learning network capabilities
The aim of this problem is to estimate network capabilities.The term "network capabilities" refers to what the network is capable of, i.e. the main features of a network such as the QoS, its range, its location, its type (GSM, UMTS), etc.In thisarticle, the term refers explicitly to the QoS that the network may offer.Consequently, QoS may also refer to more than one parameter, such as the bitrate, the jitter, the delay, the bit error rate and the throughput of the network.In this case, QoS is mentioned in terms of achievable bitrate.Summarizing, the scope of this case is to estimate network capabilities in terms of QoS, expressed in bitrate, based on current network measurements and context.It is worth mentioning at this point that by network measurements, measurements that refer to parameters holding information related to the network identity, its RAT, its configuration, its Received Signal Strength Identifier (RSSI) and its traffic, in terms of packets or Bytes, are considered.Moreover, context refers to those parameters that hold information such as time, location and the environmental conditions.

Learning user preferences
This problem targets at dynamically learning user preferences regarding the perceived QoS level per service/ application and potentially, the maximum acceptable price per service/application [31].The aim is to estimate the most likely user preferences/satisfaction for a specific service, QoS level, location and time zone.
The user profile has been modelled as a collection of parameters that can be classified in two main groups: observable and output parameters.Observable parameters include the currently running services/applications, corresponding QoS levels and associated QoS parameters, location, time zone and provided user feedback.User feedback is obtained in the following manner.The user initiates a specific service.At the initial stages it is considered that the user does not have any particular preferences.In other words, the user is initially considered to be indifferent between service provision choices.Every time the user obtains a service, a rating facility, embedded in the learning mechanism, allows the user to rate how much he/she liked the particular service provision.A Likert scale [35] is used for the rating, i.e. five different rating options are provided.In this way even non-technology expert users can provide the system with feedback on their preferences.The user is also given the choice to decline providing a rating.Output parameters depend on the value of observable parameters.Their value is dynamically updated over time.Output parameters represent the most appropriate configuration for the specific user in a certain context (user role/profile which encompasses certain location and time zone aspects).For the sake of simplicity the focus here is on one output parameter, namely the utility value.Other output parameters include for example the maximum acceptable price that the user is willing to pay in order to be provided a certain service at a specific QoS level.The utility value, a concept used in decision making theory and microeconomics, is used to represent user preferences for QoS levels when making use of a certain service.In other words, the utility value provides a ranking, by order of preference of service and QoS combinations.User preferences may vary depending on the contextual situation and may change over time.Therefore, the utility value is assumed to depend on a range of context-related parameters, as mentioned in the previous.More specifically, the utility value, apart from the service and QoS level, may be related to the location of the user, the time zone, and the feedback obtained from the user.The utility value for a QoS level may also implicitly correspond to a set of weights per QoS parameter (such as bit rate, delay, jitter, etc).

Network capabilities using SOM
An approach adopted in recent literature refers to learning network capabilities using the unsupervised learning technique known as SOMs.The elements that serve as inputs for discovering the QoS in this approach refers to parameters that are obtained given a configuration, e. g.RSSI, number of input Bytes, etc.On the other hand and as already stated, the parameter that expresses the QoS is the bit rate.
According to the technique applied here, SOM is used for mapping multidimensional data in a 2D-map.To do so, SOM requires a training process where the data are converted in data samples and, finally, in vectors which are mapped with respect to their resemblance.Each inserted vector too the training process updates the vector of the map that is closest to it according to Euclidean distance so as similar data samples to come closer to each others.As a result, similar vectors are mapped to the same cluster, i.e. group of vectors.Thus, the created map depicts the clustering of the data and the pattern of their relationship.
At this point, it is essential to clarify that the term "data sample" differs from the term "data" in the fact that a data sample consists of more than one data.In fact, each data sample is a combination of values, each of which refers to a different observed parameter.
Continuing on the analysis of the technique, as soon as the pattern has been recognised based on the resemblance of the vectors that were used for the training and the map has been trained and designed, a new data sample can be mapped on it with respect to the vector of the map that is closest to it when using Euclidean distance.Moreover, according to the SOM theory, data samples that belong in the same cluster are expected to be similar to each other.In our case, this means that the bit rate observed at the same time with the parameters that formulate one data sample of the cluster is expected to be the same with the respective bit rate of the other data samples of the same cluster.Thus, for inferring the network capacity all that is left to be done is to identify the cluster in which the new entry belongs.These last features of SOM technique also constitute the basis of the learning technique.Further information about the technique can be found in [15].

Network capabilities using Bayesian statistics
This approach is also devoted to the presentation of learning capabilities which facilitate the estimation of the network capabilities.In this approach, the mechanism is based on the correlation of candidate transmitter configurations with the QoS, in terms of bit rate, that is offered by the network given this configuration.In particular, the learning mechanism exploits the knowledge and the past experience by enforcing them with Bayesian statistics techniques suitable for reasoning about probabilistic relationships [36][37][38].
More specifically, since the goal is to associate different configurations of a transmitter with the bitrate, the probability to obtain a specific network capacity BR i given the configuration CFG i is calculated.This calculation and its frequent update constitute the basis of this technique.The update of these relies on approaches suggested in [36,[38][39][40].
To begin with, using the Shannon theorem and gathering the necessary information for each configuration makes it possible to calculate the available bit rate for each configuration.Furthermore, using all possible combinations of configurations and bit rate, conditional probabilities of the form Pr kj [BR k |CFG j ] can be calculated.These probabilities are then organised in conditional probability tables (CPTs) of the form of Table 1.In these CPTs each column represents a different configuration while each row corresponds to a different reference value of bit rate.These reference bit rates comprise the set of available BRs which in this case was selected to be discrete [41].
Finally, using the CPTs, it is easy to identify the most probable bit rate given a configuration.In fact, it will be the one which is associated to the highest conditional probability in the respective configuration column.Further information about the results and the mathematical background of the technique can also be found in [14].

User preferences using Bayesian statistics
The functionality for learning user preferences is realised with the use of Bayesian statistics concepts [39].The aim is to estimate the probability of the level of user satisfaction for a specific service and perceived QoS level, given a certain location and time zone.In other words, conditional probabilities for the utility value are calculated.More specifically, a method has been implemented according to which instantaneous estimations are updated by taking into account existing (historical) information on the user.The process of developing knowledge on user preferences can be roughly divided in two phases.The initial phase is collecting information on user preferences.The second phase deals with the estimation of future user preferences based on the information collected.More specifically, is assumed that values for the observable parameters are recorded for various instances (phases).These values constitute the "observable parameters evidence".Based on the observable parameters evidence, the instantaneous (conditional) probabilities for the utility value are calculated.The next step of this procedure is the calculation of adapted (conditional) probabilities through Equation ( 1), where n is the current instance, P adapted,n denotes the adapted probability estimation for instance n, P adapted, n-1 stands for the current instantaneous estimation and parameters w hist and w instant reflect the weights attributed to the historical and the current instantaneous estimation, respectively, with values in the interval (0,1).The latter also comply with Equation (2).
The calculated probabilities yield the most likely user preferences per QoS level, which in turn can be used as input in a decision-making mechanism for deciding, for instance, on the most appropriate configuration of user's device.An overview of the process for learning user preferences is depicted in Figure 1.More information regarding the mathematical formulation and the algorithm for the functionality for dynamically learning user preferences, as well as the data structures and elements it utilises can be found in [31].

Appraisal
Comparing the two approaches presented for learning network capabilities, Bayesian statistics offer also the possibility of online training.The latter abstracts the necessity of explicitly storing past observations, a feature that doesn't exist in SOM.As a result, the approach minimises the required memory capacity and thus it could also be applied from a user device perspective.If so, then the device would become for example capable of certifying that the offered network capability indeed reaches the value that is claimed by the network.
In addition, comparing the two problems that were identified and overviewed above, they are found to be interrelated.Their interrelation lies in the fact that the combination of a dynamical learning of context information and user preferences could be exploited in a later stage for the selection of the most appropriate network configuration.It is worth mentioning at this point that this mechanism is out of the scope of this article.

Conclusions
The article presented an overview of applications of machine learning for building knowledge on environment characteristics (context) and user preferences.In particular, the article provided an extended overview of related study in the field of learning in cognitive systems and focused on two problems of recent literature followed by an overview of the applied approaches.More specifically, learning mechanisms were presented for building knowledge on network capabilities in terms of QoS, and more specifically in terms of bitrate, and on user preferences, in terms of user satisfaction for the achieved QoS, given the service, the time and the location of the user.

Table 1
Example of CPT