 Research
 Open access
 Published:
A gametheoretic learning approach to QoEdriven resource allocation scheme in 5Genabled IoT
EURASIP Journal on Wireless Communications and Networking volumeÂ 2019, ArticleÂ number:Â 55 (2019)
Abstract
To significantly promote Internet of Things (IoT) development, 5G network is enabled for supporting IoT communications without the limitation of distance and location. This paper investigates the channel allocation problem for IoT uplink communications in the 5G network, with the aim of improving the quality of experience (QoE) of smart objects (SOs). To begin with, we define a mean opinion score (MOS) function of transmission delay to measure QoE of each SO. For the sumMOS maximization problem, we leverage a gametheoretic learning approach to solve it. Specifically, the original optimization problem is equivalently transformed into a tractable form. Then, we formulate the converted problem as a gametheoretical framework and define a potential function which has a nearoptimum as the optimization objective. To optimize the potential function, a distributed channel allocation algorithm is proposed to converge to the best Nash equilibrium solution which is the global optimum of maximizing the potential function. Finally, numerical results verify the effectiveness of the proposed scheme.
1 Introduction
The Internet of Things (IoT) is a system of humantoobject or objecttoobject connection that sensors, controller, mechanical and digital machines, objects, animals, or people are interrelated and transfer data over a network by using information technology [1, 2]. In IoT, a thing can be a person with wearable devices, an autonomous vehicle with sensors, a farm animal with biochip transponders, or any other smart objects (SOs) provided with the ability to transfer data over a network [3, 4]. The concept of IoT is first mentioned by Kevin Ashton in a presentation he made to Procter Gamble in 1999. At that time, the computers use the data they gathered with the help from human beings [5]. However, people have very limited time, attention, and accuracy, which results in that they are not very good at capturing data about things in reality. The enormous potential demand for things connection drive the rapid development of IoT. IoT SOs contain different types, in which some are sensitive for delay, some are need for high reliability, and some are lowpower and lowcost. Moreover, most of the IoT traffic is in the uplink and IoTsâ€™ messages are typically small in size and sparse in time. These characteristics of IoT SOs make their access to the network different from classical users, which brings the network a great challenge [6]. Therefore, providing satisfactory service for IoT applications with differentiated demands is an important field, and the requirement for ultrareliable lowlatency communications of IoT SOs is greatly emphasized.
5G heterogeneous networks are envisioned to play a key role in providing a promising infrastructure for the massive proliferation of IoT SOs and the corresponding services [7â€“11]. IoT SOs with very limited computing and storage capabilities are associated with access points of 5G network for cloud services and communications [12, 13]. To handle the massive connectivity and satisfy the requirements of ultrareliable lowlatency communications, 5G network supporting IoT communications requires huge spectrum resources or the improvement of spectral efficiency [14]. Moreover, the interference management problem is one of the key challenges in 5Genabled IoT, since the cochannel model that small cell base stations (SCBSs) overlayed on the covering area of macrocell base station (MBS) share the same frequency band is generally adopted in 5G network [15]. The resource allocation strategies are usually optimized to overcome the interference problem [16]. In particular, the requirement of ultrareliable lowlatency communications for IoT SOs is greatly emphasized, and thus, the performance enhancement of quality of experience (QoE) of IoT is a challenging and attractive research area. Motivated by achieving a realtime and reliable transmission of IoT, a QoEdriven resource allocation scheme is proposed in this paper.
Next, we give a brief review of the works related to our research. More related works on the efficient IoT support in 5G can be found in [17â€“19]. Yerrapragada and Kelley [17] investigated a perfect interference alignment scheme for multipleinput multipleoutput systems and applied it to a 5Genabled IoT architecture. Since intercell interference significantly degraded the performance of IoT communications, Dao et al. [18] proposed a novel algorithm for finding the most appropriate pair of IoT terminal or its associated BS to provide a relayassisted communication for the IoT terminal with poor signals in the intercell interference area. For IoT in cognitive 5G networks, the multiband cooperative spectrum sensing and resource allocation framework was presented in [19]. IoT communications in 5G network are expected to provide flexible delivery of broad services with a high QoE. Recently, the research on improving QoE of IoT SOs has attracted more and more attention [20â€“22]. Aminjavaheri et al. [20] presented an underlay control signaling method for ultrareliable lowlatency communication applications in an LTE network and analyzed its performance. Since the satisfaction of QoE becomes the major challenge in contentcentric IoT, the authors in [21] have analyzed lots of factors, i.e., content popularity and weight factor, which impact the resource allocation and how they subsequently influence the QoE. As a cloud resource, fog computing is rationally used for the delaysensitive services of IoT by minimizing resource underutilization and enhancing QoE [22].
Resource allocation in IoT is investigated in many literatures by using game theory [23â€“27]. Huang et al. [23] employed a cooperative game to model and analyze devicetodevice communication for achieving highperformance data transportation in the new cloudcentric IoT paradigm. Devicetodevice communication underlaying cellular networks was investigated in [24] to improve spectral efficiency, and a gametheoretic resource allocation scheme was designed by exploring the inherent competition of spectrum resource among users. The authors in [25] proposed Stackelberg game and manytomany matching to solve the multistage problems of pairing, resource pricing, and purchasing in threetier IoT fog networks. Although the proposed framework can achieve high performance, the optimal solution is ambiguous. In addition, matching theory was also used in [26] to find a stable IoT node pairing. In [27], the problem of efficiently and effectively securing IoT networks was investigated by carefully allocating security tools.
2 Methodology
Bearing the above in mind, we tend to leverage the gametheoretic learning algorithm to solve the resource allocation problem in 5Genabled IoT network. In this paper, we assume that there are some SOs that access to 5G network, and it is looking forward to achieving the effective deployment of IoT without considering the limitation of distance and location. Certainly, this is confronted with more challenges. SOs are usually sensitive to latency, which raises a higher demand for data transmission. However, the interference, deriving from the reuse of radio resource, greatly affects SOsâ€™ QoE in 5G network. The nonconvex and integer optimization objective brings a great challenge to achieve the rational allocation of resource. Moreover, a distributed algorithm is desired for various SOs with different service requirements. In this paper, we study the channel allocation problem by applying game theory to analyze the distributed decisions made by SOs, and perform the learning algorithm to maximize the sumMOS of SOs in the 5G network. The main contributions of our work are summarized as follows:

We consider the QoE of all SOs in the 5Genabled IoT network as the objective function. A MOS standard in terms of the data transmission delay is proposed to measure QoE of various services. Then, an equivalent form is derived to replace the original optimization objective.

We use the gametheoretic model to formulate the modified optimization problem in which the designed potential function is an approximation of the optimization objective. Then, we prove it to be an exact potential game, whose best Nash equilibrium (NE) point is a nearoptimal solution of the original optimization problem.

To find the best NE point, we design a distributed learning algorithm which can asymptotically converge to the global optimal solution that maximizes potential function with arbitrary high probability.
The rest of this paper is organized as follows. In Section 3, the system model and the QoE metric are presented. Then, the proposed channel allocation problem is equivalently converted into a tractable problem. According to the converted optimization problem, Section 4 establishes a game framework and then investigates the properties of the equilibrium. In Section 5, we propose an algorithm and the asymptotical optimality is verified. Finally, numerical results and discussion are presented in Section 6, and Section 7 concludes this paper.
3 System model and problem formulation
3.1 System model
We consider an uplink 5Genabled IoT network consisting of B BSs, K diverse SOs (i.e., smart phone, smart meter, wearable device, and monitoring device) and N orthogonal channels, illustrated by Fig. 1. The set of all BSs is denoted by \(\mathcal {B}=\{1,2,\dots,B\}\) and the set of all SOs is represented by \(\mathcal {K}=\{1,2,\dots,K\}\). Suppose that the associations between BSs and SOs have been predetermined and let \(b_{k}\in \mathcal {B}\) be the BS at the service of SO \(k\in \mathcal {K}\). Moreover, suppose that each SO chooses a channel for data transmission and the bandwidth of each orthogonal channel is the same. We denote the set of the channels by \(\mathcal {N}=\{1,2,\dots,N\}\). Let a_{k} be the channel allocation strategy of SO \(k\in \mathcal {K}\) and \(\mathcal {A}_{k}\) be the set of all possible selections for k. Thus, a=(a_{1},a_{2},â€¦,a_{K}) is the channel selection profile for all SOs and \(\mathcal {A}=\mathcal {A}_{1}\times \mathcal {A}_{2}\times \dots \times \mathcal {A}_{K}\) is the space of all possible selections for all SOs.
The channel from SO \(k\in \mathcal {K}\) to BS \(b_{k}\in \mathcal {B}\) is supposed to be flat fading and the channel gain is denoted by \(h_{k,b_{k}}\). Let p_{k} be the transmit power of SO k. Then, the received signaltointerferenceplusnoise ratio (SINR) of BS b_{k} from SO k is given by:
where \({\sigma _{k}^{2}}\) is the power of the additive white Gaussian noise at the BS associated by SO k, and the indicator variable \(\mathbb{1}_{\{a_{l}=a_{k}\}}\in \{0,1\}\) is used to denote that the channel allocated to SO k is occupied by SO l, i.e., \(\mathbb{1}_{\{a_{l}=a_{k}\}}=1\), or not occupied, i.e., \(\mathbb{1}_{\{a_{l}=a_{k}\}}=0\).
In IoT, different SOs perform different applications, i.e., picture/video collection, game, file upload, and control information transmission. When performing different applications, SOs need to transfer different sizes of data for purpose of obtaining the same user experience in the same period of time. Mathematically, the set of service types required by SOs is represented as \(\mathcal {S}=\{1,2,\ldots,S\}\), and \(s_{k}\in \mathcal {S}\) is denoted as the performed service type of SO k. Let \(C_{s_{k}}\) be the amount of data required from SO k during a given period of time. Hence, the uplink transmission time from SO k is described as follows.
where R_{k}=B log2(1+Î³_{k}) is the achievable rate of BS b_{k} from SO k and B is the bandwidth of each channel.
3.2 QoE metric
To measure QoE of various services, we propose a mean opinion score (MOS) standard, ranging from 1 to 5, in terms of the data transmission delay. Letting \(\tau _{1,s_{k}}\) and \(\tau _{2,s_{k}}\) be respectively the most satisfied delay and the maximal tolerable delay based on the different service types, the MOS is defined as follows.
where \(\alpha =\frac {4}{\ln \tau _{2,s_{k}}\ln \tau _{1,s_{k}}}\) and \(\beta =\tau _{1,s_{k}} \left (\frac {\tau _{1,s_{k}}} {\tau _{2,s_{k}}} \right)^{\frac {1}{4}}\). Figure 2 shows the curve variation tendency of MOS with the change of delay. The MOS values range from 1 to 5, where MOS=1 represents an unacceptable QoE for SOs and MOS=5 reflects an excellent user experience. In general, SOs have different tolerances for delay with regard to the different services. The application characteristics of SOs are determined by \(\tau _{1,s_{k}}\) and \(\tau _{2,s_{k}}\).
3.3 Problem formulation and transformation
In this paper, to improve the overall transmission performance of the 5Genabled IoT network by the optimization of channel allocation, we consider the sumMOS maximization problem, which is mathematically expressed as:
The problem P is a nonconvex and discrete optimization problem, for which finding its solution is expected to be very challenging. In what follows, We convert it into a tractable form. For notational convenience, we first define \(U_{k}=p_{k}g_{k,b_{k}}\phantom {\dot {i}\!}\) and \(I_{k}(a_{k},\mathbf {a}_{k}) = \sum _{l\in \mathcal {K} \setminus \{k\}} p_{l} g_{l,b_{k}} \mathbb{1}_{\{a_{l}=a_{k}\}} + {\sigma _{k}^{2}}\), where a_{âˆ’k} is the channel selection profile of all the SOs except SO k. Then, we have
By using firstorder approximation of Taylor expansion at the point \(\phantom {\dot {i}\!}\mathbf {a}^{'}, T_{k}\) is expanded as \(\tilde {T}_{k}\), namely,
where \(\Delta _{1,k} = \frac {BC_{s_{k}}U_{k}} {\ln 2{T_{k}^{2}} \left (\mathbf {a}^{'}\right) \left ({I_{k}^{2}} \left (\mathbf {a}^{'} \right) + U_{k}I_{k} \left (\mathbf {a}^{'}\right) \right)}\phantom {\dot {i}\!}\) and \(\Delta _{2,k}=T_{k}\left (\mathbf {a}^{'}\right)\Delta _{1,k}I_{k}\left (\mathbf {a}^{'}\right)\phantom {\dot {i}\!}\).
According to (6), (3) is expanded at the point \(\mathbf {a}^{'}\phantom {\dot {i}\!}\), namely,
where \(\tilde {\tau }_{1,s_{k}}=\frac {\tau _{1,s_{k}}\Delta _{2,k}\left (\mathbf {a}^{'}\right)}{\Delta _{1,k}\left (\mathbf {a}^{'}\right)}\), \(\tilde {\tau }_{2,s_{k}}=\frac {\tau _{2,s_{k}}\Delta _{2,k}\left (\mathbf {a}^{'}\right)}{\Delta _{1,k}\left (\mathbf {a}^{'}\right)}\), \(\Delta _{4,k}=\alpha \ln \frac {\tau _{1,s_{k}}+\tau _{2,s_{k}}T_{k}\left (\mathbf {a}^{'}\right)}{\beta }\Delta _{3,k}I_{k}\left (\mathbf {a}^{'}\right)\), and \(\Delta _{3,k}=\frac {\alpha \beta }{T_{k}\left (\mathbf {a}^{'}\right)\left (\tau _{1,s_{k}}+\tau _{2,s_{k}}\right)}\).
By comparing (6) with (3), it is noted that (4) and \(\sum _{k\in \mathcal {K}} \mathrm {\widetilde {MOS}}_{k} \left (\mathbf {a}^{'},\mathbf {a}\right)\phantom {\dot {i}\!}\) have the same solution when \(\mathbf {a}^{'} = \mathbf {a}^{\ast }\phantom {\dot {i}\!}\) where a^{âˆ—} is the optimal solution to (4). Therefore, the original problem (4) is equivalently transformed into the following optimization problem.
According to the above definition, it is noted that MOS of each SO depends on not only its channel selection strategy, but also on other SOsâ€™ strategies. If too many SOs occupy the same channel to transmit data, the transmission rates are relatively low, and then the low MOSs lead to lowefficient data processing and put pressure on the data storage. Due to the interdependent and interactional relationship among different SOs, we adopt game theory to model and analyze the channel allocation strategies of SOs in \(\tilde {P}\). Furthermore, it is difficult for each SO to obtain other information of SOs with different types, which motivates us to propose a distributed learning algorithm for achieving the equilibrium solution of the game modeled from the channel access problem.
4 Gametheoretic analysis
In this section, we study the distributed optimization of the channel access problem by using game theory. Every SO is regarded as a player in the game, and the channel access game is defined as \(\mathcal {G}_{\mathbf {a}^{'}} = \{\mathcal {K}, \{\mathcal {A}_{k}\}_{k\in \mathcal {K}}, \{u_{k}\}_{k\in \mathcal {K}}\}\), where \(\mathcal {K}\) is the player (SO) set, \(\mathcal {A}_{k}\) is the action space of player k, and u_{k} is the utility function of player k. The action space of each player is exactly the available channel set. To build a bridge between \(\mathcal {G}_{\mathbf {a}^{'}}\) and problem (8), we give the definition of utility function of \(k,k\in \mathcal {K}\) as follows.
Then, we investigate the properties of \(\mathcal {G}_{\mathbf {a}^{'}}\).
Theorem 1
If the variable \(\phantom {\dot {i}\!}\mathbf {a}^{'}\) is predetermined and the potential function \(\phantom {\dot {i}\!}\phi (\mathbf {a}) = \sum _{l\in \mathcal {K}} \left (\Delta _{3,l}I_{l} (\mathbf {a}) + \Delta _{4,l} \right)\), \(\phantom {\dot {i}\!}\mathcal {G}_{\mathbf {a}^{'}}\) is an exact potential game which exists at least one NE point a^{âˆ—}. Moreover, the nearoptimal solution to the proposed channel access problem (4) is a pure strategy NE of \(\phantom {\dot {i}\!}\mathcal {G}_{\mathbf {a}^{\ast }}\).
Proof
The potential function of \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\) is defined as follows:
Then, (10) is rewritten as (11).
Suppose that an arbitrary player k unilaterally changes its strategy from a_{k} to \(\bar {a}_{k}\), we can obtain the following equation based on (11):
The equation above shows that the change in any single playerâ€™s utility function due to unilateral strategy deviation results in exactly the same amount of change in the potential function. Therefore, according to Definition 2.2 in [24, 28], \(\phantom {\dot {i}\!}\mathcal {G}_{\mathbf {a}^{'}}\) is an exact potential game with potential function Ï•(a). As a kind of potential games, \(\phantom {\dot {i}\!}\mathcal {G}_{\mathbf {a}^{'}}\) has some desirable properties, one of which is that \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\) exists at least one NE point.
Although each player in \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\) focuses on maximizing its own utility value, we characterize the achievable performance of NE points of \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\) by exploiting the inherent structure of the exact potential game.
Denote a^{opt} as an optimal channel allocation profile that maximizes the potential function Ï•, i.e.:
It has been proved that all NEs are the maximizers of the potential function Ï•, either locally or globally, for any exact potential game [28]. The best equilibrium point is a^{opt}. Obviously, a^{opt} is a nearoptimal solution of (4) when \(\phantom {\dot {i}\!}\mathbf {a}^{'}=\mathbf {a}^{\ast }\).
Hence, Theorem 1 is proved. â–¡
5 Decentralized algorithm for achieving the best NE
According to the above theoretic analysis of \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\), an approach of achieving the best NE of \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\) is also the approach to obtain a nearoptimal solution of problem (4). In this section, we propose a distributed channel allocation learning algorithm to solve (4) in a distributed manner.
5.1 Algorithm description
Taking into account the above analysis, we give a detailed procedure of solving the channel allocation problem, labeled as Algorithm 1. Algorithm 1 includes twotier loops of inner loop and outer loop, in which the variable \(\phantom {\dot {i}\!}\mathbf {a}^{'}\) is updated until the nearoptimal solution is achieved in inner loop. Two stages of inner loop in this improved algorithm are presented as follows: (a) In step 1, one player is randomly selected from the set of updatable players to update its strategy. Then, the selected player chooses an action and gets feedback in the form of the resulting state and an associated reward. (b) In step 2, the selected player updates its alternative action selection based on (14). In Algorithm 1, the stop criterion is set to be the case that the change of the potential function is trivial.
The proposed algorithm is not easily trapped in an undesirable NE when the game has multiple NE points because of its some favorable properties: (a) it is an uncoupled algorithm, namely, each player only needs to acquire the information of channel selection actions; (b) it can achieve the best NE which is the global optimum of maximizing potential function.
5.2 Convergence and optimality analysis
In order to investigate the actual performance of Algorithm 1, Theorems 2 and 3 characterize its convergence and optimality.
Theorem 2
If all players perform the proposed distributed channel allocation learning algorithm with the fixed \(\phantom {\dot {i}\!}\mathbf {a}^{'}\), the network converges to an unique stationary distribution of playersâ€™ strategy profile, which is given by:
Proof
Let z(t) be the state of channel allocations at the tth iteration of Algorithm 1 with the fixed \(\phantom {\dot {i}\!}\mathbf {a}^{'}\). Obviously, z(t) is an irreducible and aperiodic Markov process. Then, we will verify that the process determined by the distribution (14) is reversible. It is to say that for \(\forall \mathbf {a},\bar {\mathbf {a}}\in \mathcal {A}\), we have:
where \(P(\bar {\mathbf {a}}\mathbf {a})\) is the the transition probability from state a to \(\bar {\mathbf {a}}\).
When \(\mathbf {a}=\bar {\mathbf {a}}\), (15) clearly holds. When \(\mathbf {a}\neq \bar {\mathbf {a}}\), one player, say k, changes its working channel, which results in that one element of the network state has been changed, i.e., a=(a_{k},a_{âˆ’k}) and \(\bar {\mathbf {a}}=(\bar {a}_{k},\mathbf {a}_{k})\). It is easy for us to check that:
where c=c_{1}c_{2}, \(c_{1} = \frac {1} {\mathcal {K} \sum _{\tilde {\mathbf {a}} \in \mathcal {A}}(1+\lambda)^{\gamma \phi (\tilde {\mathbf {a}})}}\), and \(c_{2} = \frac {1}{\max \left \{ (1+\lambda)^{\gamma u_{k} (\mathbf {a})}, (1+\lambda)^{\gamma u_{k} \left (\bar {a}_{k},\mathbf {a}_{k}\right)} \right \}}\).
According to the symmetry, we have:
By substituting (12) into (16), we can obtain:
Thus, we can derive that:
which is the balanced equation of Markov process.
Hence, Theorem 2 is proved. â–¡
Theorem 3
If the variable \(\phantom {\dot {i}\!}\mathbf {a}^{'}\) is fixed, the inner loop of Algorithm 1 converges to the best NE point of \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\) with an arbitrarily high probability when Î³ is sufficiently large. Therefore, the MOS level of the IoT network is approximately maximized when \(\mathbf {a}^{'}\phantom {\dot {i}\!}\) is the NE point.
Proof
It is noted from Theorem 1 that a^{opt} is represented as an optimal channel allocation profile that maximizes the potential function Ï•, which is also the best NE of \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\).
According to Theorem 2, the proposed algorithm converges to a unique stationary distribution. When Î³ is sufficiently large, \((1+\lambda)^{\gamma \phi \left (\mathbf {a}^{\text {opt}} \right)} \gg (1+\lambda)^{\gamma \phi (\mathbf {a})}, \forall \mathbf {a}\in \mathcal {A}\setminus \{\mathbf {a}\}\). According to (14), the unique stationary distribution of playersâ€™ strategy profile is (0,â€¦,0,1,0,â€¦,0), where 1 denotes the probability of the optimal channel allocation solution and the probabilities of other nonoptimal solutions are all 0. Thus:
which means that the proposed learning algorithm converges to the best NE of \(\mathcal {G}_{\mathbf {a}^{'}}\phantom {\dot {i}\!}\) with an arbitrarily high probability. When \(\phantom {\dot {i}\!}\mathbf {a}^{'}=\mathbf {a}^{\text {opt}}\), the NE obtained by the proposed algorithm is a nearoptimum solution to problem (8). Then, the MOS level of the IoT network is approximately maximized when \(\phantom {\dot {i}\!}\mathbf {a}^{'}\) is the NE point.
Thus, the proof is completed. â–¡
6 Simulation results and discussion
In this section, numerical simulations are performed by Matlab software to validate the efficiency and performance of our proposed algorithm for solving the channel allocation problem of IoT uplink communications over cellular networks.
6.1 Scenario setup
We consider one MBS with a hexagonal coverage area where there are randomly layouts of 2 SCBSs. For convenience, we assume that there are 3 SOs randomly located in each SCBS and the other 10 SOs in the coverage area of MBS. Here, we suppose that each SO has the same uplink transmission power which is set to 23 dBm. Accordingly, suppose that the total 5 MHz bandwidth in this heterogeneous network constitutes N=10 channels with each same bandwidth 487.5 kHz. Each SO chooses 1 channel for transmission. Rayleigh fading model is considered in the simulation and \(h_{b_{c}}\) is the link gain from SO d to BS b_{c}, which is expressed as \(\phantom {\dot {i}\!}h_{b_{c}}=\xi _{b_{c}}\left (L_{b_{c}}\right)^{\theta }\), where \(L_{b_{c}}\) is the distance between SO d and BS b_{c}, \(\xi _{b_{c}}\) denotes the channel fading component and Î¸ is the path loss exponent. The noise power is set to Ïƒ^{2}=âˆ’ 174 dBm/Hz. In the following simulations, the simulation results are obtained by 400 independent trials and those parameters involved are optimized by experiments.
6.2 Convergence behavior and optimality of this algorithm
In this subsection, we first investigate the convergence behavior comparison between Algorithm 1 and best response dynamic (BRD). It is shown from Fig. 3 that Algorithm 1 and BRD can respectively converge to two stable points as the number of iterations increases. Compared with BRD, Algorithm 1 has a faster convergence speed and achieves a better solution. It is supported with the proved fact that BRD can only converge to one NE of the potential game which may be not the best NE. Conversely, our algorithm can find the the best NE which is also the optimal solution of maximizing potential function. Therefore, the proposed algorithm is distributed and can obtain a better convergence performance. Figure 4 plots the changing curves of the optimization objectives in problems P and \(\tilde {P}\) as the number of iterations increases by performing Algorithm 1 with fixed \(\mathbf {a}^{'}\phantom {\dot {i}\!}\). It is shown from Fig. 4 that sumMOS in problem P gradually increases and converges eventually along with the increase of iteration times, which is consistent with the variation tendency of sum\(\widetilde {\text {MOS}}\) in problem \(\tilde {P}\). This indicates that the increase of sum\(\widetilde {\text {MOS}}\) by selecting better channel allocation strategy profile also causes the improvement of sumMOS. Although sum\(\widetilde {\text {MOS}}\) continues to increase in the latter process, the value of sumMOS is unchangeable. Since MOS is a piecewise function, the better solution for \(\tilde {P}\) cannot further enhance the performance of P, which implies that multiple optimal solutions exist.
In the following, we evaluate the MOS performance of each SO by preforming Algorithm 1. From Fig. 5, it is worth noting that Algorithm 1 can maintain a better SO fairness with respect to MOS performance by taking into account the impact of the interference generated by each SO on the entire network. Algorithm 1 is proposed to find the best NE of the channel access game \(\phantom {\dot {i}\!}\mathcal {G}_{\mathbf {a}^{'}}\), which achieves an approximately equal utility value for each SO shown in Fig. 5. Moreover, the fairness among SOs with respect to MOS or \(\widetilde {\text {MOS}}\) performance is guaranteed. Figure 6 plots the changing curve of sumMOS as the number of iterations increases by preforming Algorithm 1. It is shown that Algorithm 1 can improve the QoE of SOs. However, our proposed algorithm cannot guarantee convergence to the global optimal solution of \(\tilde {P}\) since the potential function in \(\phantom {\dot {i}\!}\mathcal {G}_{\mathbf {a}^{'}}\) is different from the optimization objective in \(\tilde {P}\) where \(\widetilde {\text {MOS}}\) is a piecewise function. The best NE a^{âˆ—} of \(\mathcal {G}_{\mathbf {a}^{\ast }}\phantom {\dot {i}\!}\), i.e., the maximum of potential function, is obtained by performing Algorithm 1 which is only the nearoptimum of P. It is noted from Fig. 6 that sumMOS in P can eventually converge to a fixed point as the number of iterations increases and is close to the maximum value of sumMOS. This indicates that our approach provides high performance for solving this difficult nonconvex optimization problem.
7 Conclusion
In this paper, we investigated the channel allocation problem in 5Genabled IoT, by using a gametheoretic learning algorithm, to improve the QoE of SOs. In order to measure the QoE of SOs in IoT, we first defined a MOS function. Then, we proposed the exact potential game to formulate this optimization problem, in which the potential function was designed by approximatively converting the objective function into a tractable form. It was proved that the exact potential game existed the best NE which was a near optimization solution of the channel allocation problem. Aiming at the proposed game, we designed a distributed learning algorithm and proven it can converge to the best NE with an arbitrarily high probability.
Abbreviations
 BRD:

Best response dynamic
 IoT:

Internet of things
 MBS:

Macrocell base station
 MOS:

Mean opinion score
 NE:

Nash equilibrium
 QoE:

Quality of experience
 SCBSs:

Small cell base stations
 SINR:

Signaltointerferenceplusnoise ratio
 SOs:

Smart objects
References
M. A. M. Albreem, A. A. ElSaleh, M. Isa, W. Salah, M. Jusoh, M. M. Azizan, A. Ali, in 2017 IEEE 4th International Conference on Smart Instrumentation, Measurement and Application (ICSIMA). Green Internet of things (IoT): an overview, (2017), pp. 1â€“6.
W. Tan, M. Matthaiou, S. Jin, X. Li, Spectral efficiency of dftbased processing hybrid architectures in massive mimo. IEEE Wirel. Commun. Lett.6(5), 586â€“589 (2017).
A. Musaddiq, Y. B. Zikria, O. Hahm, H. Yu, A. K. Bashir, S. W. Kim, A survey on resource management in IoT operating systems. IEEE Access. 6:, 8459â€“8482 (2018).
W. Tan, S. Jin, C. K. Wen, T. Jiang, Spectral efficiency of multiuser millimeter wave systems under single path with uniform rectangular arrays. EURASIP Wirel, J., Commun. Netw.2017(1), 181 (2017).
X. Sun, N. Ansari, Dynamic resource caching in the IoT application layer for smart cities. IEEE Internet Things J.5(2), 606â€“613 (2018).
Z. Qin, J. A. McCann, in 2017 IEEE Global Communications Conference. Resource efficiency in lowpower widearea networks for IoT applications, (2017), pp. 1â€“7.
B. Khalfi, B. Hamdaoui, M. Guizani, Extracting and exploiting inherent sparsity for efficient IoT support in 5G: Challenges and potential solutions. IEEE Wirel. Commun.24(5), 68â€“73 (2017).
C. Li, Y. Li, K. Song, L. Yang, Energy efficient design for multiuser downlink energy and uplink information transfer in 5G. Sci. China Inf. Sci.59(2), 1â€“8 (2016).
J. Yuan, S. Jin, W. Xu, W. Tan, M. Matthaiou, K. K. Wong, Usercentric networking for dense CRANs: highSNR capacity analysis and antenna selection. IEEE Trans. Commun.65(11), 5067â€“5080 (2017).
M. Zhang, W. Tan, J. Gao, S. Jin, Spectral efficiency and power allocation for mixedADC massive MIMO system. China Commun.15(3), 112â€“127 (2018).
W. Tan, D. Xie, J. Xia, W. Tan, L. Fan, S. Jin, Spectral and energy efficiency of massive MIMO for hybrid architectures based on phase shifters. IEEE Access. 6:, 11751â€“11759 (2018).
H. Ji, S. Park, J. Yeo, Introduction to ultra reliable and low latency communications in 5G (2017). arXiv preprint arXiv:1704.05565.
C. Li, K. Song, D. Wang, F. Zheng, L. Yang, Optimal remote radio head selection for cloud radio access networks. Sci. China Inf. Sci.59(10), 102315:1â€“102315:12 (2016).
S. Li, Q. Ni, Y. Sun, G. Min, S. AlRubaye, Energyefficient resource allocation for industrial cyberphysical IoT systems in 5G era. IEEE Trans. Ind. Inform.4(6), 2618â€“ 28 (2018).
C. Li, S. Zhang, P. Liu, F. Sun, J. M. Cioffi, L. Yang, Overhearing protocol design exploiting intercell interference in cooperative green networks. IEEE Trans. Veh. Technol.65(1), 441â€“446 (2016).
H. Dai, Y. Huang, J. Wang, L. Yang, Resource optimization in heterogeneous cloud radio access networks. IEEE Commun. Lett.22(3), 494â€“497 (2018).
A. K. Yerrapragada, B. Kelley, in 2017 12th System of Systems Engineering Conference (SoSE). An IoT self organizing network for 5G dense network interference alignment, (2017), pp. 1â€“6.
N. N. Dao, M. Park, J. Kim, Resourceaware relay selection for intercell interference avoidance in 5G heterogeneous network for internet of things systems. Futur. Gener. Comput. Syst. (2018).
W. Ejaz, M. Ibnkahla, Multiband spectrum sensing and resource allocation for IoT in cognitive 5G networks. IEEE Internet Things J.5(1), 150â€“163 (2018).
A. Aminjavaheri, A. RezazadehReyhani, R. Khalona, Underlay control signaling for ultrareliable lowlatency IoT communications (2018). arXiv preprint arXiv:1711.02248.
X. He, K. Wang, H. Huang, T. Miyazaki, Y. Wang, S. Guo, Green resource allocation based on deep reinforcement learning in contentcentric IoT. IEEE Trans. Emerg. Top. Comput.9(3), 1â€“15 (2018).
M. Aazam, M. StHilaire, C. H. Lung, I. Lambadaris, in 2016 23rd International Conference on Telecommunications (ICT). MeFoRE: QoE based resource estimation at Fog to enhance QoS in IoT, (2016), pp. 1â€“5.
J. Huang, Y. Yin, Q. Duan, H. Yan, in 2015 3rd International Conference on Future Internet of Things and Cloud. A gametheoretic analysis on contextaware resource allocation for devicetodevice communications in cloudcentric internet of things, (2015), pp. 80â€“86.
H. Dai, Y. Huang, R. Zhao, J. Wang, L. Yang, Resource optimization for devicetodevice and small cell uplink communications underlaying cellular networks. IEEE Trans. Veh. Technol.67(2), 1187â€“1201 (2018).
H. Zhang, Y. Xiao, S. Bu, D. Niyato, F. R. Yu, Z. Han, Computing resource allocation in threetier IoT Fog networks: a joint optimization approach combining stackelberg game and matching. IEEE Internet Things J.4(5), 1204â€“1215 (2017).
S. F. Abedin, M. G. R. Alam, N. H. Tran, C. S. Hong, in 2015 17th AsiaPacific Network Operations and Management Symposium APNOMS). A Fog based system model for cooperative IoT node pairing using matching theory, (2015), pp. 309â€“314.
A. Rullo, D. Midi, E. Serra, E. Bertino, in 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS). Strategic security resource allocation for internet of things, (2016), pp. 737â€“738.
H. Dai, Y. Huang, C. Li, S. Li, L. Yang, Energyefficient resource allocation for devicetodevice communication with WPT. IET Commun.11(3), 326â€“334 (2017).
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 61801243), by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 18KJB510026), and by the Foundation of Nanjing University of Posts and Telecommunications (Grant No. NY218124).
Availability of data and materials
Mostly, I got the writing material from different journals as presented in the references. A MATLAB tool has been used to simulate my concept.
Author information
Authors and Affiliations
Contributions
All authors contributed significantly to the research work presented in this paper. HD had a leading role in the formulation and solution of the considered optimization problem, while performing a detailed evaluation and analysis of the developed algorithm, through conducting an extensive set of simulations. HD and HZ completed the writing and formatting of the paper. WW did the experiments and simulations. BW helped in finalizing the solution and amending the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Dai, H., Zhang, H., Wu, W. et al. A gametheoretic learning approach to QoEdriven resource allocation scheme in 5Genabled IoT. J Wireless Com Network 2019, 55 (2019). https://doi.org/10.1186/s1363801913597
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1363801913597