Skip to main content

Behavioral learning game for socio-physical IoT connections


Game theory is an innovative idea to understanding human behaviors from economics to political science. However, due to the bounded rationality of game players, game theory alone cannot fully explain human behavior and should complement other key concepts championed by the behavioral disciplines. This paper provides a foundation for decision-making process from the viewpoint of game theory and behavioral science approach. First, we develop a new behavioral learning game model to examine how bounded rationality is exhibited in the game player’s cognitive capabilities. Second, we apply the developed game model to operate the socio-physical Internet of Things (IoT) system. Finally, we study how to effectively negotiate between players who, though interested in their own welfare, are also willing to consider other players in the IoT system. The main contribution of our work lies in the fact that we shed some new light on the interplay between the game player’s selfishness and the public interest. We believe that our approach will open a new door to exploring the impact of social behavior on networking.

1 Introduction

The rapid development of Internet of Things (IoT) technology makes it possible for connecting various smart objects together through the Internet and providing more data interoperability methods for application purpose. Typically, IoT is expected to offer advanced connectivity of devices, systems, and services that goes beyond machine-to-machine communications and covers a variety of protocols, domains, and applications. The interconnection of these smart devices is expected to usher in automation in nearly all fields while reducing the need of human interventions. The future vision of IoT has evolved due to a convergence of multiple technologies, ranging from wireless communication to micro-electromechanical systems. This means that all traditional technologies have contributions to enable the IoT [13].

Recently, information cascades over IoT systems can deeply influence the patterns of social behaviors. Indeed, we have become increasingly aware of the fundamental role of the coupled socio-physical network as a medium for the spread of information. Socio-physical approach considers jointly the interaction and integration of the social and physical views of IoT systems and yields meaningful qualitative and quantitative differences when compared with approaches that focus on the social and physical views in isolation. During IoT system operations, this collaborative approach provides services to both people and technical systems while realizing the vision of future pervasive computing environments [4, 5].

In IoT systems, individual devices locally make control decisions to maximize their profits. This situation can be seen a game theory problem. Game theory is a decision-making process between independent decision-making players as they attempt to reach a joint decision that is acceptable to all participants. The standard application of game theory requires each player to form a utility function that quantifies the benefit that accrues to it as a consequence of the actions that it and all other players may take. In the traditional game theory, a solution concept is a rule that defines what it means for a decision vector to be acceptable to all players in the light of the conflict/cooperation environment [6].

Usually, game theory implicitly assumes that players have complete information about the game situation throughout the time period. Therefore, many game models assume that players are on perfect rationality and can have enough abilities to act according to their preferences. However, this assumption is obviously not satisfied under the real-world environment; experiments have shown that players do not always act intelligently [7]. In addition, traditional game methodologies attempt to optimize individual player’s profit. However, the game players’ group interest is generally not optimized and perhaps not even well served if each individual player optimizes his own behavior. This is because optimization is an individual activity that is based on the doctrine that each individual is committed to maximizing its own satisfaction without concern for the welfare of others [8].

To overcome the limitations of traditional game theory, behavioral science is a newly discovered research approach. It is the idea that in decision-making, the rationality of individuals is limited by the information they have, the cognitive limitations of their minds, and the finite amount of time they have to make a decision. In real-world situations, decision-makers lack the ability and resources to arrive at the optimal solution. Therefore, the decision-makers are seeking a satisfactory solution rather than the optimal one. It is an alternative way of traditional optimal decision-making, which views decision-making as a fully rational process of finding an optimal choice given the information available. To understand the complex social behavior, behavioral science has become very popular and of growing interest in social sciences [9].

In this work, we develop a new behavioral learning game model and also propose a novel solution concept. To develop a new game model, we take into consideration the bounded rationality of decision-makers in our solution searching process. In addition, we draw on the concept of selfishness and social welfare trade-offs. To strike an appropriate performance balance between contradictory requirements, the proposed game model employs a learning perspective and investigates some of the reasons and probable lines for justifying players’ behaviors. Therefore, the key feature of our model is a real-world practicality. In contrast, traditional game assumptions for perfectly optimal decisions are often not feasible in practice.

A newly developed behavioral learning game involves the idea that players take reasonable strategies that may lead to suboptimal decision-making. Therefore, players engage in mapping the decision strategies that players use in order to help increase the effectiveness of decision-making process. During the behavioral learning game operations, players iteratively negotiate with each other and adaptively modify their strategy selections in an attempt to reach a mutually acceptable decision vector. As a new solution concept of behavioral learning game, the observed agreement with behavioral development is introduced. Such an agreement in decision-making, if reached, is called behavioral learning equilibrium. During game operations, game players may not realize how their learning experience would have been different if they had chosen to behave differently. Therefore, they may adjust their behavior when experience contradicts their beliefs. The novel solution concept of behavioral learning equilibrium presents a dynamic learning interpretation to justify their behaviors.

Recently, IoT technology has been widely explored in many fields. IoT is the interconnection of uniquely identifiable network agents or smart objects within the existing internet infrastructure. The rapid IoT development makes it possible for connecting various smart objects together through device-to-device communications and providing more data interoperability methods for application purpose [10]. However, there is much room left to exploit diverse social interactive relationships among agents for networking optimization, and it is of great interest to explore the continuum space [1113]. Motivated by the facts presented in the above discussion, in this paper, we design a new IoT communication control scheme based on the behavioral learning game model. With considering the socio-physical relationship, we take into account both the agents’ social relationships and physical coupling.

In the proposed scheme, a key observation is that network agents are coupled not only in the physical domain due to the physical relationship but also in the social domain due to the social ties among agents. Therefore, under socially connected dynamic IoT environments, we can formulate the agents’ payoff by the combination of a personal utility and social group utility functions and make agents select their strategies by considering the interaction of selfishness and social welfare tradeoffs. Based on the autonomous agent behaviors, the proposed algorithm is implemented as a behavioral learning game to approximate the behavioral learning equilibrium status.

The rest of this paper is organized as follows. In Section 2, we review the related work. In Section 3, we familiarize the reader with the basics of behavioral learning game model and define the solution concept of behavioral learning equilibrium. In Section 4, we explain in detail the developed IoT system management algorithm based on the behavioral learning game model. We present the experimental results in Section 5 and compare the performance to other existing schemes [5, 11]. Finally, we give our conclusion and future work in Section 6.

2 Related work

Over the years, a lot of state-of-the-art research work on the IoT system operation has been conducted. The Utility-Based Socio-Physical Interactions (USPI) scheme [5] is a framework for addressing the interplay between online social networks and communications by exploiting principles from the theory of utility-based engineering and elements from social network analysis. To improve the IoT system performance, the USPI scheme aims at a holistic design model to allow the joint development of improved resource management mechanisms. This scheme provides a promising direction for improving the performance and dealing with various problems emerging in both the social and communication parts of networks, thus strengthening the emerging paradigm shifts in the design of future IoT systems and applications [5].

The Social Group Utility Maximization (SGUM) scheme [11] is developed for cooperative networking that takes into accounts both social relationships and physical coupling among network users. Instead of maximizing its individual utility or the overall network utility, each user aims to maximize its social group utility that hinges heavily on its social ties with other users in IoT systems. The SGUM scheme provides rich modeling flexibility and spans the continuum space between non-cooperative game and network utility maximization [11].

All the earlier work has attracted a lot of attention and introduced unique challenges to efficiently handle the IoT communication control problem. However, these existing schemes were one-sided protocols and cannot adaptively respond the current IoT system conditions. Therefore, they did not provide suitable solutions under different practical constraints. Compared to these schemes [5, 11], the proposed scheme attains better performance during the IoT system operations.

3 Behavioral learning game and behavioral learning equilibrium

This paper presents a behavioral learning game; it is based on a mathematically precise notion of negotiation between selfishness and social welfare. Therefore, our game model is fundamentally different from, and not an approximation to, being individual optimization. Key attribute of our approach is that we naturally accommodate sophisticated social behaviors with practical assumptions and approximate the ideal system status.

3.1 Behavioral learning game model

To model strategic interactive situations involving learning process, we develop a new behavioral learning game. In the developed game model, players seek to choose their strategy practically. Based on their social relationship, different players can receive different payoffs. We make the following definition.

Definition 1. A behavioral learning game model constitutes a 5-tuple \( \mathbb{G} = \left(N,{S}_{k,1\le k\le n},{U}_{k,1\le k\le n},{\boldsymbol{\varphi}}_{k,1\le k\le n},{\boldsymbol{\Lambda}}^{\boldsymbol{\varepsilon}}\right) \), where

  1. (i)

    N is a set of game players N = {p 1, …, p n }

  2. (ii)

    \( {\boldsymbol{S}}_k=\left\{{s}_1^k,{s}_2^k,\dots, {s}_m^k\right\} \) is a non-empty finite set of all pure strategies of the player k where m is the number of possible strategies

  3. (iii)

    \( {\boldsymbol{U}}_k=\left\{{u}_1^k,{u}_2^k,\dots, {u}_m^k\right\} \) is the utility set of all payoffs of the player k’s strategy; it is defined as a satisfaction levels of players. \( \boldsymbol{U}:{\boldsymbol{S}}_1\times {\boldsymbol{S}}_2\cdot \cdot \cdot \times {\boldsymbol{S}}_n\ \to\ \mathfrak{N} \) is the utility function where \( \mathfrak{N} \) represents the set of real numbers.

  4. (iv)

    φ k  = {φ k1, …, φ kn } is the set of each player’s social tie strength with other players. We assume that each player’s social tie strength to itself is normalized as 1, i.e., φ kn  = 1 and 0 ≤ φ kj  ≤ 1.

  5. (v)

    Λ ε = {ε 1, …, ε n } is the set of each player’s cooperative degree for the total system performance. It is an indicator of non-cooperative actions for each player.

Behavioral learning game is constructed based on the basic concept of “good enough” decisions. Therefore, game players search for the optimal solution but terminate the search when an option is deemed to be good enough; it means that the outcome of selected strategy meets or exceeds the player’s individual aspiration level. Therefore, we replace a profit maximization as the selfish rationality in favor of a concept of adequacy as the realistic approach. In real-world situations, this approach is a practical decision-making mechanism with the informational and computational constraints.

Furthermore, behavioral learning game accounts simultaneously for cooperative and non-cooperative interests during multi-agent decision-making process. According to the social utility theory, game players can be assumed to have two personas [8]. The non-cooperative persona views the strategies exclusively in terms of maximizing his payoff, while the cooperative persona views the strategies exclusively by considering the social relationship and public interests. Based on these two personas, our behavioral learning game formulates two important aspects of sophisticated behavior; namely, selfishness and social cooperation. To express sophisticated behaviors and realistically define utility functions, it would be more natural approach. Therefore, the behavioral learning game model may be synthesized according to a systematic concept of bounded rational behavior that involves social utilities, that is, utilities that account for the interests of others as well as of the self.

The example of behavioral learning games can be an electricity market model. Actually, the electricity market faces much more uncertainties; the uncertainty comes from the behavior of power suppliers which have different levels of rationality. Usually, the market power released depends on suppliers’ level of rationality, which is related to not only his information processing manner but also his utility function. The objective of power demander is to minimize the purchasing cost while facing the diversity of the market. Obviously, it is reluctant to say that a traditional game model with hyper-rationality has an ability to predicate the power market correctly. Therefore, it is more reasonable to take the behavioral game approach as a solution for the electricity market model with the variable level of rationality [14].

3.2 Behavioral learning equilibrium

In non-cooperative game models, Nash equilibrium is a traditional solution concept. However, the main weak point of Nash equilibrium is unrealistic assumption. In the scenario of the Nash equilibrium, the players are assumed to be perfectly rational. It requires complete information and a well-defined and static situation. In reality, this assumption rarely holds [15]. When a player faces unknown players and does not observe the individual players’ preferences, it is generally impossible to reach the Nash equilibrium.

Since the development of the Nash equilibrium concept, game theorists have proposed many related solution concepts, like Pareto equilibrium, subgame perfect Nash equilibrium, Bayesian-Nash equilibrium, ε-equilibrium, correlated equilibrium, and Wardrop equilibrium [15]. These solutions refine the Nash equilibrium to overcome perceived flaws in the Nash concept. However, subsequent refinements and extensions of the Nash equilibrium share the main insight of Nash’s concept. All equilibrium concepts analyze what choices will be made when each player takes into account the decision-making of others. Therefore, all the various equilibria fundamentally have the same limitations; (i) there are multiple equilibria in a game. In many games, there is no guarantee for the uniqueness of the equilibrium, (ii) game players are not perfectly rational in many circumstances. Perfect nationality assumption is not applicable in real-world situations, (iii) equilibria concept has mostly been developed in a static setting. Therefore, traditional approach cannot capture the adaptation of players to change their strategies and reach equilibrium over time, and (iv) equilibria concept does not take computational costs—it needs huge computational overheads [15, 16].

In this work, we introduce a new solution concept, called Behavioral Learning Equilibrium (BLE). It is the result of a learning process over repeated play, instead of rational thinking on the player’s ability to find a static equilibrium point. Therefore, players try to maximize their satisfaction level through a repetitive learning process. In addition, the BLE solution includes the concept of mutual cooperation. If all players are satisfied while maintaining a social welfare, the BLE can be obtained. The BLE is formally defined as follows.

Definition 2. BLE is a set of strategies that can be obtained by repeating a symmetric game with receiving feedbacks. When a set of strategies has chosen by all players and all the cooperative degrees of players are higher than a pre-defined minimum bound (Γ), this set of strategies and the corresponding payoffs constitute the BLE. That is formally formulated as

$$ \begin{array}{c}\hfill \boldsymbol{U}:{\boldsymbol{U}}_1\times {\boldsymbol{U}}_2\cdots \times {\boldsymbol{U}}_n,\mathrm{s}.\mathrm{t}.,\left\{{u}^i\left({s}^i,\ {\varepsilon}_i\right)\in {\boldsymbol{U}}_i,{s}^i\in {\boldsymbol{S}}_i\ \mathrm{and}\ {\varepsilon}_i\in {\boldsymbol{\Lambda}}^{\boldsymbol{\varepsilon}}\Big|\underset{i,1\le i\le n}{\mathbf{min}}\ \left\{{\varepsilon}_i\Big|{\varepsilon}_i\in {\boldsymbol{\Lambda}}^{\boldsymbol{\varepsilon}}\right\}>\varGamma \right\}\hfill \\ {}\hfill \mathrm{s}.\mathrm{t}.,\varGamma < 1\ \mathrm{and}\kern0.37em i\in \boldsymbol{N}\hfill \end{array} $$

where U i is the utility set of all consequent payoffs of the player i’s strategy and Λε is the set of each player’s cooperative degree. n is the number of game players.

BLE is the state where all players’ current cooperative degrees are above the pre-defined minimum bound (Γ). Therefore, the BLE is a strategy profile that approximately satisfies the condition of perfect equilibrium (Γ = 1). In the BLE, players have no incentives to deviate given their beliefs about the consequences of deviating. These beliefs are consistent with the information obtained from the actual equilibrium play of all players.

In this work, we do not focus on trying to get an optimal solution based on the traditional approach, but instead, an adaptive online interactive model is proposed. This approach can dramatically reduce the computational complexity and overheads. Usually, the traditional optimal solutions need exponential time complexity. However, the proposed solution concept only needs polynomial time complexity. Even though the BLE solution does not guarantee the performance optimization, our BLE concept can make this equilibrium possible in real-world operations.

4 Socio-physical IoT management algorithm

Basically, IoT architecture can be considered as ubiquitous ID architecture enhanced with concrete network mechanisms using lightweight protocols for resource-oriented applications. Usually, the IoT system is made up of a number of PANs (personal area networks), which comprise parts of the IPv6 network. In IoT systems, there are different kinds of devices (e.g., IoT agents), which are healthcare, home security, life-supporting machines, sensors, and so on. The information of the IoT agents are considered essential and should be distributed, rapidly. Based on this information, IoT agents know how they can work together in cooperation [17].

In this paper, we design a new IoT management algorithm based on the behavioral learning game. The main goal of our scheme is to ensure relevant tradeoff between optimality and good enough. To practically adapt the IoT system situation, IoT agents are assumed as game players. Players have set of different packet transmission probabilities, which are their strategies. Individual players do not statically decide the best strategy. Instead, all possible strategies can be chosen based on a player’s preference. To induce selfish players to participate cooperative behaviors, the service cost is provided for the players. By using the dynamics of feedback-based repeated process, the coordinate entity (i.e., system operator) periodically monitors the current IoT system situation and adaptively intervenes to converge to the BLE. Based on the only local information and an interactive learning technique, the play of the different players is independent, and players adjust their strategies. It leads to the entirely distributed implementation of the proposed algorithm.

Even though cooperative behavior may be desired of multi-player game models, it would be rare indeed for the interests of all players to be perfectly aligned. To maximize social welfare, game players could have the capability to consider other players and compromise in all situations. Therefore, players must be able to negotiate effectively and maintain their own interests while yielding some considerations to others in the decision-making mechanism. Such behavior requires each individual to seek a social balance between its individual interests and the interests of others [8].

4.1 Data transmission algorithm in IoT systems

To implement the proposed scheme, IoT agents are grouped as a cluster with a contention-based medium access communication mechanism. If there are n communication agents, they are game players; the player k (0 ≤ k ≤ n) contends for the opportunity of data transmission with probability q k(i. e., q kS k ) in a time slot where 0 ≤ q k ≤ 1. If a persistence mechanism is implemented, the data transmission probability is just the persistence probability. If multiple players contend in the same time slot, a collision occurs and no player can get the transmission opportunity. To define the utility function for each player, we employ three different components. From the viewpoint of self-interest, the individual utility function of each player kk(·)) is defined as follows.

$$ \begin{array}{c}\hfill {\Theta}^k\left({q}^k,\ {\mathbf{q}}^{-\boldsymbol{k}}\right)= \log \left({\varrho}_k\times {q}^k\times {\displaystyle \prod_{\begin{array}{c}\hfill l=1,\dots, n\hfill \\ {}\hfill l\ne k\hfill \end{array}}}\left(1-{q}^l\right)\right)-\mathcal{C}\left(k,{q}^k\right),\kern0.5em \hfill \\ {}\hfill \mathrm{s}.\mathrm{t}.,{\mathbf{q}}^{-\boldsymbol{k}}=\left({q}^1,\dots, {q}^{k-1},{q}^{k+1},\dots {q}^N\right)\hfill \end{array} $$

where ϱ k represents the player k’s action willingness (e.g., player k’s efficiency of utilizing the transmission opportunity). Traditionally, the logarithmic function is widely used in literature for modeling utility of network users. \( \mathcal{C}\left(k,{q}^k\right) \) is the cost function for the player k with probability q k. Thus, the individual payoff function (Θk(·)) has a nice interpretation: the net gain of utility from channel access decreases by the network cost.

In this work, the system operator monitors and coordinates explicitly by adjusting the cost. During game operations, the system operator periodically observes the players’ behaviors and cooperative degree of each player. Every time period (unit_time), the system operator dynamically adjusts the cost based on this measured information. Therefore, the system operator can induce players to cooperate each other as well as to take appropriate actions. In the proposed scheme, cost is obtained based on the concept of fairness and cooperative degree. To characterize this fairness notion, we follow the Jain’s fairness index (), which has been frequently used to measure the fairness of network resource allocations [18]. Based on the and ε information, the player k’s cost function \( \mathcal{C}\left(k,{q}^k\right) \) is defined as follows.

$$ \begin{array}{c}\hfill \mathcal{C}\left(k,{q}^k\right)=\left(1+\xi \right) \times {\left(\frac{\gamma_k\left({q}^k,\ {\mathbf{q}}^{-\boldsymbol{k}}\right)}{{\displaystyle {\sum}_{i=1}^n}\ {\gamma}_i\left({q}^i,\ {\mathbf{q}}^{-\boldsymbol{i}}\right)}\right)}^{\mathrm{\mathcal{F}}} \times \log \left({\varrho}_k\times {q}^k\times {\displaystyle \prod_{\begin{array}{c}\hfill l=1,\dots, n\hfill \\ {}\hfill l\ne k\hfill \end{array}}}\left(1-{q}^l\right)\right)\hfill \\ {}\hfill \mathrm{s}.\mathrm{t}.,\;\xi = \left\{\begin{array}{c}\hfill 0, \kern0.5em if\kern0.5em {\varepsilon}_k > \varGamma \hfill \\ {}\hfill \varGamma, \kern0.5em if\kern0.5em {\varepsilon}_k\le\ \varGamma \hfill \end{array}\right.,\;\mathrm{\mathcal{F}}=\frac{{\left({\displaystyle {\sum}_{i=1}^n}\ {\gamma}_i\left({q}^i,\ {\mathbf{q}}^{-\boldsymbol{i}}\right)\right)}^2}{n\times {\displaystyle {\sum}_{i=1}^n}{\left(\ {\gamma}_i\left({q}^i,\ {\mathbf{q}}^{-\boldsymbol{i}}\right)\right)}^2}\kern0.62em \mathrm{and}\ i\in \boldsymbol{N}\hfill \end{array} $$

where γ k (q kq − k) is the amount of actually transmitted data bits per unit_time for the player k, and the range of is varied from 0 to 1. Therefore, if a player has a higher fairness value () and cooperative degree (ϵ), he can reduce the \( \mathcal{C}(.) \) value.

In the proposed scheme, we consider that two players are connected by a directed edge if one has a social tie towards the other. Let \( {\mathbb{G}}_k \) denote the group of the player k; \( {\mathbb{G}}_k \) is the set of players connected by the player k. From the perspective of group, the group utility function of \( {\mathbb{G}}_k\ \left({\mathcal{J}}^k(.)\right) \) is defined based on the social relationships. The strength of the social tie from player k to player i is quantified by φ ki where 0 ≤ φ ki  ≤ 1. Each player k aims to maximize its \( {\mathcal{J}}^k(.) \), which is given by

$$ {\mathcal{J}}^k\left({q}^k,\ {\mathbf{q}}^{-\boldsymbol{k}}\right)={\displaystyle \sum_{i=1,i\ne k}^N}\left({\varphi}_{ki}\times {\Theta}^k\left({q}^k,\ {\mathbf{q}}^{-\boldsymbol{k}}\right)\right) $$

where q − k is the probability set of players without the player k. From the viewpoint of social cooperation, we should consider the total performance of participating players. Therefore, the social utility function of each player \( k\left({\mathfrak{T}}^k(.)\right) \) is defined as follows.

$$ {\mathfrak{T}}^k\left({q}^k,\ {\mathbf{q}}^{-\boldsymbol{k}}\right) = {\varepsilon}_k\times {\displaystyle \sum_{\begin{array}{c}\hfill l=1,\dots, n\hfill \\ {}\hfill l\ne k\hfill \end{array}}}{\Theta}^l\left({q}^l,\ {\mathbf{q}}^{-\boldsymbol{l}}\right) $$

By considering three different perspectives (i.e., individual utility (Θ), group utility (\( \mathcal{J} \)), and social utility (\( \mathfrak{T} \))), we can make possible comparisons of the selfish option in contrast to the socially cooperative option. Finally, the total utility function (U k) of the player k is defined as follows.

$$ {U}^k=\kern0.5em \underset{q^k\in\ {\boldsymbol{S}}_k;{\varepsilon}_i\in {\boldsymbol{\Lambda}}^{\boldsymbol{\varepsilon}}}{ \max}\left({\Theta}^k\left({q}^k,\ {\mathbf{q}}^{-\boldsymbol{k}}\right)+{\mathcal{J}}^k\left({q}^k,\ {\mathbf{q}}^{-\boldsymbol{k}}\right) + {\mathfrak{T}}^k\left({q}^k,\ {\mathbf{q}}^{-\boldsymbol{k}}\right)\right) $$

where ε k is a weighted parameter for different objectives; it provides the evaluation of the relative degree of achieving a social payoff. Under the diverse preference and uncertainty, players dynamically adjust ε value to adapt the preference of social welfare.

4.2 The main steps of proposed algorithm

Recent developments in behavioral science have forced a re-evaluation of the conventional concept of rationality used in game theory. The traditional game model assumes that game players are only involved in rational decision-making. However, developments in the areas of behavioral science have led to advancements in the modeling and identification of bounded rationality in decision-making [19]. In this study, game players individually adjust their propensity and select a good enough strategy based on the observation of past periods of interactions.

By considering the interaction and social relationship associating with the other players, each player may choose the action. Traditional non-cooperative game approach does not provide any guarantees of socially balanced outcome. Achieving social balance requires a concept of BLE that is more flexible and accommodating than optimization. Therefore, it is a relevant practical game solution. To achieve this BLE, we construct a feedback-based control mechanism according to iterative learning technique. In our learning technique, the solution exhibits that a better strategy may be chosen without any sophisticated prediction mechanism. To get an expected solution, we develop the behavioral learning game procedure; players change their current strategy based on their preferences. The main steps of the proposed algorithm are given next and described as a flow diagram in Fig. 1.

Fig. 1
figure 1

Flow diagram for the proposed algorithm

  • Step 1: At the initial time, q k(·) is randomly decided from the strategy set S k . This starting guess guarantees that players independently select their packet transmission probability at the beginning of the game.

  • Step 2: Control parameters Γ, ϱ, φ and ε are given from the simulation scenario (refer to Tables 1 and 2). Based on the social ties, each player has its own group \( \mathbb{G} \); it can capture complex social structures among network agents.

    Table 1 System parameters used in the simulation experiments
    Table 2 System parameters used in the simulation experiments
  • Step 3: Due to the current IoT system situation, each player estimates his payoff (U(·)) according to (2), (3), (4), (5), and (6).

  • Step 4: The cost for each player is estimated according to fairness () and ξ parameter. Based on the formula in (3), and ξ values adaptively adjusted, and the cost is obtained.

  • Step 5: During game processing, individual player iteratively adjust their strategy (q and ε) to maximize his payoff based on the equation (6).

  • Step 6: To reach a mutually acceptable solution, each player repeatedly interacts with other and adaptively learns the best solution.

  • Step 7: If all players ε values are above than \( \varGamma \left(\mathrm{i}.\mathrm{e}.,\underset{i,1\le i\le n}{\mathbf{min}}\ \left\{{\varepsilon}_i\Big|{\varepsilon}_i\in {\boldsymbol{\Lambda}}^{\boldsymbol{\varepsilon}}\right\}>\varGamma \right) \), it is assumed to get a BLE; when the IoT system reaches the BLE, the game process is temporarily stopped.

  • Step 8: Constantly, the system operator is self-monitoring the current IoT system; proceed to step 3 for the next iteration.

5 Performance evaluation

In this section, we compare the performance of our scheme with other existing schemes [5, 11] and can confirm the performance superiority of the proposed approach by using a simulation model. Our simulation model is a representation of the socio-physical IoT system that includes system entities and the behavior and interactions of those entities. To facilitate the development and implementation of our simulator, Tables 1 and 2 lists the system parameters.

Our simulation results are achieved using MATLAB, which is widely used in academic and research institutions as well as industrial enterprises. In order to emulate a real-world scenario, the assumptions of our simulation environment are as follows.

  • The simulated system consists of ten network agents (i.e., players) for the IoT system.

  • The network agents are randomly scattered across a square area of a length of 2000 m.

  • In each network agent, a new service request is Poisson with rate ρ (services/s), and the range of offered service load was varied from 0 to 3.0.

  • The proposed scheme and other existing schemes [5, 11] are implemented in the same simulation scenario, i.e., same network agents and offered service load.

  • For simplicity, we assume the absence of power control problems in the experiments.

  • The range of pre-defined minimum bound (Γ) is decided from 0.05 to 0.15; Γ {0.05, 0.075, 0.1, 0.125, 0.15}.

  • The number of strategies (m) for the player k is 5, and each strategy \( \left({s}_{i,1\le i\le m}^k\right) \) is \( {s}_i^k\in \) {0.1, 0.3,0.5,0,7,0.9}.

  • The strength of the social tie (φ) for other players is randomly decided.

  • The value of cooperative degree (ε) is varied, and it is dynamically adjusted during the game operations.

  • The resource of IoT system is bandwidth (bps), and the total resource amount is 30 Mbps.

  • Network performance measures obtained on the basis of 50 simulation runs are plotted as a function of the offered traffic load.

  • The IoT performance is estimated in terms of the resource usability, system throughput, cooperative degree, and system fairness.

  • The service size of each application is exponentially distributed with different means for different message applications.

Performance measures obtained through simulation are normalized system throughput, resource usability, the cooperative degree status under different Γ values and service fairness in IoT systems, etc. In this paper, we compare the performance of the proposed scheme with existing schemes: the USPI scheme [5] and the SGUM scheme [11]. These existing schemes are also recently developed as effective network management algorithms.

Figure 2 shows the performance comparison of each scheme in terms of the resource utilization in IoT systems. In this work, the resource utilization is a measure of how IoT system bandwidth is used. To maximize the network performance, bandwidth utilization is an important performance metric. During system operations, all the schemes produce similar resource usability. However, the resource utilization produced by our proposed scheme is higher than other schemes from low to heavy service request rates.

Fig. 2
figure 2

Resource utilization in IoT systems

In Fig. 3, the comparison of normalized system throughput is presented. In this paper, system throughput is defined as the normalized data amount of a successfully service. In general, the better throughput gain means that the system can achieve the higher profit in IoT operation. Due to the inclusion of selfishness and social welfare tradeoff mechanism, the proposed scheme can keep the higher system throughput during the IoT system operations.

Fig. 3
figure 3

Normalized system throughput

The curves in Fig. 4 show the cooperative degree (ε) status under different Γ values. In this work, ε value indicates the degrees of player’s cooperative behaviors. According to the Γ value, players get the adjusted cost for services and re-consider their actions. Based on the feedback-based learning approach, network agents in our scheme can dynamically adapt the current situation and adaptively select their strategies. By looking at the results, we can observe that the proposed scheme fits well to reach the behavioral equilibrium.

Fig. 4
figure 4

Cooperative degree status under different Γ values

Figure 5 indicates the IoT system fairness of each scheme. This measure is a key factor to estimate the fair distribution of system resource. All the schemes have similar trends. However, our proposed scheme can achieve a balanced resource allocation for players. Therefore, we can maintain the excellent system fairness under various service rates. This feature is a highly desirable property for the multi-agent system management. The simulation results shown in Figs. 2, 3, 4, and 5 demonstrate the performance comparison of the proposed scheme and other existing schemes [5, 11] and verify that our behavioral learning game based scheme can provide an attractive IoT system performance.

Fig. 5
figure 5

Service fairness in IoT systems

6 Conclusions

Nowadays, IoT is regarded as a technology and economic wave in the global information industry after the Internet. The IoT is an intelligent network which connects all things to the internet for the purpose of exchanging information and communicating through the smart objects in accordance with agreed protocols. Recently, IoT communication control problem is analyzed through both non-cooperative and cooperative game model to maximize the system performance. In this paper, we take a new direction to develop a socio-physical IoT connection algorithm. Based on the behavioral learning game model, the tradeoff between selfishness and public interest in the IoT system is analyzed. Compared with the existing schemes, simulation result shows that our proposed scheme helps network agents to adapt actions to achieve the socially balanced outcome while converging to behavioral learning equilibrium. Future work will be pursued in the following directions. Theoretical analysis needs to be further developed. Identifying some application domains for empirical studies is also planned. Furthermore, behavioral game model can be extended toward for other research areas: control decisions in inter-process communication, disk and memory management, file and I/O systems, CPU scheduling, and distributed operating system.


  1. Ovidiu Vermesan and Peter Friess, Internet of Things—Global Technological and Societal Trends, (River Publishers, aalborg, Denmark, 2011)

  2. Singh, D., Tripathi, G. and Jara, A.J., A survey of Internet-of-Things: future vision, architecture, challenges and services, IEEE World Forum on Internet of Things (WF-IoT’ 2014), pp. 287-292, (2014)

  3. Mazhelis, O. and Tyrvainen, P., A framework for evaluating Internet-of-Things platforms: application provider viewpoint, IEEE World Forum on Internet of Things (WF-IoT’ 2014), pp. 147-152, (2014)

  4. M Nitti, R Girau, L Atzori, Trustworthiness management in the social Internet of Things. IEEE Trans. Knowl. Data Eng. 26(5), 1253–1266 (2014)

    Article  Google Scholar 

  5. E Stai, V Karyotis, S Papavassiliou, Exploiting socio-physical network interactions via a utility-based framework for resource management in mobile social networks. IEEE Wirel. Commun. 21(1), 10–17 (2014)

    Article  Google Scholar 

  6. JK Archibald, JC Hill, FR Johnson, WC Stirling, Satisficing negotiations. IEEE Trans. Syst. Man Cybern. 36(1), 4–18 (2006)

    Article  Google Scholar 

  7. S Kim, Adaptive online power control scheme based on the evolutionary game theory. IET Commun. 5(18), 2648–2655 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  8. WC Stirling, RL Frost, Social utility functions—part II: applications. IEEE Trans. Syst. Man Cybern. 35(4), 533–543 (2005)

    Article  Google Scholar 

  9. Colin F. Camerer, Behavioral Game Theory: Experiments in Strategic Interaction, (Princeton University Press, Princeton, NJ, U.S.A, 2003)

  10. Q Wu, G Ding, Y Xu, S Feng, Z Du, J Wang, K Long, Cognitive Internet of Things: a new paradigm beyond connection. IEEE Internet Things J. 1(2), 129–143 (2014)

    Article  Google Scholar 

  11. Xu Chen, Xiaowen Gong, Lei Yang and Junshan Zhang, A social group utility maximization framework with applications in database assisted spectrum access, IEEE INFOCOM, pp.1959-1967, (2014)

  12. I Jang, D Pyeon, S Kim, H Yoon, A survey on communication protocols for wireless sensor networks. JCSE 7(4), 231–241 (2013)

    Google Scholar 

  13. Syed Muhammad Khaliq-ur-Rahman Raazi, Sungyoung Lee, A survey on key management strategies for different applications of wireless sensor networks, JCSE, vol.4, no.1, pp.23-51, (2010

  14. Shao, K., Wang, R. and Zhou, X.Y., Behavioral game model and evolutionary analysis in electricity market, IEEE Power Eng. Soc. Gen. Meet., pp.1-7, (2006)

  15. Sungwook Kim, Game Theory Applications in Network Design, (IGI Global, Pennsylvania, 2014)

  16. Hemmati, M., Sadati, N. and Nili, M., Towards a bounded-rationality model of multi-agent social learning in games, IEEE Intelligent Systems Design and Applications (ISDA’ 2010), pp.142-148, (2010)

  17. Yashiro, T., Kobayashi, S., Koshizuka, N. and Sakamura, K., An Internet of Things (IoT) architecture for embedded appliances, IEEE Humanitarian Technol. Conf., pp. 314-319, (2013)

  18. Dianati, M., Shen, X., and Naik, S., A new fairness index for radio resource allocation in wireless networks, IEEE WCNC, pp.712-715, (2005)

  19. K Kinoshita, K Suzuki, T Shimokawa, Evolutionary foundation of bounded rationality in a financial market. IEEE Trans. Evol. Comput. 17(4), 528–544 (2013)

    Article  Google Scholar 

Download references


This research was supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2015-H8501-15-1018) supervised by the IITP (Institute for Information & Communications Technology Promotion) and was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2015R1D1A1A01060835).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sungwook Kim.

Additional information

Competing interests

The author declares that he has no competing interests.

Author’s contributions

Sungwook Kim is a sole author of this work and ES (i.e., participated in the design of the study and performed the statistical analysis).

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, S. Behavioral learning game for socio-physical IoT connections. J Wireless Com Network 2016, 24 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Behavioral learning game
  • Internet of Things
  • Bounded rationality
  • Behavioral learning equilibrium
  • Game theory
  • Socio-physical connections