Open Access

Virus propagation power of the dynamic network

  • Fu Cai1,
  • Huang Qingfeng2Email author,
  • Han LanSheng1,
  • Shen Li1 and
  • Liu Xiao-Yang3
EURASIP Journal on Wireless Communications and Networking20132013:210

https://doi.org/10.1186/1687-1499-2013-210

Received: 19 February 2013

Accepted: 1 August 2013

Published: 17 August 2013

Abstract

With the development of mobile networks, propagation characteristics and defense mechanism of the virus have attracted increasing research attention. But current researches are mainly concerned with static network topology or community structure and some studies focus on characteristics of virus and malware. Little attention is paid to the influence of dynamic changes of network topology to virus propagation. Meanwhile, many studies focus on the threshold rate of the infection (or the immunization rate) for virus outbreak. In this paper, we present a new way to assess and restrain virus propagation by proposing the concepts of propagation power and propagation structure. Three basic propagation structures are presented through which the infection risk can be quantified, and a framework is proposed to assess the impact of virus propagation in different network structures. The relationship between the speed of virus propagation and network structure is also explored. An algorithm is designed to assess the propagation power in a dynamic network with no need to redetect the community under the dynamic network which may significantly improve the efficiency of assessment in a large-scale dynamic network. This study offers a feasible approach for quantifying the risk of virus infection in the network community which is valuable for designing and optimizing the virus defense systems.

Keywords

Dynamic networkPropagation powerVirus restraintBasic propagation structure

1. Introduction

With the increasing popularity of mobile devices, virus propagation in mobile networks has become a major security concern [14]. A virus brings not only a potential threat but also real problems such as leakage of personal data and remote access control [4, 5]. A more dangerous threat is its large-scale outbreak which can paralyze the whole network.

The current research on network virus propagation can be divided into two categories: some researchers model the propagation of virus to find the threshold of a large-scale breakout, while others attempt to study the mechanism of a restraining virus.

Since network virus propagation in a mobile network is similar to that of biological virus propagation, four classic models (susceptible-infected-susceptible (SIS), susceptible-infected-recovered (SIR), susceptible-exposed-infected-recovered (SEIR), and susceptible-infected-directed-removed (SIDR)) used to describe biological virus propagation are used in describing network virus propagation. There are two major lines of research on mobile network adopting this approach.

One line of studies focuses on human epidemic through the mobile network. Using GPS and GSM, researchers can analyze people's social activities through collecting the position data of people carrying the devices. For example, researchers collect data from a high school in the USA by a set TelosB sensor node in 788 people in the school. After analyzing the collected data, it is concluded that biological virus can be immunized only in the area where vaccine injection rate is high [6]. Using the data collected from a mobile network, Fenichel et al. [7] observed contact between people and proposed a scheme to improve the resistance of biological virus infection by balancing the threat of virus and the benefit of social networking.

Another line of studies is concerned with virus in intelligent mobile devices. It is found that the virus will break out if the market share of intelligent mobile devices exceeds a certain threshold [8]. Some method has been put forward to address the problem of mobile phone virus [911]. But the mobility of smart devices can greatly increase the speed of virus propagation even if most nodes in the network remain still. Therefore, given the mobility of the nodes in the mobile network, the propagation of virus is likely to be exacerbated [12].

In the study of virus defense in the mobile network, it is found that virus propagation in a mobile network is difficult to restrain because the topology of the network is changing frequently due to the mobility of devices. There are two kinds of strategies restraining the virus in a mobile network: immunization strategy [1322] and local strategy. In most cases, immunization strategies are implemented based on centralized distribution and static network [1321]. But in a large-scale dynamic network, immunization strategy should cover 80% of nodes by stochastic immunization strategy or the whole topology should be known using a special immunization strategy [13, 14]. Through local strategies, once a node finds itself infected, it sends an antivirus message to others immediately and the virus can be cleared accordingly [2326].

Zyba et al. [23] presents an ideal scheme to restrain virus propagation through which a node can detect a virus locally. Once the virus has been found, the infected node immediately cuts off the communication or sends an antivirus message to adjacent nodes. However, this study is based on an idealistic but unrealistic assumption that the mobile devices possess superb computational power [27, 28]. Hui et al. [24, 25] propose a message dissemination mechanism through social community, but the mechanism requires that the virus is locally detected. Jackson and Creese [26] study virus propagation from the perspective of human behaviors.

In this paper, we propose a new way to research on virus propagation based on the network structure and its dynamic change. We propose the concepts of propagation power and propagation structure. It is assumed that the risk of virus propagation depends on the speed of virus propagation in the network. And the major factor influencing the speed of virus propagation is propagation power which can be quantified by analyzing the network structure based on three basic propagation structures. This approach has two distinct advantages. On the one hand, through the propagation power, we can assess the risk of the virus propagation quantitatively and take virus-restraining measures in advance. On the other hand, a guideline may be developed for network construction and optimization to restrain the virus.

The rest of the paper is organized as follows. Section 2 reviews the current researches and points out the major concern of our study. Section 3 offers definitions of various important concepts. Section 4 presents the calculation of fundamental propagation power in static network, whereas the propagation power in dynamic network is computed in Section 5. Section 6 presents the experiment and data analysis, and Section 7 offers several case studies. The paper is concluded in Section 8 along with some future works.

2. Problem statement

As mentioned above, several schemes have been proposed to model and restrain virus propagation, but these schemes may have some weaknesses with regard to the current prevalent virus. We analyze some famous advanced persistent threat (APT) viruses to illustrate our research focus.

First, there are several classic virus propagation models such as SIS, SIR, SEIR, and SIDR where the threshold propagation rates λ are defined. But it is also found that the propagation rate is highly related to the size of the networks [29]. To reduce the propagation rate, it is assumed that the network size is infinite. Since in reality the network is usually finite, how can we assess the risk of virus propagation? For example, Figure 1 shows two real microblog communities. If we can accurately assess the risk of the two communities, then we can take measures in advance.
Figure 1

Two microblog communities (a, b).

Second, the current researches on the outbreak of virus or the risk of propagation focus on the threshold infection rate. When the infection rate reaches a certain value, the virus will break out. But historically, viruses often utilize one or more 0-day vulnerabilities. And it will take a long time for virus patch to be released. Before the release of the patch, the infection rate of virus is almost 1, as shown in the cases of shockwave, Blaster Worm, and the APT virus in 2012. Table 1 shows some famous APT viruses, almost all of which utilize the 0-day vulnerability. The current research has uncovered 0-day malware in a mobile android device [30]. So what measures can be taken to restrain the propagation of such viruses?
Table 1

Famous virus utilizing 0-day vulnerability

Name

Date of discovery

0-day count

Aurora

January 2010

1

Stuxnet

June 2010

7

RSA

March 2011

1

Flame

May 2012

1

Third, the current methods aim to remedy the situation after the discovery of the virus. What preventive measures can be taken in advance? More specifically, what can be done with respect to network design and optimization? For example, Figure 2 shows the topology of two independent networks and a number of possible ways to connect the two networks. Which is the best way to connect in order to restrain the virus propagation?
Figure 2

Optimization of network based on virus defense.

Fourth, due to the rapid development of mobile network and the mobility of mobile devices (smartphone, PDA), the network topology changes frequently. How can we assess the risk of the dynamic network efficiently? For example, Figure 3 shows three statuses of a microblog community. It is obviously inefficient to recalculate the propagation power of the entire community. Efficiency can be greatly improved if only the dynamic part of the community is calculated.
Figure 3

The changes of risk of virus propagation in a microblog community (a, b, c).

To address the issues above, it is insufficient to study the propagation model or virus defense. The network topology has to be taken into consideration as it is the basic environment for virus propagation. If the speed of virus propagation at the layer of network structure can be reduced, more time and space will be available to restrain the virus. Therefore, we propose the propagation power used to quantify the risk of virus propagation in the network. The network may be analyzed based on three basic propagation structures. We also propose a method to calculate the risk of virus propagation in the dynamic network. In short, we believe that our scheme is a new approach to restrain virus propagation taking into account the network structure which has been largely neglected in current researches.

3. Related definitions

In order to quantify the risk of virus propagation of a complex network structure we define three basic propagation structures of a network according to the speed of propagation. And then we define the propagation power to quantify the risk of virus propagation on the network. With the two definitions, we can better understand the relationship between the risk of network and the speed of virus propagation.

3.1. Definition of SBS and speed of virus propagation

Definition 1 Spreading network is defined as a connected graph G = (V, E), where V is the set of vertices and E is the set of edges. For any e ij that belongs to E, if e ij  = 1, v i and v j are connected.

Definition 2 The speed of virus propagation v0 is defined as v0 = n/t, where n is the number of infected nodes in the network and t is the time taken to infect the whole network.

Definition 3 The speed of virus propagation of the network v is defined as v = 1 n i = 1 n n a i t 0 , where n is the number of the nodes in the network, α i is the number of infected in the whole network by the node i, and t0 is the time required for a successful infection.

Definition 4 The three basic propagation structures include linear propagation structure (LPS), ring propagation structure (RPS), and star propagation structure (SPS).

Definition 5 LPS is a connected graph, which satisfies 1 ≤ d i  ≤ 2, where d i is the degree of any node i in the LPS. There must exist two nodes whose degrees are equal to 1.

Definition 6 RPS is a connected graph, where the degrees of all nodes are equal to 2 and the number of the nodes is greater than 2.

Definition 7 SPS is a connected graph, where the degree of one node is greater than or equal to 3 and the degree of other nodes is equal to 1 and the number of the nodes in the structure is greater than 3. All basic structures are shown in Figure 4.
Figure 4

Basic propagation structures.

3.2. Definition of propagation power

Speeds of virus propagation vary in different network structures. In this section, the propagation power is defined to quantify the effect of different structures on the speed of virus propagation. Intuitively, virus propagation is faster in SPS than in the other two structures, whereas RPS is faster than LPS. Under the same situation, the speed of virus propagation is mainly determined by the structures.

To quantify the propagation power, we should consider E n = 1 n i = 1 n p i to assess the risk of virus of a structure, where probability p i  = p(x1 = x2 = … = x n  = 1|x i  = 1) and n is the number of nodes in network. Quantification is reasonable in terms of theory for the reason that the larger the value of E n is, the higher the risk of virus propagation. However, it is impracticable to work out the probability p i . Thus, we chose another way which investigated the number of hops a virus needed from an original node spreading to all nodes of the network. To simplify the calculation, it is assumed that the probability of virus spreading from one to another is 1. The propagation power was defined as follows:

Definition 8 Propagation power of a network is F = t 0 n i = 1 n disp i 1 , where disp i is the number of infection caused by the i th node, n is the number of nodes in the network, and t0 is the time required for a successful infection.

With the definition of propagation power, the speed of virus propagation in a network can be evaluated. If the propagation power is greater, the time for virus propagation would be smaller and vice versa.

4. Calculation of fundamental propagation power

4.1. Calculation of propagation power in a large network

To calculate the propagation power, we present the algorithm according to the definition of propagation power to be applied in a large-scale network as follows.

The number of hops required for a virus to spread from an original node to all nodes of the network can be considered as the maximum value of the shortest path length which is the shortest length one needs to travel from the original node to another. Thus, the problem comes down to a task of obtaining the shortest path.

The main idea of calculating the propagation power is that a shortest path tree for each node in the network is worked out by the Dijkstra algorithm first, and then the height of the tree which represents the number of hops is obtained by calculating F. The algorithm is described as follows:
  1. (1)

    Input graph G and set propagation power F = 0;

     
  2. (2)

    Calculate the shortest path tree SPT i (1 ≤ i ≤ n) by Dijkstra (G, v), the root of which is v;

     
  3. (3)

    Get the height of tree SPT i , disp i  = TreeHeight (v);

     
  4. (4)
    Return F,
    F = t 0 n i = 1 n disp i . 1
     
As the time complexity of the optimized Dijkstra algorithm is O(|E| log |V|) [31], since every vertex must be computed, the complexity of the algorithm above is O(|V| × |E| log |V|). Figure 5 shows the algorithm.
Figure 5

The propagation power calculation algorithm.

As shown in Figure 6, we present an example to show how the algorithm works:
  1. 1.
    Calculate the propagation power of Figure 6a:
    F = t 0 5 × 3 + 2 + 3 + 2 + 3 1 = 0.38 t 0
     
  2. 2.
    Calculate the propagation power of Figure 6b:
    F = t 0 5 × 2 + 2 + 2 + 2 + 2 1 = 0.5 t 0
     
Figure 6

Example of calculation (a, b).

In Figure 6, although the vertex numbers are the same, the propagation powers are different, which suggests that the virus risk is different.

4.2. Relation between propagation power and infection time

In this section, we present some theorems about the propagation power F and the infection time n/v.

First, we show the mathematical expressions of the propagation power of SPS and the propagation speed of SPS.

Relationships between the propagation power F of LPS and number of nodes n in the network and relationships between the speed v of LPS and number of nodes n in the network can be expressed as follows:
v = 1 n 1 n n α i t 0 = 2 × i = n 2 n 1 1 i t 0 + k
(1)
where k = 0 ( while n is an even number ) 1 n 2 t 0 while n is an odd number
F = 2 t 0 n i = n 2 n 1 i 1 = t 0 n 1 + n 2 × n 2 × 1 2 × 2 n 1 = 3 n 2 4 t 0 1 while n is an even number 2 t 0 n i = n 1 2 n 1 i + n 1 2 × n 1 = 3 n 1 n 1 t 0 4 n 1 while n is an odd number
(2)
Relationships between propagation power F of RPS and number of nodes n in the network and relation between speed v of RPS and number of nodes n in the network can be expressed as follows:
v = 1 n 1 n n α i t 0 = n n / 2 t 0 = 2 / t 0
(3)
F = t 0 n i = 1 n n / 2 1 = t 0 n / 2 1
(4)
Relationship between propagation power F of SPS and number of nodes n in the network and relationship between speed v of SPS and number of nodes n in the network can be expressed as follows:
v = 1 n 1 n n α i t 0 = n 1 2 t 0 + 1 t 0 = n + 1 2 t 0
(5)
F = t 0 n n 1 2 + 1 n 1 = 2 n 1 n t 0 1
(6)

According to the above formulae, it may be concluded that v is the average speed of the nodes infecting the whole network, and propagation power expresses the risk of virus propagation in the network. Assuming the two variables are related, three theorems are presented as follows:

Theorem 1 When n approaches infinity in LPS, F 1 = α(1/v), where 3/4 ≤ α ≤ 3/2; In RPS, F−1 = α(1/v), where α = 1; when n approaches infinity in SPS, F−1 = α(1/v), where α = 1.

Proof In LPS:
n / v = n / 1 n i = n 2 n 1 n i t 0 + k
n / 1 n i = n 2 n 1 n i t 0 + k n / v n / 1 n i = n 2 n 1 n i t 0
n / 1 n i = n 2 n 1 n n / 2 t 0 + k n / v n / 1 n i = n 2 n 1 1 t 0
t 0 / 2 1 / v t 0

When n approaches infinity, substitute F−1 = (3/4)t0 into the above inequalities. Hence, 3/4 ≤ α ≤ 3/2.

We can prove the relationships in RPS and SPS in the same way.

Theorem 2 The ratio of the reciprocal of propagation energy of any network structure to time (n/v) has upper and lower limits. The upper limit function of the two variables is f(n, α) = α × n, and the lower limit function is f(n, α) = a/n, where α = t02.

Proof Since F = t 0 1 n disp i 1

and 1 ≤ disp i  ≤ n.

Therefore, 1 t 0 n i 1 F 1 1 t 0 n i n
1 n t 0 F 1 1 / t 0
v = 1 / n n n α i t 0
And because 1 ≤ α i  ≤ n,
1 n n 1 t 0 v 1 n n n t 0
1 / t 0 v n / t 0
t 0 n v n t 0

From above, it can be concluded that sup n v = F 1 × f n , α , where f(n, α) = αn inf n v = F 1 × f n , α , where f(n, α) = a/n.

Theorem 3 The speeds of virus propagation in three SBSs are different when the numbers of nodes are the same. The speed in LPS v line , the speed in RPS v ring , and the speed in SPS v star satisfy the inequality v line  < v ring  < v star .

Proof Since
v ring = 1 n 1 n n α i t 0 = 2 × n 1 n / 2 n i t 0 + k 2 × n 1 n / 2 n i t 0 < 1 n 1 n n α i t 0
According to the Equation 3, vline < vring
n n / 2 t 0 < n + 1 2 t 0

According to Equations 3 and 5, vring < vstar.

Therefore, vline < vring < vstar.

From formulae 1, 3, and 5, the following conclusions can be made:
  1. 1.

    The speed of virus propagation in LPS decreases when the number of nodes increases.

     
  2. 2.

    The speed of virus propagation in RPS fluctuates around 2 t 0 .

     
  3. 3.

    The speed of virus propagation in SPS increases when the number of nodes increases. The relationship between the speed and the number of nodes is linear.

     

Through the above analysis, we can see that the structure of the network is the main factor in virus propagation and the most dangerous structure is the SPS. When a network has more SPS, the risk of propagation is higher.

According to Equations 1, 3, and 5, the speed of virus propagation is not only related to the scale of the network, but it also largely depends on the structure of the network. The star structure has played the most significant role in increasing the speed of virus propagation. In any given network, the more star structures there are and the bigger the propagation power F is, the faster the virus propagation would be. The influence of the star structure also increases with the increase of the number of nodes. The influence of the ring structure is smaller and that of the linear structure is the smallest. Overall, the propagation is slower in a network with more linear structures and ring structures than that in one with more star structures.

5. Calculating propagation power in dynamic network

It is complex to evaluate the risk of virus propagation in a large-scale dynamic network due to the following reasons: first, when the network scale is large, it would be too complex to compute the propagation power of the entire network; second, since the risk of virus propagation changes frequently due to the change of the network structure, a large amount of computation is required to reevaluate the propagation risk after each change. In this section, we propose a feasible algorithm to compute the propagation power of virus in large-scale dynamic networks based on dynamic community mining algorithm in [26], which can efficiently evaluate the propagation risk of virus in a community network.

Four types of changes can be observed in a dynamic network, namely adding edges, reducing edges, increasing nodes, and reducing nodes, which lead to different derived networks and the need to recalculate propagation power in the network. For each type of change, we will specifically describe how the algorithm calculates the propagation power in the dynamic network.

5.1. Add a new node

Adding a new node in the network can change the community structure in the following two ways for which the propagation power of the community has to be recalculated.

First, when a new node not connected to any other nodes in the network is added, a new community is added to the entire network. Since there is no change in propagation power in the community, the value of propagation power remains unchanged.Second, when a new node connected with other nodes in the network is added, this situation is more complicated since the structure of the network community may change. When the node is added, since other nodes connected to this node may belong to one or more of the communities, only the propagation power of the community where the node is added needs to be recalculated. The propagation power of the entire community has to be updated accordingly. Algorithm 1 is described in detail below:

5.2. Add a new edge

Adding an edge may lead to a variety of changes in structure. We recalculate the propagation power of community after each change, which may be one of the two following cases.

When the two nodes of a new edge belong to the same community, the new edge is added to the same community. We simply need to recalculate the propagation power after adding the new edge and update the value of propagation power of the entire community.When the two nodes of a new edge belong to different communities, a new external edge is added which may cause two different circumstances: first, the original community structure remains unchanged, and we keep the value of propagation power of the whole community unchanged; second, when the addition of the two nodes to the community increases the value of the function of the entire community ratings, we will add the two nodes to that community and calculate the value of propagation power of the transformed community, and then update the value of propagation power of entire community network. The details of the Algorithm 2 are described as follows:

5.3. Delete an edge

Compared with the above cases, deleting an edge is less frequent. When an edge is deleted from the network, we recalculate the propagation power based on a different case.

When the two nodes of the edge belong to different communities, the overall value of propagation power remains unchanged.

When the two nodes of the edge belong to the same community and the degree of one of the nodes is 1, this node will become an isolated node, forming a new and separate community that contains only one node. The structure of the rest of the community remains unchanged. In particular, when the degrees of both nodes are 1, the deletion of the edge will cause both nodes to become isolated nodes, and two separate communities would form. We need to recalculate the F value of the two one-node communities.

When the two nodes of the edge belong to the same community, but the degrees of the nodes are different from the above situations, the value of the function of the community ratings will decrease. The community may remain unchanged or split into two communities after an internal edge is deleted. In the latter case, the F value of the divided communities has to be recalculated. A detailed Algorithm 3 is presented as follows:

5.4. Delete a node

When a node is deleted from the network, recalculating the F value is a complex process. When the degree of the deleted node is 1, only one edge is deleted. Thus, the rest of the community structure and the F value of the original community remain unchanged. When the degree of the deleted node is more than 1, it will appreciably impact the original community and loosen the structure of the original community.

When we consider other nodes adjacent to the deleted node, the adjacent nodes connected to different communities should be added to the appropriate community so that the function value of the community ratings is higher. F values of the changed communities have to be recalculated. A detailed Algorithm 4 is described as follows:

6. Experiments and analysis

6.1. Experimental environment and simulation process

In this study, the algorithms are implemented in C++ or VS2008 (Microsoft, Albuquerque, NM, USA) and run on Intel (R) Celeron (R) (Santa Clara, CA, USA) 2.2 GHz, 2G memory in Win7.

In general, three methods are used to simulate the propagation of the virus [22]: log files, infection model, and synthetic model. Log files can reflect the real users' behaviors; with incomplete geographical coverage, it cannot represent all behaviors. The virus infection model can be computed efficiently, but many details are neglected. The synthetic model is very flexible and can cover the entire region. In this paper, we adopt the third method for model simulation.

Figure 7 shows the whole algorithm process, where G represents the network in which the virus will spread and time[i] represents the time taken to infect the entire network, with i representing the i th node that causes infection. Infected[i] denotes whether the i th node is infected. Allnodestart denotes the number of nodes that causes infection in graph G.
Figure 7

Simulation algorithm flowchart.

6.2. Relationship between propagation power and virus infection

Section 4 shows how to calculate the propagation speed of virus. We analyze the relationship between propagation speed ν and propagation power F in basic structure. We firstly use three basic structures as input of graph G to obtain the corresponding propagation speed ν and then calculate the corresponding propagation power F. Here, to simplify the simulation, the time taken for a node to infect another node is assumed to be 1 (this assumption does not affect the results).

6.2.1. Relationship between propagation power and virus infection in LPS

We select 100%, 80%, 60%, and 50% of the nodes from the network respectively to constitute a linear structure. The remaining nodes are randomly connected to the linear networks. The relationships between propagation speed ν and the number of nodes in the linear structure under different proportions are observed by experiment as shown in Figure 8a.
Figure 8

Simulation of virus propagation in LPS (a, b).

As can be seen in Figure 8a, when a network is closer to a linear structure, the propagation speed is lower. Thus, we conclude that the linear structure may help restrain the spread of the virus. Figure 8b shows that the value of propagation power F is not only related to the propagation time; it is also influenced by the number of the nodes in the linear structure.

When the number of nodes increases, the reciprocal value F−1 of the propagation power will also increase and the value of propagation power F will decrease.

The propagation time n/v (the time t0 taken for a node to infect another node is assumed to be 1) and reciprocal value F−1 of the propagation power are almost linearly related in the linear structure, which is consistent with the theoretical analysis in Section 4.

6.2.2. Relationship between propagation power and viral infection in RPS

We select 100%, 80%, 60%, and 50% of the nodes from the network respectively to constitute a ring structure. The remaining nodes are randomly connected to the network. The relationships between propagation speed ν and the number of nodes n in the ring structure under different proportions are observed by experiment as shown in Figure 9a.
Figure 9

Simulation of virus propagation in RPS (a, b).

As shown in Figure 9a, when a network is closer to a ring structure, the propagation speed is lower. Thus, we conclude that the ring structure can also help restrain the spread of the virus. Figure 9b shows that the value of propagation power F is not only related to the propagation time; it is also influenced by the number of nodes in the ring structure.

When the number of nodes increases, the reciprocal value F−1 of propagation power will also increase and the value of propagation power F will decrease.

The propagation time n/v (the time t0 taken by a node to infect another node is assumed to be 1) and reciprocal value F−1 of propagation power are almost linearly related in the ring structure, which is consistent with the theoretical analysis in Section 4.

6.2.3. Relationship between propagation power and viral infection in SPS

We respectively select 100%, 80%, 60%, and 50% of the nodes from the network to constitute a star structure. The remaining nodes are randomly connected to the networks. The relationships between propagation speed ν and the number of nodes n in the star structure are observed under different proportions by experiment as shown in Figure 10a.
Figure 10

Simulation of virus propagation in SPS (a, b).

As shown in Figure 10a, when a network is closer to a star structure, the propagation speed is higher. Thus, we conclude that the star structure may facilitate the spread of the virus. Figure 10b shows that the value of propagation power F is not only related to propagation time; it is also influenced by the number of nodes in the ring structure. When the number of nodes increases, reciprocal value F−1 of the propagation power will also increase.

We can also find that propagation time n/v (the time t0 taken by a node to infect another node is assumed to be 1) and reciprocal value F−1 of propagation power are almost linearly related in the star structure.

When n approaches infinity, the ratio between n/v and reciprocal value F−1 of propagation power is gradually fixed, suggesting that the relationship between the reciprocal value F−1 of the propagation power and the propagation time n/v approaches linear in the star structure, which is consistent with the theoretical analysis in Section 4.

6.2.4. Range of changes of community propagation power

In addition to the changes in basic structures, for a change in the community, the value of propagation power changes within a certain range in a dynamic community. When a community is a complete subgraph, we find that the risk of virus propagation is highest based on our definition of propagation power. The value of propagation power F in a complete graph is F = 1 n 1 / n 1 = 1 , according to Section 4.1. When a community is transformed into a linear structure, the propagation power F is lowest and the reciprocal value of propagation power is highest, according to Section 3.2, as shown in Figure 11a.
Figure 11

Range of propagation power (a, b).

The reciprocal values of propagation power F−1 in all communities change within a certain range. In all communities, the propagation speed is highest in the complete subgraph and is lowest in linear structure. The experimental results are shown in Figure 11b.

As shown in Figure 11b, propagation time n/v in any community change within a certain range. The propagation time is shortest in the complete subgraph and is longest in the linear structure.

The above results are consistent with Theorem 2 which suggests that the ratio of the reciprocal of propagation energy of any network structure to time (n/v) has upper and lower limits, i.e.,
sup n / v = F 1 × f n , a ,
where f (n, α) is the linear function of n (when F−1 approaches the lower limit and n/v approaches the upper limit);
inf n / v = F 1 × f n , a ,

where f (n, α) increases when n decreases and vice versa (when F−1 approaches the upper limit and n/v approaches lower limit).

7. Case studies

With the concept of propagation power F, we can better understand the risk of virus propagation. In this section, the examples in Figures 1, 2, and 3 are analyzed to illustrate how to apply this propagation power model to actual cases and how to use the proposed algorithms.

Figure 1 in Section 2 shows the topology of two microblog communities. From the figure, it is difficult to tell the differences of risk of virus propagation between communities a and b. But through the calculation, we know the propagation power of the first community F a  = 0.1919 and the propagation power of the second community F b  = 0.2786. Therefore, the virus propagation risk of the second community is bigger than that of the first community. According to our calculation, it takes 5.0222 s to infect all nodes in the first community and 3.458 s in the second community through virus propagation (assuming the average time for one node to infect another node is 1). Therefore, a precise threshold value of risk of virus propagation can be set so that preventive measures can be taken to manage infection risk based on the calculation results if the calculated result is bigger than the threshold value.

The propagation risk is not only useful in the analysis of the propagation risk of virus in the static network. It may also be used to optimize network connection to restrain virus propagation. For example, in the four schemes of network connection in Figure 2, the networks are connected through two lines in each scheme. The following table presents the values of propagation power and the time taken to infect all nodes through simulation.

It is clear that the bigger the propagation power F, the shorter the propagation time of virus. According to Table 2, scheme 4 should be used. Therefore, propagation power can be used to optimize the network structure for restraining virus propagation.
Table 2

Propagation power table

Scheme

Propagation power F

Propagation time t

1

0.22

4.33

2

0.19

4.97

3

0.20

4.82

4

0.17

5.45

It is not necessary to recalculate the value of propagation power in the real network by dynamic algorithm. As shown in Figure 3, when adding an edge to graph a, which consists of three communities, graph a will become graph b. Since the function value of community ratings remains unchanged, the structure of the original community also remains unchanged and it is not necessary to recalculate F values of any communities. When adding two edges to graph b, graph b will become graph c. We need to calculate the partial propagation power of community II and III. But it is not necessary to calculate the propagation power of community I.

8. Conclusions

8.1. Summary of the study

This paper analyzes the virus propagation and attempts to develop virus defense mechanisms by taking into account the network topology and the dynamic change of the network structure. Unlike traditional approaches to restraining computer virus propagation, we propose the concept of propagation power F to quantify the risk of virus propagation which is influenced by network structures. Preventive measures can then be taken in advance.

The main contributions of this paper are as follows:
  1. 1.

    A quantitative framework is proposed to assess the influence of the structure to virus propagation. We present three basic propagation structures (linear, ring, and star structures) and the concept of propagation power F to determine how a structure affects the speed of virus propagation.

     
  2. 2.

    The quantitative relationships between propagation power and network structures are found and verified. It is theoretically proven that propagation power F and propagation speed (1/v) are related in the three basic structures. The relationships are also verified through experiments. Therefore, propagation power F is a valid indicator of the risk of virus propagation.

     
  3. 3.

    We develop an algorithm to compute the propagation power F for the dynamic community structure. Without calculating the entire community, this algorithm can be used to compute propagation power in large-scale network efficiently.

     

The feasibility of the framework and the algorithms has been verified through experiments and case studies.

8.2. Future work

To further improve the proposed approach, efforts should be made to improve the algorithm for computing propagation power so that the efficiency of calculation can increase without compromising the accuracy. In addition, the characteristics of network virus should be taken into consideration to build a more integrated mechanism for restraining virus propagation.

Declarations

Acknowledgements

The paper is supported by the National Natural Science Foundation of China (60903175, 61272405, 61272033, 61272451) and University Innovation Foundation (2013TS102, 2013TS106). Mr. Ho Simon Wang at HUST Academic Writing Center has provided tutorial assistance to improve the manuscript.

Authors’ Affiliations

(1)
School of Computer, Huazhong University of Science and Technology
(2)
Network and Computer Center, Huazhong University of Science and Technology
(3)
Department of Computer Science and Engineering, Shanghai Jiao Tong University

References

  1. Leyden J: Mobile malware menace hits high - McAfee. 2007. . Accessed 2 Mar 2007 http://www.theregister.co.uk/2007/02/12/mobile_malware/Google Scholar
  2. Shi D-H, Lin B, Chiang H-S, Shih M-H: Security aspects of mobile phone virus: a critical survey. Ind. Manage. Data Syst. 2008, 108(4):478-494. 10.1108/02635570810868344View ArticleGoogle Scholar
  3. Kim H, Smith J, Shin Kang G: Detecting energy-greedy anomalies and mobile malware variants. Proceedings of the 6th International Conference on Mobile Systems Applications and Services, Breckenridge, 17–20 June 2008. New York: Association for Computing Machinery; 2008:239-252.Google Scholar
  4. Xie L, Song H, Jaeger T, Zhu S: A systematic approach for cell-phone worm containment. Proceedings of the 17th International World Wide Web Conference (WWW08), Beijing, 21–25 April 2008. New York: Association for Computing Machinery; 2008:1083-1084.Google Scholar
  5. Xu N, Zhang F, Luo Y, Jia W, Xuan W, Teng J: Stealthy video capturer: a new video-based spyware in 3G smartphones. Proceedings of the 2nd ACM Conference on Wireless Network Security(WiSec09), Zurich, 16–18 March 2009. New York: Association for Computing Machinery; 2009:69-78.Google Scholar
  6. Salathe M, Kazandjieva M, Lee JW, Levis P, Feldman MW: A high-resolution human contact network for infectious disease transmission. PNAS 2010, 107(51):22020-22025. 10.1073/pnas.1009094108View ArticleGoogle Scholar
  7. Fenichel EP, Castillo-Chavez C, Ceddia MG, Chowell G, Parra PA, Hickling GJ, Holloway G, Horan R, Morin B, Perrings C, Springborn M, Valazquez L, Villalobos C: Adaptive human behavior in epidemiological models. PNAS 2011, 108(15):6306-6311. 10.1073/pnas.1011250108View ArticleGoogle Scholar
  8. Wang P, Gonzalez MC, Hidalgo CA: Understanding the spreading patterns of mobile phone viruses. Science 2009, 324(5930):1071-1076. 10.1126/science.1167053View ArticleGoogle Scholar
  9. Gao C, Liu J: Modeling and restraining mobile virus propagation. T. Mobile Comput. 2012, 99: 1-3.Google Scholar
  10. Channakeshava K, Bisset K, Kumar VSA, Marathe M, Yardi S: High performance scalable and expressive modeling environment to study mobile malware in large dynamic networks, 25th IEEE International Parallel and Distributed Processing Symposium, Anchorage, 16–20 May 2011. Piscataway: IEEE; 2011:770-781.Google Scholar
  11. Yajin Z, Xuxian J: Dissecting android malware: characterization and evolution. Proceedings of the 33rd IEEE Symposium on Security and Privacy, San Francisco, 20–23 May 2012. Piscataway: IEEE; 2012:95-109.Google Scholar
  12. Gopalan A, Banerjee S, Das AK, Shakkottai S: Random mobility and the spread of infection. IEEE INFOCOM 2011, Shanghai, 10–15 April 2011. Piscataway: IEEE; 2011:999-1007.Google Scholar
  13. Pastor-Satorras R, Vespignani A: Immunization of complex networks. Physical 2002, 65(3):6104.Google Scholar
  14. Dezso Z, Barabasi AL: Halting viruses in scale-free networks. Physical 2002, 65(5):5103.Google Scholar
  15. Holme P, Kim BJ: Vertex overload breakdown in evolving networks. Physical 2002, 65(6):6109.Google Scholar
  16. Chen Y, Paul G, Havlin S, Liljeros F, Stanley HE: Finding a better immunization strategy. Physical 2008, 101(5):8701.Google Scholar
  17. Cohen R, Havlin S, Ben-Averaham D: Efficient immunization strategies for computer networks and populations. Physical 2003, 91(24):7901.Google Scholar
  18. Holme P: Efficient local strategies for vaccination and network attack. Europhys. Lett. 2004, 68(6):908-914. 10.1209/epl/i2004-10286-2View ArticleGoogle Scholar
  19. Gallos LK, Liljeros F, Argyrakis P, Bunde A, Havlin S: Improving immunization strategies. Physical 2007, 75(4):5104.Google Scholar
  20. Gomez-Gardenes J, Echenique P, Moreno Y: Immunization of real complex communication networks. Eur. Phys. 2002, 49(2):259-264.View ArticleGoogle Scholar
  21. Echenique P, Gomez-Gardenes J, Moreno Y, Vazquez A: Distance-d covering problem in scale-free networks with degree correlation. Physical 2005, 71(3):5102.Google Scholar
  22. Gao C, Liu J, Zhong N: Network immunization with distributed autonomy-oriented entities. IEEE Trans. Parallel Distrib. Syst. 2011, 22(7):1222-1229.View ArticleGoogle Scholar
  23. Zyba G, Voelker GM, Liljenstam M, Mehes A, Johansson P: Defending mobile phones from proximity malware. IEEE INFOCOM 2009, Rio de Janeiro, 19–25 April 2009. Piscataway: IEEE; 2009:1503-1511.Google Scholar
  24. Hui P, Crowcroft J, Yoneki E: BUBBLE rap: social-based forwarding in delay tolerant networks. IEEE Trans. Mob. Comput. 2011, 10(11):1576-1589.View ArticleGoogle Scholar
  25. Li F, Yang Y, Wu J: CPMC: An efficient proximity malware coping scheme in smartphone-based mobile networks. IEEE INFOCOM 2010, San Diego, 14–19 March 2010. Piscataway: IEEE; 2010:1-9.Google Scholar
  26. Jackson J, Creese S: Virus propagation in heterogeneous Bluetooth networks with human behaviors. IEEE T. Depend. Secure 2012, 9(6):930-943.View ArticleGoogle Scholar
  27. Abhijit B, Xin H, Kang S, Taejoon P: Behavioral detection of malware on mobile handsets, in MobiSys '08. Proceedings of the 6th International Conference on Mobile Systems Applications and Services, Breckenridge, 17–20 June 2008. New York: Association for Computing Machinery; 2008:225-238.Google Scholar
  28. Kim H, Smith J, Shin KG: On detecting energy-greedy anomalies. IEEE Trans. Mob. Comput. 2011, 10(7):968-981.View ArticleGoogle Scholar
  29. Romualdo P-S, Alessandro V: Epidemic dynamics in finite size scale-free networks. Phys. Rev. E 2002, 65(3):035108.View ArticleGoogle Scholar
  30. Zhou YJ, Wang Z, Zhou W, Jiang XX: Hey, you, get off of my market: detecting malicious apps in official and alternative android markets. In Proceedings of the 19th Network and Distributed System Security Symposium NDSS 2012. San Diego; 2012.Google Scholar
  31. Sedgewick R, Wayne K: Algorithms. New Jersey: Addison-Wesley Press; 2011:412-445.Google Scholar

Copyright

© Cai et al.; licensee Springer. 2013

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.