By adopting the UDG model, which is the intersection graph of equal-sized circles [16], VANET topology can be abstracted as a non-directional graph tagged with time stamps. For better explanation, we first introduce some annotations for the related definitions.

### 3.1 Network model

The traditional static graph model in describing a network is *G* = < *V*,*E* >, where *V* represents the nodes and *E* represents the edges between the nodes. However, a VANET is dynamic and the topology is evolving due to the movement of the vehicles. Therefore, the topology of VANET can be expressed by a time-stamped graph *G* (*T*) = < *V*,*E*,*T* >, where *V* represents all the vehicles, *E* represents the links between two vehicles, of which the Euclidean distance is smaller than the wireless communication range *R*, and *T* is the time stamps. In real cases, due to obstacles such as high buildings, two vehicles might not establish a link even though their distance is smaller than the wireless communication range. In this situation, every vehicle can explore its real neighbor list by exchanging beacon messages with its neighbors. Consequently, VANET topology can be figured out based on the neighbor lists of all vehicles.

### 3.2 Routing path

Routing in a network is just to find a path in a given topology. We firstly define the routing path in a simplified static case. In a given time stamp *t*, the topology of VANET is *G* (*t*) = < *V*_{
t
},*E*_{
t
},*t* >. There is a non-empty sub-graph of *G*(*t*), denoted as *P*(*t*) = < *V*^{′}_{
t
},*E*^{′}_{
t
},*t* >. Assume *V*^{′}_{
t
}⊆ *V*,*E*^{′}_{
t
}⊆ *E*,*n* = |*V*^{′}_{
t
}|,*V*^{′}_{
t
}= {*v*_{a 1},*v*_{a 2},…,*v*_{
an
}}. *P*(*t*) is called a *path* if and only if there exists *σ* : *V*^{′}_{
t
}→ *V*^{′}_{
t
}; *σ* (*v*_{
ai
}) = *v*_{
i
}(*i* = 1,…,*n*), s.t. {{E}^{\prime}}_{t}=\bigcup _{j=1}^{n-1}({v}_{j},{v}_{(j+1)}). Note that *v*_{1} and *v*_{
n
} are called the two ends of the path at time stamp *t*. The path from *v*_{1} to *v*_{
n
} can be denoted as Equation 1.

\begin{array}{l}{P}_{{v}_{1}}^{{v}_{n}}\left(t\right)=\left\{{v}_{1}\stackrel{t}{\iff}{v}_{n}\right\}\end{array}

(1)

The *length* of the path is |*V*^{′}_{
t
}|, which is *n*. Note that there are probably more than one path from *v*_{1} to *v*_{
n
} in *G*(*t*). Therefore, the *distance* from *v*_{1} to *v*_{
n
} is defined as the shortest path from *v*_{1} to *v*_{
n
} in *G*(*t*). Meanwhile, there might be more than one shortest path from *v*_{1} to *v*_{
n
} in *G*(*t*) as well.

In a general case, the routing path may pass through several time stamps. Without loss of generality, assuming the routing path from *v*_{1} to *v*_{
n
} passes through a non-descending time-stamp set {T}_{s}=\left\{{t}_{j}{|}_{j=1}^{j=m}\right\}, we then have Equation 2.

\begin{array}{l}{P}_{{v}_{1}}^{{v}_{n}}\left({T}_{s}\right)=\bigcup _{j=1}^{m}\left({P}_{{v}_{{t}_{j}1}}^{{v}_{{t}_{j}n}}\left({t}_{j}\right)\right)=\bigcup _{j=1}^{m}\left\{{v}_{{t}_{j}1}\stackrel{{t}_{j}}{\iff}{v}_{{t}_{j}n}\right\}\end{array}

(2)

where {v}_{1}={v}_{{t}_{1}1},{v}_{n}={v}_{{t}_{m}n}.

In VANET, data packets should be forwarded along the routing path in a consecutive way if the former node is connected to the latter node. However, if one node is not in the communication range of the next hop at a certain time stamp, the data packets should be buffered at this node until the next time stamp comes when the consecutive two nodes on the path are connected to each other. Therefore, Equation 2 must fulfill the following requirement.

\begin{array}{l}{v}_{{t}_{k}n}={v}_{{t}_{(k+1)}1}={{V}^{\prime}}_{{t}_{k}}\cap {{V}^{\prime}}_{{t}_{(k+1)}}\phantom{\rule{1em}{0ex}}(k=1,\dots ,m-1)\phantom{\rule{1em}{0ex}}\end{array}

(3)

### 3.3 Connected component

The *connected component* at time stamp *t* is a non-empty sub-graph of network *G*(*t*), in which there exists at least one path for any two vertices. That is, the *connected component* that node *v*_{
i
} is connected with at time stamp *t* can be denoted as Equation 4.

\begin{array}{l}\text{CC}\left(t\right)=\bigcup _{{v}_{i},{v}_{j}\in {V}_{t}}^{}\left\{{v}_{i}\stackrel{t}{\iff}{v}_{j}\right\}\end{array}

(4)

As mentioned in Section 1, packets are forwarded much faster in a multi-hop way than that in a carry-and-forward style if the source node and the destination node are in the same connected component. Therefore, the performance of the routing strategy can be greatly improved if there are enough stable connected components in VANETs. We will analyze both the number and the size of the connected components in Sections 4.2 and 4.3.

### 3.4 Connected component stability

The topology of a VANET keeps on changing due to the movement of the vehicles. For a given connected component CC(*t*), some vehicles in it might leave while some other new vehicles might come to join at the next consecutive time stamp (*t* + 1). A good metric to measure the stability of CC(*t*) should consider both of these two phenomena. For simplification, we use the difference of vertices of the connected component at two consecutive time stamps to measure the stability. Therefore, we can define the stability factor [17] of CC(*t*) as Equation 5.

\begin{array}{l}{\zeta}_{\text{CC}\left(t\right)}=\frac{\left|{V}_{\text{CC}\left(t\right)}\bigcap {V}_{\text{CC}(t+1)}\right|}{\left|{V}_{\text{CC}\left(t\right)}\bigcup {V}_{\text{CC}(t+1)}\right|}\end{array}

(5)

It is obvious that 0 ≤ *ζ*_{CC(t)}≤ 1. CC(*t*) is more stable when the value of *ζ*_{CC(t)} is larger. The stability of the connected components will be discussed in Section 4.4.

### 3.5 Location dependency

Besides the topology features of the connected component, the location of the connected component is also important for network design. If the connected component is location dependent, in other words, if the vehicles always form the connected component in a specified region, we do not need to place roadside infrastructures in this region anymore. In order to measure the location dependency, we need the locations of all the vehicles in the connected component. For a given connected component CC(*t*), let *Λ*_{CC(t)} represent the coordinate set of all the vehicles in CC(*t*). Then we have Equation 6.

\begin{array}{l}{\Lambda}_{\text{CC}\left(t\right)}=\left\{\left({x}_{{v}_{i}},{y}_{{v}_{i}}\right)|{v}_{i}\in {V}_{\text{cc}\left(t\right)}\right\}\end{array}

(6)

Let \underline{X}=\underset{{v}_{i}\in {V}_{\text{cc}\left(t\right)}}{min}\left({x}_{{v}_{i}}\right),\underline{Y}=\underset{{v}_{i}\in {V}_{\text{cc}\left(t\right)}}{min}\left({y}_{{v}_{i}}\right),\overline{X}=\underset{{v}_{i}\in {V}_{\text{cc}\left(t\right)}}{\text{max}}\left({x}_{{v}_{i}}\right),\overline{Y}=\underset{{v}_{i}\in {V}_{\text{cc}\left(t\right)}}{\text{max}}\left({y}_{{v}_{i}}\right). The rectangle covering CC(*t*) can be defined as {\Gamma}_{\text{CC}\left(t\right)}=\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}\right(\underline{X},\underline{Y}),(\overline{X},\overline{Y}\left)\right], where (\underline{X},\underline{Y}) is the bottom left coordinate of rectangle *Γ*_{CC(t)} and (\overline{X},\overline{Y}) is the top right coordinate of rectangle *Γ*_{CC(t)}. In a consecutive time-stamp set {T}_{s}=\left\{{t}_{j}{|}_{j=1}^{j=m}\right\}, we denote the region that can cover vehicles in a connected component at *any* time stamp in *T*_{
s
} as \Psi =\bigcup _{j=1}^{m}{\Gamma}_{\text{CC}\left({t}_{j}\right)} and denote the region that can cover vehicles in a connected component at *all* time stamps in *T*_{
s
} as \Omega =\bigcap _{j=1}^{m}{\Gamma}_{\text{CC}\left({t}_{j}\right)}. Assume the function *δ*() is used to calculate the area of a region, the location-dependency factor of CC(*T*_{
s
}) can be calculated by Equation 7.

\begin{array}{l}{\xi}_{\text{CC}\left({T}_{s}\right)}=\frac{\delta (\Omega )}{\delta (\Psi )}=\frac{\delta \phantom{\rule{0.3em}{0ex}}\left(\bigcap _{j=1}^{m}{\Gamma}_{\text{CC}\left({t}_{j}\right)}\right)}{\delta \phantom{\rule{0.3em}{0ex}}\left(\bigcup _{j=1}^{m}{\Gamma}_{\text{CC}\left({t}_{j}\right)}\right)}\end{array}

(7)

It is obvious that 0\le {\xi}_{\text{CC}\left({T}_{s}\right)}\le 1. CC(*T*_{
s
}) is more location dependent when the value of {\xi}_{\text{CC}\left({T}_{s}\right)} is larger. The location dependency of the connected components will be discussed in Section 4.6.