We propose an opportunistic routing called Link State aware Geographic Opportunistic routing protocol (LSGO) which takes a combination of geographic location and the link state information as the forwarder selection mechanism. The protocol aims to ensure a highly dynamic network packet delivery rate and improve the reliability of data transmission. Besides, it also aims to reduce the number of transmissions (including retransmissions) and the transmission delay. The protocol mainly includes three parts, namely, the estimation of link quality, candidate node set selection mechanism, and priority scheduling algorithm.

### 3.1. The estimation of link quality

ETX [16] is based on the expected number of transmissions (including retransmissions) to select the next hop, and the aim is to minimize the end-to-end number of transmissions, thus saving bandwidth. The ETX of a path is the sum of the ETX value of each link on this path. Each node broadcasts probe packets periodically. After a certain time interval, two adjacent nodes calculate the probe packet delivery rate *d*_{
f
} and *d*_{
r
} in two directions (one for probe packet transmission and the other for ACK acknowledgment packet transmission). So, the expected probability of a successful transmission is *d*_{
f
} × *d*_{
r
}. Since every time we send a data packet can be considered as a Bernoulli trial, ETX is calculated as

\mathrm{ETX}=\frac{1}{{\mathit{d}}_{\mathit{f}}\times {\mathit{d}}_{\mathit{r}}}

(1)

However, the ETX metric does not specifically consider the mobility in VANETs. In LSGO, we improve the ETX to adapt to the network that is highly dynamic. There are two major improvements: the measurement of the link transmission rate and the calculation of ETX.

In LSGO, each node broadcasts a Hello packet periodically, and we use the Hello packets to measure the link transmission rate. To calculate the ETX of a link, each node should record *t*_{0} which means the time when the first Hello packet is received and the number of packets it has received from the neighbor during the last *w* seconds. Then, according to the interval between *t*_{0} and the current time *t* and the window *w*, the link transmission rate *r*(*t*) is

\mathit{r}\left(\mathit{t}\right)=\left\{\begin{array}{l}\mathrm{count}\left({\mathit{t}}_{0},\mathit{t}\right),\phantom{\rule{1em}{0ex}}0<\mathit{t}-{\mathit{t}}_{0}<1\\ \frac{\mathrm{count}\left({\mathit{t}}_{0},\mathit{t}\right)}{\left(\mathit{t}-{\mathit{t}}_{0}\right)/\mathit{\tau}},\phantom{\rule{0.5em}{0ex}}1\le \mathit{t}-{\mathit{t}}_{0}<\mathit{w}\\ \frac{\mathrm{count}\left(\mathit{t}-\mathit{w},\mathit{t}\right)}{\mathit{w}/\mathit{\tau}},\mathit{t}-{\mathit{t}}_{0}\ge \mathit{w}\end{array}\right.

(2)

The denominator is the number of Hello packets that should have been received during the window, and *τ* represents the broadcast interval of the Hello packet. Count (*t*_{0},*t*) is the number of Hello packets received during *t* - *t*_{0}. As can be seen from the formula, there are three situations in terms of the difference between *t* - *t*_{0} and window *w*. (1) 0 < *t* - *t*_{0} < 1, in this case, the packet delivery rate is the number of Hello packets received from *t*_{0} to *t*. (2) 1 ≤ *t* - *t*_{0} < *w*, the packet delivery probability in this condition is the number of Hello packets received from *t*_{0} to *t* divided by the length of this period. (3) *t* - *t*_{0} ≥ *w*, in this situation, the calculation is the same as the calculation in the ETX metric.

In LSGO, we do not consider the asymmetry of the link and only use the one-way transmission rate to calculate the link ETX. Assuming that the one-way transmission rate is *r*(*t*), then the link ETX is

\mathrm{\text{ETX}}=\frac{1}{{\mathit{r}}^{2}\left(\mathit{t}\right)}

(3)

### 3.2. Candidate node set selection mechanism

LSGO’s main objective is to use opportunistic routing to ensure VANET transmission reliability, while reducing the number of transmissions, and therefore, the selection of the candidate node set needs to ensure that the number of backup links can provide the required delivery rate. Seen from the estimation of link quality, each node can calculate the link transmission rate *r*(*t*) of all links between itself and all its neighbors. The candidate nodes can be selected by the link transmission rates of the links that are formed by the sending node to its neighbors. As shown in Figure 2, *r*_{1}(*t*) and *r*_{2}(*t*) are the transmission rates of the source node S to its two candidate relay nodes X and Y. Then, the probability that S sends data to the next hop successfully is 1 - (1 - *r*_{1}(*t*))(1 - *r*_{2}(*t*)).

Here is how the candidate node set selection mechanism works. For node S, the current time *t*, the number of neighbor nodes is *N. r*_{
i
}(*t*) (1 ≤ *i* ≤ *N*) is the transmission rate of the link that is formed by S to its neighbor node *i*, and *d*_{
i
}(*t*) (1 ≤ *i* ≤ *N*) represents the distance from the destination to node *i. S*(*t*) is the distance from the current node to the destination, and *r* is the required data delivery rate of a single link. If the number of candidate nodes is *n*, then *n* should satisfy the following conditions:

1-{\displaystyle \prod _{\mathit{i}=1}^{\mathit{n}}\left(1-{\mathit{r}}_{\mathit{i}}\left(\mathit{t}\right)\right)}\ge \mathit{r}

(4)

{\mathit{d}}_{1}\left(\mathit{t}\right)<{\mathit{d}}_{2}\left(\mathit{t}\right)<\dots <{\mathit{d}}_{\mathit{n}}\left(\mathit{t}\right)<{\mathit{d}}_{\mathit{n}+1}\left(\mathit{t}\right)<\dots {\mathit{d}}_{\mathit{N}}\left(\mathit{t}\right)

(5)

{\mathit{d}}_{\mathit{n}}\left(\mathit{t}\right)<\mathit{S}\left(\mathit{t}\right)

(6)

That is, for the current node, the nodes in the candidate node set are the first *n* neighbors nearest to the destination. In addition, the distances from these *n* nodes to the destination are less than *S*(*t*). Note that if the network is sparse, it may result in a situation in which those *n* nodes cannot satisfy the condition 1-{\displaystyle \prod _{\mathit{i}=1}^{\mathit{n}}\left(1-{\mathit{r}}_{\mathit{i}}\left(\mathit{t}\right)\right)}\ge \mathit{r}. At this time, only if the distance from the neighbor node to the destination is less than *S*(*t*), the neighbor node is the candidate node.

The sending node would record the candidate nodes’ IDs and their priority numbers in the packet header after it selected the candidate node set. Since the number of candidate nodes *n* is dynamic, the size of the packet header is changing with it. If the network environment is good, the link between any two nodes is relatively stable, so the value of *n* and the packet header is small, which means that the overhead is small. On the contrary, if *n* and the packet header are large, then the overhead is large, too. The priority scheduling algorithm will be introduced in the next section.

### 3.3. Priority scheduling algorithm

LSGO uses timer-based priority scheduling algorithm, in which the highest priority node sends the packet firstly. For other candidate nodes, if they hear a higher-priority node send a packet, they would not process the packet; if the timer expires and a higher-priority node is not transmitting, they would begin to send the packet. The timer-based scheduling algorithm is simple and easy to implement and has no additional control overhead. However, the disadvantage is that it would introduce waiting time, thereby increasing the end-to-end transmission delay. Another shortcoming is that it may cause duplicate packet transmission, because the nodes in the candidate node set may not hear each other. But in VANETs, the packet passes along roads, and the road width is far less than the transmission range; in addition, the nodes that are selected by the candidate node set selection mechanism are located on one side of the current node, so all candidate nodes could hear each other from the distance perspective and duplicate transmission exists rarely in VANETs. An efficient scheduling algorithm should minimize the waiting time, which can be achieved in two ways: firstly, by assigning node priorities correctly, so that the optimal forwarding node has the highest priority and the higher-priority node has a better forwarding advantage, thus increasing the probability of selecting a higher-priority node that forwards packets and reducing the number of failed transmission, and secondly, by setting a reasonable waiting time for each node, which makes the low-priority node forward packets immediately after the high-priority node failed, thereby reducing the waiting time between the candidate nodes.

In LSGO, when the current node assigns the priority for a candidate node, it considers the distance from the candidate node to the destination, and the ETX of the link formed by the current node and the candidate node. There are two reasons for doing like this: on the one hand, selecting the candidate node that makes the greatest extent close to the destination as the forwarding node can reduce the transmission hops. On the other hand, the candidate node with a small ETX (minimum is 1) can increase the probability of successful reception. For candidate node *i*, its priority is obtained by

\frac{{\mathit{D}}_{\mathit{\text{sd}}}-{\mathit{D}}_{\mathit{\text{id}}}}{\mathit{\text{ET}}{\mathit{X}}_{\mathit{i}}^{2}}

(7)

*D*_{
sd
} is the distance between the current node and the destination. *D*_{
id
} is the distance from candidate node *i* to the destination node. ETX_{
i
} is the ETX of the link that is formed by the current node and candidate node *i. D*_{
sd
} - *D*_{
id
} indicates the geographic distance a packet can advance towards the destination. However, due to link loss, to be successfully forwarded to node *i*, a packet needs to be transmitted ETX_{
i
} times on average. Therefore, (*D*_{
sd
} - *D*_{
id
}) / ETX_{
i
} is the expected advance that a packet can make towards the destination through one transmission if it chooses node *i* as the next hop.

Passing by a link of low transmission rate will increase the probability of data transmission failure, so we divide the square of ETX in Equation 7. If candidate node *i* does not receive data correctly, another candidate node whose priority is lower than *i* will transmit the data, thus introducing additional waiting time. If two nodes have the same expected advance that a packet can make towards the destination through one transmission, the node whose ETX is smaller should be set a high priority.

The sending node will calculate each candidate node’s value according to Equation 7 as soon as it finishes selecting all the candidate nodes and assign priorities for candidate nodes in accordance with the calculation results. The node which has the maximum calculation result is assigned the highest priority; on the contrary, the node which has the minimum calculation result is assigned the lowest priority. The highest priority node sends a packet directly when it receives the packet, while the lower priority nodes need to set a timer. If the timer expires and a higher-priority node is not transmitting, they would begin to send the packet. Only by setting a reasonable overdue time for the timer can both reduce delay time and avoid duplication of transmission.

The network delay is defined as the time from a node receiving a packet to send it completely, and it consists of four parts: the processing delay, queuing delay, transmission delay, and propagation delay. Since we do not consider the network load, which means not considering the queuing delay, the network delay consists of three parts. Assuming that the total time of these three parts is *T*, if the node priority is *i*, the timer should be set to (*i* - 1)*T*. In our simulation, the packet size is set to 512 bytes. The protocol in MAC layer is 802.11, in which the channel rate is 2 Mbps. So, the transmission delay is equal to 512 × 8 bits / 2 Mbps = 0.002048 s. The radio wave propagation velocity in air is equal to the speed of light, namely, 3 × 10^{8} m/s. However, the distance between two vehicles who can communicate with each other directly is less than 250 m. So, the propagation delay is equal to 250 m / 3 × 10^{8} m/s = 0.83 × 10^{-6} s, and it can be ignored. Through doing multiple times of simulation and analyzing the trace files, we can get the processing delay which is approximately 0.001 ~ 0.002 s. Therefore, based on the above analysis, we can conclude that *T* is about 0.004 s.