MMIR: a microscopic mechanism for street selection based on intersection records in urban VANET routing

In urban vehicular ad hoc networks (VANETs), the intersection-based routing scheme has represented its greater applicability and better efficiency to adapt to high and constrained mobility. How to make an accurate decision for street selection is a challenging issue due to the rapid topology changes in VANETs. In this paper, we propose a microscopic mechanism based on intersection records (MMIR) in which the intersection vehicle nodes maintain and update a records table with every passing vehicle’s individual information. By analyzing and processing these entries, we evaluate these vehicles’ current positions so as to compute the connectivity probability or estimated delivery delay for all candidate streets to support street selection. In contrast to the statistical and macroscopic information for the common condition, we firstly make use of the individual and microscopic data to enhance the accuracy of estimated results. Furthermore, according to the quantity and the running interval, we classify vehicles into two categories: individual and queue vehicles, in order to effectively decrease the complexity of position estimation. Lastly, since there are no dedicated control packets generated in MMIR, the network overhead is low. The simulation results show that the proposed MMIR outperforms existing approaches of street selection in terms of the accuracy of computed connectivity probability and estimated delay.


Introduction
With the advance in wireless network technology in recent years, each vehicle running in the urban streets can exchange data with the nearby vehicles through vehicle-to-vehicle (V2V) communications [1,2] or with the roadside units (RSU) via vehicle-to-infrastructure communications (V2I) [3]. Vehicular ad hoc networks (VANETs) have attracted extensive attention from both academic and commercial communities. VANETs play an important role in safety-related (collision avoidance, cooperative driving, etc.), information services (real-time traffic, weather information, etc.), and infotainment (multiplayer games, multimedia sharing, etc.) for drivers and passengers. As a particular type of mobile ad hoc network (MANETs), especially in the urban scenario, VANETs have some unique features. First, due to the high mobility of the vehicles running in the street, the topology of the vehicular networks changes rapidly and thus the end-to-end connection is frequently broken. Second, the trajectories of vehicle nodes which only move along with the existing streets and change them one by one when they pass the intersections result in that the routes of multi-hop delivery between the vehicle nodes have to follow the urban traffic map restrictively. Third, the network connectivity in the street between two adjacent intersections depends on not only the vehicle nodes' density which is mainly related to the location of the street and the time of day, but also the vehicle nodes' evenness which is frequently affected by the traffic lights, vehicle accidents, and the difference of various vehicles' speeds, etc. Due to these characteristics of urban VANETs, the classical topology-based routing protocols [4][5][6] and the traditional position-based routing protocols [7][8][9] for MANETs are not suitable.
To adapt to the urban VANETs with high but constrained mobility, the intersection-based routing scheme [10] has represented its greater applicability and better efficiency. Its working mechanism is that geographical greedy forwarding strategy or its improved versions (e.g., [11][12][13][14][15]) are still used for packet transmission in the intra-streets, but when packets reach the intersection, a selection of the next street (i.e., direction) for forwarding is decided based on which one can provide a higher delivery rate and lower network delay in the entire multi-hop routing. The routing path from source to destination is separated into a series of streets connected one by one so that with the divide-rule policy it can better deal with the rapidly changing topology in VANETs. And as intersections are the only places where routing decisions are made, it is adaptable to the constrained mobility of the vehicle nodes and effective to avoid the local optimum problem caused by the street layouts and some towering obstacles in the urban environment.
How to make an accurate decision for street selection is one of the key issues in the intersection-based VANET routing. According to different application scenarios, there are different metrics for the street selection in VANET routing such as distance to destination, connectivity probability, delivery delay, and delivery ratio. No matter which metric, its calculation needs to be based on some information, e.g., the street length, the vehicle density, and the number of neighboring vehicles. The data acquisition should have characteristics of accuracy and real time in order to support the right decision. However, there are some interference factors to street selection. For instance, owing to traffic lights at intersections, the network in each street may be partitioned into several segments to impact network connectivity [16]; with the drivers' diverse customs, the vehicle nodes have different not only driving velocities but also velocity variations to a certain condition (e.g., high density). These concrete factors make it more difficult to capture the real and current network state by applying some macroscopic statistical data such as average vehicle velocity, traffic flow, and vehicle density in the past period of time.
Most of the existing mechanisms and models of connectivity estimation were designed at the macroscopic level. In this way, it is not suitable for the network environment with high mobility and rapid topology changing. In this paper, we focus on designing a more accurate strategy of real-time street selection with low overhead from the micro point of view. To this end, we propose a microscopic mechanism based on intersection records (MMIR) in which the intersection vehicle nodes maintain and update a records table with every passing vehicle's information. By analyzing and processing these entries, we can evaluate these vehicles' current positions so as to compute the connectivity probability or estimate the delivery delay for all candidate streets which connected with this intersection. In contrast to the statistical and macroscopic data or unreliable topology information, in MMIR, we firstly make use of the concrete and microscopic information recorded at the intersection when each vehicle passed so as to enhance the accuracy of estimated results. In this way, MMIR can detect the actual status of network connectivity, in which it may be disconnected even with high density. Furthermore, according to the vehicle quantity and intervals during them, we classify the recorded vehicles into two categories to handle respectively, i.e., individual nodes and queue nodes. The nodes in a queue are regarded as a whole in order to effectively decrease the complexity of position estimation caused by the different velocity variance of every vehicle to the relatively high density and respective location in the queue. Lastly, since there is no dedicated control packet generated (i.e., the information for intersection recording is contained in periodical beacon messages for sending), there is barely extra network overhead. MMIR is suitable to the intersectionbased VANET routing in urban environments especially the signalized arterial street networks. Simulation results show that the proposed MMIR outperforms existing approaches of street selection in terms of the accuracy of computed connectivity probability and estimated delay.
The remainder of this paper is organized as follows: Section 2 summarizes the related work. Section 3 describes the detail of MMIR including free velocity, intersection recording, connectivity calculation, and delivery delay estimation. Section 4 determines the key parameters of MMIR. Section 5 evaluates the performance of MMIR by simulations. Finally, Section 6 concludes the paper.

Related work
Selection of an optimal street for delivering data packets is the critical issue in designing an intersection-based VANET routing protocol. Generally speaking, the performance of a street selection strategy greatly depends on what information it adopted. The greedy perimeter coordinator routing (GPCR) [17] is a classical intersectionbased VANETs protocol which was proposed to solve the local optimum problem in the greedy perimeter stateless routing (GPSR). In GPCR, with supports of GPS and static street maps, the street through which there is the shortest path for packet delivery to destination is selected by using Dijkstra's algorithm. Yang et al. [18] proposed an adaptive connectivity aware routing (ACAR) and indicated that the width of a street can be used to assess the candidate street: a wider street implies a higher probability of vehicle density and consequent network connectivity. But only introducing the static information is obviously insufficient. The delay model in the vehicle-assisted data delivery (VADD) protocol proposed by Zhao and Cao [19] combines static data (Euclidean distance of street) in digital maps and statistical data (average velocity and vehicle density) from third-party services to estimate the packet delivery delay for every adjacent outgoing street and select one with the shortest delay towards the destination at the current intersection. Jeong et al. [20] proposed another link delay model with one-way vehicular traffic given the vehicle arrival rate, the vehicle average speed, and the length of the street. The model separates the street length into two parts: forwarding distance and carrying distance. By ignoring the small communication delay for forwarding packets, the delivery delay along a street is the corresponding carry delay with carrying distance which is calculated by the analytical method of probability function. However, there might be some broken links with length greater than transmission range due to low traffic density, splitting the network into multiple clusters. To improve the connectivity probability, Panichpapiboon et al. [21] took advantage of the opposing vehicles on a twoway street and proposed a connectivity model by applying the bidirectional statistics of the street. Furthermore, it is well known that speeds of vehicles are not constant and normally distributed in the free-flow traffic state [22,23]. Yousefi et al. [24] proposed an analytical model for connectivity probability and connectivity distance by considering not the constant but the normal distribution (mean and variance) of the vehicles' speeds. Al-Mayouf et al. and Ding et al. [25,26] also make use of the average speed of vehicles to calculate the connectivity for the next street selection. Apart from the above macroscopic models which adopted static data and traffic statistics, some other studies focused on applying the real-time control information exchanged with neighboring vehicles to estimate network connectivity or delivery delay in the street. In the landmark overlays for urban vehicular routing environments (LOURE) [27] and the virtual vertex routing (VVR) [28], similarly, a node obtains the number of its current neighbors by received beacon messages and adds this new information into its next beacon to broadcast. Thus, all vehicle nodes including that located at the intersection can collect the density and topology information in the street to calculate the network connectivity for routing selection in real time. Zhang et al. [29] considered the phenomenon of the link correlation which represented the influence of different link combinations in network topology to transmit a packet and deigned an opportunistic routing metric called the expected transmission cost over a multi-hop path (ETCoP) for the selection guidance of the relaying node in intra-streets and the next street at an intersection. Likewise, the topology information used to calculate ETCoP is obtained via beacon packets. The link-delay update (LDU) module in the static-nodeassisted adaptive data dissemination protocol for vehicular networks (SADV) proposed by Ding [30] measures the transfer delay for each street in real time and propagates the up-to-date estimation among the static nodes which were deployed at intersections, so that each static node can get a more complete delay matrix and contribute to making an accurate decision of street selection. Nzouonta et al. [31] proposed two road-based (the same as intersection-based) using vehicular traffic (RBVT) routing protocols: a reactive protocol RBVT-R and a proactive protocol RBVT-P. Especially in the RBVT-P, the periodical connectivity packets (CPs) are generated to visit connected streets and store the graph that they form. By dissemination, all nodes in the network can maintain the information of entire topology and calculate the shortest connected paths to the destination. In consideration of network overhead and freshness of information, another routing protocol, diagonal-intersection-based routing (DIR) [32], only gathers topology information within the range of the successive three streets; moreover, it takes into account the probability of the green light at intersections for delay estimation. With the affection by traffic lights, various vehicles' different speeds, etc., the network connectivity in the street depends on not only the average density but also the vehicles' distribution in real time. In the improved greedy traffic-aware routing (GyTAR) protocol proposed by Jerbi et al. [33], each street is dissected into small fixed-area cells in advance depending on the transmission range of vehicles. By acquiring the number of vehicles within every cell of the street in real time, the intersection vehicle nodes consider traffic density information included in the cell data packet (CDP) and the curve metric distance to the destination extracted from digital maps, then calculate a score for every candidate street and select the one with the highest score for forwarding packets. Furthermore, with the development of sensor technology and intelligent transportation system, the real-time status of traffic lights was also considered a deciding factor for selecting streets, e.g., in [12] and [34].

Method
Apart from the distance to destination, the connectivity of network in the candidate street is also the crucial element for street selection in VANET routing. As well known, it mainly depends on the density and distribution of the vehicle nodes in the street. In this paper, we study the connectivity from the microscopic point of view, describing the traffic flow by tracking individual vehicles rather than on an aggregated basis [35]. MMIR which we proposed aims to give an accurate estimation in real time and with low overhead. It is organized into three parts: (1) free-velocity analysis (the definition of free velocity in this paper), (2) recording at the intersection, and (3) connectivity calculation and delivery delay estimation.
MMIR considers that each vehicle is equipped with a Global Position System (GPS) and a street-level digital map, and then it can easily acquire the information about its own position, velocity, moving direction, etc. The information can be also obtained by their neighboring vehicles with the aid of the periodical beacon messages exchanged with each other. Furthermore, a source node knows the current geographical position of the destination which can be achieved by the location service. It would draw support from a low power wide area (LPWA) network [36] such as LoRa [37], Narrowband Internet of Things (NB-IoT) [38], etc. In addition, all the vehicles are assumed to be synchronized by GPS.

Free-velocity analysis
MMIR's main approach to the calculation of the street connectivity is using the data of individual vehicles to estimate their positions at a certain time. To achieve this objective, each running vehicle needs to gather and calculate its accurate and effective driving data to support the connectivity calculation in which the free velocity is the crucial one.
In this paper, we refer to free velocity as the driver's desired velocity in the free-flow traffic state in which there is little influence from other vehicles and no occurrence of traffic incidents nearby. In terms of the definition, to gather the real-time data, the free velocity needs to meet the following conditions and principles: Within a certain range in the front, the number of vehicles which are moving in the same direction is not sufficient to affect the driver to make a reaction on velocity. For instance, the threshold value of the number can be determined as N ln − 1, where N ln is the number of lanes in one direction. It means that the vehicle still has a free lane to move at its desired velocity without the influence of the slow vehicles in front of it. The free-velocity collection cannot be executed in the vicinity of intersections in consideration of the forced decelerating, waiting, accelerating processes of vehicles due to the traffic lights and security considerations. The free-velocity collection can be executed only when the condition mentioned above has been active for a certain time. It ensures that there is enough time for the driver to convert to his desired velocity from the previous state. According to different conditions such as the number of lanes, lane width, and the value of speed limit, we classify the streets into several classes in advance. Thus, the individual vehicle needs to gather and calculate its free velocity for each class respectively. Such is helpful to the accuracy of information collection.
The free velocity may also be affected by some other factors which are hardly recognized quantitatively, such as the weather, driver's mental status, and even the mood. However, generally speaking, the samples of free velocity gathered in the above conditions follow Gaussian distributions [23,39]. In this way, for the later works, it needs to calculate the average (1) and the variance (2) of free velocity by each vehicle in real time using the following two formulas:

Recording at the intersection
The street intersection plays an important role in urban VANETs as it is the junction between different streets. At the intersection, vehicles leave their last street and enter a new one by going straight or taking a turn. Correspondingly, the forwarding direction of a packet in VANETs may also be changed depending on the destination location and the network connectivity in the candidate street. Vehicle nodes at the intersection always act as the decision makers of street selection in most of the intersection-based routing protocols. In MMIR, among these vehicle nodes, one or some are considered the intersection-server node (ISN) according to their current locations and other features. They are in charge of receiving and storing the records for all vehicles that passed the intersection in recent time. In general, the closest vehicle to the intersection center (optimal position), and with the slowest velocity (longest duration), is the optimum one for ISN election. With respect to the mechanism of the server's selection and replacement, it could draw lessons from related ideas of the location server, e.g., in [40] which is a quorum-based location service protocol. As it is not the main aspect for study in this paper, the details are not given here.
In practical situations, every vehicle, when it passed the center zone of the intersection and entered a new street, packets its information and attaches them to the beacon message in the next time for sending. After (usually less than) a beacon time interval, once the ISN received the modified beacon message, it extracts the information and generates a new entry in its intersection records table for the vehicle that the message comes from. In other words, only the ISNs maintain the records table. Additionally, when an ISN left the intersection, it removes its status as a server, also generates a new entry for itself, and then sends the whole records table included in the beacon message to the other ISNs or the optimal vehicle-node which will be elected as a new ISN. Note that due to all the information of the passing vehicles which are contained in the periodical and mandatory beacon messages, MMIR does not introduce much additional network overhead from the recording process at the intersection. As illustrated in Table 1, the intersection record includes vehicle ID, last street ID, new street ID, transmission time (t rec ), current position in the new street (pos rec ), current velocity (v rec ), free velocity (average and variance), normal acceleration (acc), transmission range (R), TTL (time-to-live), by which we can estimate the position of the vehicle at a later time.
Note that, in order to reduce the data volume and calculation quantity at ISN, the term of TTL is added. It means the estimated time which will be taken to pass through the new street by the vehicle. However, sometimes a vehicle cannot run at its free velocity throughout the whole street due to the vehicles around it. In practice, we should set the valid time bigger than the estimated time. The detail will be discussed in the following section.
Privacy protection is a critical issue for the drivers and passengers in the vehicles [41]. To make sure that the vehicle's trajectory cannot easily be traced by the others, the vehicle ID in an intersection record is denominated as a temporal and unique character string which is not its real ID in the network. From the prospective of intersection recording, it only needs this unique string to avoid the occurrence of duplicated records for the same vehicle in its records table, rather than to know which vehicle it is in the whole network for other uses such as location service, etc.

Connectivity calculation and delay estimation for street selection
After study of intersection recoding, we introduce connectivity calculation and delay estimation in detail below.

Connectivity probability in light traffic
In the urban environment especially in the arterial streets, for vehicle driving, there is a more comfortable condition relatively which commonly includes three or four lanes for each direction, a greater width of the lane, a smaller ratio of the curved section, no crosswalks, few parking points, etc. Vehicles in such situation, and in free-flow traffic state (i.e., the density is sufficiently low), could be running at the free velocity all the way. Thus, their positions (pos est , i.e., the distance from the street entrance), at a certain time (t cur ), can be calculated based on its record at last intersection as where t int is the interval between the current time and the transmission time in the record, t acc is the time which the vehicle needs to accelerate to its free velocity. At the intersection with traffic lights, vehicles which begin to move when the light turns green from red, and vehicles which have slowed down for passing the intersection, need an accelerating process to attain their free velocity when they have entered the new street. In most cases, t acc is less than t int , because the transmission time t rec is the last beacon message to ISN when the vehicle is in the communication range (around 250 m) and for this distance the vehicle's acceleration has been normally completed.
Since the free velocity is not a constant value and is distributed following Gaussian distribution, the estimated position follows Gaussian distribution correspondingly when t int ≥ t acc , as Let N be the total number of vehicles of which the estimated positions are still in the current street or in the next intersection area. Hence, we can calculate and sort the vehicles' estimated positions as μ pos1 < μ pos2 < ⋯ < μ posN , where μ posi is the ith vehicle.
As the free-velocities of vehicles are independent and identically distributed (i.i.d.), according to the property of Gaussian distribution, the distance (dis i ) between vehicle i and vehicle i + 1 follows Gaussian distribution as well (8). The probability of connectivity (P i ) between them can be calculated as (9).
The distance between any two consecutive vehicles must be smaller than the transmission range R to ensure that the network from the first to the last is connected. Thus, it is required that dis i ≤ R for i = 1, 2, …, N − 1. Note that even in the multi-lane streets, the connections between vehicles mainly depend on the distance along the street (in the parallel direction) and the distance in the transverse direction can be negligible relatively. In other words, VANETs in the urban streets are considered a one-dimensional network.
Furthermore, the street connectivity considered in MMIR is that there is an end-to-end connection from the last intersection area to the next one.
Accordingly, it needs at least one vehicle in the next intersection. We can calculate the probability of each vehicle to be in the next intersection area (P next ) in sequence as where R int is the range of intersection area which is the half transmission range so as to ensure any two vehicles in the area can communicate with each other by onehop link. If the P next of the nth vehicle is greater than the threshold value (such as 0.8) that we set and for i = 1, 2, …, n − 1, the P next of the ith vehicle is less than it, we can calculate the connectivity probability in the street (P c ) as

Queues and individuals
Until now, we have described the calculation of connectivity probability in the street based on free-flow traffic state. However, in many cases, a vehicle cannot be driven at its free velocity all the way due to the interaction with a crowd of vehicles around it. It needs to make some actions such as acceleration, deceleration, and frequent lane changing, and these events may interrupt network connection.
To deal with these disturbances to our calculation of connectivity probability, in MMIR, we classify the vehicles into two categories to handle respectively, i.e., individual vehicle and queue vehicle.
As the most significant source of fixed interruptions, the traffic lights at intersections periodically halt vehicle flow for each movement which on a given set of lanes is possible only on the green light, and then partition the flow in the street into several clusters which are called as queues in this paper due to the consideration of one dimensional network. As illustrated in Fig. 1(a), it is a typical intersection in the urban environment, of which in each entrance direction there are two dedicated straight lanes, one straight lane sharing with right turn and one dedicated left turn. Vehicle A and B have arrived at the intersection and want to turn left and enter street-L. On the other side, some vehicles (group C) of which the number is enough to let them affect each other in velocity when they are starting to move together, have stopped opposite to street-L and are waiting for the straight-moving signal to turn green. At the next moment in Fig. 1b, vehicle A and B have entered street-L and moved a distance; behind them, group C has obtained the permit (green signal) and also entered street-L. As we mentioned above, the individual vehicles, A and B, could run at their respective free velocity. On the contrary, the vehicles in group C formed a queue and their velocities might be affected by each other. Furthermore, after group C passed the intersection, there is a queue-discharging process representing that all the vehicles can be back to their free-velocities until the queue fully dissipates. Let us analyze the discharging queue in terms of the following points: Connectivity in the queue. The influence of queue in the urban street is very likely to be negligible when the length exceeds 2 mi (3.21 km) [42]. However, in an urban environment, the length of the street between adjacent intersections is generally less than such 2 mi. In other words, the queue generated at last traffic lights will not be dispersed in the current street. And in view of the transmission range of about 250 m, we consider that the connection in the queue is linked from the head vehicle to the last one in the whole street which they entered. Head vehicle and tail vehicle. In MMIR, we refer to the head vehicle in the queue as the headmost vehicle at the time of executing the connectivity calculation rather than the time when the queue formed. On the contrary, the tail vehicle is also the meaning. Common sense says that with fewer disturbances from other vehicles, the one with the fastest free velocity in the front of the queue accelerates and more likely runs at its free velocity without loss. On the other side, from starting to move to the last communication for intersection records, the rear vehicles in the queue have more time and practicable distance (is about queue length plus transmission distance) to accelerate than others. Furthermore, the one with the slowest free velocity in the rear can get its free velocity more quickly and then run without disturbance (the vehicles behind have overtaken it almost). Therefore, from respective recording time in the intersection records, we consider both the processes of head vehicle and tail vehicle as acceleration (it is not needed if the vehicle has reached its free velocity) and then running at the free velocity without loss until reaching the range of next intersection or catching the queue ahead. Integration and overlap. Once a queue is formed and enters the new street, there are three occurrences we need to notice: the queue catches up with an individual (Fig. 2), an individual catches up with the queue (Fig. 3), and the queue catches up with another queue (Fig. 4). In the first case, the head vehicle overtakes the individual vehicle which means the individual is integrated into the queue, and then we no longer consider it independently. The second case is similar to the first, after the individual vehicle overtakes the tail vehicle in the queue, and then we no longer consider it. Note that the individual vehicle can hardly overtake or be overtaken by all the vehicles in the queue within the distance of usual urban street length, and moreover, there is little probability that its velocity is faster or slower than all the vehicles. In the last case, two queues overlap with each other and are integrated into a new queue. Then, we consider the head vehicle in the queue in front as the new head vehicle and the tail vehicle in the queue behind as new tail vehicle.
As discussed above, in a queue which is generated due to traffic lights at the intersection, only the head vehicle and the tail vehicle can be considered, and between them there is still a connection link under common circumstance. Furthermore at a given time, if the estimated position of an individual vehicle is in the range between the head and tail vehicles in the queue or even strides over the queue, the individual can be ignored. The pseudo code of the calculation of connectivity probability is shown below.

Delivery delay
In a sparse traffic circumstance, sometimes there is probably not an existing connection link in the street. However for delay-tolerant applications, the carry-andforward approach can be adopted, where the vehicle carries the packet when connection does not exist, and forwards the packet when there is an appropriate receiver that appears. The delivery delay which is taken to deliver the packet through the street is commonly constituted by transmission delay and carrying delay. By ignoring the transmission delay which is very small relatively, we consider delivery delay mainly as the corresponding carrying delay. In MMIR, by means of the position estimation of vehicles, we can estimate the connection status over time at equal intervals and then calculate a score of delivery delay for every candidate street. Note that we just give a score to compare for street selection on our original purpose, not to precisely model the delivery delay of packet forwarding in a street, which is a complicated work especially from the microscopic point of view due to many uncertainties. Algorithm 2 describes the process of score calculation: at a certain time if the connectivity probability P i between two consecutive vehicles is greater than 0.5, it is considered the packet is sent to the front vehicle without increasing carrying delay; on the contrary, the packet is left at the current vehicle and will be judged again at the next moment (e.g., next second); when the packet arrived in the range of the next intersection, the ratio of the time spent to the expiration time is the score of delivery delay. If the expiration time ran out and the packet cannot arrive at the next intersection, the score is set as 0. In MMIR, the street which has a higher score delay is regarded as that with lower delivery delay relatively for forwarding packets.
In Section 3.3, we discussed the estimation of connectivity and delay for a street in MMIR. Thus, like [33], combining the distance to the destination, we conclude the calculations of the total score for street selection, (12) and (13), which focus on connectivity probability and delivery delay respectively: where d i is the curve metric distance from the next intersection of the candidate street to the destination and it should be less than d cur which is the distance from the current intersection to the destination. α 1 , α 2 , β 1 , and β 2 are weighting factors for the distance and connectivity or delay respectively with α 1 + β 1 = 1 and α 2 + β 2 = 1. The candidate street with higher score is preferred here. Note that if the destination of packet delivery is in a candidate street, this direction will be chosen without calculation.

Improvement and adjustment for connectivity calculation
Forwarding packets during the vehicles with the same running direction can enhance the stability of network connection. However, in a two-way street, sometimes packets can be relayed by the opposing vehicles to improve connectivity, which may also be considered in MMIR. In advance, we can get the average value of spatial density λ opp in opposing directions over a recent period of time from the intersection records in the other side of the candidate two-way street. In the connectivity calculation between two consecutive vehicles, if the distance ( d ¼ μ pos iþ1 −μ pos i ) is greater than the transmission range R, and then we can make use of the opposing vehicles to fix the connectivity. We get the number which is needed at least: n = d/R, and then recalculate the connectivity probability adding opposing vehicles as So far, for most street selection strategies based on monitoring traffic density, there are some social disturbance factors existing. For instance, there is a large residential community on the side of the street. In the morning of working days, many vehicle nodes appear in the street and start their trips to work, and after work they pass the intersection connecting the street, come back to the community, and disappear. If the motoring point is at the upstream intersection (such as in MMIR), the disappeared vehicles will decrease the actual connectivity in the street. On the contrary, if the connectivity evaluation is based on the information gathered at the downstream intersection, the connectivity probability calculated will be larger than the real value due to the appeared vehicles. In MMIR, we introduce two solutions to this problem. In consideration of the limitation of length, the details of this study are not given here. First is to utilize the vehicles' trajectory information like [20]: each vehicle's destination position will be acquired at the intersection if its destination is in the candidate street, and according to it we can correct our calculation. Second is to set the variation factor by means of the statistics information: the ratios of appeared (n app ) and disappeared (n dis ) vehicles to the recorded vehicles (n rec ) at the intersection in terms of the street location and the time of day will be used, and the variation factor can be set in the form of (pos/l) × (1 + α ' ⋅ n app /n rec − β ' ⋅ n dis /n rec ), where pos is the position (relative to the current intersection) of the disturbance point such as a large community and factory and l is the length of the street.

Parameter setting
In this section, we determine the threshold values: arriving time interval and the number of vehicles which satisfy the requirement of a queue in MMIR. As interpreted in Section 3.3.2, the generation of a queue in MMIR needs to satisfy two conditions: the arriving time interval between every two consecutive vehicles in the queue is short enough to avoid a communication break occurring (i.e., the distance between any two consecutive vehicles is larger than the transmission range after driving a while), the number of queue vehicles is large enough to differ from the individual vehicles. Therefore, to determine the threshold value of them, we simulate a scenario using SUMO (Simulation of Urban Mobility) [43] which is an open source program to generate realistic vehicular mobility, and the parameters are shown in Table 2. There is a street with 1-km length and 4 lanes in a single direction which is the common environment in urban arterial streets. The free-velocities of vehicles are not the same and follow a normal distribution N (70, 10.5) KPH (kilometers per hour). The value of sigma in SUMO which describes the random influence on velocity from the driver imperfection (i.e., uncertain factors) is set as 0.5 to achieve realistic vehicle behavior. The transmission range of each vehicle is set as 250 m. In the simulation, the vehicles are injected from one side and travel through the street with different combinations of the average arrival time interval and total number. For every combination, we repeated sufficient times to compute the mean broken rate which is the proportion of disconnecting time (the multi-hop connection from the head vehicle to the last vehicle is broken) to the queue existence time (between the time when the last vehicle entered and the time when the head vehicle left the street), and the average lost time due to driving slower than desired velocities of all vehicles.
As expected in Fig. 5, the broken rate in the queue increases as the arrival time interval increases. When it is set at 2 s or less, whether the number of vehicles is 12, 30, or 50, the broken rate can be controlled less than 1%. However, when it is set at 2.5 s or above it, the broken rate increases roughly. And Fig. 6 shows that the average lost time is inversely proportional to the number of queue vehicles. As we know, the more vehicles there are in a queue, the more interactions to the velocity with each other the vehicles have and then the slower they drive. When the number is 8, the average lost time is less than 1 s which means 8 vehicles do not need to be considered a queue but individuals. In order to classify the vehicles (individual and the queue) appropriately and avoid excessive queues, combining of the actual feature of the street (e.g., the urban arterial street), we set qint to 2 s which is the maximum time interval enabled of consecutive vehicles in a queue and set qnum to 12 which is minimum number allowed of the vehicles in a queue. And we use these parameters in the following simulations. Note that with different street conditions (e.g., the number of street lanes, velocity limit.), there are different values of qint and qnum we should adopt.

Results and discussion
In this section, we evaluate the performance of MMIR. In terms of connectivity probability, we run the simulation to verify its accuracy. And in an intuitive simulation test about street selection for routing, the estimated delivery delay in MMIR is evaluated and compared with two classical methods based on traffic statistics and GyTAR, respectively. The traffic simulations are conducted with SUMO and the trace files are injected into OMNet++ tools [44] to analyze.

Accuracy of connectivity probability
To evaluate our algorithm of connectivity probability, in the simulation, we set two intersections and a street connected them. We adopt some of the same parameters in Table 2 here (e.g., street length, vehicle velocity, and sigma), and the vehicles through the street are deployed as different degrees of traffic flow (50/100/ 150/200/250/300 per lane per hour) to test and verify the accuracy of the connectivity probability. The test packet is generated per 10 s and sent to another intersection relayed by the vehicle nodes in a single direction.
In the meantime, we calculate the street connectivity probability P c for the street. And if the packet reaches the destination without carrying delay which means every distance between two consecutive vehicles is smaller than the transmission range, the multi-hop network between two intersections is connected at the moment, and we set the value of real connectivity as 1, otherwise 0. As the evaluation indicator, we adopt probability deviation which equals |real connectivity − P c |. Obviously, the smaller value of probability deviation means a more accurate prediction of the real-time street connectivity, vice versa. The total simulation time is 3000 s, and we gather the data from 500 s to ensure the traffic state has reached stability. For MMIR, in advance, to obtain the free-velocity data of vehicles, we performed the free-flow traffic test to simulate the light traffic environment to help us to collect the data for free-velocity calculation. Furthermore, to investigate the impact of traffic lights on the connectivity calculation, the simulations are performed in two scenarios, respectively: no traffic lights and existing traffic lights, of which the period is 150 s. Furthermore, the threshold in Algorithm 1 is set to 0.8, the n head and n tail are set to 8. In Fig. 7, there are no traffic lights at the intersection. We can learn that the probability deviation is proportional to the traffic flow. It ranges from 0 to 0.13, while the traffic flow increases from 50 to 300. The reason is that the more vehicles there are in the street, the more interactions they have, and then they add more uncertainty to the connectivity. Moreover, when the traffic flow gets its critical value (about 150) to just support the end-to-end connection through the street statistically (i.e., the real connectivity value is swung between 0 and  1 frequently), the probability deviation is about 0.09 and does not have fluctuation. It shows its accuracy and realtime performance. Figure 8 illustrates the variation under the condition of the traffic lights existing at the intersection, which means the traffic flow is separated by the lights, and even though there are more vehicles in the street to support the network connectivity, it will still have the breaks. However, MMIR displays similar results like in the scenario of no traffic lights. Its probability deviation ranges from 0 to 0.12. It is due to the reason that MMIR records every vehicle's entrance time directly (i.e., when they passed the intersection and traffic lights), instead of only the statistical traffic flow. In consideration of many uncertain influential factors (we set sigma in SUMO as 0.5) to the real-time connectivity, the result of MMIR is accurate enough.

Analysis of estimated delay
In order to evaluate the performance of estimated delivery delay in MMIR directly, we built a simple and intuitive scenario. It includes three intersections: I1, I2, and I3. I2 and I3 connected I1 by street A and B which have the same condition (e.g., the number of lanes, street length). The traffic lights are set at these intersections. There is a static sender node at the center of I1, and two static receiver nodes are deployed at I2 and I3, respectively. At regular intervals, the sender generates the test packet including its id and transmission time to the receivers. The TTL (time-to-live) of the packet is set to 100 s. When the packet arrives at the receiver node at I2 and I3, the time will be recorded to check which street has a shorter delivery delay and then is the better routing choice. Before the transmission of the test packet, we use the estimated delivery delay of MMIR to choose the better street and mark the corresponding packet. If the marked packet arrives at its receiver node earlier than another one, the selection is correct. Note that, due to the same condition, the optimal selection from street A and B depends mainly on their respective connectivity in the whole packet transmission time (i.e., α 2 = 0 and β 2 = 1 in Eq. 13). The pair of average traffic flows in street A and B are deployed as the same and different degrees to evaluate the performance respectively. The detail values of the simulation parameters are shown in Table 3. For comparison, we introduce two classical street selection methods from various intersection-based routings. The first is a classical connectivity model [21] which uses the statistic traffic information like traffic flow and average velocity to calculate the network connectivity in the candidate street and then chooses the best one to forward the packet. For every calculation, we use the statistical data in the past 300 s before that moment in our simulation. And we called this method as statistical model in this paper. The similar method for street selection is also adopted in VADD. The second is GyTAR in which the forwarding node at the intersection assigns a score to each candidate street considering the  traffic density and curve metric distance to destination. The street with the highest score is selected to forward the packet. The information about traffic density in the street is gathered by its dedicated control packets-CDP (cell density packets) which are generated by the dynamic vehicles in the next intersection regularly and traverse the street to the current intersection. Note that as well as MMIR, due to the same street condition as we set, the optimal selection from street A and B depends mainly on their respective connectivity (i.e., traffic density and vehicles' distribution) in GyTAR. Then, we can compare their accuracy of the street selection with MMIR by connectivity-related metric directly. As shown in Fig. 9, we deploy five combinations of average traffic flows to street A&B as 200&150, 200&100, 250&100, 250&50, and 300&50 (the differences between A&B are 50, 100, 150, 200, and 250 respectively) in order to observe the performances of three street selection methods. The accuracy of the statistical model increases incrementally when the difference increases. Since it selects the street with higher flow actually based on the traffic information of the past period. And the higher average traffic flow means the lower delivery delay probabilistically. On the contrary, GyTAR and MMIR do not display the similar change. As we know, they use the practical measured information to verify the current status as possible, and then their decisions are rarely influenced by the non-real-time statistics. MMIR achieves a higher accuracy that all the results are greater than 80%, thanks to its capability to estimate the end-to-end delay directly making use of the effective individual information in the intersection records. GyTAR does not perform as well as MMIR due to the relatively low updating rate of real-time traffic state which is about 250 m (transmission range)/ average velocity (for details, please refer to [33]).
Sometimes, it cannot follow the rapid change of vehicles' distribution caused by the differences in vehicles' velocities and the alteration of traffic lights.
In the next set of tests, the traffic flows with the same degree are deployed to street A&B as 50, 100, 150, 200, 250, and 300. Besides the selection accuracy in Fig. 10, we provide the delivery delay in Fig. 11 to show the performance difference among three methods precisely. The statistical model has the longest delivery delay and does not perform as well as it in the last set. Street A and B have the same average traffic flow, so that the macroscopic statistical data cannot further support it to make the correct choice between street A and B. GyTAR displays the similar results when the traffic flow is large relatively. But when it is 50 or 100, the accuracy is lower and the delivery delay is relatively longer. Since, in those cases, there are fewer vehicles which are not enough to form an end-to-end connection through the whole street in many times. Moreover, the existing traffic lights increase the probability of broken time further. Inevitably, the CDPs do not reach the current intersection regularly and cannot update the traffic information timely. Based on the unfresh data, GyTAR is difficult to make an accurate decision well. As can be seen, MMIR still achieves a high accuracy (greater than 80%) and outperforms others in delivery delay at every level of traffic flow.

Conclusions
In this paper, we study street selection in the intersection-based routing for urban vehicular ad hoc networks. We show that existing methods and models which utilize macroscopic information are not suitable for VANETs with high mobility and rapid topology changing. In summary, macroscopic data (e.g., traffic density, average velocity) can be used to make a good decision only for the general condition not every concrete condition and moment actually. To address this problem, we proposed a microscopic mechanism based on intersection records (MMIR), which makes use of vehicle' individual information recorded at the intersection to estimate their current positions and calculate the connectivity probability or estimated delay for candidate streets. The simulation results show that in terms of connectivity probability and delivery delay, MMIR provides an accurate estimation and outperforms existing schemes. In the future, based on the microscopic mechanism, we will improve our method (e.g., take more microscopic individual factors into consideration) and support more metrics satisfying the quality of service in urban VANET routings.