We improve the voting procedure of MR-ARP by incorporating computational puzzles [16–18] to mitigate ARP poisoning-based MITM attacks in wired or wireless LAN, while overcoming the limitation of MR-ARP [13]. As explained at the end of Section 2, the transmission rates of wireless nodes may not be uniform, and thus, the fair voting may not be guaranteed by the voting mechanism of existing MR-ARP. The key idea of the new voting mechanism to resolve the fairness issue in the wireless LAN can described as follows. If the CPU processing power of different nodes is not significantly different, then they might spend a similar amount of time solving the puzzles of the same difficulty. Thus, if we accept the votes only from the nodes that solved the puzzle correctly, then fair voting might be realized in the wireless environment, too. Furthermore, voting reply traffic can be reduced compared to the original MR-ARP scheme since we no longer need to receive multiple votes from each neighbor node. In addition to the fairness issue, MR-ARP could be vulnerable to ARP cache-poisoning problem, although not the MITM problem, especially when a new machine with a new IP address is added. We also explain how this issue is resolved through the new initialization stage after the detailed description of the new voting mechanism. The proposed enhanced version of MR-ARP, which is called EMR-ARP, follows a similar approach to the original MR-ARP, except in the voting and initialization procedures. Thus, EMR-ARP is also based on the following idea [13]. When Node A knows the correct IP/MAC address mapping for Node B, if Node A retains the mapping while Node B is alive, then ARP poisoning and the MITM attack between A and B are not possible.
EMR-ARP retains the long-term (IP, MAC) mapping table, in addition to the normal ARP cache, which is also referred to as the short-term table in this article, to manage the (IP, MAC) mapping for all alive machines in the subnet. Three fields, IP, MAC, and Timer T
L
, are allocated for each IP address registered in the long-term table. The default value of the timer in the long-term table is 60 minutes. In order to avoid losing the mapping of (IPa, MACa) for an alive host after 60 minutes, the EMR-ARP node sends new ARP request messages for IPa only to MACa through unicasting and checks whether MACa is alive and is still using IPa just before the timer expires. In this case, 10 ARP messages are sent at the random intervals of 50~100 ms. If at least one ARP reply returns, then the mapping is registered in the short-term ARP cache and the corresponding long-term table timer is set to 60 min again. If there is no ARP reply, then the mapping between IPa and MACa is considered invalid and the corresponding entry is deleted from the long-term table. Thus, the (IP, MAC) mapping can be retained in the long-term table until the binding is released.
EMR-ARP inspects the mapping between the sender protocol (IP) address and the sender hardware (MAC) address of every incoming ARP request message to track the IP/MAC mapping for all alive machines in the same subnet, since each alive machine tends to send ARP requests to find the MAC address of the gateway router repeatedly due to the periodic timer expiry in the ARP cache. However, IP/MAC mapping of every ARP request cannot be directly reflected in the long-term table due to the possibility of ARP cache poisoning attempts. Thus, the short-term cache, i.e., ARP cache, and long-term table need to be updated carefully on arrival of ARP request or reply packets. Figure 1 shows the detailed short-term cache and long-term table management policy. By the rule for the case (A) in Figure 1, each IP/MAC mapping registered in short-term cache is frozen until it expires, e.g., for 2 minutes for Windows XP, to prevent too frequent cache updates by ARP request sniffing.
In case (B) of Figure 1, MAC conflict occurs, because the newly received MAC address MACa for IPa differs from that is already associated with IPa. This conflict is resolved by giving a priority to only if it is alive. As shown in Figure 1, the activity of a host is examined by sending 10 ARP request packets and counting the ARP replies. Multiple ARP requests are sent to cope with unexpected packet losses, including the case of DoS attack on . Even though the packet loss probability is as high as 80% by DoS attack, at least one ARP reply will be returned with a probability of 89%(= 1 - 0.810). Let us consider the case where the mapping of (IPa, MACz) expires, and the IP address IPa is allocated to a new machine with the MAC address of MACa through DHCP. In this case, the new mapping between IPa and MACa can be chosen by the rule corresponding to the case (B) in Figure 1, even though the mapping (IPa, MACz) exists in the long-term table. When a first ARP request from MACa arrives, there would be conflict on IPa in the long-term table, and the node which observes the conflict will send ten ARP requests to MACz. However, the machine MACz would not reply and the new mapping can be registered, since the previous mapping has expired.
3.1 Computational puzzle-based voting scheme
Thus far, we investigated how to prevent MITM attacks for the IP addresses whose corresponding MAC addresses are known already. However, if Node A receives an ARP request from a new IP address, then Node A cannot easily know the correctness of the source IP/MAC mapping contained in the ARP request. For example, when a new machine is added into a LAN with empty short-term cache and long-term table and the machine sends an ARP request for the gateway router, if an adversary's ARP reply advertising a fake MAC address arrives first, then this wrong MAC address may be registered and the late true MAC address may be dropped. We use computational puzzle-based voting, which corresponds to the case (C) in Figure 1, to solve the poisoning problem for this particular case.
The basic idea of the voting-based resolution mechanism can be explained as follows. When Node A is turned on with empty ARP cache and long-term table, if the neighbor nodes that know the true MAC address of the gateway router provide that information to Node A, then Node A can make a correct decision on the MAC address of the gateway router based on the majority of the votes. Figure 2 describes a simplified outline of the voting procedure with this example scenario. Since Node A wants to know the true MAC address of the gateway router, it first generates a new puzzle and broadcasts the voting request message containing the new puzzle to neighbor EMR-ARP nodes. Then, the neighbor nodes solve the puzzle using the message transformation function, and notify the voting request node of the puzzle solution and the MAC address of the gateway router that they know through voting reply messages. After collecting the voting reply messages from the neighbor nodes, Node A verifies the puzzle solution using the message recovery function, and makes a decision based on the votes from the nodes providing the correct puzzle solution. The message transformation function and the message recovery function will be described shortly in this section.
Preserving the fairness among different nodes is one of the most important issues in the voting scheme. In the previous version of the voting scheme used in MR-ARP [13], the fairness was provided by having all the neighbor MR-ARP nodes send multiple voting reply packets at the maximum speed of link rate in the Ethernet. This idea works, because the LAN cards usually have sufficient capability to send traffic at the link rate regardless of the vendor. However, the maximum transmission rates of different nodes may not be the same in the wireless LAN if the transmission rate changes by ARF depending on SNR. Thus, the voting scheme needs to be refined to provide the fairness under any combination of wireless and wired nodes. We attempt to provide the fairness by having the wired or wireless nodes spend a similar amount of time solving the computational puzzles [18] of the same difficulty and making the address resolution decision only based on the votes from the nodes that solved the puzzle correctly. Thus, the puzzle used in the proposed ARP scheme needs to satisfy the following requirements.
-
The puzzle must be easy to make.
-
The puzzle must be difficult to solve, but it should be easier to verify the puzzle solution.
-
Puzzle receiver nodes cannot predict the puzzle easily.
-
Different nodes should solve different problems to avoid the case whereby a malicious node pretends to be many other nodes by sending multiple replies with a spoofed MAC address after solving only one problem.
-
The complexity of the different puzzles should be similar to provide fairness between different nodes.
Conventional computational puzzles [16–18] based on hash functions might satisfy most of the above requirements. However, we found very large variance in the puzzle solving time for hash function-based puzzles. As an example, let us consider the puzzle that finds the value of x that satisfies the following relation:
where H' is a uniform random hash function, r is a positive integer, y is an integer given by the puzzle presenter, and [z]
r
means the least significant r bits of z. Then, the number of iterations required to find a solution of the above puzzle can be modeled as a random variable X that has a geometric distribution with a success probability p, where p = 1/(2r). The mean (μ) and the standard deviation (σ) of X are 1/p and , respectively. Thus, σ can be large, even close to μ for small values of p. When this type of puzzle is used, if the attacker can luckily solve many puzzles, even for different MAC addresses, while most other normal machines solve only one puzzle, then the attacker might win the voting by increasing the ratio of votes supporting the fake MAC address and spoof the ARP cache of the voting request node. The variance in the puzzle solving time needs to be minimized to avoid such a problem.
In order to satisfy the above requirements, while minimizing the variance in the puzzle-solving time, we design the puzzle based on the public key cryptography, especially RSA algorithm [19, 20], so that the number of arithmetic operations required to solve the puzzle cannot differ depending on the machines. The proposed puzzle can be described as follows. We first consider two keys e and d that are usually termed the public key and the private key in the public key cryptography. In the proposed scheme, the value of e is fixed to 3, a popular value for the public key [20]. We calculate the value of d in the following way. We choose two different large primes p and q, that are 128 bits long and satisfy the condition that (p - 1) and (q-1) are relatively prime to e. Then, d is an integer that satisfies the condition:
ed = 1 mod (p - 1)(q - 1). Such a value of d can be easily found with Euclid's algorithm [20]. N is defined as N = pq, and N is 256 bits long. When e = 3, if the message x is smaller than , then the encrypted message x3 can be decrypted simply by taking a cube root. This problem can be avoided by padding each message with a random number before encryption, so that the padded message is sufficiently large. We define the message transformation functions f
i
(i = 1,..., m), and F
m
as follows.
(1)
If f
i
is simply defined as modular exponentiation, e.g., as the decryption operation in the public key cryptography, without any additional operations of addition and subtraction, then the multiple encryption required to evaluate F
m
(x) can be simplified into a single exponentiation by Euler's theorem [20]. In order to avoid such a simplification, we incorporated the additional operations of addition and subtraction in the definition of the message transformation functions as shown above.
We also define the message recovery functions g
i
(i = 1,..., m), and G
m
as follows.
(2)
Then, from (1) and (2) we can obtain the following relation.
If we put y = (xdmod N)-1, then y belongs to [0, N - 2] and the above relation can be changed into
where the last equality is valid by the RSA algorithm. Similarly, we can show that
(3)
The proposed puzzle is to find the value of y that satisfies the following relation for a given integer x,
Thus, if the values of e, d, N, and m are fixed, then the puzzle complexity in terms of the number of operations is also fixed. In the proposed scheme, the values of e, d, and N are fixed as described below, and the complexity of the puzzle is controlled only by m.
In EMR-ARP, we use two types of puzzles. The first type of puzzle should be solved by the voting request node itself to prevent a DoS attack that attempts to exhaust the processing power of neighbor EMR-ARP nodes by inducing continuous puzzle computation through too frequent transmission of voting request packets. The second type of puzzle needs to be solved by the EMR-ARP nodes receiving the voting request packets. This puzzle is used to provide the fairness among different nodes in voting.
In the first-type puzzle, the voting request node first makes a message m
s
by concatenating its own MAC address (MACA) and the local time T
s
measured in seconds when the voting request node starts to calculate the puzzle, i.e., m
s
= MACA||T
s
, and transform m
s
into c
s
by the message transformation function F
m
. Then, the voting request node sends T
s
, m, and the transformed message c
s
to the neighbor nodes. e, d, and N need not be delivered since they are fixed and known to EMR-ARP nodes in advance. The neighbor nodes apply the message recovery function G
m
to the received message c
s
. If the recovered message agrees with MACA||T
s
, then it confirms that the sender solved the first-type puzzle correctly and the receiver proceeds to solve the second-type puzzle to send a valid vote in return.
The second puzzle can be described as follows. The voting request node with the MAC address of MACA generates the key parameter for the second puzzle P by
(4)
where H is a publicly-known cryptographic hash function, and MD5 is used for H in the current version. K is the secret that is known only to the voting request node. If the value of P is delivered to the neighbor nodes additionally, then each receiver node generates a message m
r
by m
r
= MACA||MACR||P, where MACR is the MAC address of the receiver node. We make each EMR-ARP node solve a puzzle that differs from those of other nodes by including the MAC address of the receiver inside the puzzle to prevent the reuse of the solution. m
r
is transformed by F
m
into c
r
, and only the transformed message c
r
is sent back to the voting request node. The voting request node applies the recovery function G
m
to the received message c
r
, and compares the recovered message with m
r
, whose components are known to voting request node already. If c
r
is correctly calculated, i.e., c
r
= F
m
(m
r
), then the following relation should hold by (3):
Thus, by comparing G
m
(c
r
) and m
r
, we can verify the correctness of each answer. We can easily find that the second-type puzzle reflects all the requirements stated above.
The implementation-related details are as follows. Two more ARP packet types, voting request and voting reply packets, are defined in the EMR-ARP in a similar way to MR-ARP. The operation field is set to 20 and 21 for voting request and reply packets, respectively. Figure 3 shows the packet formats for voting request and reply messages. As shown in the figure, the voting request packet is defined by adding 4 more fields, (T
s
, m, c
s
, P), to the normal ARP request/response packet format, and the voting reply packet is defined by adding just one more field of c
r
.
Each EMR-ARP node maintains a table of (MAC address of voting request node, T
s
) pairs, called puzzle history table, to prevent the flooding of the identical puzzles. T
s
is the puzzle generation time at the sender node as described above. If an EMR-ARP machine receives a voting request packet, then it first examines the value of the T
s
field. If the value of received T
s
is less than or equal to the value stored in the puzzle history table, then the received voting request message is neglected since the puzzle is highly likely to be a repetition of an old one. After verifying the validity of T
s
to avoid the flooding of identical puzzles, the EMR-ARP node investigates the correctness of the solution contained in the field of c
s
. If the solution is correct, the received voting request is considered valid, and the received timestamp T
s
is registered in the puzzle history table, i.e., in the entry corresponding to the MAC address of the current voting request node. Then, the EMR-ARP node proceeds to solve the second-type puzzle to provide a feedback for the voting request.
After solving the second-type puzzle as described above, the EMR-ARP node sends the MAC address corresponding to the queried IP address and the puzzle solution to the voting request node through the fields of target hardware address and c
r
, respectively, in the voting reply packet of Figure 3b.
Then, the voting request node receives and buffers the incoming voting reply packets up to 1 s from the voting request transmission time. The voting request node finds the earliest voting reply packet, calculates the response time t1, i.e., the period from the voting request transmission time to the arrival time of the earliest voting reply packet, and discards the voting reply packets that arrive later than 2t1 since those late replies might come from the malicious nodes actively solving the puzzles multiple times even for different nodes to increase the ratio of the votes supporting the fake IP/MAC mapping. Then, the voting request node verifies the correctness of the puzzle solution only for the not-too-late replies and accepts the received IP/MAC mapping only when the solution for the second-type puzzle is correct. The voting request node calculates the polling score for each candidate MAC address. If a MAC exists that received over 50% of valid votes, then that MAC address is accepted for the queried IP address.
3.2 Initialization stage
The original MR-ARP scheme might be vulnerable especially when a new MR-ARP node is attached to a LAN where an attacker resides. The detailed problem is described with an example below. In this section, we introduce an initialization stage for the proposed EMR-ARP to overcome the shortcoming of the original MR-ARP scheme especially for this case.
Figure 4 shows an example network topology to describe the vulnerability of the original MR-ARP scheme. In the figure, Nodes A, B, C, and D are MR-ARP enabled nodes. Among them, D is a router connecting the given subnet to other subnets. We assume that there is only one attacker, i.e., Node E, and Node F is newly attached to the LAN. Let us assume that Node F is also MR-ARP enabled and uses a new IP address not known to other MR-ARP nodes. In this case, if the new node sends an ARP request message querying the MAC address of the router, then Node F might receive a false IP/MAC mapping from Node E. However, the correct MAC address can be resolved through the voting procedure since the number of MR-ARP nodes is larger than that of the attackers.
In this scenario, existing MR-ARP nodes do not know the IP address of Node F, and thus, each of them will try to find the correct MAC address of Node F through voting after waiting a random time between 0 and 100 ms [13]. If we assume Node D is the first node that queries the MAC address of Node F, Node D will send a voting request for the IP address of Node F, IPF. There are one attacker node, i.e., Node E, and only one MR-ARP node that knows the MAC address for IPF, i.e., Node F itself. If those two nodes receive the voting request for IPF, then Node F will reply with 50 voting reply packets containing the correct MAC address for IPF. The attacking node needs not follow the original MR-ARP protocol, and it can send more than 50 voting reply packets with a false IP/MAC mapping. Thus, the attacker can win the voting in this case.
MITM attack cannot be successful between Nodes D and F even in this case since the ARP cache and the long term table of Node F are protected with the help of other MR-ARP nodes. However, we further investigate how to prevent ARP cache poisoning-based one-way eavesdropping by Node E for the packets going from the router Node D to the new Node F.
We employ additional initialization steps for EMR-ARP, which are described hereafter, to resolve this problem. The initialization procedure is performed only once during the lifetime of a given machine. The lifetime means the time interval from the time when the machine is turned on to the time when the machine is turned off. If a given EMR-ARP machine receives a new IP address through DHCP, then it performs the initialization steps, immediately after the IP address is configured. If the IP address is manually configured for a given EMR-ARP machine, then the initialization steps are performed just after the network card is activated. We need to note that all the ARP packets irrelevant to the initialization steps are discarded until the initialization steps are completed.
The initialization stage consists of two steps. In the first step, the rebooted or newly attached node advertises its own IP/MAC mapping through a gratuitous ARP request packet [21]. Each EMR-ARP node that receives the gratuitous ARP packet checks whether it knows the source IP address of the received packet. If it knows the received IP address, then it resolves the conflict in the usual way, by asking the previous owner of that IP address and giving it a priority. If the EMR-ARP node does not know the received IP address, then it just neglects the received gratuitous packet. Figure 5 summarizes the task that is performed by the EMR-ARP node receiving a gratuitous ARP request packet. This packet is sent first to update the IP/MAC mapping for the given IP easily, without the burden of voting especially when the owner of a given IP address changes legitimately through DHCP.
The second step starts after waiting 1 s from the transmission of the gratuitous ARP request packet. In the second step, the rebooted or newly attached node sends a special voting request packet for its own IP address to know if its current IP address is used by other machines. If there is no voting reply packet, or the MAC address of voting node polls over 50% of votes, then the new node can use the current IP address without any problem. If there is at least one reply packet and the MAC address of voting request node polls less than 50%, then the node gives up the use of the current IP address and requests a new IP address through DHCP or manual configuration with the help of a network administrator.
If other existing EMR-ARP nodes retain an entry corresponding to the IP address of the voting request node in the long term table, then they first verify the correctness of the solution to the first-type puzzle and send voting reply packets after solving the second-type puzzle, only if the first puzzle solution is correct. Although the voting reply packets for a normal voting request message are sent only to the voting request node through unicasting, the voting reply packets for a special voting request message are broadcasted, so that other neighbor EMR-ARP nodes can sniff those responses.
If an existing EMR-ARP node does not know the IP address of the voting request node, then the node temporarily stores the sender protocol and sender hardware address mapping of the received ARP voting request packet. The neighbor EMR-ARP node ignorant of the IP address of the node sending a special voting request packet monitors the reply packets from other EMR-ARP nodes for an interval of 1 sec. If there is no ARP voting reply during that time interval, then the neighbor EMR-ARP node accepts the sender protocol and sender hardware address mapping of the special voting request packet into the ARP cache and the long term table. However, the mapping is not accepted, if there is any voting reply packet. Figure 6 summarizes the way in which EMR-ARP nodes process a special voting request message that is querying the MAC address for the IP address of the voting request node.
Thus, according to the initialization procedure described above, when an EMR-ARP node sends the first special voting request packet after reboot or initial deployment, if an attacker replies with a false MAC address for the queried IP address, then the new node will try to avoid the use of the IP address in conflict. Thus, the voting replies from the attackers advertising false IP/MAC mapping cannot help one-way eavesdropping for the packets going to the new EMR-ARP node. Conversely, if there is no reply for the first special voting request, then all the normal EMR-ARP nodes will accept the mapping between the sender protocol address and the sender hardware address of the special voting request packet by the algorithm in Figure 6. Thus, the proposed initialization procedure can effectively prevent one-way eavesdropping of the packets destined to the newly attached nodes.