Secure Network Coding against Wiretapping and Byzantine Attacks

.


Introduction
Network coding is a packet-level coding technique that generalizes the classical routing paradigm [1].Based on linear superposition of incoming packets at intermediate nodes, linear network coding achieves multicast capacity in single-source wired networks [2].Recently, wireless network coding has gained much attention as one skill to enhance the overall throughput in a wireless multihop network that supports multiple communication flows [1][2][3][4].Like in wired networks, the basic idea is also that a relay node can combine several incoming packets.The wireless communication medium has inherent particularities, such as the broadcast nature, high error rates, and unpredictable signal strength, which create some opportunities for attackers.Thus, secure network coding [5][6][7][8] is a hot topic in wireless networks.
In [9], Cai and Yeung proposed a model which incorporates network coding and information security.This is the first time that network coding was used for secure transmission.Later, Jain [5] widened the sufficient condition in [9] and generalized these results to wireless networks.
In general, secure network coding is designed against two kinds of attacks: wiretapping and Byzantine modification.And we call them type I and type II secure network coding, respectively.
Type I Secure Network Coding.The wiretapping attack means that some adversary can wiretap some communication signals with the purposes of curiosity or recovering the messages.In traditional transmission, packets are generally encrypted against wiretapping.However, Cai and Yeung [9] found that without cryptographic approaches one can securely transmit the message by using network coding.With the similar model for wireless networks, Jain [5] explored widening the sufficient condition in [9] so that it becomes necessary, too.Under the transmission rate of one unit, the sender can send a message to the receiver without leaking any information to a wiretapper.In [10], Feldman et al. showed that if a small amount of overall capacity is given up, then a random code achieves security by using a much smaller base field than that in [9].Furthermore, they pointed out that a large field size may sometimes be required to achieve security without giving up any capacity.In [11], Bhattad and Narayanan generalized the model in [9] and gave a new information theoretic model for security which accommodates a lot more practical requirements on security.
In general, this previous secure network coding by information-theoretic approaches has to restrict the eavesdropping set or give up some capacities.To address this problem, in this paper, we present a new secure network coding scheme by combining information-theoretic and cryptographic approaches.In our scheme, we do not restrict the eavesdropping set, that is, the wiretapper can eavesdrop any communication signals.Moreover, we do not give up any capacity.Based on these superiorities, our scheme is more suitable for wireless multicast compared with the previous schemes [5,9,10].
Type II Secure Network Coding.Byzantine attack means that attackers may modify the coded packets.For this case, since some important packets that are modified by an adversary will mislead the receivers and maybe cause the receivers to make wrong decisions, the modification detection of packets transmitted is also very important compared with the modification correction.How to detect the modification is a hot topic in network coding theory.With cryptographic approaches, Charles et al. [6] proposed a signature scheme for network coding based on Weil pairing on elliptic curves.Later, in [7,8] the authors proposed some signature schemes by the linearity property of the packets in a coded system for network coding.In these schemes, one detects the modifications at intermediate nodes, so they are computationally expensive.With information-theoretic approaches, Ho et al. [12] showed a scheme in which Byzantine modification detections are done at sink nodes.In this scheme, he used random network coding by incorporating a polynomial hash value in each packet.By this way, the computing complexity is much less.
In this paper, to optimize the capacity loss and computation complexity, we propose a new scheme with Byzantine modification detection.Our scheme only needs one hash symbol which is much less than previous results in [12] and can achieve higher detection probability.Moreover, its computation complexity is lower than that in [12].Furthermore, by combining cryptographic and information-theoretic approaches, we present secure network coding against wiretapping and Byzantine attacks.
The rest of this paper is organized as follows.In Section 2, we mainly give some necessary notations and definitions.In Section 3, we show some secure network coding against wiretapping or detecting Byzantine modification.One example is given at the end of this section.Some conclusions are presented in Section 4.

Primaries
In this section, some necessary notations and definitions such as network model, descriptions of a linear network code and all-or-nothing transform are presented.We denote matrices and linear spaces with bold uppercase letters and vectors with bold lowercase letters.All vectors are column unless some additional illustrations.
2.1.Network Model.Network coding has been leveraged as a generic technique in several types of wireless networks, such as vehicular ad hoc networks [13], wireless sensor networks [14], and Mesh networks [15],.In this paper, our focus is on secure network coding for acyclic wired networks and acyclic wireless networks which include parts of vehicular ad hoc networks, wireless sensor networks and Mesh networks.In detail, by the broadcast nature of the wireless interface each node is possiblly connected to several other nodes, where one node u connects to node v means that v is in the coverage of u s signal.By this way, we can obtain a directed graph G. Our attentions are mainly focused on the acyclic wired networks and acyclic wireless networks, both of which can be represented by a directed acyclic network G = (V , E), where V is the set of nodes and E is the set of edges.The source node is denoted by s, and edges are denoted by round brackets e = (u, v) ∈ E, in which v = head(e) and u = tail(e).Let In(v)(Out(v)) be the set of edges that end (start) at a vertex v ∈ V .

Descriptions of a Linear Network
Code.Now we give two kinds of descriptions of a linear network code.
Definition 1 (see [16] (Local Description of a Linear Network Code)).An ω-dimensional linear network code on an acyclic network over a base field F q consists of a scalar k d,e , called the local encoding coefficient, for every adjacent pair of channels (d, e) in the network.The matrix is called the local encoding kernel at node t.
Definition 2 (see [16] (Global Description of a Linear Network Code)).An ω-dimensional linear network code on an acyclic network over a base field F q consists of a scalar k d,e for every adjacent pair of channels (d, e) in the network as well as a column ω-vector f e for every channel e such that (1) f e = d∈In(t) k d,e f d for e ∈ Out(t); (2) the vectors for the ω imaginary channels e ∈ In(s) form a standard basis of F ω q .
The vector f e is called the global encoding kernel for channel e.
For convenience of decode, during the transmission process, global encoding kernels are combined in the head of packets.

All-Or-Nothing Transform.
In [17], Rivest presented a model of encryption for block ciphers, which is called allor-nothing transform (AONT in short).AONT is defined for information-theoretic security [18].In detail, let F q be a finite field and F n q be an n-dimensional space over F q .Suppose that φ : F n q → F n q .φ is named as an (n, q)-AONT if φ satisfies the following properties: (1) φ is a bijection; (2) If any n − 1 of the n output values y 1 , y 2 , . . ., y n is fixed, then the value of any input value From this definition, for some input vectors v 1 , . . ., v n (a basis of the space F n q ) and the corresponding output vectors u 1 , . . ., u n , we have the following result.
Theorem 1.For any (n, q)-AONTφ : , where v T denotes the transpose of vector v.For any i, from the definition of AONT if any n − 1 of the n output u 1,i , u 2,i , . . ., u n,i is fixed, then the value of any input v 1,i , v 2,i , . . ., v n,i is completely undetermined.Therefore, when any n − 1 of n output vectors u 1 , . . ., u n is fixed, any input vector v j is completely undetermined.
If an (n, q)-AONT φ : F n q → F n q is also F q -linear, φ is called a linear all-or-nothing transform.In fact, the linear AONT is very useful for constructing secure linear network coding because of its low computation complexity and convenience for decoding.In [18], Stinson proved that for prime power q > 2 and positive integer n there exists a linear (n, q)-ANOT.Moreover, he constructed the following linear AONT which can be implemented very efficiently.Let q = p k , where p is prime and k is a positive integer.λ ∈ F q such that λ / ∈ {n − 1 mod p, n − 2 mod p}.Then the linear function φ : We call T S an (n, q)-ANOT matrix.This transform (and the inverse transform) can be implemented very efficiently. (3) where

Main Schemes
In this section, we present some schemes that achieve different securities.Suppose that ω is the source rate.Each packet is represented by one vector in some linear space based on F q .The output packets of an AONT is called pseudopackets.In our schemes, AONT, the hash function and cryptosystem are public.The only shared secret is the key of the encryption when we use symmetric cryptsystem.

Against Wiretapping Attack.
In wireless networks, because of the broadcast nature of the wireless interface, we canot determine which edges can be eavesdropped.So we canot obtain the same secure communication on the wireless networks if we made use of the scheme in [9] against wiretapping attacks.For example, consider the wireless network shown in Figure 1(a).From the presentation of the wireless network model in Section 2.1, we can get its equivalent graph model shown in Figure 1(b).
As for this wireless network, the scheme in [9] is not efficient and secure enough against wiretapping attack.In detail, the scheme in [9] for the network in Figure 1 shown in Figure 2(a).The collection of sets of wiretap edges is In reality, however, as for this wireless network, it is possible that the wiretapper can eavesdrop all the network linkages because of the broadcast nature of the wireless interface.Then the previous two schemes are not secure enough in practical applications.So some cryptographic approaches are required to address this problem.In fact, by combining ANOT with symmetrical cryptography, without constrictions of wiretapping sets, we can construct secure network coding in the sense of cryptographic security.That means the wiretapper cannot obtain any massage if he has not the secret key.Our secure network coding is presented as follows.
Scheme 1.Let v 1 , . . ., v ω be ω packets, where v i ∈ F ω q .An (ω, q)-AONT matrix T S is the local encoding kernel of source node s.
Step 1.Let (u 1 , . . ., u ω ) = (v 1 , . . ., v ω )T S .The source node encrypts u ω using AES cryptsystem ( the source can also choose other high speed asymmetric cryptsystem.And the only secret for this scheme is the private key owned by the sender and receiver.)and sends out u 1 , . . ., u ω−1 , c, where c = E AES (u ω ).
Step 2. Based on Jaggi's construction of network coding [19] for wired networks and Rajawat's [20] for wireless networks, we can construct the codes for the intermediate nodes in wired networks and wireless networks, respectively.
Step 3.Each sink node first decodes the received packets and gets u 1 , . . ., u ω−1 , c then decrypts c and obtains u ω .By the inverse of T S , they get the original packets v 1 , . . ., v ω .Time Complexity Analysis.Since the orders of matrix T S and its inverse are both ω, the time complexity of multiplying T S or T −1  S is at most O(ω 3 ).In addition, there are two operations, encryption and decryption.So the more time complexity of this construction than those of Jaggi's and Rajawat's is O(ω 3 ) and the time for encryption and decryption.
Security Analysis.Since the network coding in this paper is linear, all of the network coding operations in the network are linear.The packets in the network are linear combinations of u 1 , . . ., u ω−1 , c. On one hand, if the rank of the linear packets that an adversary eavesdrops is less than ω, he can only get some (not all) of the packets u 1 , . . ., u ω−1 , c.By the definition of AONT, he can not obtain any original packet v i .On the other hand, even if the rank of eavesdropped packets is equal to ω, the wiretapper can not get the pseudopacket u ω without the private key.So he can not obtain any original packet either by Theorem 1.
In this model, we do not need to encrypt all the transmitted packets ( In [18], all pseudopackets are encrypted, because this requires an adversary to decrypt all the blocks of ciphertext to determine any block of plaintext by the definition of AONT.Then the attack will be slowed down without any change in the size of the secrete key.Therefore, AONT is used to afford a certain amount of additional security for a block cipher encryption.).Only one is enough by combing with AONT.By Theorem 1, each original packet is relative to all the pseudopackets.When we encrypt one of the pseudopackets, the wiretapper canot get all of the pseudopackets without the private key.So he canot obtain any original packet.For example in Figure 2(b), we only need to encrypt u 1 or u 2 , then the wiretapper can not get any meaningful information about v 1 and v 2 .The security here combines the information-theoretic security with cryptographic security.However, by the wooden barrel theory the whole security of this scheme is reduced to cryptographic security.Now, we show the advantages of AONT as the local encoding kernel of the source node.Firstly, from the information-theoretic point: we not only increase the achievable throughput, but also get secure transmission.Secondly, from the cryptographic point: we only need to encrypt one packet out from the source node instead of encrypting all the packets which will be sent to sink nodes.Moreover, we can save lots of time consumption, explained from Table 1, where ω denotes the source rate, the length of each packet is 2 bytes and "clk" is the abbreviated clock.

Byzantine Modification Detection.
Since some important packets that are modified by an adversary will mislead the receivers and may cause the receivers to make wrong decisions, the modification detection of packets transmitted is also very important compared with the modification correction in both wireless and wired networks.In this subsection, we present a scheme to detect the Byzantine modification combining AONT with a simple polynomial hash function.
By the definition of AONT, we find that if one of the pseudopackets from source node is damaged, then it is likely that every packet will be damaged.This is the errorpropagation property of AONT.So we can append a suitable block of redundancy to the packet before applying an AONT.And this redundancy can be used to verify the integrity of the packets and also can be removed after decode.
Suppose source s multicast ω vector packets v 1 , v 2 , . . ., v ω to the sink nodes.For convenience, each packet in the network is represented by a column vector v of d + 1 (d ≥ ω) symbols over a finite field F q , where the first d entries are data symbols and the last one is a redundant hash symbol.The hash symbol in each augmented packet is given by a hash function ψ : F d q → F q of the data symbols.Of course, we can choose any nonlinear hash function.In fact, we find that the security can be ensured by a simple nonlinear function.In detail, we take the following simple nonlinear function as the secure hash function in this scheme.
Let ψ : F d q → F q be the function mapping (x 1 , . . ., Denote Denote the augmented packets by where x y denotes the concatenation of two vectors x and y, and h i is the hash symbol satisfying Now we give a brief description of our scheme as follows.
Scheme 2. Initialization: For each original packets v i , 1 ≤ i ≤ ω, the source s calculates the hash values h i , 1 ≤ i ≤ ω, and obtains the augmented packets v i , 1 ≤ i ≤ ω, by concatenating the hash value h i to each original packet v i .
Step 1.The source s takes the AONT matrix T S as its local encoding kernel and computes ( u 1 , . . .
Step 2. Based on the Jaggi's construction of network coding for wired networks and Rajawat's for wireless networks, we can construct the codes for the intermediate nodes.
Step 3.Each sink node first decodes the received packets and gets v 1 , . . ., v ω .( Since in this scheme, we donot consider the wiretapping but the integrity of the packets, it does not need to encrypt any pseudopackets from the source.We note that T S is the local encoding kernel of the source s, and thus we can decode directly and get the packets v 1 , . . ., v ω .)Then it verifies whether 2 for all i, then there does not exist modification on the transmission and v i = v i , 1 ≤ i ≤ ω.Finally, they remove the hash values and obtain the original packets v 1 , . . ., v ω .
Time Complexity Analysis.This scheme is similar to Scheme 1.The differences are additional calculations for hash symbols in Step 1 and verifications for hash symbols in Step 3. The time complexity of these two operations is O(ω 2 ).So the time complexity of Steps 1 and 3 is polynomial on the length ω of the packet vector and equal to O(ω 3 ).So the total time complexity of this secure network coding construction is only O(ω 3 ) more than that of Jaggi's for wired networks and Rajawat's for wireless networks.
Security Analysis.Based on the model above, an adversary successfully modifies the packet that he can construct the logical hash symbol after modifying the data symbols (actually here he modifies the pseudopackets).From the following theorem we will find that an adversary can construct a logical hash symbol after modifying the data symbols with a very low probability.
Theorem 2. In Scheme 2, the probability of not detecting an error is at most (1/q) ω , where ω is the source rate.
To prove this theorem, we first prove the following two lemmas.Lemma 3. Given the vector a and scalar value c, the probability of randomly choosing a vector v ∈ F n q such that the inner product v • a = c is 1/q.
Proof.The number of points on the hyperplane {v ∈ F n q | v • a = c} is q n−1 .And the cardinality of the field F n q is q n .So the probability of choosing a vector v such that v • a = c is q n−1 /q n = 1/q.

Lemma 4. The probability of randomly choosing a vector v
is at most (1/q) n , where the vectors a i , 1 ≤ i ≤ n and scalar values c i , 1 ≤ i ≤ n are fixed and independent.
Proof.By Lemma 3, we randomly choose a vector v ∈ F n q such that v•a i = c i with probability 1/q.Then the probability that choosing an appropriate vector v ∈ F n q such that v satisfies ( 8) is at most (1/q) n .Now we prove Theorem 2. Let v 1 , v 2 , . . ., v ω be the ω packet vectors transmitted, each of which consists of d + 1 symbols from a finite field F q and is a column vector.The first d entries are data symbols and the rest one is the redundant hash symbol.It can be represented as where v i denotes the data.The hash symbol The matrix T S is the local encoding kernel of source node, and let So the hash symbols satisfy where h ui denotes the hash symbol of u i .Therefore, 3).Notice that the hash function is not a linear function, that is, h ui / = h vi+vω .When the adversary modifies some pseudopacket u i , he has to modify the hash symbol h ui such that the sink nodes can not detect the modification.The proof can be completed by two steps.
Step 1.We suppose that only the first pseudopacket u 1 is modified and the new pseudopacket is denoted by u 1 .Let u 1 = u 1 + Δu 1 and h u 1 = h u1 + Δh u1 .Δu 1 is known to the adversary.
Thirdly, when the adversary modifies more data symbols of u 1 , by the similar method, we can prove that the probability of constructing the logical hash symbols is at most (1/q) ω .
Step 2. When the adversary modifies more pseudopackets u i at one time, from the similar method above, the probability of constructing the logical hash symbols is no more than (1/q) ω .
From the proof of Theorem 2, we have the following two corollaries.
Corollary 1.The probability of not detecting an error is not related to both the number of modified packets Δu i and the symbols of one pseudopacket, but the cardinality of F q and the source rate.
Corollary 2. If the redundant hash symbol in the packet is a constant or a linear function of the data symbols, then the scheme can not defend the Byzantine modification.
Proof.First, when the hash symbol is a constant.The adversary only modifies the data symbols and keeps the hash symbols unchanged.Then the receivers can not detect the modification.
Second, when the hash symbol is a linear function of the data symbols, we have By (15) Δu j are known by attackers.So, by the relationship between Δv i and Δu j , Δh i are easily calculated and this scheme canot defend Byzantine modification.

Against Wiretapping and Byzantine
Attacks.Further, by combining with Scheme 1, we can improve Scheme 2 to against wiretapping attack.Before sending out u i , i = 1, . . ., ω, the source encrypts the last packet u ω and denotes the encrypted packet by c.The aim is to prevent wiretapper from recovering any original packets.
Scheme 3 provides not only security but also authenticity.Scheme 3. Initialization: For each packet v i , 1 ≤ i ≤ ω, the source calculates the hash values h i , 1 ≤ i ≤ ω, and obtains the augmented packets v i , 1 ≤ i ≤ ω, by concatenating the hash value h i to each original packets v i .
Step 1.The source takes T S as its local encoding kernel.Computes ( u 1 , . . ., u ω ) = ( v 1 , . . ., v ω )T S and encrypts u ω using AES cryptsystem ( Here we use symmetry cryptsystem.Because if we use asymmetry cryptsystem, by the public key the adversary may successfully modify all the pseudopackets at the same time when he controls ω edge disjoint paths.) to get c = E AES ( u ω ).Then sends out u 1 , . . ., u ω−1 , c.
Step 2. Based on the Jaggi's construction of network coding for wired networks and Rajawat's for wireless network, we can construct the codes for the intermediate nodes.
Step 3.Each sink node first decodes the received packets and gets u 1 , . . ., u ω−1 , c , then gets u ω = D AES ( c ) by decrypting c .Verify whether 2 for all i, there does not exist modification on the transmission.They get the original packets v 1 , . . ., v ω by T −1 S .
These three schemes are based on Jaggi's construction for wired networks and Rajawat's construction for wireless networks.Actually, we can also use Ho's random network coding [21].In Scheme 2, the only change is to randomly choose the local encoding kernels from a large finite field.Except for the change in Scheme 2, in Schemes 1 and 3 the packets from the source will be appended with an ω-dimensional identity vector, the global encoding kernel, before being sent out.However, random network coding for wireless networks requires a large alphabet size to render networks robust to link failures.
Example 5. We construct a secure network code on the wireless network in Figure 1(a) to detect Byzantine modifications.Suppose the base field is F 5 .Let λ = 3 / ∈ {0, 1 mod 5}.Then γ = (2 − 1 − 3) −1 = 2.So the AONT matrix T S , which is used to encode the two original packets at the source node s, is Suppose the packets v 1 and v 2 will be sent to the sink nodes y and z.The two encoded packets (pseudopackets) from the source are Then the pseudopackets transmitted on the edges are shown in Figure 3.When an adversary wiretaps any one of A = {{(t, y)}, {(t, x)}, {(u, x)}, {(x, y), {(x, z)}}, {(u, z)}}, he can not get any meaningful information about packet v 1 or v 2 .If we encrypt the packet u 2 , then the adversary can get nothing even when he wiretaps all the channels.

Conclusion
For secure transmission, if only the information-theoretic approach is used, some bandwidth has to be given up or a high computation complexity is necessary.As to cryptographic approach, all the packets have to be encrypted against wiretapping.Even if the data is hashed and appended with its hash value, one may not detect the modifications when the adversary modifies the data and its hash value simultaneously.To address these problems, we combine the information-theoretic approach with cryptographic approach to design secure network coding.On one hand, we do not give up any network capacity to achieve the same security as that of Cai and Yeung.More importantly, our Scheme 1 does not require any restrictions on the wiretapping sets compared with that of Cai and Yeung.It means that our secure network coding is suitable for both wired networks and wireless networks.On the other hand, we decrease the resource consumptions of encryption and decryption.Furthermore, based on some simple hash function, our Scheme 2 is designed to detect the Byzantine modification.It can achieve a high detection probability with only one hash symbol and low computation complexity.In the end, combining the two schemes above we propose Scheme 3 which provides not only security but also authenticity.

Figure 1 :
Figure 1: (a) Source s wants to send two messages to sink nodes y and z. t and u are within the coverage of source s, x is within the coverage of t and u, and terminals y and z are within the coverage of x.The region bounded by the dash lines denotes the signal coverage of the broadcast node.(b) The equivalent graph model of (a).

Figure 2 :
Figure 2: (a) The source node s sends v to sink nodes y and z.For security Cai et al. add an independent random packet k to v. (b) At source node s we transform the packets (v 1 , v 2 ) into (u 1 , u 2 ) by a linear ANOT T S then send the pseudopackets u 1 and u 2 to sink nodes.

2 Figure 3 :
Figure 3: The source node s transforms the packets v 1 and v 2 by T S , and sends the outputs u 1 = v 1 + v 2 and u 2 = v 1 + 3v 2 to sink nodes y and z.

Table 1 :
The time consumptions of different encryption models.