Research on correction algorithm of propagation error in wireless sensor network coding

It is very difficult to deal with the problem of error correction in random network coding, especially when the number of errors is more than the min-cut of the network. We combine a small field with rank-metric codes to solve this problem in this paper. With a small finite field, original errors are compressed to propagated errors, and their number is smaller than the min-cut. Rank-metric codes are introduced to correct the propagated errors, while the minimum rank distance of the rank-metric code is hardly influenced by the small field. It is the first time to correct errors more than the min-cut in network coding with our method using a small field. This new error-correcting algorithm is very useful for the environment such as a wireless sensor network where network coding can be applied.

nearly C errors when the transmission rate R is close to zero. In order to correct more errors, Guo et al. [7] introduced a nonlinear operation to network coding and proved that the transmission rate can be bigger than (C − t) in her method. But she did not show her concrete construction and corresponding decoding algorithm about the method in the paper we also noticed. The operations in the method have exponential complexity. So far, there are few practical methods that can correct more than C errors in the related literature. However, in the real communication environments, the number of original errors is usually larger than min-cut C, since the number is directly proportional to total links in the networks.
In random network coding, the transport process can be depicted by Y = T ⋅ G ⋅ u + T Z → Y ⋅ Z, where u is the original message intended to be transmitted. In a source node of a multicast network, u is encoded to G ⋅ u by a code Ω with its generate matrix G. Then, G ⋅ u is sent to the network and encounters a transfer matrix T, which represents the effect of network coding upon G ⋅ u. Simultaneously, the error Z occurred on the links also encounter their own transfer matrix T Z → Y , and then, T Z → Y ⋅ Z are injected into the messages Y received in the sink node. In the existing NEC models, packets must be collected enough to guarantee that T is a full-rank matrix. By multiplying T -1 to both sides of the equation Based on the code Ω with its generated matrix G, we can further get the original message u. Denoted by Z, the original error, as the effect of network coding, there are several variations of Z such as T -1 ⋅ T Z → Y ⋅ Z, T Z → Y ⋅ Z, and they are called "propagated errors." In essence, the existing NEC methods focused their aims on compressing the number of "propagated errors." With respect to the determined network [2][3][4], the Hamming weight of the propagated error T -1 ⋅ T Z → Y ⋅ Z is compressed to the Hamming weight of the original error Z. The spread effect of the original error Z is compressed by the NEC method. But for a random network, it is difficult to compress the spread of original error from the perspective of the Hamming metric. However, from the view of the rank metric, the rank of propagated error T -1 ⋅ T Z → Y ⋅ Z is no more than the rank of the original error Z, and the spread of error in network coding is compressed naturally [5]. Once the list-decoding method is introduced, the range of the number of original errors that can be corrected increases from the interval (0, C/2) to (C/2, C), see [5] or [6] for a reference. In order to guarantee T is a full-rank matrix, the size of the finite field should be big enough [6]. The rank of propagated error T -1 ⋅ T Z → Y ⋅ Z could not be compressed to a smaller one than C when the rank of the original error Z is bigger than C. Even if the list decoding of rank-metric code [6] has a strong decoding ability, it still cannot correct the propagated errors based on the rationale of coding theory when the rank of propagated error T -1 ⋅ T Z → Y ⋅ Z is bigger than C.
In this paper, we propose a new method, which can correct more than C errors. In our experiments, we test and verify that if the size of a finite field for network coding is smaller than a threshold number, the rank of T Z → Y ⋅ Z can be smaller than C. But T unfortunately is usually not full rank, so we cannot use the decoding algorithm of code Ω with the generated matrix G to decode u, where the corresponding decoding equa- For T is not full rank in a small field, we decode u based on the equationY = T ⋅ G ⋅ u + T Z → Y ⋅ Z. We will use the new code Ω' with it generates matrix, T ⋅ G, instead of G to decode u. Here, the rank of T Z → Y ⋅ Z is smaller than C. Now, the only question remaining is how much the minimum rank distance of code Ω' declines, compared to the minimum rank distance of code Ω. If the minimum distance of code Ω' declines too much, we still cannot correct the propagated error T Z → Y ⋅ Z with the list-decoding algorithm of rank-metric code, just like that in [7]. Fortunately, based on our experiments to be shown in Section 5, we found that the minimum distance of T ⋅ G is very close to the minimum distance of G even if T is not full rank when a small field is used. In the traditional research paradigm of network coding, we use a big field to make sure T is full rank. In our work, we deal with the problem by using a small field way to compress the spread of original error based on the rank metric. Even the number of non-zero components in original error Z is far larger than C, the rank of propagated error T Z → Y ⋅ Z also can be compressed to a smaller number than C as long as the size of the field is small enough. Our method not only can correct more than C errors in random network coding, but also can correct lots of errors valuable to practical applications of network coding.
The rest of this paper is organized as follows. We first introduce the related works including rank-metric codes and list decoding briefly in Section 2. Then, we present our method in Section 3. In Section 4, we give the experiment results and make a detailed theoretical analysis based on a combinatorial theory of probabilities and simulation. Finally, we draw a conclusion about our work in Section 5.

Related works
In this section, as the background of our work, we first introduce the rank-metric codes and the list-decoding technique. Then, we explain a model about NEC, in which list decoding of rank-metric codes is involved.

Rank-metric codes
Denote the set of all n × m matrices over F q by F nÂm q and suppose a rank-metric code, Ω, is a subset of F nÂm q . Define the distance between X∈ F nÂm q and Y ∈ F nÂm q as rank(X − Y). Obviously, we can also take the matrix X∈ F nÂm q as a vector x∈ð F q m Þ n in the field F q m , and this means there is a bijection between the vector set of F q m and the matrix set of F nÂm q . We further define rank(x) = rank(X).

List decoding of rank-metric codes
In the traditional decoding method of linear block codes, the solution is unique, and the number of errors that can be corrected is less than the half-size of the codeword. With the method of list decoding, we can correct errors more than half of a codeword size, but the solutions of list-decoding method are not unique. Based on the rank codes, Guruswami in [6] proposed the list decoding of rank codes and approximate the number of the corrected errors to the word size when the transmission rate approximates zero. We can utilize the method with its strong error-correcting ability to correct the propagated error in network coding, where the codeword size is exactly the min-cut C.

Graph model about network coding
Consider an acyclic directed graph G ¼ fV ; ℰg, where V is a node set and, ℰ is an edge set whose elements represent network channels. A channel e = (i, j) is a directed edge starting from node i and ending at node j, i.e., tail(e) = i and head(e) = j. For a nodei, the collection of incoming channels is In(i) {e : e ∈ ℰ, head(e) = i} and the collection of outgoing channels is out(i) = {e : e ∈ ℰ, tail(e) = i}. Each channel has a unit capacity in the network. NEC is specified by the related encoder at the source, encoders at intermediate nodes, and decoders at sink nodes. The coding and decoding operations for messages are performed over the field F q m , and the network coding is F q .

Some claims based on numerical experiments
In this section, we give a set of numerical experiment results to be used in our method. The results are summed up in our three claims. The experiment details will be given in the next section. The three claims establish the base of our method in Section 3. Our claims are based on plenty of experiments rather than rigorous mathematical deducing. However, we also give out auxiliary mathematical analysis about the claims in Section 4.
Claim 1: Claim 2: For rank-metric code Ω with the generated matrix G and minimum rank distance

Data collection strategy with delay constraint
In this section, we formally propose our method. The method is specified by its encoder at the source, encoders at intermediate nodes, and decoders at sink nodes. The coding operation is over the field F q .

Source
A source message u∈ð F q m Þ R is encoded with a rank-metric code Ω equipped with the generated matrix G∈ð F q m Þ CÂR , where m ≥ C and 0 < R < C. The coded message is a vec- Then, x is then sent to the network from the source. The minimum rank distance of Ω is d min .

Coding at intermediate nodes
Every intermediate node combines its received packets from incoming edges with its own local coding kernel in field F q , creates a new packet, and sends the packet to its successors via outgoing edges. At a more macroscopic level, the network coding at that time leads that messages encounter a so-called transfer matrix [8]. In our method, the messages x∈ð F q m Þ C ¼ G Á u encounter a transfer matrix T ∈ (F q ) C × C and the error Z∈ ð F q m Þ t encounters a corresponding transfer matrix T Z → Y ∈ (F q ) C × t , where t is the number of edges, on which errors occur. Roughly speaking, the received messages Y in the sink can be expressed as Y = T ⋅ G ⋅ u + T Z → Y ⋅ Z, if T is not polluted by errors. T is gotten from the global coding kernels in the packets. However, if errors occur, the global coding kernels may also be polluted. At this case, T is then naturally polluted and it is unknown. Though T is unknown, the polluted version of it, denoted byT , is known.
The transmission procedure can be expressed by a more delicate equation Y ¼ c T ÁG Á u þT Z→Y Á ðZ−L Á G Á uÞ . Here, L ∈ (F q ) t × C is a matrix, formed by grouping the global coding kernel vectors together.

Decoding in the sink
LetT Á G as the generated matrix of the newly formed rank-metric code Ω' with a minimum rank distance of d min ' and Y as the received messages, and the errors can be evaluated as T Z → Y ⋅ (Z − L ⋅ G ⋅ u). We utilize the list-decoding method of rank-metric codes [6] to perform decoding and get u, as long as rank(T Z → Y ⋅ (Z − L ⋅ G ⋅ u)) < C and d min ' is not smaller than d min too much. This means u d , corresponding to Y d ¼ arg So, u d is the solution of the decoding algorithm for the original message. In the list decoding, u d is not unique.

The feasibility of the decoding
We discuss the feasibility of our method here. The feasibility depends on claims 1 to 3. As mentioned in Section 2, the model in [6] to correct the errors is based on the equa- We use the decoding algorithm of code Ω with its generated matrix G to perform list decoding. The list-decoding produce can correctly work as long as rankðT −1 Á T Z→Y Á ð Z−L Á G Á uÞÞ < C. It is a dilemma to make a choice, using a bigger field size or using a smaller field size in the context of random network coding. IfT −1 is invertible, the size of field F q would be big enough, and it is usually |F q | ≥ 256 [9]. On the other hand, claim 1 shows rankðT −1 Á T Z→Y Á ðZ−L Á G Á uÞÞ > C if |F q | is bigger than 5. In this case, the decoding certainly fails if the list-decoding method [6] is used.
As discussed above, we use code Ω' with the generated matrix c T ÁG to perform list decoding, where the corresponding error is ÁT Z→Y Á ðZ−L Á G Á uÞ . The two preconditions for successfully decoding are that rank(T Z → Y ⋅ (Z − L ⋅ G ⋅ u)) < C and d min ' is not smaller than d min too much, respectively. The first condition, rank(T Z → Y ⋅ (Z − L ⋅ G ⋅ u)) < C, can be naturally satisfied, and it is necessary according to the inherent nature of the linear block code. For the second condition, if d min ' is not smaller than d min too much, we can consider that the code Ω' has a nearly identical error-correcting ability with the code Ω. When rank(T Z → Y ⋅ (Z − L ⋅ G ⋅ u)) < C, we also perform list decoding of rank-metric just like done in [6]. If d min ' is smaller than d min too much, the error-correcting ability of code Ω' declines sharply, compared with code Ω. In this case, it becomes un-meaningful in practice, even if code Ω' can correct nearly C errors in theory. After the theoretical analysis, we explain that the two preconditions can be met in the following case if |F q | ∈ {2, 3, 4, 5}. In order to obtain better results, we set |F q | = 2 in this paper. Based on claim 1, rank Based on claims 2 and 3, it can be guaranteed that d min ' is not smaller than d min too much. In most cases, d min ' = d min or d min ' = d min − 1, and in a few cases, d min ' = d min − 2. The specific situation depends on the sizes of C and m. The details of it will be introduced in the experiment section.

Advantages and disadvantages
The advantages of our method are as follows: (1) More than C errors can be corrected with list decoding of rank-metric codes in a random network coding, when we adopt a small network coding field, for example, |F q | = 2. (2) Our method has the ability of correcting the original more than min-cut C errors, and it is very important in the real applications of network coding. (3) The small field can avoid a low computational burden. In the rank-metric code for network coding, we can make the extension field F q m big enough to get a bigger d min . In this case, m is usually bigger than min-cut C [10]. In [9], |F q | is set more than 256 to guaranteê T is invertible. Naturally, a bigger F q m obviously lead to a heavy computation. In our approach with a small field, F q m should be smaller because |F q | is very small, and a small F q m can lead to a serious computational burden. (4) The disadvantage of our approach is that the transmission rate is a little smaller than the rate in [6] because d min ' is 1 or 2 and it is smaller than d min . But the problem can be alleviated as C becomes bigger because 1 or 2 is a few ratios of C. In this case, d min ' is also sufficient to successfully decode. On the other hand, the more components about the original error Z, i.e., t, the bigger the rank of the propagated errors T Z → Y ⋅ (Z − L ⋅ G ⋅ u) it takes. In this case, the transmission rate is usually low.

Experimental results and discussion
In this section, we give a set of experiments to support claims 1 to 3 in Section 3. We also give corresponding theory analysis about experiments. Because many mathematical operations are performed in the finite field, we design our program about the finite field mathematical operation based on MATLAB, such as inversing a matrix, computing the inverse of a number in the finite field, and computing rank of a matrix in the finite field. In the experiments, we use newly designed finite operation methods to verify claims 1 to 3.
In Fig. 1, the different numbers of original errors are illustrated by different curves. We can find that, if |F q | = 2, the rank of the propagated errors T Z → Y ⋅ (Z − L ⋅ G ⋅ u) can be compressed to 0.7 * C, even if t = 20 * C, where t is the number of original errors Z. In this case, the list decoding of rank-metric codes can correct the propagated errors T Z → Y ⋅ (Z − L ⋅ G ⋅ u) easily. This character is good for correcting the dense errors in the random network coding.
Because it is difficult to do finite field programming with MATLAB when the size of the finite field is even, we use C programming language to do finite field programming. Like Fig. 1, where the size of the finite field is odd, Fig. 2 shows how the errors are propagated in the finite field when the size of the finite field is even.
We now analyze the experimental results theoretically and find that there is a satisfactory coincidence with the theory and the experiment results. The transfer matrix T Z → Y ∈ (F q ) C × t is known in advance, where t is the number of edges on which errors occur, and (Z − L ⋅ G ⋅ u) takes its value in ð F q m Þ t . Based on the theory of the extension field and the base field, we can take the vector (Z − L ⋅ G ⋅ u) in field F q m as a matrix (F q ) t × m in field F q [10]. T Z → Y ⋅ (Z − L ⋅ G ⋅ u) means that when we multiply two matrices together in the field F q , they should satisfy ( The reason is that a small field F q usually makes rank(T Z → Y ⋅ (Z − L ⋅ G ⋅ u)) ≤ C, no matter how big the value of t is. If the Hamming metric is adopted, no matter how small the size of F q is, we cannot compress Hamming weight of T Z → Y ⋅ (Z − L ⋅ G ⋅ u) to the number smaller than C. Naturally, we cannot correct the propagated errors even though the list decoding is used based on the Hamming metric.

Claim 2
According to the experiments we have done, we found that claim 2 always holds on all the parameter combinations. So far, no counterexample has been found to claim 2 in our work.

Claim 3
In Table 1, we show several examples about the decline of the rank of T. We can see the declined amount is 0, 1, or 2. In Fig. 3, we illustrated the ratio of the declined amount of the rank of T in a small field to the full rank. We can find the ratio very low, which means d min ' is very close to d min when the size of the finite field is odd.
Because it is difficult to do finite field programming with MATLAB when the size of the finite field is even, we use C programming language to do finite field programming. Like Fig. 3 where the size of the finite field is odd, Fig. 4 shows how the rank reduced in the finite field when the size of the finite field is even (Table 1).
Whether T is a full rank depends on the size of the finite field. Figure 5 shows the probability that T ∈ (F 2 ) C × C is a full rank in field F 2 .

Discussion
Consider the rank of a square matrix T in the finite field F q . The probability that [9]. For the first selected row of T, the probability that this row is a nonzero vector is 1 − |F q | −C . For the second selected row of T, the probability this row is linearly  Table 1 Several examples about the decline of T. For example, the rank of T is 3 (C − 2) when C = 5 and the field size is 2. In this case, the declined amount about the rank of T is 2. independent with the first selected row is 1 − |F q | −(C − 1) . The rest rows can also be considered in a similar way. Consider the case that |F q | = 2 and C approximate infinity, the probability that T is a full rank is about 0.289 based on [9]. The probability rank of T that declines no more than 2 is Normalized decreased rank based on claim 3 when the size of a finite field is odd. Legend: Denote declined amount about rank of T is by ∂, and the vertical axis characters are ∂/C. In this case, matrix dimension T is C × C, and its rank is C -∂ Fig. 4 Normalized decreased rank based on claim 3 when the size of a finite field is even. Legend: Denote declined amount about rank of T is by ∂ and vertical axis characters are ∂/C. In this case, matrix dimension T is C × C, and its rank is C -∂ ⋯(1 -2 −3 ) is about 0.7707. In Fig. 2, when |F q | = 2 and C = 50, the normalized declined amount about rank is very low. So, the rank of T also did not decline too much when |F q | = 2, and then, it means d min ' is very close to d min .
This work mainly depends on information methods, if we adopt the methods from the machine learning field [11][12][13][14], we may correct more errors. The methods in [15][16][17][18] are also worth learning for the network coding error correction.

Conclusions and future work
With a small field |F q | = 2, the number of original errors can be compressed to less than the min-cut of the network when their number is far more than the min-cut of the network. The minimum distance of the newly formed rank-metric code in the small field also did not decline sharply. So, the propagated errors can be corrected by the listdecoding method of the newly formed rank-metric code. Our scheme can correct more than the min-cut errors in network coding. A small field is also useful to reduce the computational burden compared to the bigger fields.
In our future research, we will try to optimize the scheme in real scenarios. We also will combine deep learning methods with network coding to improve the effectiveness of this method.