In order to enhance the reliability of digital transmissions, error correcting codes are used in every digital communication system. To meet the new constraints of data rate or reliability, new coding schemes are currently being developed. Therefore, digital communication systems are in perpetual evolution and it is becoming very difficult to remain compatible with all standards used. A cognitive radio system seems to provide an interesting solution to this problem: the conception of an intelligent receiver able to adapt itself to a specific transmission context. This article presents a new algorithm dedicated to the blind recognition of convolutional encoders in the general k/n rate case. After a brief recall of convolutional code and dual code properties, a new iterative method dedicated to the blind estimation of convolutional encoders in a noisy context is developed. Finally, case studies are presented to illustrate the performances of our blind identification method.

1 Introduction

In a digital communication system, the use of an error correcting code is mandatory. This error correcting code allows one to obtain good immunity against channel impairments. Nevertheless, the transmission rate is decreased due to the redundancy introduced by a correcting code. To enhance the correction capabilities and to reduce the impact of the amount of redundancy introduced, new correcting codes are always under development. This means that communication systems are in perpetual evolution. Indeed, it is becoming more and more difficult for users to follow all the changes to stay up-to-date and also to have an electronic communication device always compatible with every standard in use all around the world. In such contexts, cognitive radio systems provide an obvious solution to these problems. In fact, a cognitive radio receiver is an intelligent receiver able to adapt itself to a specific transmission context and to blindly estimate the transmitter parameters for self-reconfiguration purposes only with knowledge of the received data stream. As convolutional codes are among the most currently used error-correcting codes, it seemed to us worth gaining more insight into the blind recovery of such codes.

In this article, a complete method dedicated to the blind identification of parameters and generator matrices of convolutional encoders in a noisy environment is treated. In a noiseless environment, the first approach to identify a rate 1/n convolutional encoder was proposed in [1]. In [2, 3] this method was extended to the case of a rate k/n convolutional encoder. In [4], we developed a method for blind recovery of a rate k/n convolutional encoder in turbocode configuration. Among the available methods, few of them are dedicated to the blind identification of convolutional encoders in a noisy environment. An approach allowing one to estimate a dual code basis was proposed in [5], and then in [6] a comparison of this technique with the method proposed in [7] was given. In [8], an iterative method for the blind recognition of a rate (n-1)/n convolutional encoder was proposed in a noisy environment. This method allows the identification of parameters and generator matrix of a convolutional encoder. It relies on algebraic properties of convolutional codes [9, 10] and dual code [11], and is extended here to the case of rate k/n convolutional encoders.

This article is organized as follows. Section 2 presents some properties of convolutional encoders and dual codes. Then, an iterative method for the blind identification of convolutional encoders is described in Section 3. Finally, the performances of the method are discussed in Section 4. Some conclusions and prospects are drawn in Section 5.

2 Convolutional encoders and dual code

Prior to explain our blind identification method, let us recall the properties of convolutional encoders used in our method.

2.1 Principle and mathematical model

Let C be an (n, k, K) convolutional code, where n is the number of outputs, k is the number of inputs, K is the constraint length, and C^{⊥} be a dual code of C. Let us also denote by G(D) a polynomial generator matrix of rank k defined by:

where g_{i,j}(D), ∀i = 1,..., k, ∀j = 1,..., n, are generator polynomials and D represents the delay operator. Let μ_{
i
}be the memory of the i th input:

where deg is the degree of g_{i,j(D)}. The overall memory of the convolutional code, denoted μ, is

\mu =\underset{i=1,...,k}{max}{\mu}_{i}=K-1

(3)

If the input sequence is denoted by m(D) and the output sequence by c(D), the encoding process can be described by

c\left(D\right)=m\left(D\right).G\left(D\right)

(4)

In practice, the encoder used is usually an optimal encoder. An encoder is optimal, [10], if it has the maximum possible free distance among all codes with the same parameters (n, k, and K). This is because the error correction capability of such optimal codes is much higher. Furthermore, their good algebraic properties [9, 10] can be judiciously exploited for blind identification.

To model the errors generated by the transmission system, let us consider the binary symmetric channel (BSC) with the error probability, P_{
e
}, and denote by e(D) the error pattern and by y(D) the received sequence so that:

y\left(D\right)=c\left(D\right)+e\left(D\right)

(5)

Let us also denote by e(i) the i th bit of e(D) so that: Pr(e(i) = 1) = P_{
e
}and Pr(e(i) = 0) = 1 - P_{
e
}. The errors are assumed to be independent.

In this article, the noise is modeled by a BSC. This BSC can be used to model an AWGN channel in the context of a hard decision decoding algorithm. Indeed, the BSC can be seen as an equivalent model to the set made of the combination of the modulator, the true channel model (AWGN by example) and the demodulator (Matched filter or Correlator + Decision Rule). Furthermore, in mobile communications, channels are subject to multipath fading, which leads, in the received bit stream, to burst errors. But, a convolutional encoder alone is not efficient in this case. Therefore, an interleaver is generally used to limit the effect of these burst errors. In this context, after the deinterleaving process, on the receiver side, the errors (so the equivalent channel including the deinterleaver) can also be modeled by a BSC.

2.2 The dual code of convolutional encoders

The dual code generator matrix of a convolutional encoder, termed a parity check matrix, can also be used to describe a convolutional code. This ((n - k) × n) polynomial matrix verifies the following property:

Theorem 1Let G(D) be a generator matrix of C. If an ((n - k) × n) polynomial matrix, H(D), is a parity check matrix of C, then:

G\left(D\right).{H}^{T}\left(D\right)=0

(6)

where .^{T}is the transpose operator.

Corollary 1Let H(D) be a parity check matrix of C. The output sequence c(D) is a codeword sequence of C if and only if:

c\left(D\right).{H}^{T}\left(D\right)=0

(7)

The parity check matrix is an ((n - k) × n) matrix such that:

where h_{0}(D) and h_{i,j}(D) are the generator polynomials of H(D), ∀i = 1,..., n - k and ∀j = 1,..., k.

Let us denote by μ^{⊥} the memory of the dual code. According to the properties of a dual code and convolutional encoders [9, 11], this memory is defined by

{\mu}^{\perp}=\sum _{i=1}^{k}{\mu}_{i}

(9)

The polynomial, f\left(D\right)={\sum}_{i=0}^{\infty}f\left(i\right).{D}^{i}, is a delayfree polynomial if f(0) = 1. According to [12], if the polynomial h_{0}(D) is a delayfree polynomial, then the convolutional encoder is realizable. It follows that the generator polynomial, h_{0}(D), is such that

The parity check matrix (11) is composed of shifted versions of the same (n - k) vectors. These vectors of size n.(μ^{⊥} + 1) and denoted by h_{
j
}(∀j = 1,..., n - k) are defined by

In the case of a rate k/n convolutional encoder, each vector h_{
j
}(13) is composed of (n - k - 1).(μ^{⊥} + 1) zeros. In this configuration, the system given in (7) is split into (n - k) systems:

Let us denote by S the size of these parity checks of the code (16) such that

S=\left(k+1\right).\left({\mu}^{\perp}+1\right)

(18)

It follows from (16) and (10) that the (n - k) parity checks, h_{
s
}, are vectors of degree (S - 1).

3 Blind recovery of convolutional code

This section deals with the principle of the proposed blind identification method in the case where the intercepted sequence is corrupted. Only few methods are available for blind identification in a noisy environment: for example, an Euclidean algorithm-based approach was developed and applied to the case of a rate 1/2 convolutional encoder [13]. At nearly the same time, a probabilistic algorithm based on the Expectation Maximization (EM) algorithm was proposed in [14] to identify a rate 1/n convolutional encoder. Further to our earlier development of a method of blind recovery for a convolutional encoder of rate (n - 1)/n [8], it appeared to us worth extending it, here, to the case of a rate k/n convolutional encoder. Prior to describing the iterative method in use, which is based on algebraic properties of an optimal convolutional encoder [9, 10] and dual code [11], let us briefly recall the principle of our blind identification method when the intercepted sequence is corrupted.

3.1 Blind identification of a convolutional code: principle

This method allows one to identify the parameters (n, k, and K) of an encoder, the parity check matrix, and the generator matrix of an optimal encoder. Its principle is to reshape columnwise the intercepted data bit stream, y, under matrix form. This matrix, denoted R_{
l
}, is computed for different values of l, where l is the number of columns. The number of rows in each matrix is equal to L. If the received sequence length is L', then the number of rows of R_{
l
}is L=\u230a\frac{{L}^{\prime}}{l}\u230b, where ⌊.⌋ stands for the integer part. This construction is illustrated in Figure 1.

If the received sequence is not corrupted (y = c ⇒ e = 0), for α∈ℕ, we have shown in [8] that the rank in Galois Field, GF(2), of each matrix R_{
l
}has two possible values:

where n_{
a
}is a key-parameter which corresponds to the first matrix R_{
l
}with a rank deficiency. Indeed, in [8], for a rate (n - 1)/n convolutional encoder, this parameter proved to be such that

{n}_{a}=n.\left({\mu}^{\perp}+1\right)

(21)

In this configuration, n_{
a
}is equal to the size of the parity check (S). But, what is its value in general for a rate k/n convolutional encoder?

For a rate k/n convolutional encoder, we show in Appendix A that the size of the first matrix which exhibits a rank deficiency, n_{
a
}, is equal to

{n}_{a}=n.\u230a\frac{{\mu}^{\perp}}{n-k}+1\u230b

(22)

From (22), it is obvious that the parameter, n_{
a
}, is not equal to the size of the (n - k) parity check (16) of the code. In Appendix B, a discussion about the value of a rank deficiency of matrix {R}_{{n}_{a}} is proposed.

3.2 Blind identification of convolutional code: method

A prerequisite to the extension of the method applied in [8] to the case of a rate k/n convolutional encoder is the identification of the parameter, n. Then, a basis of dual code has to be built to further deduce the value of n_{
a
}that corresponds to the size of the parity check with the smallest degree. Using both this parameter and (22), one can assume different values for k and μ^{⊥} Then, the (n - k) parity check (16) and a generator matrix of the code can be estimated.

To identify the number of outputs, n, let us evaluate the likely-dependent columns of R_{
l
}. Then, the values of l at which R_{
l
}matrices seem to be of degenerated rank are detected by converting each R_{
l
}matrix into a lower triangular matrix (G_{
l
}) through use of the Gauss Jordan Elimination Through Pivoting adapted to GF(2):

{G}_{l}={A}_{l}.{R}_{l.}{B}_{l}

(23)

where A_{
l
}is a row-permutation matrix of size (L × L) and B_{
l
}is a matrix of size (l × l) that describes the column combination. Let N_{
l
}(i) be the number of 1 in the lower part of the i th column in the matrix, G_{
l
}. In [15, 16], this number was used to estimate an optimal threshold (γ_{
opt
}), which allows us to decide whether the i th column of the matrix R_{
l
}is dependent on the other columns. This optimal threshold is such that the sum of the missing probabilities is as small as possible. The numbers of detected dependent columns, denoted as Z(l), are such that

where Card{x} is the cardinal of x. So, the gap between two non-zero cardinals, Z(l), is equal to the estimated codeword size (\widehat{n}). Let \mathcal{I} be a set of l-values where the cardinal is non-zero. From the matrix, {B}_{i},\forall i\in \mathcal{I}, one can build a dual code basis. Let \mathcal{I} be a ((L - i) × i) matrix composed of the last (L - i) rows of R_{
i
}. If b_{
j
}, ∀j = 1,..., i, represents the j th column of B_{
i
}, b_{
j
}is considered as a linear form close to the dual code on condition that:

where d(x) is the Hamming weight of x. Let us denote a set of all linear forms by \mathcal{D}. Within the set of detected linear forms, the one with the smallest degree is taken and denoted, here, by ĥ, and its size by {\widehat{n}}_{a}. From (22), one can make different hypotheses about k and μ^{⊥} values. This algorithm is summed up in Algorithm 1.

For a rate (n - 1)/n convolutional encoder with ĥ as parity check, solving the system described in Property 1 (see Section 2) enables one to identify the generator matrix. One should, however, note that with a rate k/n convolutional code, a prerequisite to the identification of the generator matrix, G(D), is the identification of the (n - k) parity check, h_{
j
}of size S (see (16) and (18)).

Algorithm 1: Estimation ofkandμ^{⊥}

Input: Value of \widehat{n} and {\widehat{n}}_{a}

Output: Value of \widehat{k} and {\widehat{\mu}}^{\perp}

∀s = 1,..., \forall s=1,...,\left(\widehat{n}-\widehat{k}\right). For each vector, x_{
s
}, a matrix, {R}_{l}^{s}, is built as previously done for R_{
l
}. Then, for each matrix {R}_{l}^{s}, a linear form of size S has to be estimated. This algorithm is summed up in Algorithm 2 where ĥ_{
s
}refers to the identified \widehat{n}-\widehat{k} parity check.

Identification of the generator matrix from both these (\widehat{n}-\widehat{k}) parity checks and the whole set of the code parameters can be realized by solving the system described in Property 1.

In [15, 17], a similar approach, based on a rank calculation, is used to identify the size of an interleaver. In this article, an iterative process is proposed to increase the probability to estimate a good size of interleaver. The principle of this iterative process is to perform permutations on the R_{
l
}matrix rows to obtain a new virtual realization of the received sequence. These permutations increase the probability to obtain non-erroneous pivots during the Gauss Elimination process (23). Our earlier identification of a convolutional encoder relied on a similar approach [8]. Indeed, at the output of our algorithm, either: (i) the true encoder, or an optimal encoder, is identified or (ii) no optimal code is identified. But in case (ii), the probability of detecting an optimal convolutional encoder is increased by a new iteration of the algorithm.

The average complexity of one iteration of the process dedicated to the blind identification of convolutional encoder is \mathcal{O}\left({l}_{max}^{4}\right). Indeed, our blind identification method is divided into three steps: (i) identification of n, (ii) identification of a dual code basis, and (iii) identification of parity checks and a generator matrix. Each step consist of maximum (l_{
max
}- 1) process of Gaussian eliminations on R_{
l
}matrices of size (L × l)

Algorithm 2: Estimation of (\widehat{n}-\widehat{k}) parity check.

Input: y, \widehat{n},\phantom{\rule{2.77695pt}{0ex}}\widehat{k} and {\widehat{\mu}}^{\perp}

Furthermore, in the literature, the parameters of convolutional encoders used take typically quite very small values. Indeed, the maximum parameters are such that

A minimum value of l_{max} is given in Table 1 for three optimal encoders used in the following section dedicated to the analysis and performances study of our blind identification method.

4 Analysis and performances

In order to gain more insight into the performances of our blind identification technique, let us consider three convolutional encoders, C(3,1, 4), C(3, 2, 3), and C(2, 1, 7).

Let R_{
l
}be a matrix built from 20, 000 received bits with l = 2, ..., 100 and L = 200. It is very important to take into account the number of data to prove that our algorithm is well adapted for implementation in a realistic context. The amount of 20,000 bits is quite low with regards compared to standards. For example, in the case of mobile communications delivered by the UMTS at a data rate up to 2 Mbps, only 10 ms are needed to receive 20, 000 bits. Furthermore, the rates reached by standards in the future will be higher.

For each simulation, 1000 Monte Carlo were run, and focus was on

the impact of the number of iterations upon the probability of detection;

the global performances in terms of probability of detection.

In this article, the detection means complete identification of the encoders (parameters and generator matrix).

4.1 The detection gain produced by the iterative process

The number of iterations to be made is a compromise between the detection performances and the processing delay introduced in the reception chain (see [8]). To evaluate this number of iterations, let P_{
det
}(i) be the probability of detecting the true encoder at the i th iteration.

The probability of detecting the true encoder, P_{det}, is called probability of detection.

C(3, 2, 3) convolutional encoder:

Figure 2 shows the probability of detecting the true encoder (P_{det}) compared with P_{
e
}for 1, 10, and 50 iterations. It shows that, for the C(3, 2, 3) convolutional encoder, 10 iterations of the algorithm result in the best performances: indeed, there is no advantage in performing 50 iterations rather than 10. On the other hand, the gain between 1 and 10 iterations is huge.

C(3,1,4) convolutional encoder:

Figure 3 illustrates the evolution of P_{det} compared with P_{
e
}for 1, 10, and 50 iterations in the case of C(3,1, 4) convolutional encoder. It shows that the gain between the 1st and the 50th iterations is nearly nil.

For a rate k/n convolutional code where k ≠ n - 1, the algorithm presented in Figure 2 requires several iterations to estimate the (n - k) parity checks (16). Consequently, for such codes (k ≠ n - 1) there is no need to realize this iteration process. Indeed, the gain provided by our iterative process is not significant. But, for a rate (n - 1)/n convolutional encoder, it is clear that the algorithm performances are enhanced by iterations. Moreover, it is important to note that the detection of a convolutional code depends on both the parameters of the code, the channel error probability, and the correction capacity of the code. Thus, the number of iterations needed to get the best performance is code dependent. For such a code, it would be worth assessing the impact of the required number of data. In order to achieve this, for the C(2,1, 7) convolutional encoders, a comparison of the detection gain produced by the iterative process for several values of L is proposed.

C(2,1,7) convolutional encoder:

Figure 4 depicts P_{det} compared with P_{
e
}, for 1, 5, and 50 iterations and for L = 200. For 1, 10, 40, and 50 iterations, Figure 5 illustrates the evolution of P_{det} compared with P_{
e
}for L = 500. It shows that, for L = 200, 5 iterations permit us to identify the true encoder, whereas, for L = 500, the identification of the true encoder requires 40 iterations. For L = 200, after 5 iterations, P_{det} is close to 1 for P_{
e
}≤ 0.02, but after 40 iterations and L = 500, P_{det} is close to 1 for P_{
e
}≤ 0.03. It is clear that the number of received bits is an important parameter of our method. Indeed, by increasing the size of matrices R_{
l
}, the probability to obtain non-erroneous pivots increases during the iterative process. Thus, it is possible to realize more iterations of our algorithm to improve detection performances. But, for implementation in a realistic context, the required number of data has to be taken into account. In the last section, we will show that the algorithm performances are very good when L = 200.

4.2 Probability of detection

To analyze the method performances, three probabilities were defined as follows:

1.

probability of detection (P_{det}) is the probability of identifying the true encoder;

2.

probability of false-alarm (P_{fa}) is the probability of identifying an optimal encoder but not the true one;

3.

probability of miss (P_{m}) is the probability of identifying no optimal encoder.

In order to assess the relevance of our results through a comparison of the different probabilities to the code correction capability, let us denote by BER_{
r
}the theoretical residual bit error rate obtained after decoding of the corrupted data stream with a hard decision [12]. Here, to be acceptable, BER_{
r
}must be close to 10^{-5}.

Figures 6, 7, and 8 show the different probabilities compared with P_{
e
}after 10 iterations and the limit of the 10^{-5} acceptable BER_{
r
}for C(3, 2, 3), C(3, 1, 4), and C(2, 1, 7) convolutional encoders, respectively. One should note that the probability of identifying the true encoder is close to 1 for any P_{
e
}with a post-decoding BER_{
r
}less than 10^{-5}. Indeed, the algorithm performances are excellent: P_{det} is close to 1 when P_{
e
}corresponds to either BER_{
r
}< 2 × 10^{-4} for C(3,2,3) convolutional encoder or BER_{
r
}< 0.67 × 10^{-4} for the C(3,1,4) encoder.

5 Conclusion

This article dealt with the development of a new algorithm dedicated to the reconstruction of convolutional code from received noisy data streams. The iterative method is based on algebraic properties of both optimal convolutional encoders and their dual code. This algorithm allows the identification of parameters and generator matrix of a rate k/n convolutional encoder. The performances were analyzed and proved to be very good. Indeed, the probability to detect the true encoder proved to be close to 1 for a channel error probability that generates a post-decoding BER_{
r
}that is less than 10^{-5}. Moreover, this algorithm requires a very small amount of received bit stream.

In most digital communication systems, a simple technique, called puncturing, is used to increase the code rate. The blind identification of the punctured code is divided into two part: (i) identification of the equivalent encoder and (ii) identification of the mother code and puncturing pattern. Our method, dedicated to the blind identification of k/n convolutional encoders, also allows the blind identification of the equivalent encoder of the punctured code. Thus, our future study will be to identify the mother code and the puncturing pattern only from the knowledge of this equivalent encoder.

A The key-parameter n_{
a
}

According to (20), the rank of the matrix, R_{α.n}, is:

Let us seek n_{
a
}, when n_{
a
}= α.n, which corresponds to the first matrix, {R}_{{n}_{a}}, with a rank deficiency. This corresponds to seeking the minimum value of α.

\alpha .n\left(1-\frac{k}{n}\right)>{\mu}^{\perp}

(32)

\alpha .n>\frac{n}{n-k}.{\mu}^{\perp}

(33)

\alpha >\frac{{\mu}^{\perp}}{n-k}

(34)

So, the minimum value of α, denoted α_{
min
}, is such that

where Z(n_{
a
}) ∈ℕ. Therefore, the rank deficiency of the matrix, {R}_{{n}_{a}}, is such that

1\le Z\left({n}_{a}\right)\le \left(n-k\right)

(46)

References

Rice B: Determining the parameters of a rate 1/ n convolutional encoder over gf(q). In Proceedings of the 3rd International Conference on Finite Fields and Applications. Glasgow; 1995.

Filiol E: Reconstruction of convolutional encoders over GF(p). In Proceedings of the 6th IMA Conference on Cryptography and Coding. Volume 1355. Springer Verlag; 1997:100-110.

Barbier J: Reconstruction of turbo-code encoders. In Proc SPIE Security and Defense Space Communication Technologies Symposium. Volume 5819. Orlando, FL, USA; 2005:463-473.

Marazin M, Gautier R, Burel G: Blind recovery of the second convolutional encoder of a turbo-code when its systematic outputs are punctured. MTA Rev 2009, XIX(2):213-232.

Barbier J, Sicot G, Houcke S: Algebraic approach for the reconstruction of linear and convolutional error correcting codes. Int J Appl Math Comput Sci 2006, 2(3):113-118.

Côte M, Sendrier N: Reconstruction of convolutional codes from noisy observation, in. In Proceedings of the IEEE International Symposium on Information Theory ISIT 09. Seoul, Korea; 2009:546-550.

Marazin M, Gautier R, Burel G: Dual code method for blind identification of convolutional encoder for cognitive radio receiver design. In Proceedings of the 5th IEEE Broadband Wireless Access Workshop, IEEE GLOBECOM 2009. Honolulu, Hawaii, USA; 2009.

Wang F, Huang Z, Zhou Y: A method for blind recognition of convolution code based on euclidean algorithm, in. Proceedings of the International Conference on Wireless Communications, Networking and Mobile Computing 2007, 1414-1417.

Dingel J, Hagenauer J: Parameter estimation of a convolutional encoder from noisy observations, in. In Proceedings of the IEEE International Symposium on Information Theory, ISIT 07. Nice, France; 2007:1776-1780.

Open Access
This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (
https://creativecommons.org/licenses/by/2.0
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Marazin, M., Gautier, R. & Burel, G. Blind recovery of k/n rate convolutional encoders in a noisy environment.
J Wireless Com Network2011, 168 (2011). https://doi.org/10.1186/1687-1499-2011-168