Blind identification of code word length for non-binary error-correcting codes in noisy transmission

In cognitive radio context, the parameters of coding schemes are unknown at the receiver. The design of an intelligent receiver is then essential to blindly identify these parameters from the received data. The blind identification of code word length has already been extensively studied in the case of binary error-correcting codes. Here, we are interested in non-binary codes where a noisy transmission environment is considered. To deal with the blind identification problem of code word length, we propose a technique based on the Gauss-Jordan elimination in GF(q) (Galois field), with q=2m, where m is the number of bits per symbol. This proposed technique is based on the information provided by the arithmetic mean of the number of zeros in each column of these matrices. The robustness of our technique is studied for different code parameters and over different Galois fields.


Introduction
Error-correcting codes are frequently used in modern digital transmission systems in order to improve the communication quality. These codes are designed to achieve a good immunity against channel impairments by introducing redundancy in the informative data. Due to the complexity of both encoding and especially decoding procedures, the majority of research and practical implementations of real-time embedded systems were often restricted to encoders manipulating binary data, i.e., elements of the Galois field GF (2). Over the last decade, low-density parity check (LDPC) codes and turbo codes over GF (2) have attracted considerable interest of many researchers due to their excellent error correction capability. They have been generalized to finite fields GF(q) [1,2], where q = 2 m , and are among the most widely used errorcorrecting codes in wireless communication standards. It has been shown in [1] that non-binary LDPC codes perform generally better than binary LDPC codes and turbo codes. However, the major drawback of these codes is their decoding complexity for a large Galois field order q [3,4]. Low complexity decoding algorithms have recently been proposed [5,6], thus allowing the use of non-binary LDPC codes in practical implementations. Our main research interests are focused on non-binary error-correcting codes in order to blindly identify their parameters. This topic is a part of a non-cooperative context like a military interception or cognitive radio applications. In this case, the receiver has no knowledge about the parameters used to encode the information at the transmitter. The solution is to design an intelligent receiver which is able to blindly identify the encoder parameters from the only knowledge of the received data stream. This blind identification function of the receiver permits to increase the data rate transmission, since it will be unnecessary to transmit supplementary information about the encoder parameters with the useful data. Such intelligent receiver is able to adapt automatically itself to the development of new high-performance coding schemes and the fast evolution of new communication standards without equipment change. In this work, we are only interested in blindly identifying the code word length of linear nonbinary block codes. In the case of the interception, this parameter can not be transmitted. Likewise, if we want to change the encoder or get out of the list of possible choice of encoders, the code word length is not transmitted.
In this context, the published research results have been restricted so far to the blind recognition of the code word length of binary codes. To the best of our knowledge, this paper introduces, for the first time, an approach to blindly identify the code word length of non-binary codes in noisy conditions. In this work, the aim is to blindly identify the code word length from the only knowledge of received data. The authors in [7] proposed a technique of identification of non-binary LDPC parameters, but the identification is not blind because it is based on using a predefined candidate set of encoders which is known by both the transmitter and the receiver. Furthermore, this technique only works with LDPC codes unlike our proposed technique, which is general and suitable for all block codes. In our paper, the proposed blind identification technique is based on a generalization of an existing method used for binary codes. The principle of this generalization will be explained in this paper without specifying in details its detection performances. So, we present here state-of-the-art techniques to identify the code word length of binary linear block codes. The idea of these techniques is to find a basis of a dual code composed of parity check relations. For this purpose, an approach based on finding code words of small Hamming weight [8,9] was improved by Valembois [10] by using statistical hypothesis tests and recently by Cluzeau [11,12] and Côte [13]. A second approach based on linear algebra theory was introduced in [14] for noiseless channel. This approach permits to recover the length of code words by studying behaviors of the rank of matrices composed of received bits. However, the rank criterion was exploited without providing an algebraic and theoretical justification of such behavior. In [15], the use of this criterion was justified. In [16], the rank criterion approach was generalized to convolutional codes over GF(q), where q > 2, assuming a noiseless transmission, but it was shown that this generalized technique can be also performed to non-binary linear block codes. In noisy transmissions, a technique based on the Gauss elimination in GF(2) was applied in [17][18][19] to matrices composed of noisy received bits in order to find the number of almost dependent columns permitting the identification of the code word length in the case of binary error-correcting codes. Indeed, an almost dependent column of a matrix composed of noisy received symbols corresponds to a column which may be a linear combination of some preceding columns without the presence of erroneous symbols and which leads to a column that contains more zero elements after the Gauss elimination.
Compared to previous works, we demonstrate here that it is possible to generalize the blind identification technique proposed in [17,18] to non-binary block codes provided that the Galois field parameters (the cardinality and the primitive polynomial) are known by the receiver. To identify the primitive polynomial, an algorithm of identification was proposed in [20]. To achieve our purpose, it is necessary to identify the number of almost dependent columns in the matrices composed of noisy symbols of GF(q) by studying the probability of detection of these columns, denoted as P i . In fact, the computation of P i is essential in order to determine an optimal detection threshold. Assuming a transmission over q-ary symmetric channel with an error probability p e , the techniques based on finding a base of a dual code [18,19] for binary codes require the knowledge of p e , where a hard decision demodulation is considered. For this reason, we propose here an approach which is more robust because it allows us the blind identification of the code word length of nonbinary and binary block codes without using the error probability p e . This approach is based on analyzing behaviors of the arithmetic mean of the number of zeros in the columns of the matrices constructed by the Gauss elimination in GF(q). In this paper, the proposed method is a general method that should be applied to all nonbinary block codes even though most examples of codes given here are non-binary LDPC codes. For this reason, the properties of LDPC codes are not exploited by our method.
This paper is organized as follows. In the 'Technical background' section, we present the encoding process of non-binary error-correcting codes. Then, the principle of the blind identification of code parameters in the noiseless case is described. The channel model used in this study is also defined and justified in this section. In the 'Blind identification of code word length in the noisy case' section, the blind identification method of the code word length in noisy environment is described. A comparison in terms of error probability and detection performances is shown in the 'Analysis and performances' section. Finally, some conclusions are drawn in the 'Conclusions' section and planned future work is pointed out.

Non-binary error-correcting codes
The use of an efficient coding system in the transmitter as error-correcting codes is essential in order to fight disturbances present on the transmission channel. For a long time, cyclic codes such as BCH codes [21,22] and Reed-Solomon codes [23] have been the most commonly used as codes based on finite fields since they are characterized by large minimum distances for a hard decision decoding. The non-binary LDPC codes described by a sparse parity check matrix with elements in GF(q) have been developed by Davey and MacKay in 1998 [1]. Significant works on the design and the decoding complexity reduction of these codes have shown that they have a great potential to replace Reed-Solomon codes in some applications of communication, such as space communications [24], and storage systems [25,26]. In this paper, we focus on the blind identification of code word length for the non-binary block codes, but this proposed method can also be applied to convolutional codes and concatenated codes.
Let us present the encoding process of these codes over GF(q). Actually, the principle of a transmission chain is to send digital information from a source to one or more receivers. The information yielded by the source is binary data {0, 1} = GF (2). Each block of m information bits are combined to generate a symbol of GF(q). Then, the generated non-binary information, denoted as d, is encoded by one of the block codes over GF(q) listed above. For most block error-correcting codes, a code word, denoted as c, composed of non-binary symbols is obtained by the multiplication of the information d and a non-binary generator matrix G: In the case of LDPC codes, the encoding process needs the use of the parity check matrix, which is always sparse compared to the other codes.
In most of the standards, such as long-term evolution (LTE) standard [27], the encoding is performed in a systematic form in order to facilitate the decoding process without degrading performances of the error correction. For this reason, in the case of block codes, the required parameters to perform the decoding operation are the number of inputs, denoted as k, the code word length, denoted as n, and a parity check matrix, denoted as H. Indeed, the matrix H will be used by the decoder to detect or/and to correct the errors. The recovered information will be the first k symbols of the recovered code word due to the systematic form used in the encoding. Our aim in this research work is to blindly identify the parameter n from non-binary received symbols which are affected by noisy transmissions. In the noiseless context, we have already demonstrated in [16] that we can identify this parameter with the only knowledge of the received data, provided that the Galois field parameters are known. The principle of blind identification of the code parameter n in the noiseless case is recalled in the following subsection.

Principle of blind identification method of code word length in the noiseless case
In this part, we assume that the channel introduces no error. In [16], we have adapted the method proposed in [28] to identify the parameters of convolutional codes over GF(q), where q = 2 m . We have shown that our method for the noiseless case can be applied to block codes. This method reshapes row-wise the received symbols, denoted as r, under a matrix form, denoted as R l , of size (M × l). Indeed, R l is filled by received symbols from the top left corner to the bottom right as illustrated in Figure 1.
The number of columns l varies between 1 and l max and the number of rows M which depends on l is given by the integer part L l where L is the length of a received symbol stream. Then, the rank over GF(q) is calculated for each matrix R l . When all matrices R l have full rank, it is impossible to detect the existence of a code. Nevertheless, the redundancy introduced by the code leads to rank deficiencies in some matrices R l . Henceforth, the rank behaviors of R l allow us to detect the code and to identify its parameters, in particular the code word length. As demonstrated in [15] and studied in [16], there are two possible rank behaviors according to the number of columns l. If l is a multiple of n (i.e., l = α · n, α ∈ N), the ranks of the matrices R l are proportional to the code rate k/n (i.e., rank(R l ) = l · k/n). Otherwise (i.e., l = α · n), R l have full rank (i.e., rank(R l ) = l). Thus, the value of the rank deficiency depends on code parameters (k and n). Indeed, only two consecutive rank deficiencies are necessary to determine all code parameters. The code word length n can be determined by the difference between two values of l corresponding to two consecutive rank deficiencies of R l . As shown in [16], the rank method gives good results in a noiseless environment. A theoretical and algebraic study of the behavior of the rank criterion, as well as particular cases which can occur for specific parameters of codes, were presented in [15]. It was demonstrated that most matrices R l have full rank when l is not a multiple of n, except for some particular cases which depend on codes (generator matrix). In a noisy environment, the rank method can not be used, since all the matrices R l have full rank in this case.

Non-binary channel
In order to evaluate our blind identification algorithm, we assume that the encoded sequences are transmitted through a q-ary (non-binary, for q = 2 m > 2) symmetric channel (QSC) which is the simplest channel. However, our proposed algorithms can work for every type of channel provided that the error probability p e computed at the output of the demodulator is known. Indeed, we consider that the blocks of the transmission chain, the modulator, the transmission channel, and the demodulator can be modeled by a non-binary channel, where a hard decision demodulation is considered. In a cognitive radio context, a multipath fading channel is used. This realistic channel leads to burst errors which can be corrected by using an interleaver and error-correcting codes. In this context, the errors at the output of a deinterleaver at the receiver side can be modeled by a QSC when a decoding process with hard decision will be used. The problem of a blind identification of the interleaver period, as well as a blind synchronization with the interleaver blocks was handled in [14,18].
Let us define the q-ary symmetric channel which is the generalization of the binary symmetric channel (BSC). In fact, it is a discrete memoryless channel with an error probability p e and composed of non-binary inputs and non-binary outputs belonging to the GF(q), where q = 2 m . The symbols at the input of the channel are independent and distributed uniformly with a probability equal to 1/q. A symbol δ ∈ GF(q) at the channel input is received incorrectly with a probability p e /(q − 1) [29]. In other words, it is replaced at the receiver by a different symbol β of GF(q). The probability of correctly receiving a symbol is equal to 1 − p e . The QSC channel is characterized by the conditional probabilities: where the transmitted symbol is denoted r i , i.e., r i = c i , for i ∈ {1, · · · , L}, and the noisy received symbol is denoted r i such thatr i = r i + e i with e i the transmission error introduced in the symbol r i . An example of a non-binary symmetric channel for q = 2 2 is depicted in Figure 2.
In the following section, we present the blind identification method of the parameter n in a noisy framework.

Blind identification of code word length in the noisy case
In this part, we present the implementation method which allows us to identify the code word length of a non-binary code in a noisy environment. This method is based on the concept of finding the rank-deficient matrices among R l , ∀l ∈ {1, . . . , l max }, corresponding to matrices having at least one almost dependent column. Indeed, the matri-cesR l are reshaped in the same way as R l using the noisy received symbolsr i . In [19], a method devoted to determine these matrices in the case of binary codes was presented. However, this method requires the knowledge of the error probability p e . In order to avoid this constraint, we propose a method based on using the arithmetic mean criterion in order to detect the rank-deficient matrices which have some almost dependent columns without the need of the error probability p e .

Principle
In a noiseless case, the rank criterion is used to find the maximum number of linearly independent columns in the matrices R l . This allows us to derive the number of linearly dependent columns in R l (columns which are linear combinations of other columns). The finite-field Gauss elimination method [30] has to be used to eliminate those linear-dependent columns to zero. In noisy transmissions, all matricesR l have full rank. A matrixR l can be expressed according to R l by: where E l is the error matrix of size (M × l) constructed in the same way as R l using the errors induced by the channel. Therefore, the dependence of the columns is disturbed by the presence of errors in some received symbols.
In such context, the authors in [17,18] proposed to look for the number of almost dependent columns in the matrices composed of noisy received bits by using the Gauss elimination over GF (2). Inspired by this idea, it is sufficient, in the case of non-binary error correcting codes, to apply the finite-field Gauss elimination in GF(q) toR l in order to obtain a new matrixT l of size (M × l). This algorithm gives also at output a matrix of size (l × l), denoted A l , that describes the combination operations performed to the columns of the matrixR l in order to obtain the transformation matrixT l . A recall of the finite-field Gauss elimination over GF(q) is presented in Algorithm 1. To describe this algorithm, we denote I l the identity matrix of size (l × l), x (l) i the i-th column of a given matrix X l and x (l) i (j) a coefficient of a matrix X l placed in the i-th column and in the j-th row.

Algorithm 1
The finite-field Gauss elimination over GF(q).
Require:R l Ensure:T l andÃ l Initialization: Apply the following operations to the columns of the matrix A l :

end for end for
By means of this algorithm, the linear-dependent columns in the matrix will be eliminated to zeros. The whole matrix is considered in our proposed method instead of only the lower part of the matrixR l as mentioned in [17]. It would be more accurate than assuming that errors do not occur in the upper part of the matrix, but it is not the real case.
We can note that the finite-field Gauss elimination over GF(q) can be defined by a linear application given by: In noiseless transmissions, the number of dependent columns in R l , for l = α · n, α ∈ N, corresponds to the number of the zero columns in the matrix T l which is the result of the transformation of R l by the finite-field Gauss elimination in GF(q) (R l · A l = T l ). The matrix form of T l is described in Figure 3.
In fact, the dimension identification of a vector space generated by a code C is equivalent to finding the dimension of a vector space generated by its dual code C ⊥ . For any vector h belonging to C ⊥ and for any code word r of C, the relation between both is defined by r · h T = 0. In noiseless conditions, the matrix R n , for l = n, which is composed of M code words of length n, should satisfy: We can note that h belongs to the kernel of R n , denoted as ker (R n ). So, we have C ⊥ ⊂ ker (R n ). Since the dependent columns in R l multiplied by the columns a (l) i permit to have the zero columns in the matrix T l , the corresponding columns a (n) i will belong to ker (R n ) in which the dual code C ⊥ is contained. Therefore, finding the dependent columns in R l is equivalent to finding the columns a (l) i which belong to the dual code C ⊥ .
Due to the presence of errors induced by the channel iñ R l , for l = α · n, the columns ofT l corresponding to the almost dependent columns inR l will contain some nonzero symbols. Assuming that the first l rows and the pivots of the matrixT l do not contain transmission errors, using (3) and (4) allows us to write the matrixT l as: In this case, a vector h is a parity check relation (i.e., h ∈ C ⊥ ) with high probability if the relationR l · h T has a low Hamming weight [11]. However, the opposite is not necessarily true. We can conclude thatã i has a small Hamming weight. In GF(q), the Hamming weight of a vector is the number of non-zero elements in this vector. So, our aim is to determine the columnst (l) i which have a high number of zeros. The idea is to study the number of zeros in the columns of theT l in order to detect the almost dependent columns inR l .

Behaviors of the number of zeros in the columns ofT l
Let B l (i) be the number of zeros in the i-th column ofT l , t (l) i . Hence, the variable B l (i) has two behaviors depending on whether the columnã (l) i belongs to the dual code C ⊥ or not. This variable will be studied as a function of a (l) i assuming that the bits that represent an element of the GF(q), where q = 2 m , are uniformly distributed and independent from each other.
• If the columnã (l) i does not belong to the dual code C ⊥ , the variable B l (i), for all i ∈[ [ 1, l]], will follow a binomial distribution of parameters M and 1/q with a mean equal to M/q, denoted as B(M, 1/q).
• If the columnã (l) i belongs to the dual code C ⊥ , the variable B l (i) will follow a binomial distribution with parameters M and P i , denoted as B(M, P i ). The parameter P i corresponds to the probability that a coefficientt (l) i (j) of the columnt (l) i is equal to 0 i.e., P i = Pr t (l) It is possible to limit the two behaviors of the variable B l (i) by computing an optimal thresholdη opt such that: whereη opt = M q · η opt is a real in the interval [ 0, M]. The optimal threshold η opt is able to minimize the probability of wrong detection of a columnã (l) i ∈ C ⊥ , denoted as P wd , which corresponds to the sum of the false alarm probability, denoted as P fa , and the probability of not detecting a theoretical dependent column, denoted as P nd . The optimal threshold is determined by: η opt = arg min η (P wd ) = arg min η (P nd + P fa ) The normal distribution can be used to approximate the binomial probabilities of B l (i) when M is large: where N μ 0 , σ 2 0 is the normal distribution of parameters μ 0 = M · P i and σ 2 0 = M · P i · (1 − P i ) and N μ 1 , σ 2 1 corresponds to the normal distribution of parameters μ 1 = M/q and σ 2 1 = M · (q − 1)/q 2 . Henceforth, the optimal value of the thresholdη minimizing the probability of wrong detection P wd can be computed by: where φ(x) is the cumulative density function of the standard normal distribution: We can note that the optimal thresholdη opt depends on the parameters: M, q, and P i . So, in order to delimit the two behaviors of the variable B l (i), it is necessary to compute the probability P i .

Computation of the probability P i
In the case of binary codes, the probability P i has been calculated in [11]. But, it has never been studied in the general case of codes over GF(2 m ). In fact, the computation of the parameter P i is essential in order to detect the almost dependent columns inR l by delimiting the two behaviors of the variable B l (i). Our aim is to investigate this probability in the case of non-binary codes. In the following, the theoretical study of P i is presented.
For l = n and i a position of a columnã i can be obtained, using (6), by: where t (l) i (j) = 0 in the case of noiseless transmissions as explained previously. Indeed, the sum n k=1 a (l) k (j) = 0, ∀k ∈ {1, · · · , n}, and ∀j ∈ {1, · · · , M}. However, in the case of noisy transmissions, the coefficients e where X is a random variable of the erroneous positions number among N i (l). Indeed, we show in Appendix that the probability P i of havingt (l) i (j) = 0 can be determined by: In the case of GF(2) (i.e., q = 2), this probability can be written as: This expression corresponds to that used in [11].
In Figure 4, we represent the wrong detection probability P wd as a function ofη/M and p e assuming q = 2 3 , w ã (l) i = 20 and M = 2, 000. For each value of p e , the optimal thresholdη opt corresponding to a root of (11) is computed. From Figure 4, we can deduce that the threshold interval satisfying P wd ≈ 0 decreases when the value of p e increases.
We can conclude that studying the behaviors of B l (i) in order to identify n is based on the calculation of the optimal thresholdη opt . However, this threshold depends on the value of the error probability p e which is unknown for the receiver. So, the need to estimate this parameter is a blocking step in the almost dependent columns method and also leads to a lack of robustness.
In order to address these problems, we propose a new iterative method based on the arithmetic mean of the variable B l (i) which do not depend on p e and where the iterative process permits to improve the detection probability.

New iterative method based on the arithmetic mean of the variable B l (i)
In this part, the proposed method based on the arithmetic mean of the number of zeros in the columns of the matrixT l is described. We recall that the Gauss elimination described in Algorithm 1 should be applied in order to obtainT l . We show here that the identification of the parameter n by our proposed method does not depend on the error probability p e . In this method, in order to improve the detection probability of n, an iteration process is introduced. We consider the idea of the iterative process proposed in [18,19]. The principle of this process is to perform random permutations on the rows of the matrixR l in order to obtain a new virtual realization of the received data. These permutations permit to increase the probability to obtain non-erroneous pivots during the Gauss elimination. The arithmetic mean of the variables B l (i), ∀i ∈[ [ 1, l]], denoted E l is defined by: Property 1. If X 1 , X 2 , · · · , X m are independent random variables respectively following: We recall that the variable B l (i) which is the number of zeros in the i-th column of the matrixT l has two possible behaviors depending on l: • If l = α · n, for α ∈ N, the variable B l (i) follows a normal distribution N μ 1 , σ 2 1 for all columns i of T l . In this case, using the property 1, the mean E l will follow: We can note that the mean E l will be close to M/q.
• If l = α · n, for α ∈ N: -If the i -th column is an almost dependent column, the variable B l (i) will follow the normal distribution of parameters N μ 0 , σ 2 0 . -If the i -th column is not an almost dependent column, the variable B l (i) will follow the normal distribution of parameters N μ 1 , σ 2 1 .
Thereby, the mean E l is given by: where Q(l) is the number of almost dependent columns in the matrixR l such that: where Card(x) is the cardinal function which returns the set size. k l = l − Q(l) is the number of independent columns in the same matrix. In the noiseless environment, the mean E l is stable at: We note two behaviors of E l with respect to l = α · n or l = α · n: The gap between these behaviors allows us to find the matrices which have the number of columns l = α · n.
Let J be a set of l-values where the gap E l − M q > 0: Thereby, the identified length of the code words will be such that: where the functions diff(x) and mode(x) are defined by: • Function diff(x): the output of this function is a vector of size s − 1 and it corresponds to the difference between two consecutive elements of the vector • Function mode(x): this operation provides the value which has the highest occurrence in the vector x.
The proposed iterative method of the code word length identification is summarized in the Algorithm 2.  (15,11), over GF (2 4 ) which is defined by: n = 15 and k = 11. The mean E l normalized by M, which is set to 1,000, is represented in Figures 5 and 6. In Figure 5, a zero probability of error (i.e., p e = 0) is considered. For  l = α · n, we can verify that the mean E l normalized by M is stable at 1/q = 0.0625. For l = α · n, the mean E l meets (23): So, the matrices of size l = α · n have peaks for E l M − 1 q = 0.25 > 0. In Figure 6, the gap E l M − 1 q is represented with respect to l when p e = 0.01 for one iteration of our algorithm. According to (25), the set J is shown in Table 1. Henceforth, using (26), the identified length of the code words isñ = 15.

Analysis and performances
The aim of our proposed algorithm is to blindly identify the length of non-binary code words in noisy environment. This purpose can be reached with an average complexity equal to O(M · l 3 max · it max ). Indeed, the proposed algorithm performs ((l max − 1) · it max ) processes of Gaussian eliminations which have an average complexity equal to O(M · l 2 ), where l = 2 · · · l max . So, the average complexity is such that: The sizes of matrices for which El M − 1 q > 0, J , and the set diff(J ) are given for p e = 0.01 in the case of RS (15,11) over GF (2 4 ).
In order to analyze the performances of our blind identification method, the probability of correct detection of the code word length n is chosen as a performance criterion. In the simulations, our method is applied to the non-binary LDPC codes which became candidate for future communication systems. For each simulation, 2,000 Monte Carlo trials are run where the data symbols are randomly chosen at each trial. In this part, we focus on: • the gain of the iteration process on the detection probability of n • the performance comparison in the case of different channels • the impact of increasing the Galois field dimension q on the detection probabilities of n • the impact of increasing the code word length n on the detection probabilities for a given q

Gain of the iterative process
In our simulations, we consider a LDPC (n = 6, k = 3) over GF (4). Figure 7 shows the probability of detecting n according to p e for one, three, five, and ten iterations. We can see that the gain between the first and the tenth iteration is significantly important. Indeed, for p e = 0.07, with one iteration, the detection probability is equal to 0.76 and it becomes equal to 0.99 after 10 iterations. We can deduce that the iterative process improves significantly the detection performances of the blind identification method based on the mean calculation.

Performance comparison in the case of different channels
Let us illustrate the detection obtained by the proposed method for a LDPC (n = 16, k = 8) over GF (8) when an AWGN channel (the first channel) and a multipath Rayleigh channel associated to an AWGN channel (the  (4), the probability of detecting n is depicted compared with the error probability p e for one, three, five, and ten iterations.
second channel) are considered. In order to compensate and reduce the inter-symbol interference (ISI) caused by the multipath propagation, a linear mean square error (MSE) equalizer of length 20 was used.
We evaluate the performances of our method when the QAM or PAM modulation of order 8 (8-QAM and 8-PAM) is used to transmit the symbols coded by LDPC (n = 16, k = 8) over GF (8). In Figures 8 and 9, a comparison of performances of our blind identification method using 8-PAM or 8-QAM modulations in the case of an AWGN channel and a multipath channel with path number L path = 4 and it max = 1 is presented. In Figure 8, a comparison of the detection performances of our method in the case of AWGN channel is depicted. We can see that the proposed method for 8-QAM modulation gives better performances than for 8-PAM modulation when SNR<  18 dB. The gain between both is equal to 5 dB. However, for SNR > 18 dB, the performances are similar and the detection probability is equal to 1. To obtain the detection probabilities presented in Figure 9, the modulated symbols by 8-PAM or 8-QAM modulations are transmitted in a quasi-static Rayleigh fading multipath channel with path number L path = 4, then the received symbols are treated by the linear MSE equalizer of length 20. We can observe that, in the case of 8-QAM, our proposed method provides better performances than for 8-PAM. A gain equal to 5 dB is exhibited. We have chosen to evaluate our proposed methods in the worst case of 8-PAM modulation because our aim was to show that our method has the best performances even in the case of the PAM modulation.
In the following, the performance study of the impact of n and q on the proposed method is presented.

Impact of increasing q
Let us consider a LDPC (n = 6, k = 3), constructed in the Galois field GF(q), where q = 4, 8, 16. The matri-cesR l are reshaped from L = 30, 000 received symbols with l = 2, · · · , 30 and M = 1, 000. For each value of q, the method based on the mean calculation is applied to blindly identify the code word length of LDPC (n = 6, k = 3) over GF(q) when it max = 1. Figure 10 depicts the probability of detecting the correct n by our blind identification method according to the error probability p e in the cases of GF(4), GF (8), and GF (16). This figure shows that the curve behavior is nearly similar for all q = 4, 8, 16. We can deduce that the method based on the mean calculation is slightly sensitive to the increase of the Galois field dimension q.

Impact of increasing n
To evaluate the detection performances of our blind identification method, the impact of increasing the code word length should be studied. In our simulations, we consider two LDPC codes over GF(8), a LDPC (n = 6, k = 3) and a LDPC (n = 16, k = 8). The matricesR l are reshaped from L = 64, 000 received symbols with l = 2, · · · , 64 and M = 1, 000. For each code, the method based on the mean calculation is applied to blindly identify the code word length n when it max = 1. Figure 11 shows the detection probabilities of n by the method based on the mean calculation. We can note that the increase of the code word length leads to lower detection performances with our proposed method. Indeed, for p e = 0.01, the detection probability of the method of the mean calculation is constant and equal to 1 in the case of the two codes. For p e = 0.02, the detection probability decreases from 0.99 to 0.94.
In order to show that our method works in the case of codes of a reasonable code word length, we computed the detection probability of the Reed-Solomon code RS (n = 31, k = 25) over GF(32) which corresponds to an equivalent code over GF(2) of length m · n = 5 · 31 = 155. For an error probability p e = 0.01 and 1,000 trials of Monte Carlo, we obtained a detection probability of 0.87 for it max = 50. This probability can be improved by increasing the number of iteration of our algorithm. For it max = 100, we obtained a detection probability of 0.95.

Conclusions
In this paper, we have introduced an algorithm devoted to the blind identification of the code word length for a nonbinary code in a noisy transmission environment. Using this algorithm, the code word length can be identified by calculating the arithmetic mean of the number of zeros that occur in the columns of the matrix obtained by the Gauss elimination. We have shown that the proposed algorithm is robust because it does not require the estimation of error probability, is insensitive to the high order of Galois field, and has the best detection performances for the most of modulation types. Furthermore, this method provides better performances of detection when an iterative process is considered in order to increase the probability to obtain non-erroneous pivots during the Gauss elimination.
Our future work will focus on identifying the remainder of the non-binary code parameters as well as a parity check matrix, permitting to implement a generic decoder in a noisy environment. Furthermore, a method based on using soft information that allows us to improve the performances of the blind identification algorithms will be published soon [31]. k (j) = 0 such that these two probabilities are independent. Henceforth, (15) becomes: Assuming that the errors are independent from each other and uniformly distributed in GF(q)\{0}, the variable X follows a binomial distribution with parameters N i (l) and p e . Thereby, the probability P 1 (s) is determined by: The probability P 2 (s) is the probability of having k (j) ∈ GF(q)\{0}. We demonstrate by the mathematical induction that the probability P 2 (s) can be expressed by: We have P 2 (0) = 1 because there are no erroneous positions. In the case of a single erroneous position, we have P 2 (1) = 0. However, considering the example of GF(2 2 ), the probability P 2 (s = 2) can be obtained by the matrix M whose the indexes of rows and columns correspond to non-zero elements of this field. The coefficients of this matrix correspond to the sum over GF(2 2 ) of the indexes of a row and a column. 2 (j) = 0 will be P 2 (2) = 3/9 = 1/3. The computed probability verifies (38).
We assume that (38) is verified for s, and we demonstrate it for s+1. If we have s+1 k=1 a (l) i (k)·e (l) k (j) = 0, we will have e (l) k (j) that belongs to GF(q) * with a probability equal to 1/(q − 1). Therefore, the probability P 2 (s + 1) is determined by: In order to simplify the expression of P 2 (s), a change of variable is done by considering ϕ(s) = (q − 1) s−1 · P 2 (s). When P 2 (s) is replaced by ϕ(s), the expression (38) becomes: Denoting ρ(s) = (−1) s · ϕ(s), the expression (41) can be written as: but, the sum s−1 i=0 (1 − q) i is a geometric sequence of common ratio 1 − q. So, it can be written as: The computation of ρ(1) gives ρ(1) = 0. Therefore, using (43) and (41), the simplified expression of P 2 (s) is written as: Using (37) and (44), the overall probability P i is given by: In order to simplify this equation, the Newton's binomial formula can be applied: Thus, the probability of having an element of the i-th column ofT l equal to 0 is determined by: