Improved progressive edge-growth algorithm for fast encodable LDPC codes

The progressive edge-growth (PEG) algorithm is known to construct low-density parity-check (LDPC) codes at finite code lengths with large girths by establishing edges between symbol and check nodes in an edge-by-edge manner. The linear-encoding PEG (LPEG) algorithm, a simple variation of the PEG algorithm, can be applied to generate linear time encodable LDPC codes whose m parity bits p1, p2, ..., pmare computed recursively in m steps. In this article, we propose modifications of the LPEG algorithm to construct LDPC codes whose number of encoding steps is independent of the code length. The maximum degree of the symbol nodes in the Tanner graph is denoted by dsmax; The m parity bits of the proposed LDPC codes are divided into dsmax subgroups and can be computed in only dsmax steps. Since dsmax≪m, the number of encoding steps can be significantly reduced. It has also been proved that the PEG codes and the codes proposed in this article have similar lower bound on girth. Simulation results showed that the proposed codes perform very well over the AWGN channel with an iterative decoding.


Introduction
Low-density parity-check (LDPC) codes, which were first proposed in the early 1960's [1] and re-discovered in 1996 [2], have recently attracted much attention due to their capacity-approaching performance and low decoding complexity. Since their re-discovery, there are many methods such as the message-passing decoding and the linear program decoding [3,4] that have been proposed for the decoding algorithm. Also many other methods have been proposed for the construction algorithm [5][6][7][8]. Among the existing methods, the most successful approach for the construction of LDPC codes is the progressive-edge-growth (PEG) algorithm [5,6]. The PEG construction builds up a Tanner graph, equivalent to a parity-check matrix, for an LDPC code in an edgeby-edge manner and maximizes the local girth at symbol nodes in a greedy algorithm. It is simple and flexible in that it can be applied in constructing codes of arbitrary length and rate. In addition, the PEG algorithm can be modified to construct linear time encodable LDPC codes. In this article, this modified algorithm is referred to as linear-encoding PEG (LPEG) algorithm. Initially, the parity-check matrix was used in decoding process for LDPC codes. But it can also be used for the encoding of LDPC codes [9]. LDPC codes with m parity bits constructed by the LPEG algorithm can be encoded with the parity-check matrix in m recursive steps [6]. Therefore, for a given code rate R, the number of encoding steps of LPEG codes grows linearly with the code length n = m/(1-R).
The objective of this article is to reduce the number of encoding steps of the LPEG codes with negligible performance loss. To reduce the number of encoding steps, we modified the LPEG algorithm to obtain two new algorithms which are referred to as fast-encoding PEG (FPEG) algorithm and modified FPEG (MFPEG) algorithm, respectively. d max s is used to denote the maximum degree of symbol nodes. The number of encoding steps of the FPEG codes grows linearly with d max is much smaller than m, the number of encoding steps of FPEG codes or MFEPG codes is lower than that of the LPEG codes. Moreover, to ensure that there is negligible performance loss, we proved that the PEG codes and our codes had similar lower bound on girth. This was confirmed by providing examples of the proposed codes and comparing their performance with that of the multiple serially concatenated multiple paritycheck (M-SC-MPC) code, which is a class of LDPC codes of efficient encoding [10].
The remainder of this article is organized as follows. Section 2 reviews the PEG algorithm and the LPEG algorithm. Section 3 proposes the FPEG algorithm and the MFPEG algorithm. A lower bound on the girth of the FPEG algorithm is also derived in this section. Section 4 presents two examples of FPEG codes and MFPEG codes and demonstrates their performances. Finally, Section 5 concludes the article.

Progressive edge-growth algorithm
An LDPC code is a linear block code defined by a sparse parity-check matrix H having dimension m × n. A bipartite graph with m check nodes in one class and n symbol nodes in the other can be created using H as the integer-valued incidence matrix for the two classes. Such a graph is also called a Tanner graph [11]. Let V c = {c 0 , c 1 , ..., c m -1 } denote the set of check nodes and V s = {s 0 , s 1 , ..., s n-1} denote the set of symbol nodes. E is the set of edges such that where h i,j denotes the entry of H at the ith row and jth column, 0 ≤ i ≤ m -1, 0 ≤ j ≤ n -1. The PEG algorithm for constructing a Tanner graph with n symbol nodes and m check nodes is described in Algorithm 1. In this algorithm, both symbol nodes and check nodes are ordered according to their degrees in a nondecreasing order. d s j is the degree of symbol node s j , N l s j , andN l s j denote the set of all check nodes reached by a tree spreading from symbol node s j with in depth l, and its complement, respectively [6]. It was proved that [4] the PEG algorithm constructs Tanner graphs having a large girth and the lower bound on the girth was proved to be Algorithm 1. PEG algorithm 1: for j = 0 to n -1 do 2: where E 0 s j is the first edge incident to s j and c i is a check node such that it has the lowest check-node degree under the current graph setting E s 0 ∪ E s 1 ∪ · · · ∪ E s j−1 .
4: else expand a subgraph from s j up to depth l under the current graph setting such that the cardinality of N l s j stops increasing but is less than m, orN l+1 , where E k s j is the kth edge incident on s j and c i is a check node picked fromN l s j having the lowest check-node degree. 5: end if 6: end for 7: end for where d max c and d max s were the maximum degrees of the check nodes and symbol nodes, respectively.

Linear-encoding PEG algorithm
It is stated [12] in which h ij = 1 for i = j. Hence, the parity bits p = [p l , p 2 , ..., p m ] can be computed according to where d = {d i } is the systematic part of the code, H d = {h d i,j } is the m × (n-m) component of the partitioned parity-check matrix H and ⊕ represents the summation over binary field, i.e., an XOR operation. From Equation (4), the m parity bits can be computed from p m to p 1 serially in m steps. Therefore, the number of encoding steps is Accordingly, the symbol node set V s in the Tanner graph is partitioned into redundant subset V p s and the information subset V d s , which contain the first m symbol nodes and the other nm symbol nodes, respectively. The edges of the symbol nodes are then established by means of the LPEG algorithm which constructs an upper triangular pattern. As the procedure of establishing the edges of nm information bits follows the construction of edges of V p s and is exactly the same as the PEG algorithm described in Algorithm 1, only the LPEG algorithm for constructing edges of V p s is shown in Algorithm 2. for 5: end if 6: end for 7: end for Note that the first column of H p corresponds to a degree-1 symbol node and the fraction of degree-1 symbol node is 1/n. It was proved that the Tanner graph of an upper or lower triangular parity-check matrix could be equivalently transformed into a pseudo-tree and the corresponding LDPC codes could also be encoded in a linear number of steps by the label-and-decide algorithm [12].

M-SC-MPC codes
Let n i , ki, and r i denote the code length, information length and parity-check length of the ith component MPC code, respectively. The encoder of each component MPC code can be implemented as r i SPC encoders as shown in Figure 1 [10]. The matrix cells of the ith encoder are filled in column-wise order from top left to bottom right. The first r is i cells, with s i = k mod r i , are unused. When the jth row is filled, j = 1, ..., r i , the parity bit p j is calculated by XORing the elements of the row, and its value is stored in the last column, at the same row. Each component MPC code can be seen as a shortened version of a binary cyclic code with length We obtain a valid parity-check matrix H i for the ith component code, consisting of a row of n i r i identity matrix with size r i × r i . As the ith component code has length n i , the cyclic code must be shortened. This implied eliminating the first N in i columns of H i . H i forms a block-row of the parity-check matrix H of the serially concatenated code. Consequently, H was in the lower triangular form, consisting of identity matrices and zero matrices. The serially concatenated code has information length k, parity-check length m = M i=1 r i and code length n = k + m.

FPEG algorithm and MFPEG algorithm
In general, an ensemble of Tanner graphs is defined through degree distribution pairs. In the case of the symbol nodes, the degree distribution, from the edge perspective, is given by where ψ i is the fraction of Tanner graph edges which emanate from degree-i symbol nodes. The fraction of degree-i symbol nodes, from the node perspective, is given by Similarly, in the case of the check nodes, the degree distribution, from the edge perspective, is given by where i is the fraction of Tanner graph edges which emanate from degree-i check nodes. The fraction of degree-i check nodes, from the node perspective, is given by In the following section, we introduce an FPEG algorithm and an MFPEG algorithm used to construct an upper triangular parity-check matrix.

FPEG algorithm
Example 1: Consider the parity-check matrix in (10), This is a rate R = 0.5 and length n = 12 parity-check matrix that corresponds to an LDPC code C = [p 1 , p 2 , ..., p 6 , d 1 , d 2 , ..., d 6 ], where d 1 , d 2 , ..., d 6 are the information bits and p 1 , p 2 , ..., p 6 are the parity bits. This is by no means a good LDPC code but it is an example. The parity-check matrix H in (10) can be divided into three 2 × 12 submatrices H 1 , H 2 and H 3 . In each submatrix H i , the rows do not have '1's in the same column. Then, the encoding of C includes the following three steps: Step 1 Given the submatrix H 3 and the information bits d 1 , d 2 , ..., d 6 , compute the parity bits p 5 , p 6 by the parity-check equations Step 2 Given the submatrix H 2 and the information bits d 1 , d 2 , ..., d 6 and parity bits p 5 , p 6 , compute the check bits p 3 , p 4 by the parity-check equations Step 3 Given the submatrix H 1 and the information bits d 1 , d 2 , ..., d 6 and check bits p 3 , p 4 , p 5 , p 6 , compute the parity bits p 1 , p 2 by the parity-check equations It is easy to see that, in general, the number of encoding steps equals the number of submatrices, M. The parity-check matrix H should satisfy the following three conditions: (A) The parity-check matrix H should contain an upper triangular pattern.
(B) The r i rows in each submatrix H i should not have '1's in the same column.
(C) The number of submatrices should not be smaller than the maximum symbol-node degree (M ≥ d max s ). Condition (A) guarantees that the corresponding codes are linear time encodable [12]. Condition (B) guarantees that the r i parity-check equations in submatrix H i can be used to generate r i parity bits simultaneously [10,13,14] while condition (C) is a necessary condition for condition (B). Observing these three conditions, the Tanner and r 1 , r 2 , ..., r M are M positive integers such that Given a symbol-node-degree distribution, the FPEG algorithm for establishing edges of V p s and V d s is given in Algorithm 3. The symbol SC\G i was used to denote the check nodes contained in the set of selectable check (SC) nodes, but not in the check node group G i , where SC denote the set of check nodes available for the next round of spreading. The number of encoding steps of the FPEG codes, equaling the number of submatrices H i , is if k = 0 then 9: if j <m then E 0 s j ← edge (c j , s j ) , where E 0 s j is the first edge incident to s j . This edge corresponds to the "1" in the diagonal line of matrix H p . 10: s j ) , where E 0 s j is the first edge incident to s j and c i is a check node such that it has the lowest check-node degree under the current graph setting E s 0 ∪ E s 1 ∪ · · · ∪ E s j−1 .
11: end if 12: else expand a subgraph from s j up to depth l under the current graph setting such thatN l s j ∩ SC = ∅ but N l+1 s j ∩ SC = ∅, or the cardinality of N l s j stops increasing, then E k s j ← edge (c i , s j ) , where E k s j is the kth edge incident to s j and c i is a check node picked from the set N l s j ∩ SC having the lowest check-node degree. 13: end if 14: Find out which check node group G i includes c i . SC SC\G i . 15: end for 16: end for Similar to the PEG algorithm, the check-node degrees are made as uniform as possible by the FPEG algorithm. Notice that the FPEG algorithm is not always valid for any given symbol-node-degree distribution. Since the column weight of H i is at most one and the columns of H are ordered according to their weights in a nondecreasing order. The weight of first r 1 columns of H p is at most one, that of the next r 2 columns is at most two, likewise the weight of the last r M columns is at most M. In other words, the number of columns with weight less than or equal to i should be larger than or equal to i j=1 r j . Therefore, the FPEG algorithm is valid if and only if for i {1, 2, ..., M}. When the condition (17) is satisfied, it was proved that, given a symbol-node-degree distribution, for large code lengths, the probability of failing to construct an approximate upper or lower triangular parity-check matrix was negligible [15] (Theorem 1).
Proof: The proof of (18) is an adaptation of the proof of equation (1) reported in [6]. For a given symbol node s j , define its neighborhood in G k within depth l, G l k,s j , as the set consisting of check nodes in G k reached by a subgraph spreading from symbol node s j . Its complementary set,Ḡ l k,s j , is defined as G k \G l k,v j . Consider a depth-l subgraph of an irregular Tanner graph which spreads from any symbol node s j , s j V s , such that G l k,s j ⊂ G k and G l+1 k,s j = G k . By definition the depth-0 subgraph contains at most d max From (19), it is easy to see that which can be simplified to Let t be the solution of the equation where J = r min (d max Then l ≥ l' = ⌊t⌋ and The proof is completed. Figure 2 depicts the lower bounds on PEG Tanner graphs and the lower bounds on FPEG Tanner graphs for regular d max which can be simplified to that is Assume r min = m d max s , then Hereafter in this article, for a given m, it was assumed M = d max s for two reasons: First, from (16) it can be seen that the number of encoding steps is least when M = d max s . Second, from (18) it is easy to see that J grows linearly with r min and larger J value leads to larger lower bound on g F . It is also easy to see that the maximum is achieved when M = d max s . Therefore, the maximum g F is achieved when M = d max s . Note that there are degree-1 symbol nodes in the corresponding FPEG Tanner graph and the fraction of degree-1 symbol nodes is r 1 /n. The existence of degree-1 symbol nodes is a necessary condition for a linearencoding algorithm such as the label-and-decide algorithm and the LPEG algorithm [5,12]. However, it was stated that the outbound extrinsic messages of degree-1 nodes would not be updated during the iterative decoding process [16,17]. Consequently, the degree-1 symbol nodes would cause many problems such as mismatching of extrinsic information transfer (EXIT) functions and the halting of mutual information evolution. In the following section, we will introduce a modified FPEG algorithm, which construct LDPC codes with only one degree-1 symbol node.

MFPEG algorithm
A simple modification of the FPEG algorithm can be applied to construct LDPC codes which have only one degree-1 symbol node. This modified algorithm is called where the equality is achieved when r 1 is the minimum of {r 1 , r 2 , ..., r M }. In fact, the MFPEG algorithm has loosened the condition (A) and is a combination of the LPEG algorithm and the FPEG algorithm. Therefore, for a given degree distribution pair, we have the following equalities: where g M is the lower bound on girth of the MFPEG Tanner graph.
Note that, since both of LPEG codes and MFPEG codes have only one degree-1 symbol node (FPEG codes have r 1 degree-1 symbol nodes), they may have the same symbol-node-degree distribution. As shown in Figure 3, the encoder of the proposed FPEG and MFPEG codes can be implemented as r i SPC encoder and one quasi-random interleaver, thus increasing complexity, compared to MPC encoder.

Examples and simulation results
In this section, we provide two examples of the proposed codes and compare them with the M-SC-MPC codes [10] and LPEG codes. In this article, we denote LPEG codes, FPEG codes and MFPEG codes with M submatrices by M-LPEG codes, M-FPEG codes, and M-MFPEG codes respectively. In the first example it is shown that, In comparison to the M-SC-MPC codes which have the same number of encoding steps as FPEG codes, FPEG codes have better error correcting performance. In the second example, it is shown that for a given symbol-node-degree distribution, FPEG codes and MFPEG codes have similar error correcting performance but less encoding steps. In computing the error correcting performance, in terms of the bit error rate (BER), we assume BPSK transmission over the AWGN channel. The decoding algorithm used here is the log-likelihood sum-product algorithm and the maximum iteration number is set to be 50.
Example 2: An M-SC-MPC code consists of M MPC encoders and offers a flexible code rate and code length with low encoding complexity [10]. The encoding process of an M-SC-MPC code includes M steps, where M is also the maximum symbol-node degree, d max s . Clearly, for the same maximum symbol-node degree, the number of encoding steps of M-SC-MPC codes is the same as those of FPEG codes. Therefore, we will compare the performance of FPEG codes and M-SC-MPC codes, provided that the number of encoding steps and the symbol-node-degree distributions are the same.
The symbol-node degree distribution and check-node degree distributions of these codes from node perspective are shown in Tables 1 and 2, respectively. Using the same symbol-node-degree distribution, we construct the M-FPEG codes with the FPEG algorithm for M = 4, 5, 6. The BER performance comparisons of these codes are given in  = 4, 5, 6). The code length is n = 1000, the parity-check length is m = 500 and the code rate is R = 0.5. For the purpose of performance comparison, the optimal degree distributions given in [ [18], Table I] are used in this example. The fraction of Tanner graph edges which emanate from degree-1 symbol nodes of M-LPEG codes, M-FPEG codes and M-MFPEG codes can be obtained from that of the degree-i symbol nodes by the following formula: where ψ i is the original fraction of Tanner graph edges which emanate from degree-i symbol nodes and ψ * i is the new fraction used in this example. Note that if the number of degree-2 symbol nodes of the Tanner graph of H is larger than or equal to nm, any combination of nm degree-2 columns forms cycles. Thus we can improve the error floor performance by reducing the fraction if the fraction of degree-2 symbol nodes is higher than the optimal value given in [19]. Therefore, for the original degree distributions ψ i given in [ [18], Table I], we obtained the degree-1 symbol nodes from the degree-2 symbol nodes by The original node degree distributions are given in the Table 3 and in the simulation of this example, the degree distributions from edge perspective are given in Table 4.
Note that the symbol-node-degree distributions from node perspective for the LPEG, FPEG, and MFPEG algorithms can be calculated from edge distribution with the formula (7) and the check-node-degree distribution is not needed as the check-node degrees are made as uniform as possible by the LPEG algorithm [6] and the proposed FPEG and MFPEG algorithms. From (1) and (18), it is clear that the maximum check-node degree d max c is necessary to derive the lower bound on girth. However, the value d max c can be obtained easily in the   Table 5 and their number of encoding steps are given in Table 6. From Equations (5), (16), (31), and Table 6, it can be seen that T F <T M <T L for these given symbol-node degree distributions. The simulation results are shown in Figure 5. It is shown that, compared to the LPEG codes, the FPEG codes perform better in the waterfall region but worse in the error floor region. However, the MFPEG codes perform similarly to the LPEG codes.

Conclusion
In this article, we introduced the FPEG algorithm and the MFPEG algorithm for generating fast encodable LDPC codes. The number of encoding steps of the FPEG codes grows linearly with d max s , not the code length n. The number of encoding steps of the MFPEG codes grows linearly with (d max s + m d max s − 1). Moreover, we derived a lower bound on girth of the FPEG codes which is shown to be similar to that of PEG codes. By examples and simulations, it is shown that compared to the M-SC-MPC codes the FPEG codes have the same number of encoding steps but better error correcting performance, and compared to the LPEG codes the FPEG codes have similar error correcting performance but less encoding steps. Variants of M-SC-MPC codes, in which the degree of freedom is exploited, have been proposed in [20,21]. Considering these variants of M-SC-MPC codes in the design of fast-encodable LDPC codes is an open issue for future research.    Table 5 Girth lower bound, girth and average cycle length for m = 500, n = 1000, and R = 0.5 Table 6 Number