Decoding LDPC Convolutional Codes on Markov Channels

This paper describes a pipelined iterative technique for joint decoding and channel state estimation of LDPC convolutional codes over Markov channels. Example designs are presented for the Gilbert-Elliott discrete channel model. We also compare the performance and complexity of our algorithm against joint decoding and state estimation of conventional LDPC block codes. Complexity analysis reveals that our pipelined algorithm reduces the number of operations per time step compared to LDPC block codes, at the expense of increased memory and latency. This tradeo ﬀ is favorable for low-power applications.


INTRODUCTION
LDPC convolutional codes (LDPC-CCs) are the convolutional counterparts of LDPC block codes (LDPC-BCs) and were first presented in 1999 by Feltström and Zigangirov [1].Algorithms have been studied for decoding LDPC Convolutional codes over memoryless channels, such as the additive white Gaussian noise (AWGN) channel.LDPC-CC decoding over channels with memory has not been addressed to date.Wireless channels typically have memory, and are often approximated using discrete Markov channel models, of which the simplest is the well-known Gilbert-Elliott model.
LDPC-CC codes are attractive for three main reasons.First, they support arbitrary frame lengths, which is useful in packet-switched networks.Second, they achieve performance comparable to conventional LDPC codes.Finally, LDPC-CC decoders can be implemented using a pipelined architecture that significantly reduces the number of active operations required per iteration.
Markov models are widely used to represent timevarying communication channels [2].If the channel is subject to significant variation within a transmitted frame, then performance can be significantly improved if decoding and channel state estimation are performed jointly using iterative algorithms.
This paper introduces a new algorithm for LDPC-CC decoding over Markov channels.The new algorithm applies the general principles of joint inference [3,4] to the specialized problem of efficient LDPC-CC decoding on channels with memory.We demonstrate that joint decoding and channel state estimation can be performed by adding a few steps to the pipeline decoding algorithm.This means that the cost of adding joint channel state estimation to an existing LDPC-CC decoder is extremely small.We show that when complexity is measured as arithmetic operations per iteration, joint state estimation and decoding is much less complex for LDPC-CCs than for traditional LDPC block codes.The tradeoff is that LDPC-CC decoders require considerably more memory, but in most cases this is a favorable trade in terms of power, complexity, and circuit area.
As a proof-of-concept demonstration, we apply our algorithm to joint decoding and state estimation on the Gilbert-Elliott channel.The Gilbert-Elliott channel [5] represents a burst-error channel which toggles between "good" and "bad" binary symmetric channel states.Interleaving methods were previously used to remove the memory from a Gilbert-Elliott channel, but this technique resulted in substantially reducing the channel's effective capacity [6].More recently, it was shown that joint state estimation and decoding of turbo codes on Gilbert-Elliott channels allows transmission at rates closer to the Gilbert-Elliott channel's native capacity [7].Our algorithm provides a power-efficient adaptation of these approaches for the LDPC-CC case.
The paper is organized as follows.Section 2 reviews LDPC-CC codes and their pipelined decoding algorithms for memoryless channels.Section 3 reviews methods for joint decoding and channel state estimation for Markov channels.Section 3.4 presents the new pipelined estimation-decoding algorithm for LDPC-CCs.Section 4 presents performance and complexity analysis, for example, LDPC and LDPC-CC codes over the Gilbert-Elliott channel.Section 5 offers conclusions.
The paper uses the following notation.Random variables and their quantities are indicated by lower-case Latin letters.If X and Y are random variables, then Pr{x | y} should be taken to mean the probability that X = x, given that Y = y.Sets are indicated by upper-case Latin letters.Sequences are indicated by lower-case bold letters, and matrices by uppercase bold letters.Lower-case Greek letters are used to indicate probability messages in the decoding algorithm.

Code structure
LDPC-CCs are a family of time-varying convolutional codes with characteristics similar to conventional LDPC block codes (LDPC-BCs).LDPC-CCs are defined by a periodic low-density parity-check matrix: where M is the memory and T is the period of the LDPC-CC, and t ∈ Z is the time index.The submatrices H T i (t + i), i = 0, . . ., M are c × (c − b) binary matrices, where b is the number of information bits that enter the encoder, and c is the number of coded bits that exit the encoder at a given time index.The rate of the code is R = b/c.The memory is equal to the largest i such that H T i (t + i) is a nonzero matrix, and T = M(c − b).An example parity-check matrix for an M = 7 rate-1/2 LDPC-CC is shown in Figure 1.
Similar to the Tanner graph representation of LDPC-BCs, LDPC-CCs can also be represented graphically [8]. Figure 2 shows a Tanner graph representation corresponding to the parity-check matrix in Figure 1.The Tanner graph exhibits a pattern that repeats itself (for time invariant LDPC-CCs) every M time indices or, equivalently, every Mc symbol nodes.Edges are present between symbols and check nodes when a "1" is present in the corresponding location of the parity-check matrix.

Decoding LDPC-CCs on memoryless channels
The seminal paper on LDPC-CCs [1] presented an iterative decoding strategy for memoryless channels.In this subsection, we briefly review the LDPC sliding window decoding algorithm using factor graphs as in [8].
The LDPC-CC decoder is a chain of sliding window processors performing sum product algorithm (SPA) [9] calculations on symbols within the window.While an LDPC-BC decoder performs parallel operations on symbol and parity-check nodes, iterations in the LDPC-CC decoder happen sequentially.As a symbol node shifts through the sliding window, SPA operations are performed on some of its edges.SPA operations are split into two phases per time index.During the vertical phase, SPA operations are performed on b parity-check nodes.During the horizontal phase, SPA operations are performed on c symbol nodes.The LDPC-CC H matrix is designed to guarantee that all necessary SPA operations are performed on a node after M+1 time shifts, just before the symbol node exits the window [1].For a rate-b/c(J, K) LDPC-CC decoder, the vertical and horizontal phases require Kb and Jc SPA operations, respectively.
To carry out multiple iterations, abutting sliding window processors are used.As a node exits one processor, it enters the next processor for an additional iteration.Therefore, for I iterations, the decoder has a latency of I(M + 1) time units.Hard decisions are made based on the a posteriori probability value of a symbol as it exits the last processor in the chain.
The LDPC-CC decoder sliding window represented as a series of processors.Figure 3 shows the decoder window in operation over a rate 1/2 code, sliding from right to left.
The sliding window abstraction is very useful from a hardware design perspective.A designer only needs to implement one of these processors as a hardware block.The complete decoder is then constructed by tiling I copies of the processor block.As a point of reference, a comparison of the BER plots for rate 1/2 LDPC-CC and LDPC-BC codes on the AWGN channel [10] reveals that an M = 128 LDPC-CC has roughly the same performance as an N = 1024 LDPC-BC.In Section 4, we demonstrate that this rough equivalence also holds for joint decoding and state estimation on Gilbert-Elliott channels.Figure 4 summarizes AWGN decoding results obtained in [10].

JOINT ESTIMATION DECODING
In this section, we present an iterative decoding algorithm for LDPC-CCs over channels with memory.Memory in channel states is modeled as a Markov chain, hence the name Markov channel.We review the Gilbert-Elliott channel and LDPC-BC decoding over Markov channels before presenting the LDPC-CC decoding algorithm.All of these algorithms employ the SPA on factor graphs of the decoder.The general procedure followed in deriving these algorithms is as follows, summarized from [11]: (1) derive the factor graph (or, equivalently, the Tanner graph) of the code constraints.The factor graph represents the probability mass function (PMF) of the code itself, as elaborated in [9].In the typical case where all codewords are equiprobable, the code's PMF is determined by the characteristic function: (2) Derive the conditional joint PMF of the received vector y and the channel states s, that is, Pr(y, s | x).
For discrete channels, the PMF takes the form of a Markov state-transition model, which has a wellknown factor graph structure.
(3) The joint PMF describing the code and channel, Pr(y, s, x), is the product of the above derived PMFs, Υ(x) and Pr(y, s | x).The corresponding decoder factor graph is constructed by joining together the channel graph and the code graph at their intersecting symbol nodes.

Markov channels
A Markov channel is characterized by a set of states, S, which models the channel's memory.At any time index i, the channel is in some state s i ∈ S. At each time index, the channel's state can undergo a random transition governed by the state transition probability matrix P. Let ρ i be the PMF vector for the channel's state at time i.Then ρ i+1 = P × ρ i .For a general Markov channel, the PMF of the channel state process can be written as where Pr(s i+1 | s i ) are obtained from P, the state transition probability matrix of a general Markov channel.The channel state affects the channel output probability through a conditional PDF f yi (y i | x i , s i ) for each s i ∈ S. Let γ i be the channel information which is conditional upon the channel state, defined as In the sequel, the arguments to γ i are be omitted, but γ i should always be understood to have the functional dependence as expressed by (4).
Then the conditional joint PMF of the received symbols and channel states is where y is the received symbol sequence and x is the transmitted sequence.

Gilbert-Elliott channel
The Gilbert-Elliott (GE) channel [5] is a binary input-output Markov channel.It is a binary symmetric channel (BSC) whose inversion probability is modulated by a Markov chain.It can be described as where x, y, and z are the channel input, the channel output, and the error sequence, respectively.The GE channel has two states, good and bad, as indicated in the channel model [5].
Each state contains a BSC with its own inversion probability.
In the good state, the inversion probability is lower than the bad state.The error sequence z is a random sequence.At any given time, the error probability is where η B and η G are the inversion probabilities in the bad and good channel states, respectively.
If the good and bad states are assigned to numerical values 1 and 0, respectively, then the state transition matrix P is where b and g are the 1→0 and 0→1 transition probabilities, respectively.The matrix elements are p jk = Pr(s i+1 = j | s i = k).The last term needed to compute γ i is the channel PDF:

LDPC BC decoding on Markov channels
Iteratively decoded codes allow for channel estimation during the decoding process and these estimates can further assist in decoding and vice versa.This is known as joint estimation decoding.In this subsection, we review an estimationdecoding algorithm first presented in [6] and analyzed for LDPC-BCs in [3,4,11,12].In Section 3.4, these concepts are adapted for use with LDPC-CCs.
To characterize the PMF of the channel, it is convenient to decompose the characteristic function, Υ(x), into component functions h j (x) representing the individual rows of the parity-check matrix: The joint PMF of the transmitted codeword received symbol sequence and channel states is then given by Pr(y, s, x) = ξPr(s 1 ) Figure 5: Joint factor graph for decoding and channel state estimation of an LDPC-BC on a Markov channel [11].
where ξ is the normalization constant for Υ(x), and N and M are the dimensions of the parity-check matrix.
The factor graph corresponding to ( 11) is shown in Figure 5.The channel state and transition nodes comprising the Markov chain form the Markov subgraph.The symbol and check nodes of the classic LDPC graph form the LDPC subgraph.Joint state estimation and decoding are then performed by applying the SPA using a flooding schedule.In the LDPC subgraph, the ζ messages are substituted in place of the channel information.The χ messages are the a posteriori probabilities computed using the usual LDPC decoding computations.
In the Markov subgraph, BCJR operations are performed with the χ messages applied as a priori information.At the start of each iteration, every node in the factor graph receives messages from adjacent nodes in the graph.Each message is a local conditional PMF.To complete the iteration, the SPA algorithm is applied to update the outgoing messages at each node in the graph.The α, β, and ζ messages have the usual definition, based on the BCJR algorithm, and are updated according to the standard rules: where α i and β i are functions of s i , χ i and ζ i are functions of x i , and the function arguments are again omitted.

LDPC convolution decoding on Markov channels
In this section, we present an algorithm for sequential decoding of LDPC convolutional codes on general Markov channels, the Gilbert-Elliott channel LDPC convolutional decoder is presented as an example.We will show that a flooding schedule of message-passing between the channel model and decoder fits conveniently into the pipelined LDPC-CC decoding algorithm.
To derive the decoder graph, we combine the Markov channel factor graph with the graph of an LDPC convolutional code by joining the two graphs at the shared symbol nodes, x i .The derivation of the LDPC convolutional code characteristic function, and hence its factor graph, is followed by the derivation of the decoder PMF, and hence the decoder graph, in the next few paragraphs.
The sequence υ = (. . ., υ 0 , υ 1 , υ 2 , . ..), υ t ∈ F c 2 forms a convolutional code frame (equivalent to a block codeword) if and only if the constraint imposed by the syndrome former of the convolutional code is fulfilled, that is, The characteristic function of the convolutional code is 13) is satisfied for all t ∈ Z, 0, otherwise. ( The convolutional characteristic function can be decomposed as a product of component functions over time.Let x j be defined as the sequence (υ j , υ j−1 , . . .υ j−M ), corresponding to the symbols in the encoder's memory.Then at the jth time instant, the characteristic component function is The convolutional characteristic function is then the product of the component functions over time: where t end is the time at which the convolutional sequence terminates.
To construct the joint PMF of the convolutional code and the Markov channel model, we note that there are c channel symbols per time index in our notation, and therefore also c Markov channel states per time index.Let n be the total number of channel symbols transmitted, that is, n = t end × c.Then, Pr(y, s, x) = ξPr(s 1 ) The factor graph for the LDPC convolutional decoder is shown in Figure 6.Notice that the Markov channel graph of Figure 5 has been wrapped around the convolutional graph of Figure 3 resulting in a graph suitable for sequential operations.As the decoding window slides across this graph, it completes SPA operations on c channel state variable nodes along with c symbol nodes and (c − b) check nodes at each time index.
Joint decoding and estimation is performed using a flooding schedule as follows.Whenever a symbol node is updated in the decoder, the adjacent Markov state variables are also updated.At each time index in the LDPC-CC  pipeline, the χ messages are updated for c symbol nodes using the usual decoding operations.To perform the Markov state update, the α, β, and ζ messages are subsequently computed for each updated symbol node.
Whenever processing is completed for a node x i , the resulting χ i message is used to update α i , β i , and ζ i .The entire column of updated messages is then passed to the next processor, which performs the subsequent iteration.Messages on the Markov subgraph are updated directly alongside the x i symbol nodes within the pipeline.
Figure 7 shows the memory-based architecture for a typical LDPC-CC processor, augmented to support joint channel state estimation [10,13,14].Each column in the memory grid represents a symbol node.Each row represents a message passed along an edge in the code's factor graph.Each symbol node is connected to the channel and to (up to) J parity-check nodes.Due to the structure of the LDPC-CC factor graph, a symbol's influence vanishes after M × c time steps, so this is the number of columns that needs to be stored in the memory.
The processor requires one row to store channel information for each symbol and J rows to store messages between variable and check nodes in the LDPC-CC subgraph.For a Markov channel with S states, 2S+2 extra rows must be added to support channel state estimation.In the particular case of the GE channel, where S = 2, six extra rows are needed.SPA operations are completed for c symbol nodes per time index per processor, so each processor needs to update 4c additional messages per time index per processor to perform joint state estimation.As these messages pass through the pipeline of processors, updates propagate across the Markov subgraph.

Alternative message passing schedules
The flooding schedule produces a very efficient architecture for joint estimation and decoding in LDPC-CCs.Other message passing schedules are not expected to coexist well with the pipelined structure of LDPC-CC decoders.Turbo estimation/decoding, for example, is known to provide improved performance for LDPC block codes in some cases, but cannot obviously be applied to the LDPC-CC case.
In the turbo schedule, Markov state estimation is carried out using the BCJR algorithm on a sliding window.All α, β, and ζ messages are computed while the χ messages are held constant.Meanwhile, the decoder computes new χ messages by performing one iteration with the ζ messages held constant.This schedule requires that blocks of symbol messages be exchanged between the estimator and the decoder.Because the pipelined decoder computes only c symbol messages per time index, it is not possible to accumulate a large frame of messages without interrupting the pipeline.

PERFORMANCE RESULTS
Figure 8 shows the BER plot for rate 1/2, (3, 6) LDPC convolutional codes with memories 128, 256, and 2048 with channel parameters (b, g, η G ) = (0.01, 0.01, 0.01), where η B is swept from 0.04 to 0.18.As expected, decoding gain increases with memory.As with joint estimation and decoding on LDPC-BCs, LDPC-CCs are observed to have low error floors.Figure 9 plots the BERs for an M = 128 code, with iterations ranging from 10 to 60. Statistically, significant improvements are not observed for pipelines longer than 50 processors.

Comparison of LDPC-BC and LDPC-CC decoders
Figure 10 shows the comparative performance of an LDPC-BC and an LDPC-CC performing joint decoding and state estimation over the GE channel.The performance of an N = 1024 LDPC-BC is roughly comparable to that of an M = 128 LDPC-CC.The LDPC-CC decoder uses 50 processors, and the LDPC-BC decoder uses 50 iterations.
The complexity of these decoders is estimated by counting the number of edges that requires SPA updates at each      (1) Processor complexity During a time index, each LDPC-CC processor completes an iteration on c symbol nodes, each of which requires J SPA updates during the horizontal phase and K SPA updates during the vertical phase.Then there are 3 edges to be updated per symbol node in the Markov subgraph, and two of those edges have S number of messages on each.Like for the GE channel, the forward and backward messages have values for the good and bad states of the channel.Hence (2S + 1)c SPA updates are needed (the APP edge does not need to be computed until the very end).Therefore, we have I * [c * (J + 2S + 1) + K] SPA updates during any time index in the LDPC-CC decoder.
In the LDPC-BC case, there is one time index per iteration, in which all messages are updated throughout the code's graph.There are J edges per symbol node in the LDPC subgraph and two SPA updates per edge.The N symbol nodes result in a total of 2N * (J + 2S + 1) SPA updates per time index.
(2) Hardware memory requirements In the LDPC-CC case, for each symbol node, we need to store (2S + 1) messages for the Markov subgraph, 1 channel message, and J messages in the LDPC subgraph.There are (M + 1)c symbol nodes per processor, and I processors, for a total of I * [(M + 1)(J + 2S + 2)c] messages.
The LDPC-BC decoder needs memory for each SPA update, and memory to store the channel information, resulting in N * [2 * (J + 2S + 1) + 1] messages to be stored.

(3) Decoder delay
The LDPC-CC latency is I * (M + 1) time steps, the time it takes for symbols to traverse the entire pipeline.The latency of the LDPC-BC decoder is the time taken to decode the first frame.If the decoder uses I iterations, then the latency is simply I.
For similar performing block and convolutional codes, LDPC-CCs need a simpler processing unit, that is, the area of an LDPC-CC decoder chip involved in active computations is much less than for the LDPC-BC case, hence bringing down the power usage, but needs more memory.On general purpose architectures with multiple memory banks, the LDPC-CC decoder realizations can be very efficient.Apart from the initial latency, the pipelined LDPC-CC decoder architecture is well suited for high throughput, low-power hardware implementations.

CONCLUSIONS
This article described a serial, pipelined algorithm for joint decoding and channel-state estimation of LDPC convolutional decoders over Markov channels.While the general principles of joint channel state estimation and decoding are widely known, they were previously applied only to LDPC block codes.Our work is the first investigation of joint estimation and decoding for LDPC convolutional codes.We found that the conventional pipelined LDPC-CC decoder requires only a few extra operations to implement joint state estimation, which is a much smaller overhead than what is required for joint state estimation in known LDPC block decoding algorithms.Our LDPC-CC algorithm requires considerably fewer active operations than LDPC block decoders, and hence is better suited for low-power implementation of high-performance error control over Markov channels.

1 PeriodFigure 1 :Figure 2 :
Figure 1: An example parity-check matrix, H T , for a rate-1/2 LDPC-CC.The row indices correspond to symbols (i.e., channel bits), and the column indices correspond to parity check equations.

Figure 4 :
Figure 4: LDPC-CC decoder performance over the AWGN channel, based on codes and algorithms from [10].

Figure 6 :
Figure 6: Joint factor graph for an LDPC-CC decoder over a Markov channel.The Markov model wraps around the LDPC-CC factor graph.

Figure 7 :
Figure 7: The memory-based LDPC-CC decoding architecture for a rate-1/2 code, highlighting the extramemory resources and additional operations needed to implement joint channel state estimation.

Figure 9 :
Figure 9: Number of iterations versus performance for an M = 128 LDPC-CC decoder.
time in the decoder.The comparative processor complexity, memory requirements, and latency are reported in Table1.