Karhunen-Loève-Based Reduced-Complexity Representation of the Mixed-Density Messages in SPA on Factor Graph and Its Impact on BER
© P. Prochazka and J. Sykora. 2010
Received: 26 March 2010
Accepted: 30 December 2010
Published: 3 January 2011
The sum product algorithm on factor graphs (FG/SPA) is a widely used tool to solve various problems in a wide area of fields. A representation of generally-shaped continuously valued messages in the FG/SPA is commonly solved by a proper parameterization of the messages. Obtaining such a proper parameterization is, however, a crucial problem in general. The paper introduces a systematic procedure for obtaining a scalar message representation with well-defined fidelity criterion in a general FG/SPA. The procedure utilizes a stochastic nature of the messages as they evolve during the FG/SPA processing. A Karhunen-Loève Transform (KLT) is used to find a generic canonical message representation which exploits the message stochastic behavior with mean square error (MSE) fidelity criterion. We demonstrate the procedure on a range of scenarios including mixture-messages (a digital modulation in phase parametric channel). The proposed systematic procedure achieves equal results as the Fourier parameterization developed especially for this particular class of scenarios.
A factor graph (FG) based technique provides a unifying strategy to a vast variety of the problems in communications, signal processing and general inference algorithms [1, 2]. FG-based algorithms (e.g., sum-product algorithm (SPA) typically in Bayesian based decision and estimation problems) operate with messages representing the stochastic description of the variable at a given node. A direct exact canonical form of SPA operates with probability density function (PDF) or probability mass function (PMF) for continuous or discrete variables respectively. The messages for finite cardinality discrete variables (most notably for binary variables) can be easily parameterized by a number of numerically convenient representations (e.g., log-likelihood ratio, etc.) [1, 2] which allow an easy implementation.
Practical communication and signal processing scenarios, however, frequently lead to an FG representation containing a mixture of continuous and discrete parameters, for example the discrete data and continuous-valued channel state parameters (e.g., phase). The FG-based SPA algorithm solving this mixture variable problem inevitably involves messages with a complicated general-shaped mixture PDFs. This is a direct consequence of the marginalization operation of the factor node (FN) with both PDF and PMF inputs. A strict pristine implementation of the FG/SPA would require passing messages in the form of complicated PDF. It would make the practical implementation infeasible. A number of solutions for this situation appeared (see a detailed discussion later). All of them are based on a message PDF approximation done by a proper parameterization of a small set of canonical messages. Then, instead of the PDF itself, the set of parameters is used to represent the message.
The problem of finding a suitable set of canonical messages with a proper parameterization is, however, a crucial one. All previous attempts in the literature have chosen the canonical basis in an ad hoc manner or by inferring a function shape for a particular scenario context. A wrong choice easily leads to a large number of the parameters needed to represent the message with a given fidelity or to a high computational load when processing FN update equations. An obvious goal is to have a canonical representation with the smallest possible number of parameters and an easy enough FN update evaluation for the given approximation fidelity.
This paper introduces a systematic procedure for obtaining such a set of canonical messages with well-defined fidelity criterion. We base our method on a stochastic nature of the messages as they evolve during FG/SPA processing. A Karhunen-Loève transform (KLT) is used to exploit the message stochastic behavior with well-defined fidelity (mean square error (MSE)) criterion.
2. Background, Related Work and Contributions
This section summarizes the background and the related work available in the current literature. We have structured the related work according to various aspects of the FG-based processing and the message representation.
2.1. General FG-Based Processing
The FG/SPA is unambiguously given by the FG structure, update rules and scheduling algorithm [1, 2]. To enable the processing, it is necessary to store the messages in iterations between updates. The implementation of the FG/SPA gives exact results for an arbitrary cycle-free FG with exact evaluation of the update rules, exact message representation and with an arbitrary scheduling algorithm, which considers all messages in the FG. When all of these assumptions are fulfilled, the FG/SPA provides an elegant optimal evaluation algorithm. As an example, the forward/backward Bahl-Cocke-Jelinek-Raviv algorithm  might be mentioned.
The FG/SPA, however, often works well also in cases when the mentioned conditions are violated. First of all, the FG might contain loops. In such a case, the FG/SPA works only approximately. Many of iterative algorithms such as iterative decoding are solvable utilizing the looped FG/SPA. Several works focused on the convergence criteria for the looped FG/SPA [3, 4]. The role of the scheduling algorithm becomes important for the looped FG/SPA. A number of results was proposed in .
In a contrast with these principal difficulties, the representation of the messages (and corresponding update rules) forms an implementation-related problem.
2.2. FG Processing with Mixed Continuous and Discrete Variables
The most straightforward representation is a discretization (sampling) of the continuous message. The message is represented by a piecewise-constant function. An exact (we mean exact with respect to the definition of the SPA) evaluation of the update rules is approximated by the numerical integration with the rectangular rule [6–8]. The discretization of the continuously valued message is straightforward but highly inefficient in terms of the number of coefficients required to obtain a given fidelity goal. This representation was adopted as a reference model in this paper (Section 3.4).
The continuously valued message in FG/SPA stands for a PDF up to a scale factor. Thus the message can be easily described by its moments. The main interest is focused on the Gaussian message, which is fully described by its mean and variance. The Gaussian representation is extremely suitable for all linear models (only superposition and scaling factor nodes are allowed). The update rules are then closed-form operations on the Gaussian messages. See  for details.
Nevertheless, the use of the Gaussian representation might bring good results also in nonlinear models (e.g., joint phase estimation and data detection ). A mixture Gaussian message (the message given by superposition of several Gaussian kernels) might be also used as a message representation. A common problem of the Gaussian mixtures is the increasing number of mixtures in the update rules. A mixture number reducing approach based on the approximation of the resulting PDF was considered, for example, in [10, 11].
2.3. Canonical Representation of Mixture Densities
A unified design framework based on the canonical distribution was proposed in . This design consists of a set of kernel functions and related parameters describing the message. The sets of the parameter are passed through the FG/SPA instead of the continuous messages. Following this framework, the iterative decoding algorithms based on Fourier and Tikhonov parameterization were proposed in . The parameterizations are suited for the channels affected with strong phase noise. The Fourier and Tikhonov parameterizations are, however, chosen only by inferring the suitable shape from the given particular application scenario. No systematic general procedure is developed.
2.4. Goals of this Paper and Contributions
We develop a systematic procedure for finding canonical message kernels.
The procedure is based on the stochastic nature of the messages as they evolve in iterations of FG/SPA with random system excitations.
We use KLT-based procedure which provides an easy connection between message description complexity and the fidelity criterion.
The resulting orthonormality of the kernels allows relatively simple update rule implementation.
We demonstrate the procedure on number of example applications.
3. KLT Message Representation
3.1. Core Principle
This section summarizes the core principle in a "barebones" manner. The details follow in the sections below.
A particular form of this expansion should provide an efficient (minimal dimensionality) representation of the message with well-defined fidelity criterion. The KLT can serve for this efficient representation. It provides orthonormal kernels based on the second-order message statistics. The resulting coefficients are uncorrelated. The second-order moments of the coefficients are also directly related to the residual MSE of the approximation.
The second-order statistics (the correlation function) of the true messages can be easily numerically approximated (by simulation) by an empirical correlation function. A reduced complexity approximation of the message is obtained by truncating the dimensionality of the original vector . Due to the orthonormality of the basis, the residual MSE will be purely additive as a function of the truncation length. Significantly contributing kernels are easily identified by the second moment of the corresponding coefficient. This gives an easy and direct relation between the description complexity and the approximation fidelity.
3.2. KLT Message Representation Details
provides the eigenfunctions as a canonical basis of the message. We index the sorted eigenvalues such as for all : and the eigenfunction is indexed if and only if the pair forms eigenpair, that is, it solves (3).
These coefficients jointly with the set of eigenfunctions describe the message by (1).
The complexity is reduced by omitting several components. We neglect the components with index , where stands for the number of used components (dimensionality of the message). Then we can easily control the MSE of the approximated message by the term .
Note that, as a result of the KLT-approximation, the message might become negative at some points, that is, there may exist such a number that . It violates assumptions of the almost all FN update algorithms and it must be rectified by a proper translation.
3.3. Empirical Correlation Function Measurement
Of course, the correlation evaluation requires small discretization steps. But since this operation is done only off-line during the system design phase, its complexity is not an issue at all.
3.4. Reference Message Representation Models
Our goal is to compare the capabilities of the message types (KLT against others) to represent the reference message as exactly as possible. We assume that we use a reference model without any implementation issues affecting the message representation and the update rules. Thus we are not interested in the update rules for particular representations. This is an important difference in contrast to other works (e.g., [6, 14]), where the authors try to obtain the message representation jointly with the update rules.
For our analysis, we need only an unambiguous relation between the reference (possibly highly complex) message and its approximation. In each iteration, the relation is used for the evaluation of the approximated message which is then inserted into the run of the reference model instead of the original reference message during the analysis. The results of this analysis might be interpreted as the ideal behavior of the particular message representation with an exact implementation of the update rules.
All considered representations in this section are only based on a deterministic description. Thus we might lighten the notation a little bit. The reference message is denoted by . We suppose the following representations.
3.4.1. Sample Representation
The discretization of the continuous message is a straightforward method of the practically feasible representation as it was discussed in Section 2.2 or [6, 12]. An arbitrary precision might be achieved using this representation (of course at the expense of the high complexity). Nevertheless, the representation offers a direct way to the empirical evaluation of the autocorrelation function (see Section 3.3). Thus it is a suitable option for our reference model.
The sample representation is considered in two cases. The first one is the reference model, where we select as many samples as the approximation of the message can be neglected. We also use the representation by samples to be compared with the proposed KLT-message for a given dimensionality.
3.4.2. Fourier Representation
where and . Dimensionality is given by . In , the authors have derived update rules for the Fourier coefficients in a special case of random-walk phase model.
3.4.3. Dirac-Delta Representation
3.4.4. Gaussian Representation
4. Application Examples and Discussion of Results
The properties of the proposed method are demonstrated using different models. First, the models are introduced, then the FG of the models including the reference case is discussed. Finally, the numerical results obtained from the models are figured and discussed.
4.1. System Model
4.1.1. Signal Space Models
We assume a binary i.i.d. data vector of length as an input. The coded symbols are given by . The modulated signal vector is given by , where is a signal space mapper. The channel is selected to be the AWGN channel with phase shift modeled by the random walk (RW) phase model. The phase as the function of time samples is described by , where is a zero mean real Gaussian random value (RV) with variance . Thus the received signal is , where stands for the vector of complex zero mean Gaussian RV with variance . The model is depicted in Figure 1.
4.1.2. Phase Space Model
We again assume the vector as an input into the minimum-shift keying (MSK) modulator. The modulator is modeled by the canonical form, that is, by the continuous phase encoder (CPE) and nonlinear memoryless modulator (NMM) as shown in . The modulator is implemented in the discrete time with two samples per symbol. The phase of the MSK signal is given by , where is the -th sample of the phase function, is the -th state of the CPE and the sampled modulated signal is given by . The communication channel is selected to be the AWGN channel with constant phase shift, that is, , where stands for the received vector, is the constant phase shift of the channel and is the AWGN vector. The nonlinear limiter phase discriminator (LPD) captures the phase of the vector , that is, . The whole system is shown in Figure 2.
4.2. Factor Graph of the System at Hand
One can see, that for both models. The FG is depicted in Figure 3. We shortly describe the presented factor nodes and message types presented in the FG.
4.2.1. Factor Nodes
Factor Nodes in the Signal Space Models
Phase Shift (PS):
Factor Nodes in the Phase Space Model
Phase Shift (PS):
Factor Nodes Common for Both Signal and Phase Space
Random Walk (RW):
Other Factor Nodes
Other factor nodes such as the coder, CPE, and signal space mappers FN are situated in the discrete part of the model and their description is obvious from the definition of the related components (see, e.g.,  for an example of such a description).
4.2.2. Message Types Presented in the FG/SPA
The FG contains both discrete and continuous messages. The discrete messages are presented in the coder. There is no need for the investigation of their representation, because they are exactly represented by PMF.Several parameterizable continuous messages might be exactly represented using a straightforward parameterization (e.g., Gaussian message). These messages are presented in the AWGN channel model. The rest of the messages are mixed continuous and discrete messages. These mixture messages are continuously valued messages without an obvious way of their representation. The messages are situated in the phase model.
4.3. The FG/SPA Reference Model
The empirical stochastic analysis requires a sample of the message realizations. Thus we ideally need a perfect implementation of the FG/SPA for each model. We call this perfect or better said almost perfect FG/SPA implementation as the reference model. Note that even if the implementation of the FG/SPA is perfect, the convergence of the FG/SPA is still not secured in the looped cases. We call the perfect implementation such a model that does
not suffer from the implementation-related issues such as an update rules design and a messages representation. The flood schedule message passing algorithm is assumed. The reference model might suffer (and our models do) from the numerical complexity and it is therefore unsuitable for a direct implementation.
4.3.1. Discrete Type Messages
They are situated in the discrete part of the FG/SPA. As we have already said, their representation by PMF and the exact evaluation of the update rules according to the definition  are straightforward.
4.3.2. Unimportant Messages
The messages from PS factor node to the observation ( , , , and ) lead to the open branch and neither an update nor a representation of them is required, because these messages cannot affect the estimation of the target parameters (data, phase).
4.3.3. Parameterizable Continuous Messages
The messages and are representable by a number meaning , and are representable by the pair meaning , , respectively. One can easily find the slightly modified update rules derived from the standard update rules. Examples of those may be seen in .
4.3.4. Mixture Messages
The representation of the remaining messages, that is, , , , , , and , is not obvious. These messages are thus discretized and the marginalization is performed using numerical integration with the rectangle rule [8, 12] in the update rules. The number of samples is chosen sufficiently large so that the impact of the approximation can be neglected. The mentioned mixture messages are real valued one-dimensional functions for all considered models.
We specify four scenarios for the analysis purpose. All of the scenarios might be seen as a special case of the system model described in the Section 4.1. All scenarios use the FG/SPA containing the loops, except the first one.
4.4.1. Uncoded QPSK Modulation
The QPSK modulation is situated in the AWGN channel with RW-model of the phase shift. The length of the frame is data symbols, the length of the preamble is 4 symbols and the preamble is situated at the beginning of the frame. The variance of the phase noise equals . This scenario is cycle-free and thus only inaccuracies caused by the imperfect implementation are presented. The information needed to resolve the phase ambiguity is contained in the preamble and, by a proper selection of the analyzed message, we can maximize the approximation impact to the key metrics such as BER or MSE of the phase estimation. We thus select the message to be analyzed.
4.4.2. Coded 8PSK Modulation
In addition to the previous scenario, the convolutional coder is presented. The length of the frame is data symbols, the length of the preamble is 2 symbols. The same message is selected to be analyzed .
4.4.3. MSK Modulation with Constant-Phase Model of the Phase Shift
The length of the frame is data symbols. The analyzed message is . These messages are nearly equal for all possible PS factor nodes (e.g., ).
4.4.4. Bit-Interleaved Coded Modulation
The model employs a bit-interleaved coded modulation (BICM) with convolutional code and QPSK signal-space mapper. The phase is modeled by the RW model with . The length of the frame is data symbols and 150 of those are pilot symbols. This model slightly changes our concept. Instead of the investigation of the single message, we analyze all and messages jointly. It means that all of the analyzed messages are approximated in the simulations and the stochastic analysis is performed over all investigated messages.
4.5. Eigensystem Analysis
The first objective is to investigate the eigensystem of the mixture messages. We demonstrate the analysis by numerical evaluation of the eigenvalues and eigenvectors for various scenarios mentioned before.
4.6. Relation of the MSE of the Approximated Message with the Target Criteria Metrics
The KLT-approximated message provides the best approximation in the MSE sense. The minimization of the MSE of the approximated message, however, does not guarantee the minimization of the target criteria metrics such as MSE for the phase estimation or BER for data decoding. We have therefore performed several numerical simulations to inspect the behavior of the KLT-approximated message. We also consider the message types mentioned in the Section 3.4 into our simulation.
Few notes are addressed before going over the results. The MSE of the phase estimation is computed as an average over all MSE of the phase estimates in the model. The measurement of the MSE is limited by the granularity of the reference model. The simulations of the stochastic analysis are numerically intensive. We are limited by the computing resources. The simulations of the BER might suffer from this, especially for small error rates. The threshold of the detectable error rate is about for the uncoded QPSK model and for the BICM model.
4.6.1. Simulation Results for the Uncoded QPSK Modulation
Another interesting point might be seen in Figure 9. Adding the sixth component to KLT (and also Fourier) canonical representation, the performance is slightly worse than the five-component approximated message. It means that the proportional relationship between MSE of the approximated message and BER does not work, at least in this given case.
The representation by samples does not seem to work well. It is probably caused by relatively high SNR. A few samples hardly cover the narrow shape of the message. The limitation of the Gaussian message is given by its incapability to describe the phase in vicinity of 0 and . Relatively good result is achieved using the Dirac-Delta message.
4.6.2. Simulation Results for the BICM
The last measurement was performed with the BICM model for SNR=8 dB. As it was mentioned, the randomness of the message is given not only by the iteration and the observation vector, but also by the position in the FG (of course only the messages and ).
Furthermore, we can observe a good behavior of the Dirac-Delta message in the BER measurement case. The MSE of the phase estimation, however, does not give such good results for the Dirac-Delta representation.
We have proposed a methodical way for the canonical message representation based on the KLT. The method itself is not restricted for a particular scenario. It is sufficient to have a stochastic description of the message or at least a satisfactory number of message realizations. The method, as it is presented, is restricted to real-valued one-dimensional messages in the FG/SPA.
We presented several example implementations of the method for several particular scenarios. The investigated message describes the phase shift of the communication channel for all models. The results of the simulations show that the KLT analysis of the message leads to the harmonic functions (or functions very similar) for all considered models and parameters. One might offer a conclusion that the KLT-basis is given only by the variable described by the analyzed message (the phase shift in our case).
The next point is also a consequence of the phenomenon that the KLT analysis of the message leads to the harmonic functions. The harmonic functions based linear basis optimizes the MSE of the approximated messages for the considered models.
We also evaluated some crucial performance metrics (BER and MSE of the phase estimation) for differently corrupted messages. The corruption consists in the incompleteness of the message (number of canonical basis). We compared the KLT-approximated message with several message types presented in the literature. We compare only the message representations. The update rules are performed "ideally" by the numerical integration in the simulations. The Fourier representation presented in  seems to be the best complexity/fidelity trade-off for the considered models. The KLT-approximation gives the same results as the Fourier representation in the model, where a relatively good stochastic description is available. In the second model, the Fourier representation slightly outperforms the KLT representation, but it can be caused by insufficient stochastic analysis of the message. An interesting complexity/fidelity trade-off offers the Dirac-Delta representation for the BER evaluation. The results of the Gaussian representation are limited by its incapability to describe the phase in vicinity of 0 and .
Finally, we have found a case, where an increase in the approximation dimensionality affects negatively the performance in both Fourier and KLT message representations. It shows that the relation of both BER and MSE of the phase estimation and MSE of the approximated message is not generally proportional as one might expect.
This work was supported by the European Science Foudation through COST Action 2100, FP7-ICT SAPHYRE project, the Grant Agency of the Czech Republic, Grant 102/09/1624, and the Ministry of Education, Youth and Sports of the Czech Republic, prog. MSM6840770014, Grant OC188 and by the Grant Agency of the Czech Technical University in Prague, Grant no. SGS10/287/OHK3/3T/13.
- Kschischang FR, Frey BJ, Loeliger HA: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 2001, 47(2):498-519. 10.1109/18.910572MathSciNetView ArticleMATHGoogle Scholar
- Loeliger HA: An introduction to factor graphs. IEEE Signal Processing Magazine 2004, 21(1):28-41. 10.1109/MSP.2004.1267047View ArticleGoogle Scholar
- Mooij JM, Kappen HJ: Sufficient conditions for convergence of the sum-product algorithm. IEEE Transactions on Information Theory 2007, 53(12):4422-4437.MathSciNetView ArticleMATHGoogle Scholar
- Yedidia JS, Freeman WT, Weiss Y: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory 2005, 51(7):2282-2312. 10.1109/TIT.2005.850085MathSciNetView ArticleMATHGoogle Scholar
- Elidan G, McGraw I, Koller D: Residual belief propagation: informed scheduling for asynchronous message passing. Proceedings of the 22nd Conference on Uncertainty in AI (UAI '06), July 2006, Boston, Mass, USAGoogle Scholar
- Dauwels J, Loeliger HA: Phase estimation by message passing. Proceedings of the IEEE International Conference on Communications, June 2004 523-527.Google Scholar
- Andrea Loeliger H: Some remarks on factor graphs. Proceedings of the 3rd International Symposium on Turbo Codes and Related Topics, 2003 111-115.Google Scholar
- Loeliger HA, Dauwels J, Hu J, Korl S, Ping L, Kschischang FR: The factor graph approach to model-based signal processing. Proceedings of the IEEE 2007, 95(6):1295-1322.View ArticleGoogle Scholar
- Sykora J, Prochazka P: Error rate performance of the factor graph phase space CPM iterative decoder with modulo mean canonical messages. COST 2100 MCM, June 2008, Trondheim, Norway 1-7.Google Scholar
- Simoens F, Moeneclaey M: Code-aided estimation and detection on time-varying correlated mimo channels: a factor graph approach. EURASIP Journal on Applied Signal Processing 2006, 2006:-11.Google Scholar
- Kurkoski B, Dauwels J: Message-passing decoding of lattices using Gaussian mixtures. Proceedings of the IEEE International Symposium on Information Theory (ISIT '08), July 2008 2489-2493.Google Scholar
- Dauwels JHG: On graphical models for communications and machine learning: algorithms, bounds, and analog implementation, Ph.D. dissertation. Swiss Federal Institute of Technology, Zürich, Switzerland; May 2006.Google Scholar
- Worthen AP, Stark WE: Unified design of iterative receivers using factor graphs. IEEE Transactions on Information Theory 2001, 47(2):843-849. 10.1109/18.910595View ArticleGoogle Scholar
- Colavolpe G, Barbieri A, Caire G: Algorithms for iterative decoding in the presence of strong phase noise. IEEE Journal on Selected Areas in Communications 2005, 23(9):1748-1757.View ArticleGoogle Scholar
- Rimoldi BE: A decomposition approach to CPM. IEEE Transactions on Information Theory 1988, 34(2):260-270. 10.1109/18.2634MathSciNetView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.