Iterative Fusion of Distributed Decisions over the Gaussian Multiple-Access Channel Using Concatenated BCH-LDGM Codes

,


Introduction
During the last years, the scientific community has experienced an ever-growing research interest in Sensor Networks (SN) as means to efficiently monitor physical or environmental conditions without necessitating expensive deployment and/or operational costs.Generally speaking, these communication networks consist of a large number of nodes deployed over a certain geographical area and with a high degree of autonomy.Such an increased autonomy is usually attained by means of advanced battery designs, an efficient exploitation of the available radio resources, and/or cooperative communication schemes and protocols.In fact, cooperation between nearby sensors permits the network to operate as a global entity and execute actions in a computationally cheap albeit reliable fashion.Unfortunately, the capacity of SNs to achieve a high energy efficiency is highly determined by the scalability of these sensor meshes.In this context, a large number of challenging paradigms have been tackled with the aim of minimizing the power consumption and improving the battery lifetime of densely populated networks.As such, it is worth to mention distributed compression [1,2], transmission and/or cluster scheduling [3,4], data aggregation [5][6][7], multihop cooperative processing [8,9], in-network data storage [10], and power harvesting [11,12].This work gravitates on one of such paradigms: the centralized data fusion scenario (see Figure 1), where N nodes monitor a given information source S (representing, for instance, temperature, pressure, or any other physical phenomena) and transmit their sensed data to a common receiver.This receiver will combine the data from the sensors so as to obtain a reliable estimation of the information from the original source S. When the monitoring procedure at each node is subject to a non-zero probability of sensing error, intuitively one can infer that the more sensors added to this setup, the higher the accuracy of the estimation will be with respect to the case of a single sensor.Therefore, the challenging paradigm in this specific scenario lies on how to optimally fuse the information from all sources while taking into account the aforementioned probability of sensing error, specially when dealing with practical communication channels.One of the first contributions in this area was done by Lauer et al. in [13], who extended classical results from decision theory to the case of distributed correlated signals.Subsequently, Ekchian and Tenney [14] formulated the distributed detection problem for several network topologies.Later, in [15] Chair and Varshney derived an optimum data fusion rule which combines individually performed decisions on the data sensed at every sensor.This data fusion rule was shown to minimize the end-to-end probability of error of the overall system.More recently, several contributions have tackled the data fusion problem in diverse uncoded communication scenarios, for example, multihop networks subject to fading [16][17][18] and delays [19], parallel channels subject to fading [20][21][22], and asynchronous multiple-access channels [23,24], among others.
On the other hand, when dealing with coded scenarios over noisy channels, it is important to point out that the data fusion problem can be regarded as a particular case of the so-called distributed joint source-channel coding of correlated sources, since the nonzero probability of sensing error imposes a spatial correlation among the data registered by the sensors.In the last decade, intense research effort has been conducted towards the design of practical iterativelydecodable (i.e., Turbo-like) joint source-channel coding schemes for the transmission of spatially and temporally correlated sources over diverse communication channels, for example, see [25][26][27][28][29][30][31] and references therein.However, these contributions address the reliable transmission of the information generated by a set of correlated sensors, whereas the encoded data fusion paradigm focuses on the reliable communication of an information source S read by a set of N sensors subject to a nonzero probability of sensing error; based on this, a certain error tolerance can be permitted when detecting the data registered by a given sensor.In this encoded data fusion setup, different Turbo-like codes have been proposed for iterative decoding and data fusion of multiple-sensor scenarios for the simplistic case of parallel AWGN channels, for example, Low Density Generator Matrix (LDGM) [32], Irregular Repeat-Accumulate (IRA) [33], and concatenated Zigzag [34] codes.In such references, it was shown that an iterative joint decoding and data fusion strategy performs better than a sequential scheme where decoding and data fusion are separately executed.
Following this research trend, this paper considers the data fusion scenario where the data sensed by N nodes is transmitted to a common receiver over a Gaussian Multiple-Access Channel (MAC).In this scenario, it is well known that the spatial correlation between the data registered by the sensors should be preserved between the transmitted signals so as to maximize the effective signal-to-noise ratio (SNR) at the receiver.On this purpose, correlation-preserving LDGM codes have been extensively studied for the problem of joint source-channel coding of correlated sensors over the MAC [35][36][37][38].In these references, it was shown that concatenated LDGM schemes permit to drastically reduce the error floor inherent to LDGM codes.Inspired by this previous work, in this paper we take a step further by analyzing the performance of concatenated BCH-LDGM codes for encoded data fusion over the Gaussian MAC.Specifically, our contribution is twofold: on one hand, we design an iterative receiver that jointly performs LDGM decoding and data fusion based on factor graphs and the Sum Product Algorithm.On the other hand, we show that for the particular data fusion scenario under consideration, the error statistics in the decoded information from the sensors allow for the concatenation of BCH codes [39,40] in order to decrease the aforementioned intrinsic error floor of single LDGM codes.Extensive Monte Carlo simulations will verify that the proposed concatenated BCH-LDGM codes not only outperform vastly the suboptimum limit assuming separation between distributed source and channel coding, but also reaches the theoretical residual error bound derived by assuming errorless detection and decoding of the sensor data.
The rest of the paper is organized as follows: Section 2 delves into the system model of the considered encoded data fusion scenario, whereas Section 3 elaborates on the design of the iterative decoding and data fusion procedure.Next, Section 4 discusses Monte Carlo simulation results and finally, Section 5 ends the paper by drawing some concluding remarks.

System Model
, where R out is the rate of the outer BCH code.Notice that due to the low density nature of LDGM matrices, correlation is preserved not only in the systematic bits but also in the coded bits.Therefore, in order to exploit this correlation, the generator matrices are set exactly the same for all sensors.The output sequence of the concatenated encoder at every sensor, {c n l } L l=1 , is composed by a first set of K bits corresponding to the systematic bits and a final set of L − L out LDGM parity bits {p n l } L Lout+1 .These encoded sequences are then BPSK (Binary Phase Shift Keying) modulated and transmitted to a common receiver over a Gaussian Multiple-Access Channel.
The signal at the receiver is expressed as where φ : {0, 1} → {− E c , + E c } stands for the BPSK modulation mapping, and E c represents the average energy per channel symbol and sensor.The Gaussian MAC considered in this work assumes h n l = 1 for all l ∈ {1, . . ., L} and for all n ∈ {1, . . ., N}, whereas {n l } L l=1 are i.i.d.circularly symmetric complex Gaussian random variables with zero mean and variance per dimension σ 2 .Nevertheless, explanations hereafter will make no assumptions on the value of the MAC coefficients.The joint receiver must estimate the original information based on the received sequence {y l } L l=1 .This will be done by applying the message-passing Sum-Product Algorithm (SPA, see [41] and references therein) over the whole factor graph describing the statistical dependence between {y l } L l=1 and k=1 , as will be explained in next section.

Iterative Joint Decoding and Data Fusion
In order to estimate the aforementioned original information sequence { x S k } K k=1 , the optimum joint receiver would symbolwise apply the Maximum A Posteriori (MAP) decision criterium, that is, where P(• | •) denotes conditional probability.To efficiently perform the above decision criterion, a suboptimum practical scheme would first compute the conditional probabilities of the encoded symbol c n l given the received sequence, which is given, for l ∈ {1, . . ., L} and n ∈ {1, . . ., N}, as where the proportionality stands for P(0 | y l ) + P(1 | y l ) = 1 for all l ∈ {1, . . ., L}, and ∼ c n l denotes that all binary variables are included in the sum except c n l , that is, the sum is evaluated for all the 2 N−1 possible combinations of the set {c  (1) iterative LDGM decoding based on {P(c n l | y l )} L l=1 in an independent fashion with respect to the LDGM decoding procedures of the other N − 1 sensors and (2) an outer BCH decoding based on the hard-decoded sequence at the output of the LDGM decoder.Finally, the N recovered sensor sequences that is, by symbolwise majority voting over the estimated N sensor sequences.Notice that this practical scheme performs sequentially channel detection, LDGM decoding, BCH decoding, and fusion of the decoded data.However, the performance of the above separate approach can be easily outperformed if one notices that, since we assume 0 < p n < 0.5 for all n ∈ {1, . . ., N} (see Section 2), the sensor sequences for n / = m.As widely evidenced in the literature related to the transmission of correlated information sources (see references in Section 1), this correlation should be exploited at the receiver in order to enhance the reliability of the fused sequence { x S k } K k=1 .In other words, the considered scenario should take advantage of this correlation, not only by means of an enhanced effective SNR at the receiver thanks to the correlation-preserving properties of LDGM codes, but also through the exploitation of the statistical relation between sequences {x n k } K k=1 corresponding to different sensors n ∈ {1, . . ., N}.The latter dependence between {x n k } N n=1 and x S k can be efficiently capitalized by (1) describing the joint probability distribution of all the variables involved in the system by means of factor graphs and (2) marginalizing for x S k via the message-passing Sum-Product Algorithm (SPA).This methodology allows decreasing the computational complexity with respect to a direct marginalization based on exhaustive evaluation of the entire joint probability distribution.Particularly, the statistical relation between sensor sequences is exploited in one of the compounding factor subgraphs of the receiver, as will be later detailed.
This factor graph is exemplified in Figure 3(a), where the graph structure of the joint detector, decoder, and data fusion scheme is depicted for N = 4 sensors.As shown in this plot, this graph is built by interconnecting different subgraphs: the graph modeling the statistical dependence between x S k and {x n k } N n=1 for all k ∈ {1, . . ., K} (labeled as SENSING), the factor graph that relates sensor sequence {x n k } K k=1 to codeword {c n l } L l=1 through the LDGM parity check matrix H and the BCH code (to be later detailed), and the relationship between the received sequence {y l } L l=1 and the N codewords , with n ∈ {1, . . ., N} (labeled as MAC).Observe that the interconnection between subgraphs is done via variable nodes corresponding to c n l and x n k .In this context, since the concatenation of the LDGM and BCH code is systematic, variable nodes {c n l } K l=1 and {x n k } K k=1 collapse into a single node for all n ∈ {1, . . ., N}, which has not been shown in the plots for the sake of clarity.Before delving into each subgraph, it is also important to note that this interconnected set of subgraphs embodies an overall cyclic factor graph over which the SPA algorithm iterates-for a fixed number of iterations I-in the order MAC Let us start by analyzing the MAC subgraph, which is represented in Figure 3 where the value of the constant Θ l is selected so as to satisfy ℘∈B ζ l (℘) = 1 for all l ∈ {1, . . ., L}.On the other hand, the function associated to the check node connecting {c n l } N n=1 to b l is an indicator function defined as In regard to Figure 3(b), observe that a set of switches controlled by binary variables μ 1 and μ 2 drive the connection/disconnection of systematic (l ∈ {1, . . ., K}) and parity (l ∈ {K + 1, . . ., L}) variable nodes from the MAC subgraph.The reason being that, as later detailed in Section 4, the degradation of the iterative SPA due to short-length cycles in the underlying factor graph can be minimized by properly setting these switches.
The analysis follows by considering Figure 3(c), where the block integrating the BCH decoder is depicted in detail.At this point it is worth mentioning that the rationale behind concatenating the BCH code with the LDGM code lies on the statistics of the errors per simulated block, as the simulation results in Section 4 will clearly show.Based on these statistics, it is concluded that such an error floor is due to most of the simulated blocks having a low number of symbols in error, rather than few blocks with errors in most of their constituent symbols.Consequently, a BCH code capable of correcting up to t errors can be applied to detect and correct such few errors per block at a small loss in performance.Having said this, the integration of the BCH decoder in the proposed iterative receiver requires some preliminary definitions.
(i) δ n k, j (x): a posteriori soft information for the value x ∈ {0, 1} of the node x n k , which is computed, at iteration j and k ∈ {1, . . ., K}, as the product of the a posteriori soft information rendered by the SPA when applied to MAC and LDGM subgraphs.
(ii) δ n l, j (c): similar to the previously defined δ n k, j (x), this notation refers to the a posteriori information for the value c ∈ {0, 1} of node c n l , which is calculated, at iteration j and l ∈ {K + 1, . . ., L out }, as the product of the corresponding a posteriori information produced at both MAC and LDGM subgraphs.
(iii) ξ n k, j (x): extrinsic soft information for x n k = x ∈ {0, 1} built upon the information provided by the rest of sensors at iteration j and time tick k ∈ {1, . . ., K}.
(iv) δ n k, j (x): refined a posteriori soft information of node x n k for the value x ∈ {0, 1}, which is produced as a consequence of the processing stage in Figure 3(c).
Under the above definitions, the processing scheme depicted in Figure 3(c) aims at refining the input soft information coming from the MAC and LDGM subgraphs by first performing a hard decision (HD) on the BCH encoded sequence based on {δ n k, j (x)} K k=1 , {δ n l, j (c)} Lout l=K+1 , and the information output from the SENSING subgraph in the previous iteration, that is, {ξ n k, j−1 (x)} K k=1 .This is done for all n ∈ {1, . . ., N} within the current iteration j.Once the binary estimated sequence { c n l, j } Lout l=1 corresponding to the BCH encoded block at the nth sensor is obtained and decoded, the binary output { x n k, j } K k=1 is utilized for adaptively refining the a posteriori soft information {δ n k, j (x)} K k=1 as { δ n k, j (x)} K k=1 under the flipping rule which is performed for k ∈ {1, . . ., K}.It is interesting to observe that in this expression, all those indices in error detected by the BCH decoder will consequently drive a flip in the soft information fed to the SENSING subgraph.Finally we consider Figure 3(c) corresponding to the SENSING subgraph, where the refined soft information from all sensors is fused to provide an estimation of x S k as x S k .Let χ n k, j (x) denote the soft information on x S k (for the value x ∈ {0, 1} and computed for k ∈ {1, . . ., K}) contributed by sensor S n at iteration j.The SPA applied to this subgraph renders (see [41, equations (5) and ( 6)]) where p n denotes the sensing error probability which in turn establishes the amount of correlation between sensors.Factors Γ n k, j account for the normalization of each pair of messages, that is, ξ n k, j (0) + ξ n k, j (1) = 1 for all k, n, j.The estimation x S k ( j) of x S k at iteration j is then given by that is, by the product of all messages arriving to variable node x S k at iteration j.The iteration ends by computing the soft information fed back from the SENSING subgraph directly to the corresponding LDGM decoder, namely, where as before, Υ n k, j represents a normalization factor for each message pair.

Simulation Results
To verify the performance of the proposed system, extensive Monte Carlo simulations have been performed for N ∈ {2, 4, 6} sensors and a sensing error probability set, without loss of generality, to p n = p = 5 • 10 −3 for all sensors.The experiments have been divided in two different sets so as to shed light on the aforementioned statistics of the number of errors per iterations.Accordingly, the first set does not consider any outer BCH coding, and only identical LDGM codes of rate 1/3 (input symbols per coded  [12 6]}, and input blocklength K = 10000 are utilized at every sensor.The number of iterations for the proposed iterative receiver has been set equal to I = 50.The metric adopted for the performance evaluation is the End-to-End Bit Error Rate (BER) between x S k and x S k , which is averaged over 2000 different information sequences per simulated point and plotted versus the E b /N 0 ratio per sensor (energy per bit to noise power spectral density ratio).Gaussian MAC is considered in all simulations by imposing h n l = 1 for all l, n.Before presenting the obtained simulation results, two different performance limits can be derived for each simulated case.On one hand, it can be easily shown that the aforementioned BER metric can be lower bounded by the probability of erroneously detecting x S k provided that all sensor symbols {x n k } N n=1 are perfectly recovered, which can be computed, for even N, as that is, as the probability of having more than N/2 sensors in error.On the other hand, the minimum E b /N 0 per sensor required for reliable transmission of all sensors can be computed by combining the Slepian-Wolf [42] Theorem for distributed compression of correlated sources and Shannon's Separation Theorem.It can be theoretically proven that this Separation Theorem does not hold for the MAC under consideration.However, this limit may serve as a theoretical reference to compare the obtained performance results.This suboptimum limit E * b /N 0 is computed as   the difference between the simulated E b /N 0 and the corresponding E * b /N 0 limit from expression (13).Also are depicted horizontal limits corresponding to the BER lower bound from expression (12).First observe that since the aforementioned difference value is negative, the simulated E b /N 0 is lower than E * b /N 0 , which verify in practice the suboptimality of the computed separation-based bound.On the other hand, notice that the set of all BER curves for N = 2 coincide with the lower bound in expression (12) (horizontal dashed lines), while the waterfall region of such curves degrades as [d v d c ] increases.However, for N ∈ {4, 6}, the error floor (due to the MAC ambiguity of the received sequence about which transmitted symbol corresponds to each sender) is higher than the lower BER bound.By increasing [d v d c ] an error floor diminishes at the cost of degrading the BER waterfall performance.
It is also important to remark that the results plotted in Figure 4 have been obtained by setting the variables controlling the switches from Figure 3(b) to μ 1 = μ 2 = 1 during the first iteration, while for the remaining I − 1 iterations μ 1 = μ 2 = 0 (i.e., the MAC subgraph is disconnected and does not participate in the message passing procedure).The rationale behind this setup lies on the length-4 loop connecting variable nodes x n k , x m k (m / = n), x S k and b k for k ∈ {1, . . ., K}, which degrades significantly the performance of the message-passing SPA.Further simulations have been carried out to assess this degradation, which are omitted for the sake of clarity in the present discussion.Based on this result, all simulations henceforth will utilize the same switch schedule as the one used for this first set of simulations.
To better understand the error behavior of the proposed scheme in the error floor region, it is useful to analyze the distribution of the number of errors per block at the output of the LDGM decoders.To this end, let CDF(λ) denote the Cumulative Density Function of the number of errors per LDGM-decoded block λ at iteration I, which can be empirically estimated based on the results obtained for the first set of simulations.This function CDF(λ) is depicted for N = 4 and [d c d v ] = [10,5] (Figure 5(a)) and for N = 6 and [d c d v ] = [12,6] (Figure 5(b)).In this plot, such density function is depicted for every simulated E b /N 0 point and for every compounding LDGM decoder.Observe that in all the considered E b /N 0 range, the behavior of the CDF function results in being similar to all sensors.Furthermore, when E b /N 0 increases (i.e., when the system operates in the error floor region), the resulting CDF(λ) indicates that most of the decoded blocks contain a relatively small amount of errors with respect to the used blocksize K = 10 4 .This conclusion also holds for either Figure 5(b) and the other cases addressed in the first set of simulations.
This statistical behavior of the number of errors per decoded block λ motivates the inclusion of an outer systematic BCH code whose error correction capability t is adjusted so as to correct the residual errors obtained in the error floor region.However, note that the application of an outer code involves a penalty in energy.Specifically, the E b /N 0 ratio is increased by an amount 10 log 10 (1/R out ) dB, where R out decreases as the error capability t of the BCH code increases.Consequently, a tradeoff between t and its associated rate loss must be met.In this context, Figures 6  and 7  Observe that in all cases the error floor has been suppressed by virtue of the error correcting capability of the outer BCH code, and consequently the lower bound for the BER metric in expression (12) is reached.At the same time, due to the relatively small value of t with respect to K, the energy increase incurred by concatenating an outer BCH code is less than 0.5 dB.Summarizing, the proposed iterative scheme can be regarded as an efficient and practical approach for encoded data fusion over MAC, which is shown to outperform the suboptimum separation-based limit while reaching, at the same time, the lower bound for the End-to-End BER.

Concluding Remarks
In this paper, we have investigated the performance of concatenated BCH-LDGM codes for iterative data fusion of distributed decisions over the Gaussian MAC.The use of LDGM codes permits to efficiently exploit the intrinsic spatial correlation between the information registered by the sensors, whereas BCH codes are selected to lower the error floor due to the MAC ambiguity about the transmitted symbols.Specifically, we have designed an iterative receiver comprising channel detection, BCH-LDGM decoding, and data fusion, which have been thoroughly detailed by means of factor graphs and the Sum-Product Algorithm.Furthermore, a specially tailored soft information flipping technique based on the output of the BCH decoding stage has also been included in the proposed iterative receiver.Extensive computer simulations results obtained for varying number of sensors, LDGM, and BCH codes have revealed that (1) our scheme outperforms significantly the suboptimum limit assuming separation between distributed source and capacity-achieving channel coding and (2) the

Figure 1 :
Figure 1: Generic data fusion scenario where N nodes sense a certain physical parameter S, and transmit the sensed information to a joint receiver.
).The encoded sequence at the output of the BCH encoder is next processed through an inner LDGM code, that is, a linear code with low density generator matrix G = [I P].The parity check matrix of LDGM codes is expressed as H = [P T I], where I denotes the identity matrix, and P is a L out × (L − L out ) sparse binary matrix.Variable and check degree distributions (In other words, the parity matrix P of a (d v , d c ) LDGM code has exactly d v nonzero entries per row and d c nonzero entries per column.)are denoted as [d v d c ]; the overall coding rate is thus given by

Figure 2 :
Figure 2: Block diagram of the considered scenario.

Figure 3 :
Figure 3: (a) Block diagram of the overall factor graph corresponding to the proposed iterative receiver; (b) MAC factor subgraph; (c) adaptive flipping of the exchanged soft information between the LDGM and SENSING subgraphs based on the output of the BCH decoder; (d) SENSING factor subgraph.

2
where R c = R out d c /(d c + d v ) and the joint binary entropy of the sensors H(S 1 , . . ., S N ) is given by H(S 1 , . . ., S N ) Pr{n 0's},(14) with Pr{n 0's} = 0.5(p n (1 − p) N−n + (1 − p) n p N−n ) denoting the probability of having a sequence with exactly n zero symbols.In this first simulation set, no outer BCH code is used, hence R c = d c /(d c + d v ) = 1/3.
represent the End-to-End BER versus the gap to the separation limit E b /N 0 − E * b /N 0 for N = 4 (Figures 7(a) and 7(b)), N = 6 (Figures 7(a) and 7(b)), and a number of BCH codes with distinct values of the error-correcting parameter t.
∈ {1, . . ., N}.The sensed sequence at each sensor is then encoded through an outer systematic BCH code (L out , K, t), where L out and t denote the output sequence length and error correction capability of the code, respectively (We hereafter adopt this nomenclature, which differs from the standard notation (L out , K, d), with d denoting the minimum distance of the BCH code.