A Receiver for Differential Space-Time π/ 2 -Shifted BPSK Modulation Based on Scalar-MSDD and the EM Algorithm

In this paper, we consider the issue of blind detection of Alamouti-type di ﬀ erential space-time (ST) modulation in static Rayleigh fading channels. We focus our attention on a π/ 2-shifted BPSK constellation, introducing a novel transformation to the received signal such that this binary ST modulation, which has a second-order transmit diversity, is equivalent to QPSK modulation with second-order receive diversity. This equivalent representation allows us to apply a low-complexity detection technique speciﬁcally designed for receive diversity, namely, scalar multiple-symbol di ﬀ erential detection (MSDD). To further increase receiver performance, we apply an iterative expectation-maximization (EM) algorithm which performs joint channel estimation and sequence detection. This algorithm uses minimum mean square estimation to obtain channel estimates and the maximum-likelihood principle to detect the transmitted sequence, followed by di ﬀ erential decoding. With receiver complexity proportional to the observation window length, our receiver can achieve the performance of a coherent maximal ratio combining receiver (with di ﬀ erential decoding) in as few as a single EM receiver iteration, provided that the window size of the initial MSDD is su ﬃ ciently long. To further demonstrate that the MSDD is a vital part of this receiver setup, we show that an initial ST conventional di ﬀ erential detector would lead to a strange convergence behavior in the EM algorithm.


INTRODUCTION
Differential detection of a differentially encoded phase-shift keying (DPSK) signal is a technique commonly used to recover the transmitted data in a communication system, when channel information (on both the amplitude and phase) is absent at the receiver.The performance of DPSK in traditional wireless communication systems employing one transmit antenna and one or more receive antennas is well documented in the literature.In recent years, this encoding-This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.detection concept has been extended to cover the scenario where there is more than one transmit antenna.This leads to differential space-time block codes (STBCs), an extension of the STBCs originally proposed in [1].Like conventional DPSK, differential STBCs enable us to decode the received signal without knowledge of channel information, provided that the channel remains relatively constant during the observation interval [2,3,4,5,6].Another similarity between conventional DPSK and differential STBCs is that both suffer a loss in performance when compared to their respective ideal coherent receiver.
For conventional DPSK, one approach often used to improve receiver performance is to make decisions based on multiple symbols, that is, multiple-symbol differential detection (MSDD).Previous research has demonstrated that when there is only a single channel, that is, only one transmit antenna and one receive antenna, the performance of MSDD can approach that of the ideal coherent detector when N, the observation window length in a number of symbol intervals, is sufficiently large [7,8].This observation is true for both the additive white Gaussian noise (AWGN) channel and the Rayleigh fading channel.Moreover, the computational complexity of MSDD is only N log N, provided that the channel is constant over the observation window of the detector and that the implementation procedure developed by Mackenthun is employed [9].For receive-diversity only systems, Simon and Alouini demonstrated again that the performance of an MSDD combiner approaches that of a coherent maximal ratio combining (MRC) receiver with differential decoding, when N is sufficiently large [10].The application of the MSDD concept to detect differentially encoded STBCs has been considered by a number of authors [11,12,13,14,15].Their results indicate that space-time MSDD (ST-MSDD) can provide substantial performance improvement over the standard space-time (ST) differential detector in [2].Unfortunately, for both the MSDD combiner and the ST-MSDD, there is no known efficient algorithm for the optimal implementation of these receivers.The complexity of both optimal receivers is exponential in N. In this paper, we will use the term scalar-MSDD to refer to the optimal MSDD for the single channel case [7,9], and the term vector-MSDD to refer to either an MSDD combiner [10] or an ST-MSDD [11].
In light of the exponential complexity of the optimal vector-MSDD, several suboptimal, reduced-complexity variants have been proposed for detecting differential STBC.For example, Lampe et al. implemented a code-dependent technique with a complexity that is essentially independent of the observation window length of the detector [12,13].The concept of decision feedback was employed by Schober and Lampe in their MSDD for a system employing both transmit and receive diversity [6].Similar ideas were also employed by Tarasak and Bhargava in a transmit-diversity only scenario [14], and by Lao and Haimovich in an interference suppression and receive-diversity setting [15].In addition, Tarasak and Bhargava investigated reducing receiver complexity using a reduced search detection approach [14].
In this paper, we propose an iterative receiver for differential STBC employing a π/2-shifted BPSK constellation, two transmit antennas, and an Alamouti-type code structure [16].By employing a novel transformation to the received signal, it is shown that this STBC is equivalent to conventional differential QPSK modulation with second-order receive diversity.As a result, selection diversity and scalar-MSDD can be employed in the first pass of our iterative receiver.Due to the low complexity of the scalar-MSDD, a very large window size N (i.e., 64) can be employed to provide the receiver with very accurate initial estimates of the transmitted symbols.Successive iterations of the receiver operations are then based on the expectation-maximization (EM) algorithm [17] for joint channel estimation and sequence detection.Our results show that the iterative receiver we introduce can essentially achieve the performance of the ideal coherent MRC receiver, with differential encoding, in as few as a single EM iteration (i.e., a total of two passes).
This paper is organized as follows.Section 2 presents the STBC adopted in this investigation, the channel model, and the transformation employed to convert this second-order transmit-diversity system into an equivalent second-order receive-diversity system.Details of the receiver operations, including that of the EM algorithm, which performs joint channel estimation and sequence detection, are described in Section 3. The bit error performance of the proposed receiver is given in Section 4, while conclusions of this investigation are made in Section 5.

System model
We consider a wireless communications system operating over a slow, flat Rayleigh fading channel, in which spacetime block-coded symbols are sent from two transmit antennas and received by a single receive antenna.The space-time block code employed falls into the class of the popular twobranch transmission-diversity scheme introduced by Alamouti [16].Specifically, if c 1 [k] and c 2 [k] are, respectively, the complex symbols transmitted by the first and second antennas, in the first subinterval of the kth coded interval, then the transmitted symbols in the second subinterval by the same two antennas are, respectively, −c * 2 [k] and c * 1 [k].Note that throughout this paper, the notations (•) * and (•) † are used to represent the complex conjugate of a complex number and the conjugate (Hermitian) transpose of a complex vector/matrix.The various coded symbols are taken from the π/2-shifted BPSK constellation S = {+1, −1, + j, − j}, where the subsets S 1 = {+1, −1} and S 2 = {+ j, − j} are used alternately in successive subintervals at each transmit antenna.This alternation between S 1 and S 2 not only reduces envelope fluctuation, but it also enables us to transform the proposed second-order transmit-diversity BPSK system into an equivalent second-order receive-diversity QPSK system.Assuming that c 1 [k] is chosen from S 1 , it follows that c 2 [k] must be chosen from S 2 .Then, the transmitted code matrix in the kth coded interval becomes where ( Note that the columns of C[k] correspond to the two transmit antennas, while the rows of C[k] correspond to the coded subintervals.
Since we will be using MSDD in the first pass of our iterative receiver, it is necessary for the C[k]'s to be differentially encoded ST symbols.The C[k]'s are related to the actual data symbols, the D[k]'s, according to where Without loss of generality, the initial transmitted symbol C[0], which carries no information and serves only as an initialized reference, is chosen to be V 1 .It can be easily verified that the U n 's are unitary matrices, and that for any V m in set V and any U n in the set U, the product U n V m is a member of the set V .The relations between and which arise from the differential encoding rule, are explicitly depicted in Table 1.
The transmitted symbols at each transmit antenna will be pulse-shaped by a square-root raised cosine (SQRC) pulse, and then transmitted over a wireless link to the receiver.Each link introduces fading to the associated transmitted signal, and the receiver's front end introduces AWGN.The composite received signal from the two links is matched-filtered and sampled, twice per encoded interval, to provide the receiver with sufficient statistics to detect the transmitted data.Assuming the channel gains in the two links, f 1 and f 2 , are constant within the observation window of the data detector, the two received samples in the kth interval can be modeled as where is the vector of complex channel gains, is a noise vector containing the two complex Gaussian noise terms n 1 [k] and n 2 [k], and (•) T denotes the transpose of a matrix.The channel fading gains are assumed to be independent and identically distributed (i.i.d.) zeromean complex Gaussian random variables, with unit variance.In addition, these channel gains are assumed to be constant over the observation window of N symbol intervals.The static fading channel has been frequently considered when investigating systems with transmit and receive diversity [10,18,19,20,21,22,23].On the other hand, the sequence of noise samples, {. . .
. .}, is a complex, zero-mean white Gaussian process, with a variance of N 0 .It should be pointed out that the fading gains and the noise samples are statistically independent.
To recover the data contained in the R[k]'s, the receiver can employ the ST differential detector in [2].The metric adopted by this simple detector can be expressed in the form and | • | denotes the magnitude of a complex vector.Since I is actually independent of C[k − 1], the hypothesis on D[k] that maximizes the metric I is chosen as the most likely transmitted data symbol.Though simple, this detector was shown to exhibit a 3 dB loss in power efficiency when compared to the ideal coherent receiver.To narrow this performance gap, a vector-MSDD can be used instead [11].This detector organizes the R[k]s into overlapping blocks of size N, with the last vector in the previous block being the first vector in the current block.For the block starting at time zero, the decoding metric can be expressed in the form Like the metric I, this vector-MSDD metric is independent of C[0].Consequently, the detector selects the hypothesis ) that maximizes J, as the most likely transmitted pattern in this interval.It is clear from the expression of J that there are altogether 4 N−1 hypotheses to consider.So far, there does not exist any algorithm that performs this search in an efficient and yet optimal fashion.
The approach we adopt to mitigate the complexity issue in the vector-MSDD is to first transform the received signal vector in (5) into one that we would encounter in a receive-diversity only system.Although the optimal vector-MSDD in this latter case still has an exponential complexity [10], we now have the option of using selection combining in conjunction with a scalar-MSDD [18].Although there is still a substantial gap between selection combining MSDD and the MRC, this gap can be closed by employing additional processing based on the iterative EM algorithm described in the next section.In this case, the decisions made by the selection combining MSDD are used to initialize the EM processing unit.The following subsection provides details about the transformation required to turn our secondorder transmit-diversity system into an equivalent secondorder receive-diversity system.

From transmit diversity to receive diversity
To assist in the development of transformation, we first expand ( 5) to obtain This equation clearly illustrates the structure of the received signal samples.Moreover, we can deduce from the equation that the average SNR in the received sample where E{•} is the expectation operator.The same SNR also appears in the received sample r 2 [k].
Next, we introduce the new variables where are two new fading gains, is an equivalent transmitted symbol, and Information symbol source Differential decoder Channel estimation In comparing ( 2) with ( 14), we can quickly see that x i is simply the row (or column) sum of V i .Furthermore, for all V n = U m V k , x n = y m x k , where y m is the row (or column) sum of the unitary matrix U m in (4).This latter property implies that differential encoding of ST π/2-shifted BPSK symbols is equivalent to differential encoding of scalar QPSK symbols.The respective QPSK encoding rule is b where a[k] ∈ {1, j, −1, − j} is the equivalent data symbol and b[k] ∈ {±1 ± j} is the equivalent transmitted symbol.Note that x n , the row/column sum of V n , can be expressed as ] is an all-one row vector of length two.However, we can also deduce that 2 shows this equivalent differential encoding rule.By comparing Table 1 and Table 2, it is evident that the indexings of the respective symbols are identical.The advantage of transforming the original STBC into an equivalent second-order receive-diversity QPSK system will be clearly demonstrated in the next section.

THE MSDD-AIDED EM-BASED ITERATIVE RECEIVER
The previous section demonstrated how an STBC π/2shifted BPSK system can be transformed into an equivalent receive-diversity system.This section describes how an iterative receiver based on selection diversity, scalar-MSDD, and the EM algorithm [17] processes the equivalent received signal and attains the equivalent performance to that of an ideal coherent receiver (with differential decoding).Figure 1 provides a quick overview of this proposed receiver.

First pass-selection diversity and scalar-MSDD
Given the new received variables in (10), we can use, in principle, an MSDD combiner [10] to detect the transmitted data.The decoding metric of this receiver is of the form where are the equivalent received vectors, N is the window width of the MSDD combiner, is the equivalent transmitted pattern, are the equivalent noise patterns, and B represents a hypothesis of B. The MSDD combiner searches through all possible hypotheses; the hypothesis which maximizes K is declared the most likely transmitted pattern.This most likely hypothesis is then differentially decoded to obtain the data symbols.This operation therefore makes the decision independent of the first symbol in B. Consequently, we can simply assume all hypotheses start with the symbol x 1 in (14).Thus, as with the case of the vector-MSDD, there are 4 N−1 candidates to consider.This exponential complexity prevents the use of a large N in (15).However, for suboptimal implementation, we can use selection diversity followed by scalar-MSDD [18], an option which is unavailable in vector-MSDD.It will be shown in the next section that an EM-based iterative receiver initiated by selection diversity scalar-MSDD has better performance and convergence properties than those initiated by conventional space-time differential detection (ST-DD).
A selection-diversity scalar-MSDD receiver obtains an estimate of the equivalent transmitted pattern B according to where B is the collection of all possible length-N equivalent QPSK sequences, and The solution to ( 19) is easily found using the algorithm developed by Mackenthun [9], as the channel is constant over the observation interval.It is important to stress that this algorithm has a complexity of only N log N.
The decision B (0) in ( 19) is used to initialize the EM algorithm described in the next section.This algorithm performs iterative channel estimation and data detection, by passing information back and forth between the channel estimator and the data detector.At this point, we want to point out that other options for initializing the EM algorithm include using pilot symbols to acquire a channel fading estimate [19,20], or using differential detection to acquire a transmitted signal estimate [21].Although using pilot symbols provides a reliable reference to estimate the channel gains, it results in a power loss, and even after several iterations, the performance of coherent detection may not be reached [19,20].In the case of initializing the EM algorithm with differentially detected sequence [21], it was determined that the transmitted sequence estimate reconstructed from a vector-MSDD information sequence estimate does not yield good channel estimates due to differential reencoding.Hence, there was a consistent performance loss when compared to a coherent receiver.

Successive passes-joint estimation and detection using the EM algorithm
It was shown in [18] that with a large N (i.e., 64), the selection-diversity scalar-MSDD receiver, described in Section 3.1, experiences a 1.5 dB degradation in power efficiency when compared to MRC.To narrow this performance gap, we propose to adopt the EM algorithm to further process the initial estimate B (0) provided by the selection-diversity scalar-MSDD receiver.
The EM algorithm was first introduced by Dempster et al. [17].It is suited for problems where there are random variables other than a desired component contributing to the observable data.The complete set of data consists of the desired data and the nuisance data.In the context of the problem at hand, the complete set of data is the (equivalent) transmitted pattern B and the channel gains g 1 and g 2 ; the sequence B is the desired data, and the channel gains are the nuisance parameters.To initialize the EM algorithm, it is necessary to provide an estimate of either component of the complete set.In our case, this will be the decision B (0) in (19).The accuracy of this initial estimate often determines the effectiveness of the EM algorithm and the average number of iterations necessary for convergence.An excellent description of the algorithm and the breadth of its applications can be found in [24].A detailed application of the EM algorithm to joint channel estimation and sequence detection situations can be found in [25].The scope of the description given below is restricted to our joint channel estimation and sequence detection problem.
The EM algorithm consists of two steps per iteration; an expectation step (E-step) and a maximization step (M-step).At the kth E-step, the algorithm estimates the fading gains by computing their means when conditioned on the received data P 1 and P 2 , and the most recent estimate B (k−1) of the equivalent QPSK symbols.Using the minimum mean square estimation (MMSE) principle, these conditional means can be expressed as [19,20] Immediately following the kth E-step is the kth M-step.Here the algorithm assumes the fading gain estimates in (21) are perfect and performs MRC and data detection according to where Re{•} is the real operator.In other words, the M-step updates the decision on B according to the most recent estimates of the fading gains.It should be pointed out that (22) can easily be solved on a symbol-by-symbol basis.Furthermore, the estimated symbols in B (k) are then differentially decoded to obtain estimates of the information symbols.If it is desired to perform another EM iteration, the channel will be reestimated using (21), and hence another sequence estimate will be obtained using (22).The iterations cease when the sequence estimate does not change during two subsequent iterations, or after a prespecified number of iterations have occurred.A maximum of 10 iterations are considered in this research.
As the E-step is essentially an average of N variables, and the M-step maps each derotated statistic to the nearest QPSK signal, the complexity of each iteration is linearly proportional to N. We note that while it is possible to implement conventional ST-DD to initialize the EM algorithm, our results in the next section show that it is not an effective option.

RESULTS
This section details the results obtained via simulation of our system.MSDD of length N = 16, 32, 64, and 128 are considered.The results are shown in Figures 2, 3, 4, and 5, along with the performance of conventional ST-DD, equivalent to conventional equal gain combining (EGC), and the coherent detection lower bound (i.e., MRC with differential encoding).In these figures, the integer n in the notation EM-n refers to the number of EM iterations.When n = 0, we simply have a selection-diversity scalar-MSDD receiver.Note that SNR denotes the average signal-to-noise ratio per bit.Lastly, we remind the reader that simulations were performed using a complex Gaussian, static fading channel, as outlined in Section 2.1.
The results in Figures 2, 3, 4, and 5 indicate that there is a significant improvement in performance from the initial selection-diversity sequence estimate, to the first estimate provided by the EM algorithm.Although they are not included, it should be known that the performance curves of the EM-2 to EM-9 receivers lie consecutively within the curves for the EM-1 and EM-10 receivers.For N equal to 128, the first iteration of the EM receiver essentially meets the lower bound given by coherent reception.Further simulation results not included here indicate that the EM receiver is able to meet the lower bound within a single EM iteration, for all N greater than 128.
The authors stress that the success of this receiver depends strongly on the initial sequence estimate provided by (19), which in turn provides an excellent channel estimate using (21).To elaborate, note in Figures 2, 3, 4, and 5 that the performance of the conventional differential detector is comparable to that of the standard selection-diversity receiver.One might suppose an EM-based receiver using an initial conventional ST-DD sequence estimate (obtained without using selection diversity or MSDD) could yield the same performance results as those shown here; however, this is not the case.The performance curves for an EM-based receiver initialized using a conventional ST-DD sequence estimate are shown in Figure 6.Clearly, the performance of the first iteration is substantially inferior to that of the conventional ST-DD initialization.In this case, the observation window for the conventional detector is only 2 symbol intervals, and the frame length from which the channel estimates are constructed is much larger (i.e., 64 symbol intervals).The inferior performance can be explained by noting that the transmitted sequence must be regenerated before the channel estimates are made.Due to the differential encoding, a single information symbol error may result in a significant number of incorrect transmitted symbol errors and hence a poor transmitted sequence estimate [21].As the number of iterations increases, the performance improves, however it takes many iterations to approach that of a coherent receiver, and there is still a 0.25 dB performance gap after 10 iterations.This explains why using a conventional differentially detected sequence as an initialization to the EM-based receiver does not yield such good results.When the selectiondiversity MSDD sequence estimate is used as an initialization to the EM-based receiver, the sequence decision rule is based on the entire received sequence, and received statistics are derotated together in an optimal fashion (19).Hence, propagated errors in the regenerated transmitted sequence do not occur.
An assumption we have made is that the channel is constant (static) over N symbol intervals.In the more general situation of a time-varying channel, the methodology proposed here can still be considered, with minor modification to the receiver structure.Firstly, the appropriate, straightforward adjustments must be made to the channel estimation (21) and MRC detection (22) units in the iterative section of the receiver.Secondly, as the Mackenthun algorithm can only be applied to static channels, the scalar-MSDD component would need to be replaced.An appropriate replacement would be a low-complexity, suboptimal MSDD, suited for a time-varying channel [26,27].Compared to the optimal MSDD for time-varying channels in [8], these suboptimal detectors have much lower computational complexity.Although there is a small SNR penalty (in the neighborhood of 1 to 2 dB), these detectors exhibit no irreducible error floor, even when the fading rate is as high as a few percent of the symbol rate.Consequently, the initial sequence decision provided by these detectors will be of reasonable quality, and we expect good convergence properties in subsequent EM iterations, similar to that seen in the static fading case.Finally, we would like to draw some qualitative comparisons between the proposed iterative receiver and those based on pilot symbols [19,20].From a bandwidth efficiency  point of view, our pilotless (noncoherent) receiver is more attractive as there is no need to transmit any pilot symbols for channel sounding purposes.Although the gain in bandwidth efficiency is minimal for the static fading environment, it can be significant for a time-varying channel.As mentioned in the previous paragraph, the proposed receiver methodology can also be used in a fast fading environment, provided that a suitable MSDD replaces the Mackenthun MSDD.From a power efficiency point of view, we believe our noncoherent receiver and a pilot-aided receiver [19] will have similar performance in the steady state (i.e., after a sufficient number of iterations).We notice a performance gap, in the neighborhood of 1.5 dB, between the receiver for a coded system in [19] and the respective ideal coherent bound without differential encoding.Conversely, our noncoherent receiver can attain the performance indicated by the coherent bound with differential encoding.Recall that there is a 1.5 dB difference between the two coherent bounds for a second-order diversity system.The last performance measure is the computational complexity.We note that the initial pass of our noncoherent EM receiver requires approximately the same amount of signal processing as a pilot-symbol-based system, and the successive iterations require an identical amount of computational resources.However, it may take many iterations to reach the steady-state performance for a pilot-aided system [19,20], while the noncoherent EM receiver can meet the coherent detection (with differential encoding) lower bound in a single iteration.Thus it appears that the proposed receiver requires less computation, due to its better convergence behavior arising from block detection.

CONCLUSION
In summary, we present a novel transformation on a specific Alamouti-type space-time modulation, and obtain a scalar, receive-diversity equivalent.With this transformation, it is simple to apply low-complexity, high-performance, receivediversity techniques.The results show that when using the sequence estimate from selection-diversity scalar-MSDD as an initialization to an iterative channel and sequence estimator, it is possible to achieve the performance of coherent detection.
Using STBC-MSDD to obtain the lower-performance bound of coherent detection would require implementing an algorithm with complexity 4 N−1 , where 4 is the cardinality of the transmission symbol set and N is a large number of transmitted space-time symbols.For the system discussed in this paper, the coherent detection lower bound is achieved using a receiver with complexity of essentially N log N, given by the complexity of the scalar-MSDD [9] used to initialize the EM algorithm.Clearly, the scalar equivalent system using the EM algorithm employed in this paper offers a lowcomplexity method to achieve the performance of coherent detection.

Figure 1 :
Figure 1: Block diagram of transmitter, channel model, and EMbased receiver performing joint channel estimation and sequence detection.Note that the matrix multiplication and addition operations are indexed by time.

Table 1 :
Logic table showing the ST differential encoding rule for C

Table 2 :
Logic table showing the equivalent QPSK differential encoding rule for b (10)two new noise terms.It can be shown that the new fading gains g 1 and g 2 are independent Gaussian random variables, with a variance of 2. Similarly, it can also be shown that the new noise samples w 1 [k] and w 2 [k] are independent and have variance 2N 0 .These results mean that the SNR in the samples p 1 [k] and p 2 [k] is also γ, in other words, the original SNR is preserved.Of foremost interest, note the new symbol b[k] is shared by p 1 [k] and p 2[k].Consequently,(10) corresponds to the received signal encountered in a secondorder receive-diversity system.Furthermore, b[k] belongs to the QPSK signal set X = {x 1 , x 2 , x 3 , x 4 }, where