An Efﬁcient Differential MIMO-OFDM Scheme with Coordinate Interleaving

We propose a concatenated trellis code (TC) and coordinate interleaved di ﬀ erential space-time block code (STBC) for OFDM. The coordinate interleaver, provides signal space diversity and improves the codeword error rate (CER) performance of the system in wideband channels. Coordinate interleaved di ﬀ erential space-time block codes are proposed and used in the concatenated scheme, TC design criteria are derived, and the CER performances of the proposed system are compared with existing concatenated TC and di ﬀ erential STBC. The comparison showed that the proposed scheme has superior diversity gain and improved CER performance.


INTRODUCTION
In recent years, code design for multiple-input multipleoutput (MIMO) channels, with orthogonal frequency division multiplexing (OFDM) modulation, has gained much attention in wireless communications.Space-time block codes (STBC) first proposed by Alamouti [1] provide full spatial diversity in wireless channels, with simple linear maximum likelihood (ML) decoders.An efficient scheme of concatenated trellis code and STBC (TC-STBC) which provides additional diversity and coding gain was proposed by Gong and Letaief [2].Tarasak and Bhargava [3] applied the constant modulus (CM) differential encoding scheme of Tarokh and Jafarkhani [4] to the TC-STBC system [2].The differential encoding has the advantage of avoiding channel estimation and the transmission of pilot symbols.Further improvement of TC-STBC performance is possible by using a coordinate interleaver [5].Coordinate interleaved signal sets provide signal space diversity and hence improve the symbol error performance of communication systems in fast fading channels.The recent application of coordinate interleaving to MIMO-OFDM which shows that this technique provides considerable diversity gain without significant increase of encoding and decoding complexities was proposed by Rao et al. [6].The single symbol decodability of coordinate interleaved orthogonal design (CIOD) [7] is an important feature ensuring low decoding complexity.The joint use of CIOD and OFDM provides spatial and multipath diversities, and further concatenation of TC and CIOD (TC-CIOD) [5] as a consequence gives much better performance compared to CIOD OFDM [6], linear constellation precoded (LCP)-CIOD OFDM [6], and TC-STBC OFDM [2].
In this paper, we apply the nonconstant modulus (non-CM) differential space-time block (STB) encoding scheme proposed by Hwang et al. [8] to CIOD, and use it in TC-CIOD scheme [5].The proposed differential scheme achieves full spatial and multipath diversities, and provides considerable coding gain advantage without channel state information (CSI).We derive the design criteria for differential TC-CIOD and found that under some approximation they are same as in TC-CIOD case.The new differential scheme provides same diversity gain as the TC-CIOD scheme, and has diversity four times greater than of both the TC-STBC system introduced by Gong and Letaief [2] and its differential counterpart proposed by Tarasak and Bhargava [3].To clarify the effect of interleaver selection on the diversity gain of TC-STBC, we extend the results given in [2,3] where the two-symbol interleaver is considered between TC and STBC, to the symbol interleaver case.

PRELIMINARIES
In this section, we summarize the encoding and decoding of non-CM differential STBC, and in the following sections, the non-CM differential STBC is used in differential TC-CIOD system.Note that, the use of any CM differential encoding technique with CIOD is not possible due to nonconstant modulus of coordinate interleaved signal constellation.
Let us assume a quasistatic fading channel with two transmit and one receive antennas, and denote the channel gains corresponding to two transmit antennas with h 1 and h 2 , respectively.Let the dummy symbols to be transmitted during the first two transmission periods be a 1 and a 2 .Therefore, a 1 and −a * 2 are transmitted from the first transmit antenna, and a 2 and a * 1 are transmitted from the second transmit antenna during the first and second transmission periods, respectively.The differential STBC encodes the first data symbol pair (x 1 , x 2 ) by using the following equations [8]: ( The difference of non-CM differential STBC from CM differential STBC [4] is in the scaling coefficient which ensures that the total transmission energy of two antennas remains equal to one.The transmission of spacetime-block-(STB-) encoded dummy symbols a 1 and a 2 results with reception of where n 1 and n 2 are complex additive white Gaussian noise terms.Similarly, the transmission of STB-encoded a 3 and a 4 carrying non-CM symbols x 1 and x 2 results with reception of ( The differential decoder uses the received symbols r 1 , r 2 , r 3 , and r 4 to find the estimations of the transmitted non-CM symbols using As seen from ( 4), to find the transmitted non-CM symbol estimates x 1 and x 2 , the receiver should know or at least estimate the channel power (|h 1 | 2 + |h 2 | 2 ) and the signal power of previously transmitted symbols The simple estimation for the channel power p = (|h 1 | 2 + |h 2 | 2 ) denoted by p is possible by evaluating the expected value of |r t | 2 , that is, where M is the number of received symbols included in expected value calculation, and R H is the Hermitian of R. The computational complexity of (5) can be reduced by using where t is the recursion index.
There are two simple methods to estimate the signal power of previously transmitted symbols.The first one is to use the previous decoder output.The second one is to use (2) to obtain where n r is the Gaussian noise term.From (7), the estimation of the signal power of previously transmitted symbols can be written as

SYSTEM MODEL
In this section, we describe the proposed differential TC-CIOD OFDM system, and its encoding and decoding operations.

Differential encoder
The encoder block diagram of the proposed differential TC-CIOD OFDM for two transmit antennas is shown in Figure 1, where the source bits are trellis encoded at rate 2/3 and mapped to 8-PSK signal constellation.Each 8-PSK symbol is rotated by θ and then a vector of rotated symbols is coordinate interleaved by π.To achieve maximum diversity, a proper coordinate interleaver should be used.Let be the tth rotated trellis codeword of length 2K, where the symbols x t k are obtained by rotating the symbols x t k of the tth trellis codeword X t by θ, that is, The coordinate interleaver π, which has a great impact on the overall system performance, performs the following assignments: for k = 0, . . ., K − 1, and the coordinate interleaved symbols x t k form the vector In (11), the operators (•) I and (•) Q represent the real and imaginary parts of a complex symbol, respectively, and the operator (•) 2K takes modulo 2K of the operand.The vector X t enters the differential encoder which produces a vector A t+1 with elements a t+1 k obtained from for k = 0, . . ., K − 1, similar to (1).The differentially encoded symbol pairs a t+1 2k and a t+1 2k+1 are STB encoded as and transmitted from the α k th OFDM subcarrier.There is a one-to-one mapping between k and OFDM subcarriers, denoted by α k , which corresponds to the channel interleaver α.The rows of Y t+1 k are transmitted from (2t + 2)th and (2t + 3)th OFDM frames, respectively, and the columns of Y t+1 k are transmitted from first and second transmit antennas, respectively.
The differential transmitter starts encoding at t = 0 by using initial dummy vector A 0 with nonzero elements selected from considered signal constellation.The transmission consists of first STB encoding of arbitrary vector A 0 , which does not convey any information, and then sending it in the first two OFDM frames.The transmitter subsequently encodes the data in an inductive manner.

Channel model
Multipaths between transmit and receive antenna pairs in wireless communication channels cause intersymbol interference (ISI) in the received signals.The baseband impulse response for the MIMO channel with L paths between the μth transmit (1 ≤ μ ≤ n T ) and νth receive (1 ≤ ν ≤ n R ) antennas is given as [9] In (15) h μν (t, l) is the time-dependent channel tap weight, δ(•) is the Dirac function, and τ l is the path propagation delay of the lth path (0 ≤ l ≤ L − 1).OFDM modulation with cyclic prefix (CP) addition at the transmitter and removal at the receiver transforms the frequency-selective channel into K frequency nonselective subchannels without ISI.Assuming that the channel weights remain constant during an OFDM frame, the channel response becomes independent from time variable t, for single OFDM symbol period, and then the signal received by the νth antenna at the tth symbol interval, for the kth subcarrier (0 ≤ k ≤ K − 1), can be expressed as where y t μ (k) is the symbol transmitted by the kth subcarrier during tth symbol interval from μth transmit antenna, the samples n t ν (k) are zero-mean complex Gaussian r.v. with variance are N 0 /2 per dimension, and is the frequency-domain complex subchannel gain between μth transmit and νth receive antennas for the kth subchannel during tth symbol interval.In (17), T s is the effective OFDM symbol interval length and h t μν (l) is the channel tap weight.For simplicity we will drop the receive antenna index ν in the following derivations.However, the proposed system structure is easily extendable for more than one receive antenna.If we assume a quasistatic channel, we may also drop the time index t, from subcarrier transmission gains.Let the transmission of Y t+1 k be affected by the subcarrier transmission gains H 1 (α k ) and H 2 (α k ) corresponding to the first and second transmit antennas, respectively.For simplicity, we will denote H μ (α k ) as H μ k , for μ = 1, 2. Let, r t+1 k be the symbol received from α k th subcarrier of the (t + 1)th OFDM symbol, and n t+1 k for k = 0, 1, . . ., K − 1 being the subchannel noise variables which are independent and identically distributed zero-mean complex Gaussian r.v. with variance N 0 /2 per dimension.Then, the MIMO-OFDM transmission can be modeled by where , and the corresponding additive Gaussian noise affecting R t+1 can be expressed as

Differential decoder
When the receiver does not have any CSI, the decoding metric for the trellis codeword The decoder should determine the tth trellis codeword X t minimizing (20) to perform maximum likelihood (ML) decoding, where the differential CIOD decoding metric is defined as The CIOD decoding metric m t k used in (20) can be written as for k = 0, . . ., (K/2) − 1, where the STB symbol metric for ξ = 2k, 2k + 1, 2k + K and 2k + K + 1 is derived similar to [10, page 453].In (23), the scaling coefficient, which can be estimated by the methods described at the end of Section 2, is given by and the coordinate interleaved symbol estimates for k = 0, . . ., K − 1 are similar to (4).The scaling coefficient in (24) can be estimated by using the subchannel power estimation as (6) and the signal power estimation of previously transmitted symbols as (8).Similar to ( 6) and ( 8), we can express the estimation of S t k as where the subchannel power estimate p k is calculated recursively from where for k = 0, . . ., (K/2) − 1, which can be used by Viterbi decoder, to estimate the source bits.

TRELLIS CODE DESIGN
To achieve full diversity and high coding gain with the proposed differential TC-CIOD OFDM, we obtained the pairwise error probability (PEP) upper bound, which is the probability that the decoder chooses an erroneous sequence Z instead of the transmitted sequence X, defined as In ( 30), we substitute m(R t+1 , R t , X t ) with the metrics in (20), (22), and (23), and the corresponding metrics for m(R t+1 , R t , Z t ).Assuming that the previous codeword symbols a t k = (1 + j)/2 and the subchannel noise variables n t k are i.i.d.zero-mean complex Gaussian distributed r.v. with variance N 0 /2 per dimension, by dropping the time index t for simplicity, we obtain where Q(•) is the Gaussian error function: and the symbol energy involved in STBC is for ξ = k and k + (K/2).If we further assume that E 2 ξ = 1, the pairwise error probability given by (31) simplifies to which is the same expression given in [5], except that 2N 0 is replaced by 4N 0 , corresponding to 3 dB performance loss of differential TC-CIOD scheme.Using the inequality and ignoring multiplier 1/2 for simplicity, we may upper bound (34) as where the modified Euclidean distance between pair of trellis codewords X and Z is given as The rotated trellis codewords corresponding to X and Z are denoted by X and Z, respectively.Let X and Z differ only during the short part with length κ, that is, only In this case, we may rewrite (37) as where η = {s + 1, s + 2, . . ., s + κ}, f (k) = π I (k)/2 , g(k) = π Q (k)/2 and • takes the integer part of the operand.The coordinate interleaver π can be represented by a pair of permutations for real and imaginary parts of the input vector denoted by π I (k) and π Q (k), respectively, used in the definition of f (k) and g(k).According to (11), In general, θ can be selected such that for x k / =z k , both of real and imaginary components of x k and z k do not differ.Hence, we should consider two different sets of k values, η I and η Q for which real and imaginary components of rotated trellis codeword symbols x k and z k differ, respectively.In this case, at high signal-to-noise ratios (SNR), (41) can be expressed as where |η I | and |η Q | represent the cardinality of sets η I and η Q , respectively.It is clear from (42) that under the assumption of perfect coordinate and channel interleaving, the achievable diversity of the system is and the differential TC-CIOD coding gain is The codeword error probability can be written in terms of pairwise error probability as where P(X) is the probability of the codeword X being generated by the trellis encoder and the PEP P(X, Z) is upper bounded by (42).The trellis code and θ can be selected to minimize the codeword error probability upper bound obtained by substituting (42) in (45).The trellis code search is performed over all possible trellis generator polynomials based on the representation given in [11].We selected θ values ranging from 0.5 • till 22.5 • with 2 • steps and E s /N 0 = 17 dB during an exhaustive computer-based 4-8-16-, and 32-state 8-PSK R = 2/3 trellis codes search minimizing the codeword error probability upper bound calculated over all possible trellis codeword pair X and Z with length κ = 3 starting and ending at the common trellis states.Figure 2 shows the codeword error probability (P e ) upper bound of best trellis codes found for different values of θ for considered 4-, 8-, 16-, and 32-state trellises.It is clear from Figure 2 that the codeword error probability upper bounds for the best trellis code decrease with θ and achieve their minimum  value for θ= 22.5 • .Note that using rotation angles greater than 22.5  1 is found by minimizing the codeword error probability upper bound for E s /N 0 = 21 dB and κ = 4. Similarly, the 8-16-, and 32-state trellis codes are found for E s /N 0 = 17 dB and κ = 6.The κ value used during the search is selected larger for trellises with larger number of states to cover the critical codeword pairs with considerable effect on the system CER performance.The E s /N 0 values used during the search were selected to find the optimum trellis codes for CER of 10 −2 which usually is an operation region for the system.

NUMERICAL RESULTS
In this section, we give the simulation results for the proposed system and evaluate the effect of interleaver selection on the performance of the concatenated schemes.We use two-symbol [3], symbol, and coordinate interleavers and consider the performance of both differential and nondif-ferential TC-STBCs.Figure 3 shows the codeword error rate (CER) of the systems with efficiency of 2 bps/Hz, when trellis code termination and OFDM cyclic prefix are excluded.The channel model used during the simulations is given in (18), where H μ k 's are independent and identically distributed Gaussian random variables with variance 1/2 per dimension, and in order to obtain the mean CER performances of the differential systems, the H μ k values are randomly assigned multiple times during the simulation after each 10 codeword transmissions followed by a dummy frame transmission to initiate the differential decoder to the random channel change.Hence, this model corresponds to a very slow varying fading channel.The perfectly interleaved multipath channel, that is, independent H μ k 's, 48 OFDM subcarriers, and the perfect knowledge of the scaling coefficients S t k , were assumed during the simulations.The proposed scheme outperforms the differential two-symbol interleaved TC-STBC proposed by Tarasak and Bhargava [3] by 8.5 dB in SNR at a CER of 10 −3 .Note that the symbol interleaver doubles the multipath diversity achieved by TC-STBC compared to two-symbol interleaver considered in [2,3], and outperforms the two-symbol interleaved case by 6.5 dB in SNR at the CER of 10 −3 .During the simulations, we employed a 2 × 48 block interleaver between TC and STBC as symbol interleaver.When a symbol interleaver is used, the set size ω, defined in [2], becomes equal to effective length (time diversity) of the trellis code.Hence, the maximum achievable diversity of TC-STBC doubles.All of the codes employ a rate 2/3 8-PSK 4-state trellis used in [2], except the one denoted by T2, which uses the optimized 4-state trellis code given in Table 1.For TC-CIOD, the rotation angle θ is taken equal to 22.5 • , which is found to be optimum for R = 2/3 8-PSK trellis codes with 4-, 8-, 16-, and 32-states.The T2 trellis optimized for TC-CIOD improves the performance of differential TC-CIOD by 0.4 dB.For the sake of comparison, the CER performances of the nondifferential TC-STBC and TC-CIOD systems are also shown in Figure 3.As expected, the CER performances of nondifferential schemes have approximately 3 dB coding gain advantage compared to their differential counterparts.In Figure 4, the CER performances of the optimum differential TC-CIOD with trellis codes given in Table 1 are compared with those of 8-, 16-, and 32-state differential TC-STBC with optimum trellis codes proposed in [3,Table I].The perfectly interleaved multipath channel, 256 OFDM subcarriers, and perfect knowledge of the scaling coefficients S t k , were assumed during the simulations.As seen from Figure 4, the proposed scheme considerably outperforms the differential two-symbol interleaved TC-STBC given in [3].Using TC-CIOD instead of TC-STBC with forementioned 8-, 16-, and 32-state trellis codes provides approximately 9.5 dB, 4 dB, and 3.5 dB SNR gain at the CER of 10 −3 .
Figure 5 shows the simulation results of the proposed differential TC-CIOD and reference two-symbol interleaved differential TC-STBC [3] with the same bandwidth efficiency over the COST 207 12-ray typical urban (TU) channel model [12].The TC-CIOD and TC-STBC employ 4-state 8-PSK R = 2/3 trellis codes from Table 1 and [3], respectively.K = 256 OFDM subcarriers and OFDM symbol duration   T s = 128 μs were selected during simulations.The CER performances with perfect knowledge (PK) of the scaling coefficients S t k were simulated for normalized Doppler frequencies f D,n = 0.001 and f D,n = 0.01, that for OFDM symbol period T s = 128 μs and carrier frequency f c = 900 MHz correspond to mobile terminal speeds v = 9.37 km/h and v = 93.69km/h, respectively.Figure 5 shows that the high mobile terminal speeds cause an error floor due to the rapid change of channel weights.The simulations performed by estimating the scaling coefficients S t k at the receiver by using (26) and ( 27) are indicated by the subchannel power estimation length M in Figure 5. M = 10 and M = 4 were found to be optimum by exhaustive computer simulations for f D,n = 0.001 and f D,n = 0.01, respectively, under the considered channel conditions.When perfect channel interleaving is not considered, the selection of the channel interleaver α considerably affects the CER performances of TC-CIOD and TC-STBC systems.We performed the simulations for all possible block-type channel interleavers α and found that the performance of both systems improves when 2 × 128 block type channel interleaver is employed.Hence, all of the results given in Figure 5 are for 2 × 128 block channel interleaver.Figure 5 shows that the perfect knowledge (PK) of the scaling coefficients S t k provides approximately 2 dB and 4 dB SNR gain at the CER of 10 −2 when f D,n = 0.001 (M = 10) and f D,n = 0.01 (M = 4), respectively.Note that we also simulated the TC-CIOD performance when scaling coefficients S t k are estimated by using the previous decoder output in (13) to find (|a t 2k | 2 + |a t 2k+1 | 2 ) and used in (24).However, this method does not provide useful results due to error propagation.Figure 5 also shows that the proposed TC-CIOD scheme outperforms the reference TC-STBC [3] scheme by 4 dB at the CER of 10 −2 and by 6 dB at the CER of 10 −3 when f D,n = 0.001.Additionally, the proposed scheme has a much lower error floor when channel weights are rapidly changing ( f D,n = 0.01).
Figure 6 shows the CER performances of the proposed differential TC-CIOD and the reference two-symbol interleaved differential TC-STBC [3] with 8-state 8-PSK R = 2/3 trellis codes from Table 1 and [3], respectively.The 2 × 128 block-type channel interleaver α is employed in all systems.Figure 6 shows that PK of the scaling coefficients S t k provides approximately 2 dB and 3 dB SNR gain at the CER of 10 −2 when f D,n = 0.001 and f D,n = 0.01, respectively.Figure 6 also shows that the proposed 8-state TC-CIOD outperforms the reference 8-state TC-STBC [3] by 4 dB at the CER of 10 −2 and by 6 dB at the CER of 10 −3 when f D,n = 0.001.Additionally, the proposed scheme has a 10 times lower error floor when the channel weights are rapidly changing ( f D,n = 0.01).

CONCLUSIONS
A robust differential TC-CIOD OFDM system, which provides a high diversity gain, and achieves a considerable CER performance improvement compared to existing schemes, has been proposed.The new space-time coding scheme employs coordinate interleaver and trellis code to boost the MIMO-OFDM performance, and has the advantage of avoiding pilot symbol transmission for CSI recovery.We have derived the Viterbi branch metrics for differential decoding, and investigated the design criteria for trellis codes.The optimized 4-, 8-, 16-, and 32-state R = 2/3 8-PSK trellis codes for TC-CIOD have been found by exhaustive computer-based search.The computer simulation results have shown that the new differential scheme considerably outperforms the existing scheme.

4 EURASIP
Journal on Wireless Communications and Networking can be expressed as
• gives the same P e upper bound values due to the considered 8-PSK constellation.The generator polynomials in octal form for the trellis codes optimizing (45) obtained by exhaustive computer based code search are given in Table 1, where the optimum θ= 22.5 • is used.Table 1 also shows the achievable diversity gain G d and the coding gain G c values obtained from (43) and (44), respectively, for θ= 22.5 • .The 4-state trellis code in Table