Achieving lowcomplexity maximumlikelihood detection for the 3D MIMO code
 Ming Liu^{1},
 Matthieu Crussière^{1}Email author,
 Maryline Hélard^{1} and
 JeanFrançois Hélard^{1}
https://doi.org/10.1186/16871499201420
© Liu et al.; licensee Springer. 2014
Received: 15 July 2013
Accepted: 9 January 2014
Published: 1 February 2014
Abstract
The 3D Multipleinput multipleoutput (MIMO) code is a robust and efficient spacetime block code (STBC) for the distributed MIMO broadcasting but suffers from high maximumlikelihood (ML) decoding complexity. In this paper, we first analyze some properties of the 3D MIMO code to show that the 3D MIMO code is fast decodable. It is proven that the ML decoding performance can be achieved with a complexity of O(M^{4.5}) instead of O(M^{8}) in quasistatic channel with Mary square QAM modulations. Consequently, we propose a simplified ML decoder exploiting the unique properties of the 3D MIMO code. Simulation results show that the proposed simplified ML decoder can achieve much lower processing time latency compared to the classical sphere decoder with SchnorrEuchner enumeration.
Keywords
1 Introduction
Multipleinput multipleoutput (MIMO) is a promising technique that can bring significant improvements to the wireless communication systems. In combination with spacetime block code (STBC), it provides higher spectrum efficiency with better communication reliability [1]. In the last decades, MIMO has been widely employed in the latest wireless communication standards such as IEEE 802.11n, 3GPP Long Term Evolution (LTE), WiMAX, and Digital Video BroadcastingNext Generation Handheld (DVBNGH). It is also seen as the key technology for the future digital TV terrestrial broadcasting standards [2].
A socalled spacetimespace (3D) MIMO code [3] was proposed for future TV broadcasting systems, in which the services are delivered by the MIMO transmission in a singlefrequency network (SFN). Specifically, it is proposed for a distributed MIMO broadcasting scenario, where TV programs are transmitted by two geographically separated transmission sites, each site equipping two transmit antennas. On the other hand, each receiver has two receive antennas, forming a 4×2 MIMO transmission. The 3D MIMO code has been shown to be robust and efficient in distributed MIMO broadcasting scenarios where there exist strong received signal power imbalances [4]. Hence, it is a promising candidate for MIMO profile of future broadcasting standards. However, the 3D MIMO code suffers from a high computational complexity when the maximumlikelihood (ML) decoding is adopted. The decoding complexity is as high as O(M^{8}) when MQAM constellation is used. Up to now, no study on the decoding complexity reduction for the 3D MIMO code has been carried out in the literature.
Recently, a lot of efforts have been made in the STBC design to obtain both high code rate and low decoding complexity [5–11]. The decoding complexity reduction is commonly achieved by exploiting the orthogonality embedded in the STBC codeword. When there exist groupwise orthogonality in the codeword, the joint detection of many information symbols is converted into independent, groupwise detections [6, 10], yielding low decoding complexity. For other cases such as DjABBA code [12], BiglieriHongViterbo (BHV) code [7], SrinathRajan code [8], and IsmailFiorinaSari (IFS) code [11] in which the orthogonality only exists in a part of information symbols, some symbols can be detected in a groupwise manner once we condition them with respect to other symbols. Such kind of STBCs are referred to as fast decodable STBCs because they achieve ML decoding performance with a reduced order of complexity. However, most of the fast decodable STBCs are not optimized for distributed MIMO broadcasting scenarios, and they are not robust under received signal power imbalance conditions [4].
A partial interference cancellation (PIC) group decoding scheme has been presented, aiming at reducing the decoding complexity of the STBCs containing groupwise orthogonalities in the codewords [13, 14]. A number of STBCs that are optimized for this decoding scheme have also been proposed [14, 15]. This scheme actually uses a linear equalization to convert the joint detection of a large number of symbols to several groups of ML decodings for few symbols. However, the overall performance of this decoding scheme cannot achieve the ML optimality.
Some alternatives with reduced decoding complexity have been presented for the distributed MIMO broadcasting. Polonen and Koivunen described a STBC with less decoding complexity based on orthogonal basis [16]. However, such a code does not achieve full diversity or full rate for 4×2 MIMO transmissions and therefore performs worse than the 3D MIMO code. A ‘punctured version’ of the 3D MIMO code that possesses full rate with low decoding complexity has also been proposed [17]. However, it does not achieve full diversity and is hence less robust in harsh channel conditions.
In this paper, we propose a reducedcomplexity ML decoder for the 3D MIMO code which exploits the embedded orthogonality in the codeword. The main contributions are as follows:

We propose to modify the original 3D MIMO codeword through some permutations of information symbols which lead to an ML decoding algorithm with reduced complexity without affecting all desirable properties of the 3D MIMO code.

We prove that the 3D MIMO code is fast decodable. Moreover, we show that the worstcase decoding complexity is O(M^{4.5}) for Mary square QAM modulations which is the least among all square fullrate STBCs for 4×2 MIMO transmission.

Based on the unique properties of the new form of 3D MIMO codeword, we propose a novel implementation of the simplified decoder that achieves a lower average complexity in terms of time latency without losing the ML optimality. The proposed implementation is also applicable for other fast decodable STBCs.
The remainder of the paper is organized as follows. Some fundamentals of the MIMO detection are presented in Section 2. In Section 3, the 3D MIMO code is first recalled. Consequently, a modification of the codeword is proposed to facilitate the decoding process. Three important properties of the new codeword are also revealed. With this knowledge, in Section 4, the ML decoder with a worstcase decoding complexity of O(M^{4.5}) is derived. Then, in Section 5, a new implementation of the reducedcomplexity ML decoder is described. Section 6 presents the symbol error and complexity performance of the new decoder. Conclusions are drawn in Section 7.
1.1 Notations
When the $\left(\stackrel{\u030c}{\xb7}\right)$ operator is applied to a matrix $\mathbf{X}\in {\mathbb{C}}^{m\times n}$, the operation in (1) is performed for all elements x_{j,k} in the matrix, i.e., the (j,k)th 2×2 submatrix of $\stackrel{\u030c}{\mathbf{X}}$ is ${\stackrel{\u030c}{x}}_{j,k}$. For a complex vector $\mathbf{x}={[{x}_{1},{x}_{2},\dots ,{x}_{n}]}^{\mathrm{T}}\in {\mathbb{C}}^{n}$, the operator $\left(\stackrel{~}{\xb7}\right)$ separates the real and imaginary parts of the given vector, i.e., $\stackrel{~}{\mathbf{x}}\triangleq {[{x}_{1}^{\mathrm{R}},{x}_{1}^{\mathrm{I}},\dots ,{x}_{n}^{\mathrm{R}},{x}_{n}^{\mathrm{I}}]}^{\mathrm{T}}$. For a matrix X=[x_{1},x_{2},…,x_{ n }], where x_{ j } is the j th column of X, the operator vec(X) stacks the columns of X to form one column vector, i.e., $\text{vec}\left(\mathbf{X}\right)\triangleq {[{\mathbf{x}}_{1}^{\mathrm{T}},{\mathbf{x}}_{2}^{\mathrm{T}},\dots ,{\mathbf{x}}_{n}^{\mathrm{T}}]}^{\mathrm{T}}$. $\stackrel{~}{\text{vec}\left(\mathbf{X}\right)}$ denotes vectorizing matrix X followed by the real/imaginary part separation. The inner product of two realvalued vectors x and y is denoted by 〈x,y〉=x^{T}y. The n×n identity matrix is denoted by I_{ n }. The operator ⊗ denotes the Kronecker product. Finally, i represents $\sqrt{1}$.
2 System model
2.1 MIMO system model
where $\mathbf{X}\in {\mathbb{C}}^{{N}_{\mathrm{t}}\times T}$ is the STBC codeword matrix which is transmitted over T channel uses, $\mathbf{W}\in {\mathbb{C}}^{{N}_{\mathrm{r}}\times T}$ is a complexvalued additive white Gaussian noise (AWGN) component, $\mathbf{H}\in {\mathbb{C}}^{{N}_{\mathrm{r}}\times {N}_{\mathrm{t}}}$ is the channel matrix whose (j,k)th element h_{j,k} denotes the channel coefficient of the link between the k th transmit antenna and the j th receive antenna. The channel is assumed to be quasistatic. That is, the channel coefficients keep constant over the duration of one STBC codeword, but change from one codeword to another. Moreover, h_{j,k}’s are assumed to be independent from each other.
where ${\mathcal{A}}_{j}\in {\mathbb{C}}^{{N}_{\mathrm{t}}\times T}$ and ${\mathcal{\mathcal{B}}}_{j}\in {\mathbb{C}}^{{N}_{\mathrm{t}}\times T}$ are the complex weight matrices representing the contribution of the real and imaginary parts of the j th information symbol s_{ j } in the final codeword matrix.
Note that the realvalued expression of the signal can be obtained from the complexvalued form via a linear transform. Hence, we will jointly use both real and complexvalued forms in the sequel.
2.2 ML decoding of MIMO signals
where Θ is the set of the constellation symbols. (7) indicates that the ML solution is found by jointly determining κ independent information symbols. In other words, when the modulation of these symbols is MQAM, the ML decoding should exhaustively check all M^{ κ } combinations. The search complexity grows dramatically with higher modulation order or larger number of information symbols in one codeword. Hence, the ML decoding is computationally demanding.
2.3 Fast ML decoding of MIMO signals
where $\stackrel{~}{\mathbf{z}}={\mathbf{Q}}^{\mathrm{T}}\stackrel{~}{\mathbf{y}}$ is a linear transformation of received signal and $\stackrel{~}{\mathbf{z}}\in {\mathbb{R}}^{2\kappa}$; is a hypersphere centered on the received signal. Only the codewords inside the hypersphere are checked during the search in order to reduce the search complexity. The size of the hypersphere is represented by its radius. The decoding process is turned into a bounded search over a κlevel tree with complexvalued nodes. Hence, the worstcase decoding complexity is O(M^{ κ }).
Moreover, according to the property of the QR decomposition, some information symbols can be decoded independently from the others if some elements of R are equal to zero. It suggests that the joint search in a high dimension is converted into a bunch of parallel, independent searches in low dimensions. This results in a significant reduction of the worstcase decoding complexity [7, 8, 19].
3 3D MIMO code
In this section, we propose a new 3D MIMO codeword that enables low sphere decoding complexity via exchanging the positions of information symbols in the original 3D MIMO codeword. The basic idea behind this modification comes from the facts that the orthogonality embedded in the information symbols essentially enables independent detections and the sphere decoding complexity is mainly determined by the orthogonality among the first several symbols. Hence, exploiting the underlying orthogonality in the codeword and carefully choosing the sequence of information symbols can bring benefits in terms of decoding complexity.
3.1 A new proposal of the 3D MIMO codeword
where $\theta =\frac{1+\sqrt{5}}{2}$, $\stackrel{\u0304}{\theta}=1\theta $, α=1+i(1−θ), and $\stackrel{\u0304}{\alpha}=1+i(1\stackrel{\u0304}{\theta})$. It is constructed in a hierarchical manner: eight information symbols (κ=8) are first encoded to two golden codewords [20], i.e. X_{golden,1} and X_{golden,2}, which are consequently arranged in an Alamouti manner [21] over four channel uses (T=4)^{b}. This results in a code rate of 2 which is full rate for the 4×2 MIMO transmission. Previous study shows that the 3D MIMO code achieves efficient and robust performance. However, since eight information symbols are stacked in one codeword, the ML decoding complexity is up to O(M^{8}).
Since we only change the sequence of the information symbols in the codeword (the third and fourth information symbols become the fifth and sixth, respectively, and vice versa) and the information symbols are independent from each other, the new codeword preserves all the good attributes of the original 3D MIMO code in distributed MIMO scenarios. More importantly, this modification is based on the embedded orthogonalities in the 3D MIMO codeword and yields an interesting codeword structure which will be exploited to achieve lower decoding complexity. The advantages brought by the new codeword structure will be highlighted in the following sections.
3.2 Key properties of the proposed 3D MIMO codeword
where R_{ jk }’s are 4×4 submatrices containing 〈q_{ m },h_{ n }〉’s with m=4(j−1)+1,…,4j and n=4(k−1)+1,…,4k.
Based on the new codeword in (10) and taking into account (6), (3), and (4), we obtain a few interesting properties of R that can be made use of to achieve a low decoding complexity.
Theorem
Theorem 1.R_{11} is an upper triangular matrix with 〈q_{1},h_{2}〉=〈q_{1},h_{4}〉=〈q_{2},h_{3}〉=〈q_{3},h_{4}〉=0. □
Theorem
Theorem 2.R_{12} is a null matrix when the channel is quasistatic, i.e., 〈q_{ j },h_{ k }〉=0, ∀j=1,2,3,4, and k=5,6,7,8. □
Corollary
Corollary 1.R_{22} is an upper triangular matrix with similar structure as R_{11}, i.e., 〈q_{5},h_{6}〉=〈q_{5},h_{8}〉=〈q_{6},h_{7}〉=〈q_{7},h_{8}〉=0. □
Remark
Remark 1. Theorem 1 and Corollary 1 actually suggest the independency between real and imaginary parts of the information symbols. For instance, 〈q_{1},h_{2}〉=〈q_{1},h_{4}〉=〈q_{3},h_{4}〉=0 means that the real parts of the first and second received symbols, namely $\stackrel{~}{\mathbf{z}}\left(1\right)$ and $\stackrel{~}{\mathbf{z}}\left(3\right)$, do not contain any contribution from ${s}_{1}^{\mathrm{I}}$ and ${s}_{2}^{\mathrm{I}}$. Similarly, 〈q_{2},h_{3}〉=0 means that their imaginary parts, namely $\stackrel{~}{\mathbf{z}}\left(2\right)$ and $\stackrel{~}{\mathbf{z}}\left(4\right)$, do not contain any contribution from ${s}_{1}^{\mathrm{R}}$ and ${s}_{2}^{\mathrm{R}}$, either. As we will show later, this real/imaginary independency leads to independent and parallel detections for real part and imaginary part, respectively.
The real/imaginary part independency comes from the underlying golden and Alamouti structures. It has been revealed that the complexvalued R matrix of the golden code has a real upper left submatrix [19], which coincides with the structure as presented in Theorem 1. It shows the real/imaginary part independency of the golden code in its 2×2 codeword matrix. The Alamoutilike arrangement of the two golden codewords, on the other hand, helps create this independency in the 4×4 codeword matrix of the 3D MIMO code.
Remark
Remark 2. Theorem 2 indicates that some parts of the information symbols are uncorrelated with others in the received symbols. More precisely, the first two received complex symbols, or equivalently $\stackrel{~}{\mathbf{z}}\left(1\right)$, $\stackrel{~}{\mathbf{z}}\left(2\right)$, $\stackrel{~}{\mathbf{z}}\left(3\right)$, and $\stackrel{~}{\mathbf{z}}\left(4\right)$, do not contain any contribution from information symbols s_{3} and s_{4}. Hence, a group of six information symbols s_{1}, s_{2}, s_{5}, s_{6}, s_{7}, and s_{8} can be jointly determined, regardless of the values of s_{3} and s_{4}. It means that the ML decoding can be achieved by joint searches over six, instead of eight, information symbols. In other words, the ML decoding complexity is expected to be O(M^{6}) instead of O(M^{8}). Therefore, the 3D MIMO code is fast decodable.
It should be noted that Theorem 2 is partially enabled by the embedded Alamouti structure in the codeword. The channel coefficients should be constant within the duration of one codeword to validate the orthogonalities in the Alamouti structure. Hence, Theorem 2 is only valid in the quasistatic channels^{c}.
3.3 Comparison with the original 3D MIMO codeword
It should be emphasized that the new codeword only changes the sequence of the information symbols in the codeword to facilitate the decoding process. It does not affect all the good properties of the 3D MIMO code.
4 Proposed ML decoder with low complexity
In this section, a lowcomplexity ML decoding algorithm exploiting the unique properties highlighted in the previous section is proposed for the 3D MIMO code. Generally speaking, the complexity reduction is achieved in two steps. Based on Theorem 2, the joint detection of eight information symbols is converted into two partially independent detections of six information symbols. This step reduces the worstcase decoding complexity from O(M^{8}) to O(M^{6}). Consequently, using Theorem 1 and Corollary 1, the detections of complex information symbols are converted into independent detections of real and imaginary parts, which further reduces the worstcase complexity to O(M^{4.5}).
4.1 Groupwise parallel detections
From (12) and (13), it can be seen that the contributions from the information symbol groups a and b are uncorrelated in the received symbol. For instance, z_{12} does not contain any information from b, and z_{34} is irrelevant to a, either. This enables us to use groupwise conditional detections to retrieve the ML solutions [23].
with v_{12}=z_{12}−R_{13}c−R_{14}d, v_{34}=z_{34}−R_{23}c−R_{24}d. The outer search is carried out over the combinations of [c,d]. For a given [c,d], the search of a and the search of b are performed in parallel. The concatenation of the outer and inner searches (either a or b) results in a joint search of six information symbols. Therefore, the worstcase decoding complexity is reduced from O(M^{8}) to O(M^{6}). We note that this complexity reduction does not rely on the constellation that is adopted by the information symbols. In other words, the 3D MIMO code requires a worst decoding complexity of O(M^{6}) for arbitrary modulation.
4.2 Independent detections of real and imaginary parts
${\u015d}_{1}^{\mathrm{I}}$ is computed by applying the solution ${\u015d}_{2}^{\mathrm{R}}$ in (24).
Using the same technique, the best solutions of b in (17) can also be converted into independent detections of b^{R} and b^{I}. Substituting R_{11}, v_{1}, v_{2}, s_{1}, and s_{2} in (21), (22), (23), and (24) by R_{22}, v_{3}, v_{4}, s_{3}, and s_{4}, respectively, it yields the detections for s_{3} and s_{4}. In general, for a given [c,d], the search of two complex symbols [a,b] is turned into four independent searches of $\sqrt{M}$ PAM symbols. The resulting overall complexity to decode a whole codeword is O(M^{4.5}).
Comparison of ML decoding complexities of STBCs for 4 × 2 MIMO transmission
5 Proposed implementation of the simplified ML decoder
In the previous sections, we have illustrated the fast decodability of the 3D MIMO code in theory. With this knowledge, we propose an implementation of the simplified ML decoder that can be used in practice. Using the twostage tree search structure and leveraging the symmetry structure in the codeword, the proposed implementation requires a low average complexity in practice. Moreover, various performance complexity tradeoffs can be easily achieved by replacing the sphere decoder by other suboptimal tree search algorithms such as Kbest algorithm [25] and fixedcomplexity sphere decoder [26].
5.1 Twostage decoding structure
Algorithm 1 Simple ML decoder for 3D MIMO code
Algorithm 2 Simple ML decoder SimpML
Algorithm 3 Parallel decision algorithm ParaDec
Algorithm 4 Column switch algorithm ColSwt
5.1.1 Fourlevel tree search phase
The joint detection of [c,d] is realized by a complex sphere decoder with SchnorrEuchner enumeration, which is visualized by the search over a fourlevel tree as shown in Figure 3. The nodes of the same level represent all the solutions of a complex information symbol. Each path from the root to a leaf node represents a possible combination of [c,d].
The details of the tree search is explicitly presented in Algorithm 2. The search starts from the root node and traverses the nodes of lower levels in a depthfirst manner. An adaptive search radius is used to speed up the convergence of the algorithm by limiting the search within a hypersphere . For the node under checking, the partial distance resulted by the current path is compared with the radius. If the partial distance is smaller than the radius, the search moves on to the children nodes on the next level. Otherwise, the search jumps to another sibling node on the current level. When all the nodes of the level have already been checked, the search goes back to the upper level. The radius is initially set to infinity and is adaptively decreased according to the best solution already found in the search. Specifically, the radius is updated, taking into account the best combination of [c,d], and [a,b] (line 16 of Algorithm 2). The latter is obtained from the parallel decisions phase. The tree search is terminated when all the nodes within the hypersphere have been checked. The best solution is the ML solution.
The sequence in which the sibling nodes are visited is determined according to the their partial distances in an ascending order. This is to guarantee that the promising candidates are visited first in order to reduce the search complexity. This ordering process is referred to as the SchnorrEuchner enumeration [18, 27, 28]. It can simply be implemented by a lookup table [29, 30] (line 4 in Algorithm 1), and its complexity is merely the computation of the linear estimation ${\u015d}_{{}_{\text{ZF}}}$.
5.1.2 Parallel decision phase
Once a leaf node is achieved in the tree search, a better solution of [c,d] is found. Consequently, the tree search process is suspended, and the new [c,d] is used to trigger the parallel detections of the rest symbols.
Moreover, we propose a mechanism that terminates the search in each branch not only based on its own results but also taking into account the results from the other branches. In particular, once the best solution of the j th branch is found ahead of the others, the resulting branch distance d_{ j } is recorded and shared with other branches to speed up the overall search process.
Take the search of the first branch as an example. The most promising PAM symbol in the unchecked symbol list is assigned to ${\overline{s}}_{2}^{\mathrm{R}}$ (line 8 in Algorithm 3). The partial distance τ_{1} is calculated (line 9 in Algorithm 3). The search is terminated in two cases: (a) if this partial distance is greater than the current minimum branch distance (τ_{1}>p_{1}) and (b) if the overall distance is beyond the current radius of the sphere decoder in the tree search phase ((τ_{1}+d_{2}+d_{3}+d_{4}+d)>radius).
Once the searches on all the branches are terminated, the solution [a,b] and the resulting distance d_{p} are returned to the tree search phase. The tree search process is resumed. The overall distance is compared with the current radius (line 14 in Algorithm 2) to determine whether the current solution is a better one. If a better solution is found, the radius is updated accordingly (line 16 in Algorithm 2). The tree search process is moved on to the next unchecked node.
5.2 Column switch based on ZF estimation
In the proposed algorithm, the search of eight symbols is divided into a tree search for four symbols and parallel detections for the other four symbols. Due to the symmetric structure of the codeword matrix (10), some parts of the codewords can be exchanged without changing the properties of the 3D MIMO code. For instance, we have the same properties as illustrated in Section 3 after exchanging the positions of [s_{1},s_{2},s_{3},s_{4}] with [s_{5},s_{6},s_{7},s_{8}]. Similarly, if we exchange [s_{1},s_{2}] with [s_{3},s_{4}] and exchange [s_{5},s_{6}] with [s_{7},s_{8}] simultaneously, the structure of R matrix maintains, as well. That is to say, besides the original symbol sequence, the proposed lowcomplexity decoding algorithm is also valid with other three permuted symbol sequences, i.e., [s_{5},s_{6},s_{7},s_{8},s_{1},s_{2},s_{3},s_{4}], [s_{3},s_{4},s_{1},s_{2},s_{7},s_{8},s_{5},s_{6}], and [s_{7},s_{8},s_{5},s_{6},s_{3},s_{4},s_{1},s_{2}].
The exchanging of the symbol sequences can be achieved by permuting the corresponding columns in the equivalent channel matrix H_{eq}. Note that the aforementioned column permutations do not affect the decoding performance. This permits us to choose the symbols that will be determined by the tree search and the ones that will be decoded in the parallel detections.
The proposed column switch method is presented in Algorithm 4. The basic idea is to use the tree search to determine the more difficult half part and use the parallel detections to find the easier half part. The reason behind this idea is that the parallel decoding is more efficient to decode the reliable symbols separately. The more accurate the linear estimation, the faster is the convergence speed for each individual detection branch. On the other hand, the tree search phase is a joint serial detection in nature which is more suitable to decode those unreliable symbols.
The next question is how to properly choose the unreliable symbols. In the literature, Barbero and Thompson proposed to sort the decoding sequence based on the norm of subchannels in the fixedcomplexity sphere decoder [26]. However, it is not applicable here because the 3D MIMO code achieves full diversity, and the equivalent subchannels have similar norm values. In addition, as we have to maintain the structure of the R matrix, the unconstrained subchannel sorting proposed in [29] is not applicable, either.
where ${\mathbf{s}}_{{}_{\text{ZF}}}={\mathbf{H}}_{\text{eq}}^{\u2021}\mathbf{y}$ is the unconstrained estimation of the information symbols in which ${\mathbf{H}}_{\text{eq}}^{\u2021}$ represents the inverse of the equivalent channel matrix; ${\widehat{\mathbf{s}}}_{{}_{\text{ZF}}}=\mathtt{\text{Q}}\left({\mathbf{s}}_{{}_{\text{ZF}}}\right)$ is the constellation point that is closest to s_{ZF}. The metric is the distance between the estimated information symbols and the nearest constellation points, i.e., an indicator of the estimation accuracy.
Using (25), the decoding sequence can be determined in two levels. We first compare the aggregate errors of the first half and second half parts of the symbols (line 2 in Algorithm 4). The half with worse accuracy is assigned to the tree search (put in the latter part of the decoding sequence). Consequently, within this half part, the errors of the first two symbols and the second two are compared. The two symbols with worse accuracy are put closer to the root of the tree. If this twosymbolbytwosymbol exchange takes place in the second half of the symbols which are to be decoded using the tree search, the same twosymbolbytwosymbol exchange should be done accordingly in the other half in order to maintain the structure of the R matrix. If only the symbol exchange between the two halves of the symbols is carried out, it is referred to as ‘4by4 column switch’. Otherwise, if the exchange within each half is also performed, it is called ‘2by2 column switch’. The advantage of the column switch will be shown in the next section.
6 Simulation results
6.1 BER performance
6.2 Computational complexity
In general, we can see the different tradeoffs achieved by the different decoders. The proposed decoder achieves ML performance with less time latency and less divisions than the GuoNilsson. On the other hand, the GuoNilsson decoder needs less multiplications with some performance loss with 16QAM.
7 Conclusion
The 3D MIMO code has been shown to be efficient and robust in distributed MIMO scenarios. Yet, it suffers from high ML decoding complexity. In this paper, we first proposed a new form of the the 3D MIMO codeword and investigated some important properties of the new codeword. With these properties, the 3D MIMO code is proven to be fast decodable. Consequently, we proposed a reducedcomplexity ML decoder for the 3D MIMO code which offers the same performance as that of the ML decoder. The simulation results demonstrate that the novel lowcomplexity decoder yields much less processing time latency than the classical GuoNilsson sphere decoder with SchnorrEuchner enumeration. Moreover, the proposed 2by2 column switch technique can significantly reduce the average decoding complexity, especially with the 16QAM modulation.
Endnotes
^{a} We assume that the receiver has perfect knowledge of the channel in our work. In practice, the channel coefficients should be estimated using some channel estimation techniques.
^{b} Note that this construction is different from those of the quasiorthogonal code [5] and the EAST code [9].
^{c} The fast decodability of the other STBCs such as DjABBA, BHV, SrinathRajan, and IFS codes also requires quasistatic channel assumption.
Appendix
Definition of the QR decomposition
where r_{1}=h_{1}, ${\mathbf{r}}_{j}={\mathbf{h}}_{j}\sum _{k=1}^{j1}\u3008{\mathbf{q}}_{k},{\mathbf{h}}_{j}\u3009{\mathbf{q}}_{k}$, q_{ j }=r_{ j }/∥r_{ j }∥, j=1,…,2κ.
Proof of Theorem 1
Based on H_{eq}, after some straightforward computation, it yields 〈h_{1},h_{2}〉=〈h_{1},h_{4}〉=〈h_{2},h_{3}〉=〈h_{3},h_{4}〉=0. According to the definition of QR decomposition, q_{1}=h_{1}/∥h_{1}∥. Hence, 〈q_{1},h_{2}〉=〈q_{1},h_{4}〉=0.
In addition, r_{2}=h_{2}−〈q_{1},h_{2}〉q_{1}=h_{2}, q_{2}=r_{2}/∥r_{2}∥=h_{2}/∥h_{2}∥. Taking into account that 〈h_{2},h_{3}〉=0, it yields 〈q_{2},h_{3}〉=0.
Moreover, ${\mathbf{r}}_{3}={\mathbf{h}}_{3}\sum _{j=1}^{2}\u3008{\mathbf{q}}_{j},{\mathbf{h}}_{3}\u3009{\mathbf{q}}_{j}={\mathbf{h}}_{3}\u3008{\mathbf{q}}_{1},{\mathbf{h}}_{3}\u3009{\mathbf{q}}_{1}$ and q_{3}=r_{3}/∥r_{3}∥=(h_{3}−〈q_{1},h_{3}〉q_{1})/∥r_{3}∥. Therefore, 〈q_{3},h_{4}〉=(〈h_{3},h_{4}〉−〈q_{1},h_{3}〉〈q_{1},h_{4}〉)/∥r_{3}∥=0.
This completes the proof of Theorem 1.
Proof of Theorem 2
Based on H_{eq}, after some straightforward computation, it yields 〈h_{ j },h_{ k }〉=0, ∀j=1,2,3,4, and k=5,6,7,8. Using q_{1}=h_{1}/∥h_{1}∥ and q_{2}=h_{2}/∥h_{2}∥ which have been proven in the proof of Theorem 1, it yields 〈q_{ j },h_{ k }〉=0, ∀j=1,2, and k=5,6,7,8.
Using q_{3}=(h_{3}−〈q_{1},h_{3}〉q_{1})/∥r_{3}∥ which has been proven in the proof of Theorem 1, it yields 〈q_{3},h_{ k }〉=(〈h_{3},h_{ k }〉−〈q_{1},h_{3}〉〈q_{1},h_{ k }〉)/∥r_{3}∥=0, ∀k=5,6,7,8. Similarly, since q_{4}=(h_{4}−〈q_{2},h_{4}〉q_{2})/∥r_{4}∥, it yields 〈q_{4},h_{ k }〉=(〈h_{4},h_{ k }〉−〈q_{2},h_{4}〉〈q_{2},h_{ k }〉)/∥r_{4}∥=0, ∀k=5,6,7,8.
This completes the proof of Theorem 2.
Proof of Corollary 1
Using the similar method as in the proof of Theorem 1, it can be computed from the definition of H_{eq} that 〈h_{5},h_{6}〉=〈h_{5},h_{8}〉=〈h_{6},h_{7}〉=〈h_{7},h_{8}〉=0. In addition, using Theorem 2, it can be obtained that q_{5}=h_{5}/∥h_{5}∥. Hence, 〈q_{5},h_{6}〉=〈q_{5},h_{8}〉=0.
Using 〈q_{5},h_{6}〉=0 and Theorem 2, it yields q_{6}=h_{6}/∥h_{6}∥. Hence, 〈q_{6},h_{7}〉=0.
Finally, using 〈q_{6},h_{7}〉=0 and Theorem 2, it yields r_{7}=h_{7}−〈q_{5},h_{7}〉q_{5} and q_{7}=r_{7}/∥r_{7}∥=(h_{7}−〈q_{5},h_{7}〉q_{5})/∥r_{7}∥. Therefore, 〈q_{7},h_{8}〉=(〈h_{7},h_{8}〉−〈q_{5},h_{7}〉〈q_{5},h_{8}〉)/∥r_{7}∥=0.
This completes the proof of Corollary 1.
Declarations
Acknowledgements
This work has been supported by French ANR ‘Mobile MultiMedia (M3)’ project and ‘Pôle Images & Réseaux’.
Authors’ Affiliations
References
 Tarokh V, Seshadri N, Calderbank A: Spacetime codes for high data rate wireless communication: performance criterion and code construction. IEEE Trans. Inf. Theory 1998, 44(2):744765. 10.1109/18.661517MATHMathSciNetView ArticleGoogle Scholar
 DVB: TMMIMO. . Accessed 29 Jan 2014 http://www.dvb.org/groups/TMMIMO
 Nasser Y, Hélard JF, Crussière M: 3D MIMO scheme for broadcasting future digital TV in singlefrequency networks. Electron. Lett 2008, 44(13):829830. 10.1049/el:20080061View ArticleGoogle Scholar
 Liu M, Crussière M, Hélard M, Hélard JF: Distributed MIMO schemes for the future digital video broadcasting Paper presented at the 20th international conference on telecommunications (ICT). Casablanca, Morocco0; 6–8 May 2013.Google Scholar
 Sharma N, Papadias CB: Fullrate fulldiversity linear quasiorthogonal spacetime codes for any number of transmit antennas. EURASIP J. Appl. Signal Process 2004, 2004: 12461256. 10.1155/S1110865704402339View ArticleGoogle Scholar
 Dao DN, Yuen C, Tellambura C, Guan YL, Tjhung TT: Fourgroup decodable space–time block codes. IEEE Trans. Signal Process 2008, 56: 424430.MathSciNetView ArticleGoogle Scholar
 Biglieri E, Hong Y, Viterbo E: On fastdecodable spacetime block codes. IEEE Trans. Inf. Theory 2009, 55(2):524530.MathSciNetView ArticleGoogle Scholar
 Srinath K, Rajan B: Low MLdecoding complexity, large coding gain, fullrate, fulldiversity STBCs for 2 ×2 and 4 ×2 MIMO systems. IEEE J. Sel. Topics Signal Process 2009, 3(6):916927.View ArticleGoogle Scholar
 Sinnokrot MO, Barry JR, Madisetti VK: Embedded Alamouti spacetime codes for high rate and low decoding complexity. In Paper presented at the 42nd Asilomar conference on signals, systems and computers. Pacific Grove, CA, USA; 3–6 Nov 2008:17491753.Google Scholar
 Ren TP, Guan YL, Yuen C, Shen RJ: Fastgroupdecodable spacetime block code. Paper presented at the IEEE information theory workshop (ITW). Cairo, Egypt; 6–8 Jan 2010.Google Scholar
 Ismail A, Fiorina J, Sari H: A new family of lowcomplexity STBCs for four transmit antennas. IEEE Trans. Wireless Commun 2013, 12(3):12081219.View ArticleGoogle Scholar
 Hottinen A, Tirkkonen O, Wichman R: Multiantenna Transceiver Techniques for 3G and Beyond. West Sussex, England: Wiley; 2003.View ArticleGoogle Scholar
 Dai L, Sfar S, Letaief K: An efficient detector for combined spacetime coding and layered processing. IEEE Trans. Commun 2005, 53(9):14381442. 10.1109/TCOMM.2005.855016View ArticleGoogle Scholar
 Guo X, Xia XG: On full diversity spacetime block codes with partial interference cancellation group decoding. IEEE Trans. Inf. Theory 2009, 55(10):43664385.MathSciNetView ArticleGoogle Scholar
 Zhang W, Xu T, Xia XG: Two designs of spacetime block codes achieving full diversity with partial interference cancellation group decoding. IEEE Trans. Inf. Theory 2012, 58(2):747764.MathSciNetView ArticleGoogle Scholar
 Polonen K, Koivunen V: Reduced complexity spacetime coding in singlefrequency networks. Paper presented at the IEEE wireless communications and networking conference (WCNC). Cancun, Mexico; 28–31 March 2011.Google Scholar
 Liu M, Hélard M, Crussière M, Hélard JF: Distributed MIMO coding scheme with low decoding complexity for future mobile TV broadcasting. Electron. Lett 2012, 48(17):10791081. 10.1049/el.2012.1778View ArticleGoogle Scholar
 Agrell E, Eriksson T, Vardy A, Zeger K: Closest point search in lattices. IEEE Trans. Inf. Theory 2002, 48(8):22012214. 10.1109/TIT.2002.800499MATHMathSciNetView ArticleGoogle Scholar
 Sinnokrot M, Barry J: Fast maximumlikelihood decoding of the golden code. IEEE Trans. Wireless Commun 2010, 9: 2631.View ArticleGoogle Scholar
 Belfiore J, Rekaya G, Viterbo E: The golden code: a 2 ×2 fullrate spacetime code with nonvanishing determinants. IEEE Trans. Inf. Theory 2005, 51(4):14321436. 10.1109/TIT.2005.844069MATHMathSciNetView ArticleGoogle Scholar
 Alamouti S: A simple transmit diversity technique for wireless communications. IEEE J. Sel. Areas Commun 1998, 16(8):14511458. 10.1109/49.730453View ArticleGoogle Scholar
 Jithamithra G, Rajan BS: Minimizing the complexity of fast sphere decoding of STBCs. Paper presented at the IEEE international symposium on information theory (ISIT). St. Petersburg, Russia; 31 Jul–05 Aug 2011.Google Scholar
 Sirianunpiboon S, Wu Y, Calderbank A, Howard S: Fast optimal decoding of multiplexed orthogonal designs by conditional optimization. IEEE Trans. Inf. Theory 2010, 56(3):11061113.MathSciNetView ArticleGoogle Scholar
 Oggier F, Rekaya G, Belfiore JC, Viterbo E: Perfect space–time block codes. IEEE Trans. Inf. Theory 2006, 52(9):38853902.MATHMathSciNetView ArticleGoogle Scholar
 Wong K, Tsui C, Cheng R, Mow W: A VLSI architecture of a Kbest lattice decoding algorithm for MIMO channels. Paper presented at the IEEE international symposium on circuits and systems (ISCAS). Scottsdale, AZ, USA; 26–29 May 2002.Google Scholar
 Barbero L, Thompson J: Fixing the complexity of the sphere decoder for MIMO detection. IEEE Trans. Wireless Commun 2008, 7(6):21312142.View ArticleGoogle Scholar
 Schnorr C, Euchner M: Math Program. 1994, 66: 181191. 10.1007/BF01581144MATHMathSciNetView ArticleGoogle Scholar
 Guo Z, Nilsson P: Reduced complexity SchnorrEuchner decoding algorithms for MIMO systems. IEEE Commun. Lett 2004, 8(5):286288. 10.1109/LCOMM.2004.827376View ArticleGoogle Scholar
 Wiesel A, Mestre X, Pages A, Fonollosa J: Efficient implementation of sphere demodulation. Paper presented at the 4th IEEE workshop on signal processing advances in wireless communications (SPAWC). Rome, Italy; 15–18 June 2003.Google Scholar
 Tsai P, Chen W, Lin X, Huang M: A 4 ×4 64QAM reducedcomplexity Kbest MIMO detector up to 1.5 Gbps. Paper presented at the IEEE international symposium on circuits and systems (ISCAS), Paris, France, 30 May–2 June 2010, pp. 3953–3956Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.