- Research
- Open Access
- Published:
Computing tight upper bounds for Bhattacharyya parameters of binary polar code kernels with arbitrary dimension
EURASIP Journal on Wireless Communications and Networking volume 2021, Article number: 76 (2021)
Abstract
Multi-kernel polar codes have recently received considerable attention since they can provide more flexible code lengths than do the original ones. The construction process of them can be simplified by obtaining the Bhattacharyya parameter bounds of the kernels employed. However, there has been currently no generic method for seeking such bounds. In this paper, therefore, we focus on the upper Bhattacharyya parameter bounds of the standard binary polar code kernels with an arbitrary dimension of \(l\ge 2\). A calculation process composing of four steps, the common column binary tree construction for the channel inputs, the common factor extraction, the calculation feasibility testing, and the upper bound calculation based on pattern matching, is formulated with a computational complexity of \(O(2^l)\). It is theoretically proved that the upper bounds obtained by the proposed method are tight, which can lay the foundation to compare the reliability of the synthesized channels in polar codes.
1 Introduction
Polar codes, pioneered by Arıkan [1] in 2008, are capable of reaching the Shannon maximum capability with low encoding and decoding complexity, which have been accepted as the coding scheme of the control channel of the 5G wireless communication systems [2].
Arıkan [1] employed the n times Kronecker power of the polarized kernel matrix denoted by \(G_2= \left[ \begin{array}{ccc} 1 &{} 0 \\ 1 &{} 1 \end{array}\right]\) to perform a linear transformation to an input block of \(N(=2^n)\) bits. By combining and splitting a Binary Input Discrete Memoryless Channel (B-DMC) N times repeatedly, the same number of polarized sub-channels are acquired. While some of them tend to possess the reliability of one, others tend to be zero. However, the code lengths of such polar codes are constrained to \(2^n\), which makes it difficult for them to be applied to the narrow-band, low-rate, and real-time communication fields that require flexible medium and short code lengths, such as real-time voice communication. Korada et al. [3] generalized the polar code kernel as an \(l\times l (l\ge 2)\) invertible matrix denoted by \(G_l\), whose arbitrary column permutation is not an upper triangular matrix. Benammar et al. [4] proved that the channel polarization condition still holds for such multi-kernel polar codes. Thus, flexible code lengths can be obtained by applying the Kronecker product
of kernels with various dimensions as a generator matrix in the construction of the polar codes, where \(B_N\) is a permutation matrix. Following this, both principle and design of multi-kernel polar codes have become a significant research area in recent years [5,6,7].
When the construction of polar codes is a concern, it is crucial to select the most reliable channels from the synthesized channels, which can be measured by the Bhattacharyya parameters [1]. However, closed-form expressions for the Bhattacharyya parameters of the synthesized channels are usually unavailable [8]. Generally, for the polar code kernels with \(l=2\), the reliability of the synthesized channels can be acquired by Bhattacharyya parameter calculation [1], Monte-Carlo simulation [1], density evolution [9, 10], Gaussian approximation [11], or an approximation by degrading and upgrading transformations, quoted as Tal-Vardy [12]. For \(l>2\), on the other hand, the Bhattacharyya calculation and Monte Carlo simulation methods are employed [13]. Among these methods, while the Bhattacharyya parameter calculation is simple but only applicable to Binary Erasure Channels (BECs), the Gaussian approximation is applicable to Additive White Gaussian Noise (AWGN) channels. More generally, the Tal-Vardy and Monte-Carlo simulation can be applied to arbitrary binary discrete memoryless channels. However, all of the constructions based on these five methods depend on the transmission channel conditions, which make it necessary to construct codes separately for different Signal-to-Noise Ratios (SNRs) [14].
Dealing with resolving the above-mentioned problems in the construction of the polar codes, general construction methods independent of transmission channel conditions have gained significant attention in recent years. Schürch et al. [15] and Wu et al. [16] proposed the partial order theory of the polarized channels, and He et al. [17] proposed the Polarization Weight (PW) for \(G_2\). Based on the comparison of the kernel channels indicated by the Bhattacharyya parameter bounds, these two theories pointed out that there were unambiguous relationships between some of the synthesized channels, which could be utilized to select more reliable channels in the construction of the polar codes. Investigating the bounds of the Bhattacharyya parameters with dimension \(l>2\) could apply these theories to the construction of large-sized and multi-kernel polar codes. Accordingly, Hanif et al. [8] suggested that the kernels’ Bhattacharyya parameter bounds could simplify the construction of the polar codes.
In the research of the Bhattacharyya parameter bounds of the polar code kernels, Arıkan [1] presented the bounds for \(l=2\); however, the lower bound was not tight enough. To address this issue, Korada [18] proposed a much tighter lower bound. As for \(l>2\), Korada et al. [3] researched the relationship between the Bhattacharyya parameter bounds and the partial distances of their kernels and proposed a concise formula for calculating the bounds of Bhattacharyya parameters. However, both lower and upper bounds are not tight enough. Zhang et al. [19] examined the upper bounds for \(l=3\), but did not the lower bounds. Cheng et al. [20] presented a formula to calculate the Bhattacharyya parameters for a given polar code kernel under BECs. However, the calculation of the proposed formula would be very complicated for polar code kernels with large dimensions.
The available researches mentioned above reveal that there has been no general method for seeking tight bounds on the Bhattacharyya parameters of the polar code kernels with an arbitrary dimension yet. Thus, investigating the bounds on the corresponding kernels’ Bhattacharyya parameters is of great importance to simplify the construction of the polar codes, which can also play a significant role in validating and evaluating the asymptotic speed of the polarization [21]. This paper, therefore, examines the upper Bhattacharyya parameters bounds of polar code kernels with dimension \(l\ge 2\).
The main contributions of this paper are summarized as follows: We concluded that any k-order sub-matrix of the inputs must have some common columns, which leads to construct a common column binary tree. Then, we proposed a process to compute the upper bound of \(Z_l^{(i)}\) utilizing an iterative pattern matching and presented a computationally feasible criterion to test whether the proposed method could be applied to a certain polar code kernel.
The rest of this paper is organized as follows. In Sect. 2, we present notations and definitions. In Sect. 3, we derive the upper bounds of the Bhattacharyya parameter. In Sect. 4, we demonstrate the computation procedure of the upper bound on channel 2 of a polar code kernel with a dimension of five. Section 5 provides a detailed discussion of the findings. Finally, Sect. 6 summarizes the paper and lists some potential directions for future research. The proofs of the properties, lemmas and theorems are provided in the “Appendix”.
1.1 Methods/experimental
This paper is mainly theoretical derivation and analysis, and no experiment is carried out.
2 Preliminaries
In this section, we first give the symbols and definitions to be employed throughout the paper. Then, we introduce the Bhattacharyya parameter of the polar code kernels.
2.1 Symbols and definitions
Following [1], we denote random variables by upper case letters, e.g., X, Y, and their realizations by the corresponding lower case letters, i.e., x, y and use \(X^-\), \(Y^-\) to denote their upper bounds. We employ the notation \(a_0^{l-1}\) as the shorthand to denote a row vector \((a_0,a_1,\ldots ,a_{l-1})\) and use \(a_i^j\) to denote its sub-vector \((a_i,a_{i+1},\ldots ,a_j)\), where \(a_i^j\) is void when \(i>j\). Later, we will abbreviate \(a_0^{l-1}\) as the corresponding bold character \({\varvec{a}}\).
For a matrix \({\varvec{T}}\), we use \(T_{i,j}\) to denote the element in row i and column j, \({\varvec{T}}_{i,:}\) and \({\varvec{T}}_{:,j}\) to, respectively, denote the ith row vector and the jth column vector, and employ \({\varvec{T}}_{i::m,:}\) to denote the sub-matrix composing of the row vectors of \({\varvec{T}}\) with starting index i and interval m. We use \(\left[ {\varvec{T}}^{(0)}:{\varvec{T}}^{(1)} \right]\) denote for the combined matrix of \({\varvec{T}}^{(0)}\) and \({\varvec{T}}^{(1)}\), which have the same column size, in the row direction.
For an integer k, we employ \((b_0^k,b_1^k,\ldots ,b_{l-1}^k)\), denoted by b(k, l), to represent its l-bit binary expansion with the most significant bit on the left.
Furthermore, \(W: X\rightarrow Y\) represents a symmetric B-DMC transmission channel with input \(X(X={0,1})\) , output Y, and its transition probabilities denoted by \(W(y|x), x \in X, y \in Y\).
All vectors, matrices, and their operations are defined in GF(2).
Definition 1
For an integer k, the reverse order shuffle operation on its l-bit binary expansion is defined as a reverse order operator, which is denoted by r(k, l). For example, the result of r(22, 6) is represented by \(26(010110 \rightarrow 011010)\).
Definition 2
For a non-negative integer x and a positive integer vector \(b_0^{n-1}\), we employ \((m_{n-1}m_{n-2}\ldots m_{0})|_{b_0^{n-1}}\) to denote the n-digit mixed-nary representation of x where \(b_0^{n-1}\) is the base vector, \(m_i=Q_i\) mod \(b_{n-i-1}\) and
The operation of calculating the mixed-nary representation under the base vector \(b_0^{n-1}\) for x is denoted as M(x,\(b_0^{n-1}\)).
Taking the decimal number 37 as an example, its three-digit mixed-nary representation under base vector [2, 3, 8] is M(37,[2,3,8])=\((115)|_{[2,3,8]}\). The base vector indicates that the digit 1 on the left of the mixed-nary representation is in binary, the middle digit 1 is in ternary, and the right digit 5 is in octonary.
Definition 3
For an \(n\times l\) binary matrix \({\varvec{T}}\) and a certain operation \(f(\cdot )\), we define a boolean vector \(c_0^{l-1}\) as the valid column indicator, abbreviated as VCI, of \({\varvec{T}}\) for \(f(\cdot )\) with \(c_j=1\) to indicate that the elements in column j of \({\varvec{T}}\) are involved in \(f(\cdot )\) for \(\forall j \in [0,l)\). Correspondingly, \({\varvec{c}}_0^{l-1}\) is called the VCI of each row vector of \({\varvec{T}}_{k,:}(0 \le k<n)\) for \(f(\cdot )\).
Definition 4
For an \(n\times l\) binary matrix \({\varvec{T}}\) with a VCI of \({\varvec{c}}\), we define a boolean vector \(\lambda _0^{l-1}\) as the common column indicator, abbreviated as CCI, of T with \(\lambda _j=1\) to mark that all the elements in column j of \({\varvec{T}}\) are the same and \(c_j=1\) for \(\forall j \in [0,l)\). We define \(\lambda _0^{l-1}=g({\varvec{T}},{\varvec{c}})\) to denote the operation of calculating the CCI of \({\varvec{T}}\) under the VCI of \({\varvec{c}}\). Correspondingly, \({\varvec{\gamma }}_0^{l-1}={\varvec{\lambda }}_0^{l-1}\wedge {\varvec{T}}_{0,:}\) is referred as the common column vector, abbreviated as CCV, of \({\varvec{T}}\) with the VCI of \({\varvec{c}}\).
Definition 5
For two vectors \({\varvec{x}}^{(0)}\) and \({\varvec{x}}^{(1)}\) with the same VCI of \({\varvec{c}}\) for an operation \(f(\cdot )\), \(({\varvec{x}}^{(0)}\), \({\varvec{x}}^{(1)})\) is defined as a mutually different vector pair under \({\varvec{c}}\) for \(f(\cdot )\) if \({\varvec{x}}^{(0)}\oplus {\varvec{x}}^{(1)}={\varvec{c}}\).
Definition 6
For a \(2^m\times l\) binary matrix \({\varvec{T}}\), \({\varvec{T}}_{i::2^k,:}\) (\(0\le k \le m\), \(0\le i \le 2^{k-1}\)) is defined as its ith k-order sub-matrix.
It is easy to infer that the 0-order sub-matrix of \({\varvec{T}}\) is \({\varvec{T}}\) itself, and each m-order sub-matrix of \({\varvec{T}}\) has only one element.
Definition 7
For an \(l\times l\) binary invertible lower triangular matrix denoted by
where all of its diagonal elements are 1, \(G_l\) is defined as a standard binary polar code kernel with dimension l [3].
Definition 8
For an input vector \(U_0^{l-1}\), which is randomly and uniformly distributed in \(\{0,1\}_0^{l-1}\), the linear transformation sequence of its polarized kernel \(G_l\) is defined by
Here, \(\{W_l:X^l\rightarrow Y^l\}\) is defined as a combined channel under polar code kernel \(G_l\) when \(X_0^{l-1}\) is sequentially transmitted through the channel \(W:X\rightarrow Y\). Thus, the transition probability of \(W_l\) is
Definition 9
For \(X_0^{l-1}\) defined in (4) and its output \(Y_0^{l-1}\) of a combined channel \(\{W_l:X^l\rightarrow Y^l\}\), the virtual channel \(\{W_l^{(i)}:X\rightarrow Y^l\times X^{i-1},0\le i<l\}\) formed by the channel splitting under Successive Cancellation (SC) decoder in [1] is defined as a polar code kernel channel of \(G_l\).
The transition probability of \(W_l^{(i)}\) with input \(u_i\) and output \((y_0^{l-1},u_0^{i-1})\) is defined in [3] as
where the values of \(u_0^{i-1}\) are evaluated sequentially from 0 to \(i-1\) prior to \(u_i\).
Considering \(u_{i+1}^{l-1} \in \{0,1\}\), we construct an input matrix with a size of \(N \times l\) as
where \(n=l-i+1\), \(N=2^{n}\), \(u_0^{i-1}=0_0^{i-1}\) and \(0 \le k < N.\) The variables n and N will be used throughout the paper.
The linear transformation matrix is constructed from \({\varvec{v}}^{(i,u_i)}\) utilizing \(G_l\) with \({\varvec{x}}^{(i,u_i)}={\varvec{v}}^{(i,u_i)}\cdot G_l\) according to (4). The elements in \({\varvec{x}}^{(i,u_i)}\) possess the following property.
Property 1
\(\forall k\in [1,n)\) and \(\forall s,t\in [0,2^k)\), then \(g\left( {\varvec{x}}_{s::2^k,:}^{(i,u_i)},1_0^{l-1}\right) =g\left( {\varvec{x}}_{t::2^k,:}^{(i,u_i)},1_0^{(l-1)}\right)\) \(\ne 0_0^{l-1}\) holds. This means that for any k-order sub-matrix of \({\varvec{x}}^{(i,u_i)}\) must have some common columns, and the CCIs of all sub-matrices in the same order are the same.
2.2 The Bhattacharyya parameters of the polar code kernels
According to [1], the Bhattacharyya parameter of a transmission channel \(W:X\rightarrow Y\) is defined by
Similarly, the Bhattacharyya parameter of \(W_l^{(i)}\) shown in (6), referred as the ith Bhattacharyya parameter of the polar code kernel \(G_l\), can be denoted by
Since \(U_0^{l-1}\) is distributed uniformly in \(\{0,1\}_0^{l-1}\), for any function \(\varphi (\cdot )\) on \(u_k(k<i)\) where i is the index of \(W_l^{(i)}\) shown in (9), \(\sum _{u_k}\varphi (u_k\)) =2\(\varphi (0)\)=2\(\varphi (1)\) holds. Thus, according to (4), the expression of \(Z_l^{(i)}\) shown in (9), where \(u_0^{i-1}\) is set to \(0_0^{i-1}\), could be rewritten as
where \(f(\cdot )\) is defined by
and \(c_0^{l-1}\) is the VCI of \(x_0^{l-1}\) for \(f(\cdot )\). In the calculation process of (10), the initial VCI of \({\varvec{x}}_0^{(i,u_i)}\) for \(f(\cdot )\) is \(1_0^{l-1}\).
According to (11), we can easily derive that for two VCIs denoted by \({\varvec{c}}^{(0)}\) and \({\varvec{c}}^{(1)}\) of \(x_0^{l-1}\), if \({\varvec{c}}^{(0)}\wedge {\varvec{c}}^{(1)} = 0_0^{l-1}\), then
Considering the Bhattacharyya parameter of the last channel \(W_l^{(l-1)}\) for a polar code kernel \(G_l\), we attain \({\varvec{x}}^{\left( l-1,u_{l-1}=0 \right) }=0_0^{l-1}\) and \({\varvec{x}}^{\left( l-1,u_{l-1}=1 \right) }=G_{l-1,:}\) with \({\varvec{u}}_0^{l-2}\) being set to zeros according to (4) and (7). Furthermore, we can obtain \(Z_l^{(l-1)}=\sum _{y_0^{l-1}}\sqrt{f \left( 0_0^{l-1},1_0^{l-1} \right) \cdot f \left( G_{l-1,:},1_0^{l-1} \right) }\) according to (10). Due to \(\sum _{(y\in Y)}\sqrt{W(y_i |0)W(y_i |1)}=Z(W)\), we can get
It can be seen from (10) that as l increases, the composition of \(Z_l^{(i)}\) may become much more complicated, which makes it difficult to calculate the bounds of \(Z_l^{(i)}\). Since the value of \(Z_l^{(l-1)}\) can be directly calculated by (13), we mainly research the upper bounds of \(Z_l^{(i)}\) for \(0\le i\le l-2\) in this paper.
3 The upper Bhattacharyya parameter bound
In this section, we first construct a k-order sub-matrix common column binary tree for the polar code kernel channel inputs. Then, we proposed a process to calculate the upper bound of \(Z_l^{(i)}\) utilizing an iterative pattern matching.
Lemma 1
If a, b, c, and d are non-negative real numbers, then [1] defined the inequality given below
Lemma 2
For two mutually different vector pairs denoted by \(({\varvec{x}}_{0,:},{\varvec{x}}_{1,:})\) and \(({\varvec{x}}_{2,:},{\varvec{x}}_{3,:})\) with a VCI of \({\varvec{c}}\), the following inequality
holds, where \({\varvec{\lambda }}={\varvec{x}}_{0,:}\oplus {\varvec{x}}_{2,:}\).
3.1 The common column binary tree
According to Definition 4 and Property 1, we can extract a CCV for each k-order sub-matrix of \({\varvec{x}}^{(i,u_i)}\), and the extracting process can be divided into (\(n+1\)) stages ranging from 0 to n. For any stage \(k \in [0,n]\), all the \(2^k\) sub-matrices of \({\varvec{x}}^{(i,u_i)}\) have the same VCI \({\varvec{c}}_k\) and CCI \({\varvec{\lambda }}_k\). The \(\varvec{\lambda }_k\), \({\varvec{c}}_k\), and the CCV \({\varvec{\gamma }}_{k,j}^{(i,u_i)}\) of each sub-matrix can be calculated by
By doing so, a common column binary tree \({\varvec{\gamma }}^{(i,u_i)}\) can be constructed as shown in Fig. 1.
Suppose that
if there are equal paths from the child nodes of \({\varvec{\gamma }}_{k,j1}^{(i,e1)}\) and \({\varvec{\gamma }}_{k,j2}^{(i,e2)}\) to their corresponding leaf nodes in the tree shown in Fig. 1, then \(s\left( {\varvec{\gamma }}_{k,j1}^{(i,e1)}\right) =s\left( {\varvec{\gamma }}_{k,j2}^{(i,e2)}\right)\) where e1 and e2 are the instantiated values of \(u_i\).
According to (19), the expression of \(Z_l^{(i)}\) in (10) can be transformed into
Considering that \(\sum _{y\in Y} {W(y_i|x_i)}=1\) and (8), we express \({\varvec{h}} = {\varvec{\gamma }}_{0,0}^{(i,0)} \oplus {\varvec{\gamma }}_{0,0}^{(i,1)} \wedge {\varvec{c}}_0\). It can be derived that \(\sum _{y_0^{l-1}} \sqrt{f\left( {\varvec{\gamma }}_{0,0}^{(i,0)},{\varvec{c}}_0 \right) \cdot f\left( {\varvec{\gamma }}_{0,0}^{(i,1)},{\varvec{c}}_0 \right) }=Z^{\sum {\varvec{h}}}(W)\). Thus, the expression of \(Z_l^{(i)}\) in (20) can be further transformed into:
The common factor binary tree of \({\varvec{x}}^{(i,u_i)}\) shown in Fig. 1 has the following properties.
Property 2
If \(s\left( {\varvec{\gamma }}_{1,0}^{(i,0)}\right) =s\left( {\varvec{\gamma }}_{1,1}^{(i,0)}\right)\), then \(Z_l^{(i)}\) in (21) has a common factor defined by
that matches (14).
Property 3
If \(s\left( {\varvec{\gamma }}_{2,0}^{(i,0)}\right) =s\left( {\varvec{\gamma }}_{2,1}^{(i,0)}\right)\) and \({\varvec{\gamma }}_{2,0}^{(i,0)}+{\varvec{\gamma }}_{2,1}^{(i,0)}={\varvec{\gamma }}_{2,2}^{(i,0)}+{\varvec{\gamma }}_{2,3}^{(i,0)}\), then \(Z_l^{(i)}\) in (21) has a common factor defined by
that matches (14).
Since \(s\left( {\varvec{\gamma }}_{2,0}^{(i,0)}\right)\) and \(s\left( {\varvec{\gamma }}_{2,1}^{(i,0)}\right)\) are mutually different vectors with the VCI of \({\varvec{c}}_2\), both CM1 and CM2 cannot coexist, which can be extracted from \(Z_l^{(i)}\) shown in (21) to calculate its upper bound separately. By doing so, \({\varvec{\gamma }}^{(i,u_i)}\) should be reconstructed for the remaining items of \(Z_l^{(i)}\) shown in (21), which will be explained in detail later.
Theorem 1
For the reconstructed \({\varvec{\gamma }}^{(i,u_i)}\), if \(\exists j \in \{0,1\}\) makes \(s \left( {\varvec{\gamma }}_{1,0}^{(i,0)}\right) =s \left( {\varvec{\gamma }}_{1,j}^{(i,1)} \right)\) hold, then \(\sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{0,0}^{(i,0)}\right) \cdot s \left( {\varvec{\gamma }}_{0,0}^{(i,1)}\right) }\le Z_0\), where \(Z_0\) can be iteratively calculated by
and \(\hat{{\varvec{ \lambda }}}={\varvec{\gamma }}_{n,0}^{(i,0)}\oplus {\varvec{\gamma }}_{n,2}^{(i,0)}\), \({\varvec{\lambda }}^*={\varvec{\gamma }}_{1,0}^{(i,0)}\oplus {\varvec{ \gamma }}_{1,1-j}^{(i,1)}\wedge {\varvec{\lambda }}_{1,:}\), and \({\varvec{\lambda '}}={\varvec{\lambda }}_{1,:}\oplus {\varvec{ \lambda }}^*\).
3.2 The calculation of the upper bound
According to Property 2, Property 3 and Theorem 1, we construct the following process to calculate the upper bound of \(Z_l^{(i)}\) for a general polar code kernel \(G_l\).
3.2.1 The common column binary tree construction
The k-order common column binary tree \({\varvec{\gamma }}^{(i,u_i)}\) of \({\varvec{x}}^{(i,u_i)}\) can be constructed according to (16), (17) and (18) gradually. As shown in Fig. 1, the item \({\varvec{\gamma }}_{n,j}^{(i,u_i)}\) in the last stage corresponds to \({\varvec{x}}_{r(j),:}^{(i,u_i)}\); thus, \({\varvec{\gamma }}^{(i,u_i)}\) can be constructed rapidly from right to left gradually.
3.2.2 The common factor extraction
According to Property 2 and Property 3, we construct Algorithm 1 to extract the common factor of \(Z_l^{(i)}\), namely the CM1 or the CM2. The input parameters include the common column binary tree \({\varvec{\gamma }}^{(i,u_i)}\) and its CCI matrix \({\varvec{\lambda }}\). The output is a vector of \(({\varvec{c}}^*,{\varvec{r}}_0,{\varvec{r}}_1,{\varvec{r}}_2,{\varvec{r}}_3)\), where \({\varvec{r}}_i\) corresponds to \({\varvec{x}}_{i,:}\) in (15), and \({\varvec{c}}^*\) denote the VCI of \({\varvec{r}}_i\). If neither CM1 nor CM2 exists, the return value of \({\varvec{c}}^*\) is \(0_0^{l-1}\). In Step 5, \({\varvec{\gamma }}^{(i,u_i)}\) is reconstructed due to the common factor extraction.

3.2.3 The calculation feasibility testing
According to the conditions of Theorem 1, for the reconstructed \({\varvec{\gamma }}^{(i,e)}\) after conducting the common factor extraction mentioned above, if \(\exists j\) leads to \(s( {\varvec{\gamma }}_{1,0}^{(i,0)})=s({\varvec{\gamma }}_{1,j}^{(i,1)})\), it is feasible to employ the proposed method to calculate the upper Bhattacharyya parameter bound. Otherwise, the upper bound cannot be calculated with the proposed method.
3.2.4 The upper bound calculation based on pattern matching
The upper Bhattacharyya parameter bound composes of two parts: (1) the bounds of the CM1 and the CM2 and (2) the bounds of the remaining part of the reconstructed \({\varvec{\gamma }}^{(i,u_i)}\). The upper bounds of these two parts can be calculated by matching (14) and Theorem 1, respectively.
Thus, the upper bound of \(Z_l^{(i)}\) can be calculated as
where \(Z^{'}\) denotes the upper bound part contributed by both CM1 and CM2, which can be calculated according to (14) where \(Z^{'}=2\cdot Z^{\sum {\varvec{\lambda }}^*}(W)+2\cdot Z^{\sum {\varvec{c}}^*-\sum {\varvec{ \lambda }}^*}(W)-2\cdot Z^{\sum {\varvec{c}}^*}(W)\) if CM1 or CM2 exists, otherwise \(Z'\) is set to 1.
4 Illustrative examples
In this section, we utilize the following \(5\times 5\) polar code kernel as an illustrative example to demonstrate the computation of the upper bound of \(Z_5^{(2)}\):
Prior to conduct the upper bound computations, data initialization is performed for some parameters such as \(n=l-i-1=2\), \(N=2^n=4\), \({\varvec{x}}^{(2,0)}=[00000, 11101, 10010, 01111]\), and \({\varvec{x}}^{(2,1)}=[10100, 01001, 00110, 11011]\). The upper bound computation is performed by following four main steps provided below:
-
1.
The common column binary tree construction We construct the common column binary tree of \({\varvec{x}}^{(2,u_2)}\) shown in Fig. 2 according to (16), (17) and (18). The value of \({\varvec{h}}\) in (21) is calculated as [00000].
-
2.
The common factor extraction As calculated by Algorithm 1, \({\varvec{x}}^{(2,u_i)}\) has no common factor, i.e., \({\varvec{c}}^*=\)[00000].
-
3.
The calculation feasibility testing Since \(s\left( {\varvec{\gamma }}_{1,0}^{(2,0)} \right) =s \left( {\varvec{\gamma }}_{1,1}^{(2,1)} \right)\), it is feasible to calculate the upper bound with \(j=1\).
-
4.
The upper bound calculation by pattern matching Since \({\varvec{x}}^{(2,u_i)}\) has no common factor, \(Z'\) in (25) is assigned to one.
By calculating \({\varvec{\lambda }}^*={\varvec{\gamma }}_{1,0}^{(i,0)}\oplus {\varvec{\gamma }}_{1,1-j}^{(i,1)} \wedge {\varvec{ \lambda }}_{1,:} =[00100]\) and \({\varvec{\lambda '}}={\varvec{\lambda }}_{1,:}\oplus {\varvec{\lambda }}^*=[01001]\), where \(j=1\).
According to (24), we compute \(Z_1=2\cdot Z(W)+2\cdot Z^2 (W)-2\cdot Z^3 (W)\) and \(Z_0=4\cdot Z^5 (W)-8\cdot Z^4 (W)-4\cdot Z^3 (W)+12\cdot Z^2 (W)\).
According to (25), the upper bound of \(Z_5^{(2)}\) is computed by
Similarly, the upper Bhattacharyya parameter bounds of \(G_5\)’s other channels listed in “Appendix 6” are illustrated in Fig. 3. It could be seen from the figure that the reliability of all channels except for channel 0 is significantly improved compared to the transmission channel W when \(Z(W)<0.23\).
In the “Appendix,” we provide the Bhattacharyya parameter bounds of the polar code kernels with a dimension varying from 2 to 6 listed in [22]. According to [1], when W is a BEC channel, all the polar code kernels’ Bhattacharyya parameters take their upper limits and satisfy the equality defined by
Seen that all the upper Bhattacharyya parameter bounds of the polar code kernels with dimension \(l(\in [2,6])\) listed in “Appendix 6” meet (28), the correctness of the results generated by the proposed method is proven.
5 Results and discussion
In this section, we first summarize the results of this article and then discuss the computational complexity, the application scope of the proposed method and the polarization effects of some multi-kernel polar codes. Finally, we make some comparisons between the proposed method and the schemes in [3, 20] and demonstrate the possible application of the results of this paper in the construction of multi-kernel polar codes.
5.1 The results and computational complexity
None experiment has been carried out since the paper is mainly theoretical derivation and analysis.
As a result of theoretical reasoning, we gave a computation process based on the construction of a common column binary tree and pattern matching, and the results of upper bounds are tight.
For the calculation of the upper Bhattacharyya parameter bounds for a polar code kernel \(G_l\) with dimension, the main part is to construct the sub-matrix common factor binary tree of \({\varvec{x}}^{(i,u_i)}\), which needs to traverse a total of \(2^{l-i-1}+2^{l-i-2}+\cdots +2^0\) nodes. Thus, the computational complexity is \(O(2^l)\).
5.2 The scope of application
The computation of the upper Bhattacharyya parameter bounds, however, needs to meet certain conditions, which are validated by the calculation feasibility in this paper. It is pointed out in [22] that there is more than one form of polar kernels of dimension \(l(>2)\) with the same exponent. Utilizing two \(6\times 6\) polar kernels:
\(G_6^{(0)}=\left[ \begin{array}{ccccccccccc} 1&{}0&{}0&{}0&{}0&{}0\\ 1&{}1&{}0&{}0&{}0&{}0\\ 0&{}1&{}1&{}0&{}0&{}0\\ 1&{}0&{}0&{}1&{}0&{}0\\ 1&{}1&{}0&{}1&{}1&{}0\\ 0&{}1&{}1&{}0&{}1&{}1 \end{array}\right]\) in “Appendix 4” and \(G_6^{(1)}=\left[ \begin{array}{ccccccccccc} 1&{}0&{}0&{}0&{}0&{}0\\ 1&{}1&{}0&{}0&{}0&{}0\\ 1&{}0&{}1&{}0&{}0&{}0\\ 1&{}0&{}0&{}1&{}0&{}0\\ 1&{}1&{}1&{}0&{}1&{}0\\ 1&{}1&{}0&{}1&{}0&{}1 \end{array}\right]\) in [22], both of which have an exponent 0.451328, as an illustrative example. The upper Bhattacharyya parameter bound of each channel of \(G_6^{(0)}\) can be calculated by the method proposed in this paper. However, the calculation of channels 1, 2 and 3 of \(G_6^{(1)}\) are not feasible according to Theorem 1. Provided this, it is feasible to search for the polar code kernels whose upper Bhattacharyya parameter bounds can be computed by the proposed method.
5.3 Polarization effect
According to [1], the polarization effect of the original polar codes improves as the code lengths increase. In order to examine such effects of multi-kernel polar codes, \(G_2\) and \(G_5\) in “Appendix 6” are employed as illustrative examples to construct multi-kernel polar codes with length \(N=\)100, 500 and 1000.
The polarization effect of these multi-kernel polar codes and the original polar codes with code length N=128, 512 and 1024 is illustrated in Fig. 4 for the case W is a BEC with erasure probability \(\epsilon\)=0.5. The symmetric capacity values are computed according to [1] based on the upper Bhattacharyya parameter bounds of \(G_2\) and \(G_5\) in “Appendix 6”.
The result in Fig. 4 shows that the multi-kernel polar codes composed of \(G_2\) and \(G_5\) have the similar polarization effect as the original polar codes.
5.4 Comparisons and analyses
The upper Bhattacharyya parameter bounds of a kernel are pertinent to its partial distance [3]. By utilizing \(G_5\) in (26) as an illustrative example, we compare the upper bounds computed by the proposed method and those by [3], where the partial distance of \(G_5\) is (1, 2, 2, 2, 4) in [3]. Table 1 shows the upper Bhattacharyya parameter bounds of \(G_5\) computed by the two methods.
As shown in Table 1, for \(Z\in [0,1]\), the upper bounds for each channel, except for channel 4, of \(G_5\) provided by the method proposed in this paper are tighter than those of [3].
When compared to [20], the upper bounds of the kernels with dimension of three and four calculated by the proposed method are the same as that, entitled as Bhattacharyya parameter expression under BECs, in [20]. However, for the kernels with dimension of \(l\ge 5\), it is difficult to compute the upper bounds by the method in [20] since it is essentially an exhaustive computational method.
5.5 Illustrative application in construction of multi-kernel polar codes
The multi-kernel polar codes allow for more flexibility in terms of the code length than the original polar codes [7] for the enhanced mobile broadband (eMBB) control channel for the 5th generation (5G) of wireless communications. Here we consider the application of upper Bhattacharyya parameter bounds in the construction of multi-kernel polar codes for BECs, which requires selecting the most reliable ones from all polarized channels to transmit information [1].
Taking the instantiated expression
of \(G_N\) in (1) as an example, the Tanner graph shown in Fig. 5 of the multi-kernel polar code with the generator matrix of \(G_{24}\) can be constructed as in [7]. As shown in Fig. 5 for \(G_{24}\), the Tanner graph of \(G_N\) can be devided into n stages indexed from right to left.
For a multi-kernel polar code P with a generator matrix shown in (1), the following property and theorems can be established.
Property 1
For P’s one channel indexed by i, whose mixed-nary representation under base vector \(l_0^{n-1}\) is \((m_{n-1}m_{n-2}\ldots m_{0})|_{l_0^{n-1}}\), the digit \(m_k\) (\(0 \le k \le n-1\)) corresponds to the subchannel \(m_k\) of the polar code kernel \(G_{l_k}\) at stage \(l_k\).
Since Property 1 is simply derived from the Tanner graph of the polar code as shown in Fig. 5, the proof is omitted.
Theorem 2
For P’s two polarized channels indexed by i and j, whose mixed-nary representations are \([M(p,l_{s+1}^{n-1} ),a,M(q,l_0^{s-1} )]\) and \([M(p,l_{s+1}^{n-1} ),b,M(q,l_0^{s-1} )],\) respectively, if the Bhattacharyya parameters of kernel \(G_{l_s}\) satisfy \(Z_{l_s}^{(a)} \le Z_{l_s}^{(b)}\), then \(Z_N^{(i)} \le Z_N^{(j)}\) holds.
Theorem 3
For P’s two polarized channels indexed by i and j, and \(G_{l_s}\) and \(G_{l_t}\) (\(0 \le s < t \le n-1\)) are both equal to \(G_2\) listed in “Appendix 6”, if the mixed-nary representations for i and j can be expressed as \([M(p,l_{t+1}^{n-1} ),1,M(q,l_{s+1}^{t-1} ),0,M(r,l_0^{s-1} )]\) and \([M(p,l_{t+1}^{n-1} ),0,M(q,l_{s+1}^{t-1} ),1,M(r,l_0^{s-1} )],\) respectively, then \(Z_N^{(i)} \le Z_N^{(j)}\) holds.
It can be deduced that if two polarized channels satisfy Theorem 2 or 3, then one of them is always more reliable than the other one, which is independent with the transmission channel W and can be empolyed to simplify the construction of multi-kernel polar codes as in [15] and [16] .
Employing \(G_2\) and \(G_3\) in “Appendix 6” for \(G_{24}\) in (29) as an example, it can be easily inferred from the upper Bhattacharyya parameter bound expressions listed in “Appendix 6” that \(Z_2^{(0)-} \ge Z_2^{(1)}\) and \(Z_3^{(0)-} \ge Z_3^{(1)-} \ge Z_3^{(2)}\). Since the transmission channel W is a BEC, then \(W_2^{(0)} \le W_2^{(1)}\) and \(W_3^{(0)} \le W_3^{(1)} \le W_3^{(2)}\) hold in terms of reliability according to [1]. For the channels 17, 22 and 23 of \(G_{24}\), their mixed-nary representations under the base vector [2, 2, 3, 2] are \((1021)|_{[2,2,3,2]}\), \((1120)|_{[2,2,3,2]}\) and \((1121)|_{[2,2,3,2]}\), respectively. Therefore, channel 23 is superior to channel 22 in reliability according to Theorem 2, and channel 22 is superior to channel 17 according to Theorem 3. Similarly applying Theorems 2 and 3 to the remaining polarized channels, a partial order graph as shown in Fig. 6 of \(G_{24}\) could be constructed, where \(A \rightarrow B\) denotes that channel A is superior to B in terms of reliability. The reliability comparison relationship of polarized channels in the same level in Fig. 6 remains uncertain, which can be further determined by other methods such as the distance principle in [7].
These partial order results make it no longer heavy-computationally to compare the reliability of all polarized channels under the transmission channel W, which can do simplify the construction of multi-kernel polar codes.
It should be noted that the example only applies to BECs. For B-DMCs, the lower Bhattacharyya parameter bounds of the used polar code kernels should be investigated at the same time.
6 Conclusions
In this paper, we proposed a novel method to compute the tight upper Bhattacharyya parameter bounds of polar code kernels of any dimension, providing a theoretical basis for the reliability comparison of the polarized channels in the construction of the polar codes. The computation of the upper Bhattacharyya parameter bounds can be applied to some standard polarization kernels utilizing the construction of the sub-matrix common column tree of the channel inputs. Future studies should focus on searching for the standard polar code kernels that are suitable for the upper bound computation method of this paper or devising an improved method that is suitable for any standard polar code kernels.
Availability of data and materials
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
Abbreviations
- AWGN:
-
Additive White Gaussian Noise
- BEC:
-
Binary Erasure Channel
- B-DMC:
-
Binary Input Discrete Memoryless Channel
- CCI:
-
Common Column Indicator
- CCV:
-
Common Column Vector
- eMBB:
-
Enhanced Mobile Broadband
- PW:
-
Polarization Weight
- SC:
-
Successive Cancellation
- SNR:
-
Signal-to-Noise Ratio
- VCI:
-
Valid Column Indicator
References
E. Arikan, Channel polarization: a method for constructing capacity-achieving codes for symmetric binary-input memoryless channels. IEEE Trans. Inf. Theory 55(7), 3051–3073 (2009). https://doi.org/10.1109/TIT.2009.2021379
3GPP: 5G; NR; Multiplexing and channel coding. Technical Specification (TS) 38.212, 3rd Generation Partnership Project (3GPP). Version 15.1.0 (2018)
S.B. Korada, E. Şaşoğlu, R. Urbanke, Polar codes: characterization of exponent, bounds, and constructions. IEEE Trans. Inf. Theory 56(12), 6253–6264 (2010). https://doi.org/10.1109/TIT.2010.2080990
M. Benammar, V. Bioglio, F. Gabry, I. Land, Multi-kernel polar codes: Proof of polarization and error exponents, in 2017 IEEE Information Theory Workshop (ITW), pp. 101–105 (2017). https://doi.org/10.1109/itw.2017.8277949
N. Presman, O. Shapira, S. Litsyn, Mixed-kernels constructions of polar codes. IEEE J. Sel. Areas Commun. 34(2), 239–253 (2016). https://doi.org/10.1109/JSAC.2015.2504278
F. Gabry, V. Bioglio, I. Land, J. Belfiore, Multi-kernel construction of polar codes, in 2017 IEEE International Conference on Communications Workshops (ICC Workshops), pp. 761–765 (2017). https://doi.org/10.1109/ICCW.2017.7962750
V. Bioglio, F. Gabry, I. Land, J. Belfiore, Multi-kernel polar codes: concept and design principles. IEEE Trans. Commun. (2020). https://doi.org/10.1109/TCOMM.2020.3006212
M. Hanif, M. Ardakani, Polar codes: bounds on Bhattacharyya parameters and their applications. IEEE Trans. Commun. 66(12), 5927–5937 (2018). https://doi.org/10.1109/TCOMM.2018.2867475
R. Mori, T. Tanaka, Performance and construction of polar codes on symmetric binary-input memoryless channels, in 2009 IEEE International Symposium on Information Theory, pp. 1496–1500 (2009). https://doi.org/10.1109/ISIT.2009.5205857
R. Mori, T. Tanaka, Performance of polar codes with the construction using density evolution. IEEE Commun. Lett. 13(7), 519–521 (2009). https://doi.org/10.1109/LCOMM.2009.090428
P. Trifonov, Efficient design and decoding of polar codes. IEEE Trans. Commun. 60(11), 3221–3227 (2012). https://doi.org/10.1109/TCOMM.2012.081512.110872
I. Tal, A. Vardy, How to construct polar codes. IEEE Trans. Inf. Theory 59(10), 6562–6582 (2013). https://doi.org/10.1109/TIT.2013.2272694
P. Trifonov, On construction of polar subcodes with large kernels, in 2019 IEEE International Symposium on Information Theory (ISIT), pp. 1932–1936 (2019). https://doi.org/10.1109/ISIT.2019.8849672
H. Vangala, E. Viterbo, Y. Hong, A comparative study of polar code constructions for the AWGN channel. arXiv: InformationTheory (2015)
C. Schurch, A partial order for the synthesized channels of a polar code, in 2016 IEEE International Symposium on Information Theory (ISIT), pp. 220–224 (2016). https://doi.org/10.1109/ISIT.2016.7541293
W. Wu, P.H. Siegel, Generalized partial orders for polar code bit-channels. IEEE Trans. Inf. Theory 65(11), 7114–7130 (2019). https://doi.org/10.1109/TIT.2019.2930292
G. He, J. Belfiore, I., Land, G. Yang, X. Liu, Y. Chen, R. Li, J. Wang, Y. Ge, R. Zhang, et al. Beta-expansion: a theoretical framework for fast and recursive construction of polar codes, in 2017 IEEE Global Communications Conference, pp. 1–6 (2017). https://doi.org/10.1109/GLOCOM.2017.8254146
S.B. Korada, Polar codes for channel and source coding. PhD thesis, École Polytechnique Fédérale de Lausanne, Lausanne(Switzerland) (2009). https://doi.org/10.5075/epfl-thesis-4461
L. Zhang, Z. Zhang, X. Wang, Polar code with block-length n = 3 n, in 2012 International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–6 (2012). https://doi.org/10.1109/WCSP.2012.6542982
L. Cheng, L. Zhang, Q. Sun, Classification of polarizing matrices based on bhattacharyya parameters, in 2018 12th IEEE International Conference on Anti-counterfeiting, Security, and Identification (ASID), pp. 159–163 (2018). https://doi.org/10.1109/ICASID.2018.8693130
R. Mori, T. Tanaka, Source and channel polarization over finite fields and Reed-Solomon matrices. IEEE Trans. Inf. Theory 60(5), 2720–2736 (2014). https://doi.org/10.1109/TIT.2014.2312181
H. Lin, S. Lin, K. Abdelghaffar, Linear and nonlinear binary kernels of polar codes of small dimensions with maximum exponents. IEEE Trans. Inf. Theory 61(10), 5253–5270 (2015). https://doi.org/10.1109/TIT.2015.2469298
Funding
This work is supported by the Research Fund for the Doctoral Program (JY2019B162), and in part by Research Fund for the Doctoral Program (JSY2018029).
Author information
Authors and Affiliations
Contributions
TZ carried out the tight upper Bhattacharyya parameter calculation method and drafted the manuscript. SL helped to improve the calculation method and participated in drafting the manuscript. BY helped revise and improve the whole paper. All authors read and approve the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 1. Proof of Property 1
\(\forall k\in [1,n)\) and \(\forall s,t\in [0,2^k)\), then
where \({\varvec{v}}_{m\cdot 2^k+s,:}^{(i,u_i)}=\left( 0_0^{i-1},u_i,b(m,n-k),b(s,k)\right)\).
Assume that \({\varvec{\lambda }}_k^{(s)}=g\left( {\varvec{x}}_{s::2^k,:}^{(i,u_i)},1_0^{l-1}\right)\) and \({\varvec{\lambda }}_k^{(t)}=g\left( {\varvec{x}}_{t::2^k,:}^{(i,u_i)},1_0^{l-1}\right)\). According to (3), the elements of the last k columns of \({\varvec{x}}_{s::2^k,:}^{(i,e)}\) are only related to s and k, i.e., \({\varvec{\lambda }}_{k,l-k:l-1}^{(s)}=1_0^{k-1}\). Similarly, \({\varvec{\lambda }}_{k,l-k:l-1}^{(t)}=1_0^{k-1}\). Therefore, \(g\left( {\varvec{x}}_{s::2^k,:}^{(i,u_i)},1_0^{l-1}\right) \ne 0_0^{l-1}\) holds.
For \(j\in [0,l-k)\), let’s suppose that \(\lambda _{k,j}^{(s)}=1\). Then, for \(\forall m\in [0,2^{n-k})\),
Since \({\varvec{\lambda }}_{k,l-k:l-1}^{(s)}={\varvec{\lambda }}_{k,l-k:l-1}^{(t)}=1_0^{k-1}\), then,
holds, i.e., \(\lambda _{k,j}^{(t)}=1\). Therefore, \({\varvec{\lambda }}_k^{(s)}=\varvec{\lambda }_k^{(t)}\).
1.2 2. Proof of Lemma 2
Let \({\varvec{\lambda }}_0={\varvec{x}}_{0,:}\oplus {\varvec{x}}_{2,:}\) and \({\varvec{\lambda }}_1=\overline{{\varvec{\lambda }}_0}\wedge {\varvec{\lambda }}\). Since \({\varvec{x}}_{0,:}\oplus {\varvec{x}}_{1,:}={\varvec{x}}_{2,:}\oplus {\varvec{x}}_{3,:}={\varvec{c}}\), we can derive that \(f({\varvec{x}}_{i,:},{\varvec{c}})=f({\varvec{x}}_{i,:},{\varvec{\lambda }}_0)\cdot f({\varvec{x}}_{i,:},\varvec{\lambda }_1)\), \(f({\varvec{x}}_{0,:},{\varvec{\lambda }}_0)=f({\varvec{x}}_{3,:},{\varvec{\lambda }}_0)\), \(f({\varvec{x}}_{1,:},{\varvec{\lambda }}_0)=f(\varvec{x}_{2,:},{\varvec{\lambda }}_0)\), \(f({\varvec{x}}_{0,:},{\varvec{\lambda }}_1)=f({\varvec{x}}_{2,:},{\varvec{\lambda }}_1)\), and \(f({\varvec{x}}_{1,:},{\varvec{\lambda }}_1)=f({\varvec{x}}_{3,:},{\varvec{\lambda }}_1)\).
Let \(\psi =\sqrt{\left[ f({\varvec{x}}_{0,:},{\varvec{ c}})+f({\varvec{x}}_{1,:},{\varvec{c}})\right] \cdot \left[ f({\varvec{x}}_{2,:},{\varvec{c}})+f({\varvec{x}}_{3,:},{\varvec{c}})\right] }\), According to Lemma 1, we can derive that
Since \(\sum _{y_0^{l-1}}\left[ f({\varvec{x}}_{0,:},{\varvec{\lambda }}_i)+f({\varvec{x}}_{1,:},{\varvec{\lambda }}_i) \right] =2\) and \(\sum _{y_0^{l-1}}\sqrt{f({\varvec{x}}_{0,:},{\varvec{\lambda }}_i)\cdot f({\varvec{x}}_{1,:},{\varvec{\lambda }}_i)}=Z^{\sum {\varvec{\lambda }}_i}(W)\), the conclusion in Lemma 2 holds.
1.3 3. Proof of Theorem 1
The proof is divided into three cases according to the value of k.
(1) \(k=0\).
Let \(Z^*\) = \(\sum _{y_0^{l-1}}\sqrt{s \left( {\varvec{\gamma }}_{0,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{0,0}^{(i,1)}\right) }\). According to (16) and (18), both \(({\varvec{\gamma }}_{1,0}^{(i,0)}, {\varvec{\gamma }}_{1,1}^{(i,0)})\) and \(({\varvec{\gamma }}_{1,1}^{(i,0)}:{\varvec{\gamma }}_{1,1}^{(i,1)})\) are mutually different vector pairs with a VCI of \({\varvec{c}}_1\). It can be derived that \({\varvec{ \lambda }}^*\) and \(\varvec{\lambda }'\) shown in Theorem 1 denote the CCIs \({\varvec{\gamma }}_{1,0}^{(i,0)}:{\varvec{ \gamma }}_{1,j}^{(i,1)}\) and \({\varvec{\gamma }}_{1,0}^{(i,0)}:{\varvec{\gamma }}_{1,1-j}^{(i,1)}\) with the same VCI of \({\varvec{c}}_1\), respectively. Then, both \(f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},\lambda ^* \right) =f \left( {\varvec{\gamma }}_{1,j}^{(i,1)},{\varvec{ \lambda }}^* \right)\) and \(f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda '}} \right) =f \left( {\varvec{\gamma }}_{1,1-j}^{(i,1)},{\varvec{\lambda '}} \right)\) hold.
According to (19) and (12), \(Z^*\) can be transformed into:
which matches (14). Then, we can calculate the upper bound of \(Z^*\) defined by
Since \(({\varvec{\gamma }}_{1,0}^{(i,0)}, {\varvec{\gamma }}_{1,1}^{(i,0)})\) and \(({\varvec{\gamma }}_{1,1}^{(i,0)}, {\varvec{\gamma }}_{1,1}^{(i,1)})\) are mutually different vector pairs with a VCI of \({\varvec{c}}_1\), we can attain that \(\sum _{y_0^{l-1}}\left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}' \right) +f\left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda }}' \right) \right] =2\) , \(\sum _{y_0^{l-1}}\sqrt{f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}' \right) \cdot f \left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda }}' \right) }=Z^{\sum {\varvec{\lambda }}'}(W)\) , and \(\sum _{y_0^{l-1}}\sqrt{f\left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}^*\right) \cdot f \left( {\varvec{\gamma }}_{1,0}^{(i,1)},{\varvec{\lambda }}^* \right) }=Z^{\sum {\varvec{\lambda }}^*}(W)\) . Furthermore, since \({\varvec{\gamma }}_{1,0}^{(i,0)}\) has \(2^{n-k}\) leaf nodes, we can derive that \(\sum _{y_0^{l-1}}\left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}^* \right) \cdot s \left( {\varvec{\gamma }}_{1,0}^{i,0} \right) +f \left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda }}^* \right) \cdot s \left( {\varvec{\gamma }}_{1,1}^{i,0} \right) \right] =2^n\) .
Therefore, \(Z^*\le 2\cdot Z^{\sum {\varvec{\lambda }}^*}(W)\cdot (1-Z^{\sum {\varvec{\lambda }}'}(W))\cdot \sum _{y_0^{l-1}}\sqrt{s \left( {\varvec{\gamma }}_{1,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{1,1}^{(i,0)} \right) }+2^n\cdot Z^{\sum {\varvec{\lambda }}'}(W)\).
Suppose that the upper bound of \(\sum _{y_0^{l-1}}\sqrt{s \left( {\varvec{\gamma }}_{1,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{1,1}^{(i,0)} \right) }\) is \(Z_1\). Then, we can obtain
(2) \(0<k<n-1\).
By generalizing the case of \(k=0\) to \(0<k<n-1\), and let \(Z^*\) = \(\sum _{y_0^{l-1}}\sqrt{s \left( {\varvec{\gamma }}_{k,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{k,1}^{(i,0)}\right) }\), we could derive that
Since \(\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}, {\varvec{\gamma }}_{k+1,1}^{(i,0)} \right)\) is a mutually different vector pair with the VCI of \({\varvec{c}}_{k+1}\) and the CCI of \({\varvec{\lambda }}_{k+1}\), then \(\sum _{y_0^{l-1}}\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}+{\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) =2\) and \(\sum _{y_0^{l-1}}\sqrt{{\varvec{\gamma }}_{k+1,0}^{(i,0)}\cdot {\varvec{\gamma }}_{k+1,1}^{(i,0)}}=Z^{\sum {\varvec{\lambda }}_{k+1}}(W)\) hold. Since \({\varvec{\gamma }}_{k+1,0}^{(i,0)}\) has \(2^{n-k}\) leaf nodes, we can attain \(\sum _{y_0^{l-1}}\left[ s \left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) +s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) \right] =2^{n-k+1}\).
Therefore, \(Z^*\le \left( 2-2\cdot Z^{\sum {\varvec{\lambda }}_{k+1}}(W)\right) \cdot \sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) \cdot s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) }+2^{n-k+1}\cdot Z^{\sum {\varvec{\lambda }}_{k+1}}(W)\).
Suppose that the upper bound of \(\sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) \cdot s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) }\) is \(Z_{k+1}\), then we obtain
(3) \(k=n-1\).
According to (16) and (18), \(\left( {\varvec{\gamma }}_{n,0}^{(i,0)}, {\varvec{\gamma }}_{n,1}^{(i,0)} \right)\) and \(\left( {\varvec{\gamma }}_{n,2}^{(i,0)}, {\varvec{\gamma }}_{n,3}^{(i,0)}\right)\) are two mutually different vector pairs under the VCI of \({\varvec{c}}_n\). Thus,
matches Lemma 2 and the expression of \(Z_k\) in (24) can be derived for \(k=n-1\).
Therefore, the conclusion of Theorem 1 holds.
1.4 4. Proof of Theorem 2
Since the mixed-nary representations of i and j differ only in the sth digit, it can be directly concluded according to Property 1 that the inputs \(u_i\) and \(u_j\) are processed by channels with the same index of the same polar code kernel in each stage before and after stage s. Therefore, the influence of the polar kernels on the reliability of channels i and j is the same in all stages except the sth stage.
In stage s, since the Bhattacharyya parameters of the polar code kernel satisfy \(Z_{l_s}^{(a)} \le Z_{l_s}^{(b)}\), there is \(Z_N^{(i)} \le Z_N^{(j)}\).
1.5 5. Proof of Theorem 3
Since the mixed-nary representations of i and j differ only in the sth and tth digits, it can be deduced according to Property 1 as in Theorem 2 that the influence of the polar kernels on the reliability of channels i and j is the same in all stages except stage s and t
Therefore, the reliability comparison of channels i and j can be attributed to the comparison between channels 1 and 2, whose binary representations are (01) and (10), respectively, of the polar code with a generator matrix of \(G_4=G_2\otimes G_2\).
According to [16], \(Z_4^{(2)} \le Z_4^{(1)}\). Therefore, \(Z_N^{(i)} \le Z_N^{(j)}\) holds.
1.6 6. The Bhattacharyya parameter upper bounds of the polar code kernels with dimensions ranging from 2 to 6
For some standard polar code kernels with dimension of \(l(\in [2,6])\), the upper Bhattacharyya parameter bounds computed by the proposed method are as follows:
For \(G_2=\begin{bmatrix} 1 &{} 0 \\ 1 &{} 1 \end{bmatrix}\), \(\begin{array}{ll} Z_2^{(0)-}=2Z-Z^2\\ Z_2^{(1)}=Z^2\\ \end{array}\) .
For \(G_3=\begin{bmatrix} 1 &{} 0 &{} 0\\ 1 &{} 1 &{} 0 \\ 1 &{} 0 &{} 1\end{bmatrix}\), \(\begin{array}{ll} Z_3^{(0)-}=3Z-3Z^2+Z^3 \\ Z_3^{(1)-}=2Z^2-Z^3\\ Z_3^{(2)}=Z^2\\ \end{array}\) .
For \(G_4=\begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 1 &{} 0 &{}0 \\ 1 &{} 0 &{} 1 &{} 0\\ 1 &{} 1 &{} 1 &{}1\end{bmatrix}\), \(\begin{array}{ll} Z_4^{(0)-}=4Z-6Z^2+4Z^3-Z^4\\ Z_4^{(1)-}=4Z^2-4Z^3+Z^4\\ Z_4^{(2)-}=2 Z^2-Z^4\\ Z_4^{(3)}=Z^2\\ \end{array}\) .
For \(G_5=\begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 &{}0\\ 1 &{} 1 &{} 0 &{}0 &{}0 \\ 1 &{} 0 &{} 1 &{}0 &{} 0\\ 1 &{} 0 &{} 0 &{}1 &{} 0\\ 1 &{} 1 &{} 1 &{} 0 &{}1 \end{bmatrix}\), \(\begin{array}{ll} Z_5^{(0)-}=5Z-10Z^2+10Z^3-5Z^4+Z^5 \\ Z_5^{(1)-}=6Z^2-9Z^3+5Z^4-Z^5 \\ Z_5^{(2)-}=3Z^2-Z^3-2Z^4+Z^5 \\ Z_5^{(3)-}=Z^2+Z^4-Z^5 \\ Z_5^{(4)}=Z^4 \\ \end{array}\) .
For \(G_6=\begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 &{}0 &{} 0\\ 1 &{} 1 &{} 0 &{}0 &{}0 &{} 0\\ 1 &{} 0 &{} 1 &{}0 &{} 0 &{} 0\\ 1 &{} 1 &{} 1 &{}1 &{} 0 &{} 0\\ 0 &{} 0 &{} 1 &{} 0 &{}1 &{} 0\\ 0 &{} 0 &{} 1 &{} 1 &{}1 &{}1\end{bmatrix}\), \(\begin{array}{ll} Z_6^{(0)-}=6Z-15Z^2+20Z^3-15Z^4+6Z^5-Z^6\\ Z_6^{(1)-}=9Z^2-18Z^3+15Z^4-6Z^5+Z^6\\ Z_6^{(2)-}=4Z^2-2Z^3-4Z^4+4Z^5-Z^6\\ Z_6^{(3)-}=4Z^4-4Z^5+Z^6\\ Z_6^{(4)-}=2Z^2-Z^4\\ Z_6^{(5)}=Z^4\\ \end{array}\), where Z is the abbreviation of the Bhattacharyya parameter Z(W) of transmission channel W.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, T., Li, S. & Yu, B. Computing tight upper bounds for Bhattacharyya parameters of binary polar code kernels with arbitrary dimension. J Wireless Com Network 2021, 76 (2021). https://doi.org/10.1186/s13638-021-01954-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13638-021-01954-y
Keywords
- Bhattacharyya parameter
- Tight bounds
- Upper
- Binary tree
- Polar codes