Skip to main content

Computing tight upper bounds for Bhattacharyya parameters of binary polar code kernels with arbitrary dimension

Abstract

Multi-kernel polar codes have recently received considerable attention since they can provide more flexible code lengths than do the original ones. The construction process of them can be simplified by obtaining the Bhattacharyya parameter bounds of the kernels employed. However, there has been currently no generic method for seeking such bounds. In this paper, therefore, we focus on the upper Bhattacharyya parameter bounds of the standard binary polar code kernels with an arbitrary dimension of \(l\ge 2\). A calculation process composing of four steps, the common column binary tree construction for the channel inputs, the common factor extraction, the calculation feasibility testing, and the upper bound calculation based on pattern matching, is formulated with a computational complexity of \(O(2^l)\). It is theoretically proved that the upper bounds obtained by the proposed method are tight, which can lay the foundation to compare the reliability of the synthesized channels in polar codes.

1 Introduction

Polar codes, pioneered by Arıkan [1] in 2008, are capable of reaching the Shannon maximum capability with low encoding and decoding complexity, which have been accepted as the coding scheme of the control channel of the 5G wireless communication systems [2].

Arıkan [1] employed the n times Kronecker power of the polarized kernel matrix denoted by \(G_2= \left[ \begin{array}{ccc} 1 &{} 0 \\ 1 &{} 1 \end{array}\right]\) to perform a linear transformation to an input block of \(N(=2^n)\) bits. By combining and splitting a Binary Input Discrete Memoryless Channel (B-DMC) N times repeatedly, the same number of polarized sub-channels are acquired. While some of them tend to possess the reliability of one, others tend to be zero. However, the code lengths of such polar codes are constrained to \(2^n\), which makes it difficult for them to be applied to the narrow-band, low-rate, and real-time communication fields that require flexible medium and short code lengths, such as real-time voice communication. Korada et al. [3] generalized the polar code kernel as an \(l\times l (l\ge 2)\) invertible matrix denoted by \(G_l\), whose arbitrary column permutation is not an upper triangular matrix. Benammar et al. [4] proved that the channel polarization condition still holds for such multi-kernel polar codes. Thus, flexible code lengths can be obtained by applying the Kronecker product

$$\begin{aligned} G_N=B_N \cdot G_{l_0}\otimes G_{l_1}\otimes \cdots \otimes G_{l_{n-1}} \end{aligned}$$
(1)

of kernels with various dimensions as a generator matrix in the construction of the polar codes, where \(B_N\) is a permutation matrix. Following this, both principle and design of multi-kernel polar codes have become a significant research area in recent years [5,6,7].

When the construction of polar codes is a concern, it is crucial to select the most reliable channels from the synthesized channels, which can be measured by the Bhattacharyya parameters [1]. However, closed-form expressions for the Bhattacharyya parameters of the synthesized channels are usually unavailable [8]. Generally, for the polar code kernels with \(l=2\), the reliability of the synthesized channels can be acquired by Bhattacharyya parameter calculation [1], Monte-Carlo simulation [1], density evolution [9, 10], Gaussian approximation [11], or an approximation by degrading and upgrading transformations, quoted as Tal-Vardy [12]. For \(l>2\), on the other hand, the Bhattacharyya calculation and Monte Carlo simulation methods are employed [13]. Among these methods, while the Bhattacharyya parameter calculation is simple but only applicable to Binary Erasure Channels (BECs), the Gaussian approximation is applicable to Additive White Gaussian Noise (AWGN) channels. More generally, the Tal-Vardy and Monte-Carlo simulation can be applied to arbitrary binary discrete memoryless channels. However, all of the constructions based on these five methods depend on the transmission channel conditions, which make it necessary to construct codes separately for different Signal-to-Noise Ratios (SNRs) [14].

Dealing with resolving the above-mentioned problems in the construction of the polar codes, general construction methods independent of transmission channel conditions have gained significant attention in recent years. Schürch et al. [15] and Wu et al. [16] proposed the partial order theory of the polarized channels, and He et al. [17] proposed the Polarization Weight (PW) for \(G_2\). Based on the comparison of the kernel channels indicated by the Bhattacharyya parameter bounds, these two theories pointed out that there were unambiguous relationships between some of the synthesized channels, which could be utilized to select more reliable channels in the construction of the polar codes. Investigating the bounds of the Bhattacharyya parameters with dimension \(l>2\) could apply these theories to the construction of large-sized and multi-kernel polar codes. Accordingly, Hanif et al. [8] suggested that the kernels’ Bhattacharyya parameter bounds could simplify the construction of the polar codes.

In the research of the Bhattacharyya parameter bounds of the polar code kernels, Arıkan [1] presented the bounds for \(l=2\); however, the lower bound was not tight enough. To address this issue, Korada [18] proposed a much tighter lower bound. As for \(l>2\), Korada et al. [3] researched the relationship between the Bhattacharyya parameter bounds and the partial distances of their kernels and proposed a concise formula for calculating the bounds of Bhattacharyya parameters. However, both lower and upper bounds are not tight enough. Zhang et al. [19] examined the upper bounds for \(l=3\), but did not the lower bounds. Cheng et al. [20] presented a formula to calculate the Bhattacharyya parameters for a given polar code kernel under BECs. However, the calculation of the proposed formula would be very complicated for polar code kernels with large dimensions.

The available researches mentioned above reveal that there has been no general method for seeking tight bounds on the Bhattacharyya parameters of the polar code kernels with an arbitrary dimension yet. Thus, investigating the bounds on the corresponding kernels’ Bhattacharyya parameters is of great importance to simplify the construction of the polar codes, which can also play a significant role in validating and evaluating the asymptotic speed of the polarization [21]. This paper, therefore, examines the upper Bhattacharyya parameters bounds of polar code kernels with dimension \(l\ge 2\).

The main contributions of this paper are summarized as follows: We concluded that any k-order sub-matrix of the inputs must have some common columns, which leads to construct a common column binary tree. Then, we proposed a process to compute the upper bound of \(Z_l^{(i)}\) utilizing an iterative pattern matching and presented a computationally feasible criterion to test whether the proposed method could be applied to a certain polar code kernel.

The rest of this paper is organized as follows. In Sect. 2, we present notations and definitions. In Sect. 3, we derive the upper bounds of the Bhattacharyya parameter. In Sect. 4, we demonstrate the computation procedure of the upper bound on channel 2 of a polar code kernel with a dimension of five. Section 5 provides a detailed discussion of the findings. Finally, Sect. 6 summarizes the paper and lists some potential directions for future research. The proofs of the properties, lemmas and theorems are provided in the “Appendix”.

1.1 Methods/experimental

This paper is mainly theoretical derivation and analysis, and no experiment is carried out.

2 Preliminaries

In this section, we first give the symbols and definitions to be employed throughout the paper. Then, we introduce the Bhattacharyya parameter of the polar code kernels.

2.1 Symbols and definitions

Following [1], we denote random variables by upper case letters, e.g., X, Y, and their realizations by the corresponding lower case letters, i.e., x, y and use \(X^-\), \(Y^-\) to denote their upper bounds. We employ the notation \(a_0^{l-1}\) as the shorthand to denote a row vector \((a_0,a_1,\ldots ,a_{l-1})\) and use \(a_i^j\) to denote its sub-vector \((a_i,a_{i+1},\ldots ,a_j)\), where \(a_i^j\) is void when \(i>j\). Later, we will abbreviate \(a_0^{l-1}\) as the corresponding bold character \({\varvec{a}}\).

For a matrix \({\varvec{T}}\), we use \(T_{i,j}\) to denote the element in row i and column j, \({\varvec{T}}_{i,:}\) and \({\varvec{T}}_{:,j}\) to, respectively, denote the ith row vector and the jth column vector, and employ \({\varvec{T}}_{i::m,:}\) to denote the sub-matrix composing of the row vectors of \({\varvec{T}}\) with starting index i and interval m. We use \(\left[ {\varvec{T}}^{(0)}:{\varvec{T}}^{(1)} \right]\) denote for the combined matrix of \({\varvec{T}}^{(0)}\) and \({\varvec{T}}^{(1)}\), which have the same column size, in the row direction.

For an integer k, we employ \((b_0^k,b_1^k,\ldots ,b_{l-1}^k)\), denoted by b(kl), to represent its l-bit binary expansion with the most significant bit on the left.

Furthermore, \(W: X\rightarrow Y\) represents a symmetric B-DMC transmission channel with input \(X(X={0,1})\) , output Y, and its transition probabilities denoted by \(W(y|x), x \in X, y \in Y\).

All vectors, matrices, and their operations are defined in GF(2).

Definition 1

For an integer k, the reverse order shuffle operation on its l-bit binary expansion is defined as a reverse order operator, which is denoted by r(kl). For example, the result of r(22, 6) is represented by \(26(010110 \rightarrow 011010)\).

Definition 2

For a non-negative integer x and a positive integer vector \(b_0^{n-1}\), we employ \((m_{n-1}m_{n-2}\ldots m_{0})|_{b_0^{n-1}}\) to denote the n-digit mixed-nary representation of x where \(b_0^{n-1}\) is the base vector, \(m_i=Q_i\) mod \(b_{n-i-1}\) and

$$\begin{aligned} Q_i=\left\{ \begin{array}{lr} x, i=0, &{}\\ \lfloor Q_{i-1} /b_{n-i-1} \rfloor , 0 < i \le n-1. &{}\\ \end{array}\right. \end{aligned}$$
(2)

The operation of calculating the mixed-nary representation under the base vector \(b_0^{n-1}\) for x is denoted as M(x,\(b_0^{n-1}\)).

Taking the decimal number 37 as an example, its three-digit mixed-nary representation under base vector [2, 3, 8] is M(37,[2,3,8])=\((115)|_{[2,3,8]}\). The base vector indicates that the digit 1 on the left of the mixed-nary representation is in binary, the middle digit 1 is in ternary, and the right digit 5 is in octonary.

Definition 3

For an \(n\times l\) binary matrix \({\varvec{T}}\) and a certain operation \(f(\cdot )\), we define a boolean vector \(c_0^{l-1}\) as the valid column indicator, abbreviated as VCI, of \({\varvec{T}}\) for \(f(\cdot )\) with \(c_j=1\) to indicate that the elements in column j of \({\varvec{T}}\) are involved in \(f(\cdot )\) for \(\forall j \in [0,l)\). Correspondingly, \({\varvec{c}}_0^{l-1}\) is called the VCI of each row vector of \({\varvec{T}}_{k,:}(0 \le k<n)\) for \(f(\cdot )\).

Definition 4

For an \(n\times l\) binary matrix \({\varvec{T}}\) with a VCI of \({\varvec{c}}\), we define a boolean vector \(\lambda _0^{l-1}\) as the common column indicator, abbreviated as CCI, of T with \(\lambda _j=1\) to mark that all the elements in column j of \({\varvec{T}}\) are the same and \(c_j=1\) for \(\forall j \in [0,l)\). We define \(\lambda _0^{l-1}=g({\varvec{T}},{\varvec{c}})\) to denote the operation of calculating the CCI of \({\varvec{T}}\) under the VCI of \({\varvec{c}}\). Correspondingly, \({\varvec{\gamma }}_0^{l-1}={\varvec{\lambda }}_0^{l-1}\wedge {\varvec{T}}_{0,:}\) is referred as the common column vector, abbreviated as CCV, of \({\varvec{T}}\) with the VCI of \({\varvec{c}}\).

Definition 5

For two vectors \({\varvec{x}}^{(0)}\) and \({\varvec{x}}^{(1)}\) with the same VCI of \({\varvec{c}}\) for an operation \(f(\cdot )\), \(({\varvec{x}}^{(0)}\), \({\varvec{x}}^{(1)})\) is defined as a mutually different vector pair under \({\varvec{c}}\) for \(f(\cdot )\) if \({\varvec{x}}^{(0)}\oplus {\varvec{x}}^{(1)}={\varvec{c}}\).

Definition 6

For a \(2^m\times l\) binary matrix \({\varvec{T}}\), \({\varvec{T}}_{i::2^k,:}\) (\(0\le k \le m\), \(0\le i \le 2^{k-1}\)) is defined as its ith k-order sub-matrix.

It is easy to infer that the 0-order sub-matrix of \({\varvec{T}}\) is \({\varvec{T}}\) itself, and each m-order sub-matrix of \({\varvec{T}}\) has only one element.

Definition 7

For an \(l\times l\) binary invertible lower triangular matrix denoted by

$$\begin{aligned} G_l= \left[ \begin{array}{cccccc} 1 &{} 0 &{} \cdots &{} 0 \\ \cdots &{} 1 &{} \cdots &{} 0 \\ \cdots &{} \cdots &{} 1 &{} \vdots \\ \cdots &{} \cdots &{} \cdots &{} 1 \end{array}\right] , \end{aligned}$$
(3)

where all of its diagonal elements are 1, \(G_l\) is defined as a standard binary polar code kernel with dimension l [3].

Definition 8

For an input vector \(U_0^{l-1}\), which is randomly and uniformly distributed in \(\{0,1\}_0^{l-1}\), the linear transformation sequence of its polarized kernel \(G_l\) is defined by

$$\begin{aligned} X_0^{l-1}=U_0^{l-1}\cdot G_l. \end{aligned}$$
(4)

Here, \(\{W_l:X^l\rightarrow Y^l\}\) is defined as a combined channel under polar code kernel \(G_l\) when \(X_0^{l-1}\) is sequentially transmitted through the channel \(W:X\rightarrow Y\). Thus, the transition probability of \(W_l\) is

$$\begin{aligned} W_l (y_0^{l-1} |u_0^{l-1})=\prod _{i=0}^{l-1}W(y_i |x_i). \end{aligned}$$
(5)

Definition 9

For \(X_0^{l-1}\) defined in (4) and its output \(Y_0^{l-1}\) of a combined channel \(\{W_l:X^l\rightarrow Y^l\}\), the virtual channel \(\{W_l^{(i)}:X\rightarrow Y^l\times X^{i-1},0\le i<l\}\) formed by the channel splitting under Successive Cancellation (SC) decoder in [1] is defined as a polar code kernel channel of \(G_l\).

The transition probability of \(W_l^{(i)}\) with input \(u_i\) and output \((y_0^{l-1},u_0^{i-1})\) is defined in [3] as

$$\begin{aligned} W_l^{(i)}(y_0^{l-1},u_0^{i-1}|u_i)= & {} \frac{1}{2^{l-1}}\sum _{u_{i+1}^{l-1}}W_l \left( y_0^{l-1}|u_0^{l-1}\right) \\= & {} \frac{1}{2^{l-1}}\sum _{u_{i+1}^{l-1}}\prod _{k=0}^{l-1}W\left( y_k|(u_0^{l-1}\cdot G_l)_k\right) , \end{aligned}$$
(6)

where the values of \(u_0^{i-1}\) are evaluated sequentially from 0 to \(i-1\) prior to \(u_i\).

Considering \(u_{i+1}^{l-1} \in \{0,1\}\), we construct an input matrix with a size of \(N \times l\) as

$$\begin{aligned} \{{\varvec{v}}^{(i,u_i)}: {\varvec{v}}_{k,:}^{(i,u_i)}=\left( 0_0^{i-1},u_i,b(k,n)\right) , 0\le k<N\}, \end{aligned}$$
(7)

where \(n=l-i+1\), \(N=2^{n}\), \(u_0^{i-1}=0_0^{i-1}\) and \(0 \le k < N.\) The variables n and N will be used throughout the paper.

The linear transformation matrix is constructed from \({\varvec{v}}^{(i,u_i)}\) utilizing \(G_l\) with \({\varvec{x}}^{(i,u_i)}={\varvec{v}}^{(i,u_i)}\cdot G_l\) according to (4). The elements in \({\varvec{x}}^{(i,u_i)}\) possess the following property.

Property 1

\(\forall k\in [1,n)\) and \(\forall s,t\in [0,2^k)\), then \(g\left( {\varvec{x}}_{s::2^k,:}^{(i,u_i)},1_0^{l-1}\right) =g\left( {\varvec{x}}_{t::2^k,:}^{(i,u_i)},1_0^{(l-1)}\right)\) \(\ne 0_0^{l-1}\) holds. This means that for any k-order sub-matrix of \({\varvec{x}}^{(i,u_i)}\) must have some common columns, and the CCIs of all sub-matrices in the same order are the same.

2.2 The Bhattacharyya parameters of the polar code kernels

According to [1], the Bhattacharyya parameter of a transmission channel \(W:X\rightarrow Y\) is defined by

$$\begin{aligned} Z(W)=\sum _{y\in Y} \sqrt{W(y|0)W(y|1)}. \end{aligned}$$
(8)

Similarly, the Bhattacharyya parameter of \(W_l^{(i)}\) shown in (6), referred as the ith Bhattacharyya parameter of the polar code kernel \(G_l\), can be denoted by

$$\begin{aligned} Z_l^{(i)}= & {} \sum _{y_0^{l-1},u_{0}^{i-1}} \sqrt{W_l^{(i)} \left( y_0^{l-1},u_0^{i-1}|u_i=0 \right) \cdot W_l^{(i)} \left( y_0^{l-1},u_0^{i-1}|u_i=1 \right) } \\= & {} \frac{1}{2^{l-1}}\cdot \sum _{y_0^{l-1},u_{0}^{i-1}} \sqrt{\begin{matrix} \sum _{u_{i+1}^{l-1}}\prod _{k=0}^{l-1}W \left( y_k|((u_0^{i-1},0,u_{i+1}^{l-1})\cdot G_l)_k \right) \cdot \\ \sum _{u_{i+1}^{l-1}}\prod _{k=0}^{l-1}W \left( y_k|((u_0^{i-1},1,u_{i+1}^{l-1})\cdot G_l)_k \right) \end{matrix} }. \end{aligned}$$
(9)

Since \(U_0^{l-1}\) is distributed uniformly in \(\{0,1\}_0^{l-1}\), for any function \(\varphi (\cdot )\) on \(u_k(k<i)\) where i is the index of \(W_l^{(i)}\) shown in (9), \(\sum _{u_k}\varphi (u_k\)) =2\(\varphi (0)\)=2\(\varphi (1)\) holds. Thus, according to (4), the expression of \(Z_l^{(i)}\) shown in (9), where \(u_0^{i-1}\) is set to \(0_0^{i-1}\), could be rewritten as

$$\begin{aligned} Z_l^{(i)}= & {} \frac{1}{2^{l-i-1}}\cdot \sum _{y_0^{l-1}}\sqrt{\sum _{p=0}^{N-1}\prod _{k=0}^{l-1}W \left( y_k|x_{p,k}^{(i,u_i=0)}\right) \cdot \sum _{q=0}^{N-1}\prod _{k=0}^{l-1}W \left( y_k|x_{q,k}^{(i,u_i=1)}\right) } \\= & {} \frac{1}{2^{n}}\cdot \sum _{y_0^{l-1}}\sqrt{\sum _{p=0}^{N-1}f \left( {\varvec{x}}_{p,:}^{(i,0)},1_0^{l-1} \right) \cdot \sum _{q=0}^{N-1}f \left( {\varvec{x}}_{q,:}^{(i,1)},1_0^{l-1} \right) }, \end{aligned}$$
(10)

where \(f(\cdot )\) is defined by

$$\begin{aligned} f\left( x_0^{l-1},c_0^{l-1} \right) =\prod _{k=0}^{l-1}W^{c_k}\left( y_k|x_k \right) , \end{aligned}$$
(11)

and \(c_0^{l-1}\) is the VCI of \(x_0^{l-1}\) for \(f(\cdot )\). In the calculation process of (10), the initial VCI of \({\varvec{x}}_0^{(i,u_i)}\) for \(f(\cdot )\) is \(1_0^{l-1}\).

According to (11), we can easily derive that for two VCIs denoted by \({\varvec{c}}^{(0)}\) and \({\varvec{c}}^{(1)}\) of \(x_0^{l-1}\), if \({\varvec{c}}^{(0)}\wedge {\varvec{c}}^{(1)} = 0_0^{l-1}\), then

$$\begin{aligned} f\left( x_0^{l-1},{\varvec{c}}^{(0)}\right) \cdot f\left( x_0^{l-1},{\varvec{c}}^{(1)}\right) = f\left( x_0^{l-1},{\varvec{c}}^{(0)}\oplus {\varvec{c}}^{(1)}\right) . \end{aligned}$$
(12)

Considering the Bhattacharyya parameter of the last channel \(W_l^{(l-1)}\) for a polar code kernel \(G_l\), we attain \({\varvec{x}}^{\left( l-1,u_{l-1}=0 \right) }=0_0^{l-1}\) and \({\varvec{x}}^{\left( l-1,u_{l-1}=1 \right) }=G_{l-1,:}\) with \({\varvec{u}}_0^{l-2}\) being set to zeros according to (4) and (7). Furthermore, we can obtain \(Z_l^{(l-1)}=\sum _{y_0^{l-1}}\sqrt{f \left( 0_0^{l-1},1_0^{l-1} \right) \cdot f \left( G_{l-1,:},1_0^{l-1} \right) }\) according to (10). Due to \(\sum _{(y\in Y)}\sqrt{W(y_i |0)W(y_i |1)}=Z(W)\), we can get

$$\begin{aligned} Z_l^{(l-1)}=Z^{\sum G_{l-1,:}}(W). \end{aligned}$$
(13)

It can be seen from (10) that as l increases, the composition of \(Z_l^{(i)}\) may become much more complicated, which makes it difficult to calculate the bounds of \(Z_l^{(i)}\). Since the value of \(Z_l^{(l-1)}\) can be directly calculated by (13), we mainly research the upper bounds of \(Z_l^{(i)}\) for \(0\le i\le l-2\) in this paper.

3 The upper Bhattacharyya parameter bound

In this section, we first construct a k-order sub-matrix common column binary tree for the polar code kernel channel inputs. Then, we proposed a process to calculate the upper bound of \(Z_l^{(i)}\) utilizing an iterative pattern matching.

Lemma 1

If a, b, c, and d are non-negative real numbers, then [1] defined the inequality given below

$$\begin{aligned} \sqrt{(ab+cd)(ac+bd)}\le \left( \sqrt{ab}+\sqrt{cd}\right) \left( \sqrt{ac}+\sqrt{bd} \right) -2\sqrt{abcd}. \end{aligned}$$
(14)

Lemma 2

For two mutually different vector pairs denoted by \(({\varvec{x}}_{0,:},{\varvec{x}}_{1,:})\) and \(({\varvec{x}}_{2,:},{\varvec{x}}_{3,:})\) with a VCI of \({\varvec{c}}\), the following inequality

$$\begin{aligned} \sum _{y_0^{l-1}}\sqrt{\left[ f \left( {\varvec{x}}_{0,:},{\varvec{c}} \right) +f \left( {\varvec{x}}_{1,:},{\varvec{c}} \right) \right] \cdot \left[ f \left( {\varvec{x}}_{2,:},{\varvec{c}} \right) +f \left( {\varvec{x}}_{3,:},{\varvec{c}} \right) \right] } \le \\ 2\cdot Z^{\sum {\varvec{\lambda }}} (W)+2\cdot Z^{\sum {\varvec{c}}-\sum {\varvec{\lambda }}}(W)-2\cdot Z^{\sum {\varvec{c}}}(W) \end{aligned}$$
(15)

holds, where \({\varvec{\lambda }}={\varvec{x}}_{0,:}\oplus {\varvec{x}}_{2,:}\).

3.1 The common column binary tree

According to Definition 4 and Property 1, we can extract a CCV for each k-order sub-matrix of \({\varvec{x}}^{(i,u_i)}\), and the extracting process can be divided into (\(n+1\)) stages ranging from 0 to n. For any stage \(k \in [0,n]\), all the \(2^k\) sub-matrices of \({\varvec{x}}^{(i,u_i)}\) have the same VCI \({\varvec{c}}_k\) and CCI \({\varvec{\lambda }}_k\). The \(\varvec{\lambda }_k\), \({\varvec{c}}_k\), and the CCV \({\varvec{\gamma }}_{k,j}^{(i,u_i)}\) of each sub-matrix can be calculated by

$$\begin{aligned}&{\varvec{\lambda }}_{k,:}&= g\left( {\varvec{x}}_{0::2^k,:}^{(i,u_i=0)} : {\varvec{x}}_{0::2^k,:}^{(i,u_i=1)},{\varvec{c}}_k \right) , 0\le k \le n , \end{aligned}$$
(16)
$$\begin{aligned}&{\varvec{c}}_k&=\left\{ \begin{array}{lr} 1_0^{l-1}, k=0 ,&{}\\ {\varvec{c}}_{k-1}\oplus {\varvec{\lambda }}_{k-1,:}, 0<k\le n, &{} \end{array}\right. \end{aligned}$$
(17)
$$\begin{aligned}&{\varvec{\gamma }}_{k,j}^{(i,u_i)}&={\varvec{\lambda }}_{k,:}\wedge {\varvec{x}}_{j,:}^{(i,u_i)}, 0 \le j < 2^{k+1}. \end{aligned}$$
(18)

By doing so, a common column binary tree \({\varvec{\gamma }}^{(i,u_i)}\) can be constructed as shown in Fig. 1.

Fig. 1
figure 1

The common column binary tree of \({\varvec{x}}^{(i,u_i)}\): \({\varvec{\lambda }}_k\) denotes the CCI for stage k, and \(\gamma\) represents the CCV of \({\varvec{x}}_{j::2^k,:}^{(i,u_i)}\)

Fig. 2
figure 2

The common column binary tree of \({\varvec{x}}^{(2,0)}\) and \({\varvec{x}}^{(2,1)}\): the left part is for \({\varvec{x}}^{(2,0)}\) and the right part is for \({\varvec{x}}^{(2,1)}\)

Fig. 3
figure 3

The upper Bhattacharyya parameter bounds of \(G_5\): Z(W) denotes the Bhattacharyya parameter of the transmission channel W

Suppose that

$$\begin{aligned} s \left( {\varvec{\gamma }}_{k,j}^{(i,u_i)}\right) =\left\{ \begin{array}{lr} f\left( {\varvec{\gamma }}_{k+1,2j}^{(i,u_i)},{\varvec{c}}_{k+1}\right) \cdot s \left( {\varvec{\gamma }}_{k+1,2j}^{(i,u_i)}\right) + f\left( {\varvec{\gamma }}_{k+1,2j+1}^{(i,u_i)},{\varvec{c}}_{k+1}\right) \cdot s \left( {\varvec{\gamma }}_{k+1,2j+1}^{(i,u_i)}\right) , k <n, &{}\\ f \left( {\varvec{\gamma }}_{k,j}^{(i,u_i)},{\varvec{c}}_k \right) , k=n, &{} \end{array}\right. \end{aligned}$$
(19)

if there are equal paths from the child nodes of \({\varvec{\gamma }}_{k,j1}^{(i,e1)}\) and \({\varvec{\gamma }}_{k,j2}^{(i,e2)}\) to their corresponding leaf nodes in the tree shown in Fig. 1, then \(s\left( {\varvec{\gamma }}_{k,j1}^{(i,e1)}\right) =s\left( {\varvec{\gamma }}_{k,j2}^{(i,e2)}\right)\) where e1 and e2 are the instantiated values of \(u_i\).

According to (19), the expression of \(Z_l^{(i)}\) in (10) can be transformed into

$$\begin{aligned} Z_l^{(i)}= \frac{1}{2^{n}}\cdot \sum _{y_0^{l-1}}\sqrt{f\left( {\varvec{\gamma }}_{0,0}^{(i,0)},{\varvec{c}}_0 \right) \cdot s\left( {\varvec{\gamma }}_{0,0}^{(i,0)} \right) \cdot f\left( {\varvec{\gamma }}_{0,0}^{(i,1)},{\varvec{c}}_0 \right) \cdot s\left( {\varvec{\gamma }}_{0,0}^{(i,1)} \right) }. \end{aligned}$$
(20)

Considering that \(\sum _{y\in Y} {W(y_i|x_i)}=1\) and (8), we express \({\varvec{h}} = {\varvec{\gamma }}_{0,0}^{(i,0)} \oplus {\varvec{\gamma }}_{0,0}^{(i,1)} \wedge {\varvec{c}}_0\). It can be derived that \(\sum _{y_0^{l-1}} \sqrt{f\left( {\varvec{\gamma }}_{0,0}^{(i,0)},{\varvec{c}}_0 \right) \cdot f\left( {\varvec{\gamma }}_{0,0}^{(i,1)},{\varvec{c}}_0 \right) }=Z^{\sum {\varvec{h}}}(W)\). Thus, the expression of \(Z_l^{(i)}\) in (20) can be further transformed into:

$$\begin{aligned} Z_l^{(i)}= \frac{1}{2^{n}}\cdot Z^{\sum {\varvec{h}}}(W) \cdot \sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{0,0}^{(i,0)} \right) \cdot s\left( {\varvec{\gamma }}_{0,0}^{(i,1)} \right) }. \end{aligned}$$
(21)

The common factor binary tree of \({\varvec{x}}^{(i,u_i)}\) shown in Fig. 1 has the following properties.

Property 2

If \(s\left( {\varvec{\gamma }}_{1,0}^{(i,0)}\right) =s\left( {\varvec{\gamma }}_{1,1}^{(i,0)}\right)\), then \(Z_l^{(i)}\) in (21) has a common factor defined by

$$\begin{aligned} CM1=\sum _{y_0^{l-1}}\sqrt{\left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{c}}_1 \right) +f \left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{c}}_1\right) \right] \cdot \left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,1)},{\varvec{c}}_1 \right) +f \left( {\varvec{\gamma }}_{1,1}^{(i,1)},{\varvec{c}}_1 \right) \right] } \end{aligned}$$
(22)

that matches (14).

Property 3

If \(s\left( {\varvec{\gamma }}_{2,0}^{(i,0)}\right) =s\left( {\varvec{\gamma }}_{2,1}^{(i,0)}\right)\) and \({\varvec{\gamma }}_{2,0}^{(i,0)}+{\varvec{\gamma }}_{2,1}^{(i,0)}={\varvec{\gamma }}_{2,2}^{(i,0)}+{\varvec{\gamma }}_{2,3}^{(i,0)}\), then \(Z_l^{(i)}\) in (21) has a common factor defined by

$$\begin{aligned} CM2=\sum _{y_0^{l-1}}\sqrt{\left[ f \left( {\varvec{\gamma }}_{2,0}^{(i,0)},{\varvec{c}}_2 \right) +f \left( {\varvec{\gamma }}_{2,1}^{(i,0)},{\varvec{c}}_2 \right) \right] \cdot \left[ f \left( {\varvec{\gamma }}_{2,0}^{(i,1)},{\varvec{c}}_2 \right) +f\left( {\varvec{\gamma }}_{2,1}^{(i,1)},{\varvec{c}}_2\right) \right] } \end{aligned}$$
(23)

that matches (14).

Since \(s\left( {\varvec{\gamma }}_{2,0}^{(i,0)}\right)\) and \(s\left( {\varvec{\gamma }}_{2,1}^{(i,0)}\right)\) are mutually different vectors with the VCI of \({\varvec{c}}_2\), both CM1 and CM2 cannot coexist, which can be extracted from \(Z_l^{(i)}\) shown in (21) to calculate its upper bound separately. By doing so, \({\varvec{\gamma }}^{(i,u_i)}\) should be reconstructed for the remaining items of \(Z_l^{(i)}\) shown in (21), which will be explained in detail later.

Theorem 1

For the reconstructed \({\varvec{\gamma }}^{(i,u_i)}\), if \(\exists j \in \{0,1\}\) makes \(s \left( {\varvec{\gamma }}_{1,0}^{(i,0)}\right) =s \left( {\varvec{\gamma }}_{1,j}^{(i,1)} \right)\) hold, then \(\sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{0,0}^{(i,0)}\right) \cdot s \left( {\varvec{\gamma }}_{0,0}^{(i,1)}\right) }\le Z_0\), where \(Z_0\) can be iteratively calculated by

$$\begin{aligned} Z_k=\left\{ \begin{array}{lr} 2\cdot Z^{\sum {\varvec{\lambda }}^*}(W)\cdot \left( 1-Z^{\sum {\varvec{\lambda }}'}(W) \right) \cdot Z_1+2^n \cdot Z^{\sum {\varvec{ \lambda }}'}(W), k=0, &{}\\ 2\cdot \left( 1-Z^{\sum {\varvec{\lambda }}_{k,:}}(W) \right) \cdot Z_{k+1}+2^{n-k+1} \cdot Z^{\sum {\varvec{\lambda }}_{k,:}} (W), k\in (0,n-1),&{}\\ 2\cdot Z^{ \sum \hat{{\varvec{\lambda }}}}(W)+2\cdot Z^{\sum {\varvec{ \lambda }}_{k+1}-\sum \hat{{\varvec{ \lambda }}}}(W)-2\cdot Z^{\sum {\varvec{ \lambda }}_{k+1}}(W), k=n-1, &{} \end{array}\right. \end{aligned}$$
(24)

and \(\hat{{\varvec{ \lambda }}}={\varvec{\gamma }}_{n,0}^{(i,0)}\oplus {\varvec{\gamma }}_{n,2}^{(i,0)}\), \({\varvec{\lambda }}^*={\varvec{\gamma }}_{1,0}^{(i,0)}\oplus {\varvec{ \gamma }}_{1,1-j}^{(i,1)}\wedge {\varvec{\lambda }}_{1,:}\), and \({\varvec{\lambda '}}={\varvec{\lambda }}_{1,:}\oplus {\varvec{ \lambda }}^*\).

3.2 The calculation of the upper bound

According to Property 2, Property 3 and Theorem 1, we construct the following process to calculate the upper bound of \(Z_l^{(i)}\) for a general polar code kernel \(G_l\).

3.2.1 The common column binary tree construction

The k-order common column binary tree \({\varvec{\gamma }}^{(i,u_i)}\) of \({\varvec{x}}^{(i,u_i)}\) can be constructed according to (16), (17) and (18) gradually. As shown in Fig. 1, the item \({\varvec{\gamma }}_{n,j}^{(i,u_i)}\) in the last stage corresponds to \({\varvec{x}}_{r(j),:}^{(i,u_i)}\); thus, \({\varvec{\gamma }}^{(i,u_i)}\) can be constructed rapidly from right to left gradually.

3.2.2 The common factor extraction

According to Property 2 and Property 3, we construct Algorithm 1 to extract the common factor of \(Z_l^{(i)}\), namely the CM1 or the CM2. The input parameters include the common column binary tree \({\varvec{\gamma }}^{(i,u_i)}\) and its CCI matrix \({\varvec{\lambda }}\). The output is a vector of \(({\varvec{c}}^*,{\varvec{r}}_0,{\varvec{r}}_1,{\varvec{r}}_2,{\varvec{r}}_3)\), where \({\varvec{r}}_i\) corresponds to \({\varvec{x}}_{i,:}\) in (15), and \({\varvec{c}}^*\) denote the VCI of \({\varvec{r}}_i\). If neither CM1 nor CM2 exists, the return value of \({\varvec{c}}^*\) is \(0_0^{l-1}\). In Step 5, \({\varvec{\gamma }}^{(i,u_i)}\) is reconstructed due to the common factor extraction.

figure a

3.2.3 The calculation feasibility testing

According to the conditions of Theorem 1, for the reconstructed \({\varvec{\gamma }}^{(i,e)}\) after conducting the common factor extraction mentioned above, if \(\exists j\) leads to \(s( {\varvec{\gamma }}_{1,0}^{(i,0)})=s({\varvec{\gamma }}_{1,j}^{(i,1)})\), it is feasible to employ the proposed method to calculate the upper Bhattacharyya parameter bound. Otherwise, the upper bound cannot be calculated with the proposed method.

3.2.4 The upper bound calculation based on pattern matching

The upper Bhattacharyya parameter bound composes of two parts: (1) the bounds of the CM1 and the CM2 and (2) the bounds of the remaining part of the reconstructed \({\varvec{\gamma }}^{(i,u_i)}\). The upper bounds of these two parts can be calculated by matching (14) and Theorem 1, respectively.

Thus, the upper bound of \(Z_l^{(i)}\) can be calculated as

$$\begin{aligned} Z_l^{(i)-}=2^{-n}\cdot Z^{\sum {\varvec{h}}} \left( W \right) \cdot Z^{'}\cdot Z_0, \end{aligned}$$
(25)

where \(Z^{'}\) denotes the upper bound part contributed by both CM1 and CM2, which can be calculated according to (14) where \(Z^{'}=2\cdot Z^{\sum {\varvec{\lambda }}^*}(W)+2\cdot Z^{\sum {\varvec{c}}^*-\sum {\varvec{ \lambda }}^*}(W)-2\cdot Z^{\sum {\varvec{c}}^*}(W)\) if CM1 or CM2 exists, otherwise \(Z'\) is set to 1.

4 Illustrative examples

In this section, we utilize the following \(5\times 5\) polar code kernel as an illustrative example to demonstrate the computation of the upper bound of \(Z_5^{(2)}\):

$$\begin{aligned} G_5=\left[ \begin{array}{cccccccc} 1&{}0&{}0&{}0&{}0\\ 1&{}1&{}0&{}0&{}0\\ 1&{}0&{}1&{}0&{}0\\ 1&{}0&{}0&{}1&{}0\\ 1&{}1&{}1&{}0&{}1 \end{array}\right] . \end{aligned}$$
(26)

Prior to conduct the upper bound computations, data initialization is performed for some parameters such as \(n=l-i-1=2\), \(N=2^n=4\), \({\varvec{x}}^{(2,0)}=[00000, 11101, 10010, 01111]\), and \({\varvec{x}}^{(2,1)}=[10100, 01001, 00110, 11011]\). The upper bound computation is performed by following four main steps provided below:

  1. 1.

    The common column binary tree construction We construct the common column binary tree of \({\varvec{x}}^{(2,u_2)}\) shown in Fig. 2 according to (16), (17) and (18). The value of \({\varvec{h}}\) in (21) is calculated as [00000].

  2. 2.

    The common factor extraction As calculated by Algorithm 1, \({\varvec{x}}^{(2,u_i)}\) has no common factor, i.e., \({\varvec{c}}^*=\)[00000].

  3. 3.

    The calculation feasibility testing Since \(s\left( {\varvec{\gamma }}_{1,0}^{(2,0)} \right) =s \left( {\varvec{\gamma }}_{1,1}^{(2,1)} \right)\), it is feasible to calculate the upper bound with \(j=1\).

  4. 4.

    The upper bound calculation by pattern matching Since \({\varvec{x}}^{(2,u_i)}\) has no common factor, \(Z'\) in (25) is assigned to one.

By calculating \({\varvec{\lambda }}^*={\varvec{\gamma }}_{1,0}^{(i,0)}\oplus {\varvec{\gamma }}_{1,1-j}^{(i,1)} \wedge {\varvec{ \lambda }}_{1,:} =[00100]\) and \({\varvec{\lambda '}}={\varvec{\lambda }}_{1,:}\oplus {\varvec{\lambda }}^*=[01001]\), where \(j=1\).

According to (24), we compute \(Z_1=2\cdot Z(W)+2\cdot Z^2 (W)-2\cdot Z^3 (W)\) and \(Z_0=4\cdot Z^5 (W)-8\cdot Z^4 (W)-4\cdot Z^3 (W)+12\cdot Z^2 (W)\).

According to (25), the upper bound of \(Z_5^{(2)}\) is computed by

$$\begin{aligned} Z_5^{(2)-}=2^{-2}\cdot Z^{\sum {\varvec{h}}}(W)\cdot Z' \cdot Z_0=Z^5 (W)-2\cdot Z^4 (W)-Z^3 (W)+3\cdot Z^2 (W). \end{aligned}$$
(27)

Similarly, the upper Bhattacharyya parameter bounds of \(G_5\)’s other channels listed in “Appendix 6” are illustrated in Fig. 3. It could be seen from the figure that the reliability of all channels except for channel 0 is significantly improved compared to the transmission channel W when \(Z(W)<0.23\).

In the “Appendix,” we provide the Bhattacharyya parameter bounds of the polar code kernels with a dimension varying from 2 to 6 listed in [22]. According to [1], when W is a BEC channel, all the polar code kernels’ Bhattacharyya parameters take their upper limits and satisfy the equality defined by

$$\begin{aligned} \sum _{i=0}^{l-1}Z_i=l\cdot Z(W). \end{aligned}$$
(28)

Seen that all the upper Bhattacharyya parameter bounds of the polar code kernels with dimension \(l(\in [2,6])\) listed in “Appendix 6” meet (28), the correctness of the results generated by the proposed method is proven.

5 Results and discussion

In this section, we first summarize the results of this article and then discuss the computational complexity, the application scope of the proposed method and the polarization effects of some multi-kernel polar codes. Finally, we make some comparisons between the proposed method and the schemes in [3, 20] and demonstrate the possible application of the results of this paper in the construction of multi-kernel polar codes.

5.1 The results and computational complexity

None experiment has been carried out since the paper is mainly theoretical derivation and analysis.

As a result of theoretical reasoning, we gave a computation process based on the construction of a common column binary tree and pattern matching, and the results of upper bounds are tight.

For the calculation of the upper Bhattacharyya parameter bounds for a polar code kernel \(G_l\) with dimension, the main part is to construct the sub-matrix common factor binary tree of \({\varvec{x}}^{(i,u_i)}\), which needs to traverse a total of \(2^{l-i-1}+2^{l-i-2}+\cdots +2^0\) nodes. Thus, the computational complexity is \(O(2^l)\).

5.2 The scope of application

The computation of the upper Bhattacharyya parameter bounds, however, needs to meet certain conditions, which are validated by the calculation feasibility in this paper. It is pointed out in [22] that there is more than one form of polar kernels of dimension \(l(>2)\) with the same exponent. Utilizing two \(6\times 6\) polar kernels:

\(G_6^{(0)}=\left[ \begin{array}{ccccccccccc} 1&{}0&{}0&{}0&{}0&{}0\\ 1&{}1&{}0&{}0&{}0&{}0\\ 0&{}1&{}1&{}0&{}0&{}0\\ 1&{}0&{}0&{}1&{}0&{}0\\ 1&{}1&{}0&{}1&{}1&{}0\\ 0&{}1&{}1&{}0&{}1&{}1 \end{array}\right]\) in “Appendix 4” and \(G_6^{(1)}=\left[ \begin{array}{ccccccccccc} 1&{}0&{}0&{}0&{}0&{}0\\ 1&{}1&{}0&{}0&{}0&{}0\\ 1&{}0&{}1&{}0&{}0&{}0\\ 1&{}0&{}0&{}1&{}0&{}0\\ 1&{}1&{}1&{}0&{}1&{}0\\ 1&{}1&{}0&{}1&{}0&{}1 \end{array}\right]\) in [22], both of which have an exponent 0.451328, as an illustrative example. The upper Bhattacharyya parameter bound of each channel of \(G_6^{(0)}\) can be calculated by the method proposed in this paper. However, the calculation of channels 1, 2 and 3 of \(G_6^{(1)}\) are not feasible according to Theorem 1. Provided this, it is feasible to search for the polar code kernels whose upper Bhattacharyya parameter bounds can be computed by the proposed method.

5.3 Polarization effect

According to [1], the polarization effect of the original polar codes improves as the code lengths increase. In order to examine such effects of multi-kernel polar codes, \(G_2\) and \(G_5\) in “Appendix 6” are employed as illustrative examples to construct multi-kernel polar codes with length \(N=\)100, 500 and 1000.

The polarization effect of these multi-kernel polar codes and the original polar codes with code length N=128, 512 and 1024 is illustrated in Fig. 4 for the case W is a BEC with erasure probability \(\epsilon\)=0.5. The symmetric capacity values are computed according to [1] based on the upper Bhattacharyya parameter bounds of \(G_2\) and \(G_5\) in “Appendix 6”.

The result in Fig. 4 shows that the multi-kernel polar codes composed of \(G_2\) and \(G_5\) have the similar polarization effect as the original polar codes.

Fig. 4
figure 4

Polarization effect of some polar codes: \(G_{128}=G_2^{\otimes 7}\), \(G_{512}=G_2^{\otimes 9}\), \(G_{1024}=G_2^{\otimes 10}\), \(G_{100}=G_5^{\otimes 2} \otimes G_2^{\otimes 2}\), \(G_{500}=G_5^{\otimes 3} \otimes G_2^{\otimes 2}\) and \(G_{1000}=G_5^{\otimes 3} \otimes G_2^{\otimes 3}\)

Fig. 5
figure 5

Tanner graph of the multi-kernel polar with transformation matrix \(G_{24}\): \(G_{24}=B_{24} \cdot G_2^{\otimes 2} \otimes G_3 \otimes G_2\), and the mixed-nary representations of the channel indexes are listed on the left

Fig. 6
figure 6

Channel order graph of a multi-kernel polar code : \(G_{24}=B_{24} \cdot G_2^{\otimes 2} \otimes G_3 \otimes G_2\)

5.4 Comparisons and analyses

The upper Bhattacharyya parameter bounds of a kernel are pertinent to its partial distance [3]. By utilizing \(G_5\) in (26) as an illustrative example, we compare the upper bounds computed by the proposed method and those by [3], where the partial distance of \(G_5\) is (1, 2, 2, 2, 4) in [3]. Table 1 shows the upper Bhattacharyya parameter bounds of \(G_5\) computed by the two methods.

Table 1 Comparison of the upper Bhattacharyya parameter bounds of \(G_5\), where Z is the abbreviation of Z(W)

As shown in Table 1, for \(Z\in [0,1]\), the upper bounds for each channel, except for channel 4, of \(G_5\) provided by the method proposed in this paper are tighter than those of [3].

When compared to [20], the upper bounds of the kernels with dimension of three and four calculated by the proposed method are the same as that, entitled as Bhattacharyya parameter expression under BECs, in [20]. However, for the kernels with dimension of \(l\ge 5\), it is difficult to compute the upper bounds by the method in [20] since it is essentially an exhaustive computational method.

5.5 Illustrative application in construction of multi-kernel polar codes

The multi-kernel polar codes allow for more flexibility in terms of the code length than the original polar codes [7] for the enhanced mobile broadband (eMBB) control channel for the 5th generation (5G) of wireless communications. Here we consider the application of upper Bhattacharyya parameter bounds in the construction of multi-kernel polar codes for BECs, which requires selecting the most reliable ones from all polarized channels to transmit information [1].

Taking the instantiated expression

$$\begin{aligned} G_{24}=B_{24} \cdot G_{2}^{\otimes 2} \otimes G_{3} \otimes G_{2} \end{aligned}$$
(29)

of \(G_N\) in (1) as an example, the Tanner graph shown in Fig. 5 of the multi-kernel polar code with the generator matrix of \(G_{24}\) can be constructed as in [7]. As shown in Fig. 5 for \(G_{24}\), the Tanner graph of \(G_N\) can be devided into n stages indexed from right to left.

For a multi-kernel polar code P with a generator matrix shown in (1), the following property and theorems can be established.

Property 1

For P’s one channel indexed by i, whose mixed-nary representation under base vector \(l_0^{n-1}\) is \((m_{n-1}m_{n-2}\ldots m_{0})|_{l_0^{n-1}}\), the digit \(m_k\) (\(0 \le k \le n-1\)) corresponds to the subchannel \(m_k\) of the polar code kernel \(G_{l_k}\) at stage \(l_k\).

Since Property 1 is simply derived from the Tanner graph of the polar code as shown in Fig. 5, the proof is omitted.

Theorem 2

For P’s two polarized channels indexed by i and j, whose mixed-nary representations are \([M(p,l_{s+1}^{n-1} ),a,M(q,l_0^{s-1} )]\) and \([M(p,l_{s+1}^{n-1} ),b,M(q,l_0^{s-1} )],\) respectively, if the Bhattacharyya parameters of kernel \(G_{l_s}\) satisfy \(Z_{l_s}^{(a)} \le Z_{l_s}^{(b)}\), then \(Z_N^{(i)} \le Z_N^{(j)}\) holds.

Theorem 3

For P’s two polarized channels indexed by i and j, and \(G_{l_s}\) and \(G_{l_t}\) (\(0 \le s < t \le n-1\)) are both equal to \(G_2\) listed in “Appendix 6”, if the mixed-nary representations for i and j can be expressed as \([M(p,l_{t+1}^{n-1} ),1,M(q,l_{s+1}^{t-1} ),0,M(r,l_0^{s-1} )]\) and \([M(p,l_{t+1}^{n-1} ),0,M(q,l_{s+1}^{t-1} ),1,M(r,l_0^{s-1} )],\) respectively, then \(Z_N^{(i)} \le Z_N^{(j)}\) holds.

It can be deduced that if two polarized channels satisfy Theorem 2 or 3, then one of them is always more reliable than the other one, which is independent with the transmission channel W and can be empolyed to simplify the construction of multi-kernel polar codes as in [15] and [16] .

Employing \(G_2\) and \(G_3\) in “Appendix 6” for \(G_{24}\) in (29) as an example, it can be easily inferred from the upper Bhattacharyya parameter bound expressions listed in “Appendix 6” that \(Z_2^{(0)-} \ge Z_2^{(1)}\) and \(Z_3^{(0)-} \ge Z_3^{(1)-} \ge Z_3^{(2)}\). Since the transmission channel W is a BEC, then \(W_2^{(0)} \le W_2^{(1)}\) and \(W_3^{(0)} \le W_3^{(1)} \le W_3^{(2)}\) hold in terms of reliability according to [1]. For the channels 17, 22 and 23 of \(G_{24}\), their mixed-nary representations under the base vector [2, 2, 3, 2] are \((1021)|_{[2,2,3,2]}\), \((1120)|_{[2,2,3,2]}\) and \((1121)|_{[2,2,3,2]}\), respectively. Therefore, channel 23 is superior to channel 22 in reliability according to Theorem 2, and channel 22 is superior to channel 17 according to Theorem 3. Similarly applying Theorems 2 and 3 to the remaining polarized channels, a partial order graph as shown in Fig. 6 of \(G_{24}\) could be constructed, where \(A \rightarrow B\) denotes that channel A is superior to B in terms of reliability. The reliability comparison relationship of polarized channels in the same level in Fig. 6 remains uncertain, which can be further determined by other methods such as the distance principle in [7].

These partial order results make it no longer heavy-computationally to compare the reliability of all polarized channels under the transmission channel W, which can do simplify the construction of multi-kernel polar codes.

It should be noted that the example only applies to BECs. For B-DMCs, the lower Bhattacharyya parameter bounds of the used polar code kernels should be investigated at the same time.

6 Conclusions

In this paper, we proposed a novel method to compute the tight upper Bhattacharyya parameter bounds of polar code kernels of any dimension, providing a theoretical basis for the reliability comparison of the polarized channels in the construction of the polar codes. The computation of the upper Bhattacharyya parameter bounds can be applied to some standard polarization kernels utilizing the construction of the sub-matrix common column tree of the channel inputs. Future studies should focus on searching for the standard polar code kernels that are suitable for the upper bound computation method of this paper or devising an improved method that is suitable for any standard polar code kernels.

Availability of data and materials

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Abbreviations

AWGN:

Additive White Gaussian Noise

BEC:

Binary Erasure Channel

B-DMC:

Binary Input Discrete Memoryless Channel

CCI:

Common Column Indicator

CCV:

Common Column Vector

eMBB:

Enhanced Mobile Broadband

PW:

Polarization Weight

SC:

Successive Cancellation

SNR:

Signal-to-Noise Ratio

VCI:

Valid Column Indicator

References

  1. E. Arikan, Channel polarization: a method for constructing capacity-achieving codes for symmetric binary-input memoryless channels. IEEE Trans. Inf. Theory 55(7), 3051–3073 (2009). https://doi.org/10.1109/TIT.2009.2021379

    Article  MathSciNet  MATH  Google Scholar 

  2. 3GPP: 5G; NR; Multiplexing and channel coding. Technical Specification (TS) 38.212, 3rd Generation Partnership Project (3GPP). Version 15.1.0 (2018)

  3. S.B. Korada, E. Şaşoğlu, R. Urbanke, Polar codes: characterization of exponent, bounds, and constructions. IEEE Trans. Inf. Theory 56(12), 6253–6264 (2010). https://doi.org/10.1109/TIT.2010.2080990

    Article  MathSciNet  MATH  Google Scholar 

  4. M. Benammar, V. Bioglio, F. Gabry, I. Land, Multi-kernel polar codes: Proof of polarization and error exponents, in 2017 IEEE Information Theory Workshop (ITW), pp. 101–105 (2017). https://doi.org/10.1109/itw.2017.8277949

  5. N. Presman, O. Shapira, S. Litsyn, Mixed-kernels constructions of polar codes. IEEE J. Sel. Areas Commun. 34(2), 239–253 (2016). https://doi.org/10.1109/JSAC.2015.2504278

    Article  Google Scholar 

  6. F. Gabry, V. Bioglio, I. Land, J. Belfiore, Multi-kernel construction of polar codes, in 2017 IEEE International Conference on Communications Workshops (ICC Workshops), pp. 761–765 (2017). https://doi.org/10.1109/ICCW.2017.7962750

  7. V. Bioglio, F. Gabry, I. Land, J. Belfiore, Multi-kernel polar codes: concept and design principles. IEEE Trans. Commun. (2020). https://doi.org/10.1109/TCOMM.2020.3006212

    Article  Google Scholar 

  8. M. Hanif, M. Ardakani, Polar codes: bounds on Bhattacharyya parameters and their applications. IEEE Trans. Commun. 66(12), 5927–5937 (2018). https://doi.org/10.1109/TCOMM.2018.2867475

    Article  Google Scholar 

  9. R. Mori, T. Tanaka, Performance and construction of polar codes on symmetric binary-input memoryless channels, in 2009 IEEE International Symposium on Information Theory, pp. 1496–1500 (2009). https://doi.org/10.1109/ISIT.2009.5205857

  10. R. Mori, T. Tanaka, Performance of polar codes with the construction using density evolution. IEEE Commun. Lett. 13(7), 519–521 (2009). https://doi.org/10.1109/LCOMM.2009.090428

    Article  Google Scholar 

  11. P. Trifonov, Efficient design and decoding of polar codes. IEEE Trans. Commun. 60(11), 3221–3227 (2012). https://doi.org/10.1109/TCOMM.2012.081512.110872

    Article  Google Scholar 

  12. I. Tal, A. Vardy, How to construct polar codes. IEEE Trans. Inf. Theory 59(10), 6562–6582 (2013). https://doi.org/10.1109/TIT.2013.2272694

    Article  MathSciNet  MATH  Google Scholar 

  13. P. Trifonov, On construction of polar subcodes with large kernels, in 2019 IEEE International Symposium on Information Theory (ISIT), pp. 1932–1936 (2019). https://doi.org/10.1109/ISIT.2019.8849672

  14. H. Vangala, E. Viterbo, Y. Hong, A comparative study of polar code constructions for the AWGN channel. arXiv: InformationTheory (2015)

  15. C. Schurch, A partial order for the synthesized channels of a polar code, in 2016 IEEE International Symposium on Information Theory (ISIT), pp. 220–224 (2016). https://doi.org/10.1109/ISIT.2016.7541293

  16. W. Wu, P.H. Siegel, Generalized partial orders for polar code bit-channels. IEEE Trans. Inf. Theory 65(11), 7114–7130 (2019). https://doi.org/10.1109/TIT.2019.2930292

    Article  MathSciNet  MATH  Google Scholar 

  17. G. He, J. Belfiore, I., Land, G. Yang, X. Liu, Y. Chen, R. Li, J. Wang, Y. Ge, R. Zhang, et al. Beta-expansion: a theoretical framework for fast and recursive construction of polar codes, in 2017 IEEE Global Communications Conference, pp. 1–6 (2017). https://doi.org/10.1109/GLOCOM.2017.8254146

  18. S.B. Korada, Polar codes for channel and source coding. PhD thesis, École Polytechnique Fédérale de Lausanne, Lausanne(Switzerland) (2009). https://doi.org/10.5075/epfl-thesis-4461

  19. L. Zhang, Z. Zhang, X. Wang, Polar code with block-length n = 3 n, in 2012 International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–6 (2012). https://doi.org/10.1109/WCSP.2012.6542982

  20. L. Cheng, L. Zhang, Q. Sun, Classification of polarizing matrices based on bhattacharyya parameters, in 2018 12th IEEE International Conference on Anti-counterfeiting, Security, and Identification (ASID), pp. 159–163 (2018). https://doi.org/10.1109/ICASID.2018.8693130

  21. R. Mori, T. Tanaka, Source and channel polarization over finite fields and Reed-Solomon matrices. IEEE Trans. Inf. Theory 60(5), 2720–2736 (2014). https://doi.org/10.1109/TIT.2014.2312181

    Article  MathSciNet  MATH  Google Scholar 

  22. H. Lin, S. Lin, K. Abdelghaffar, Linear and nonlinear binary kernels of polar codes of small dimensions with maximum exponents. IEEE Trans. Inf. Theory 61(10), 5253–5270 (2015). https://doi.org/10.1109/TIT.2015.2469298

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

This work is supported by the Research Fund for the Doctoral Program (JY2019B162), and in part by Research Fund for the Doctoral Program (JSY2018029).

Author information

Authors and Affiliations

Authors

Contributions

TZ carried out the tight upper Bhattacharyya parameter calculation method and drafted the manuscript. SL helped to improve the calculation method and participated in drafting the manuscript. BY helped revise and improve the whole paper. All authors read and approve the final manuscript.

Corresponding author

Correspondence to Sensen Li.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 1. Proof of Property 1

\(\forall k\in [1,n)\) and \(\forall s,t\in [0,2^k)\), then

$$\begin{aligned} {\varvec{v}}_{s::2^k,:}^{(i,e)}=\{{\varvec{v}}_{m\cdot 2^k+s,:}^{(i,e)}:0\le m<2^{n-k}\}, \end{aligned}$$
(30)

where \({\varvec{v}}_{m\cdot 2^k+s,:}^{(i,u_i)}=\left( 0_0^{i-1},u_i,b(m,n-k),b(s,k)\right)\).

Assume that \({\varvec{\lambda }}_k^{(s)}=g\left( {\varvec{x}}_{s::2^k,:}^{(i,u_i)},1_0^{l-1}\right)\) and \({\varvec{\lambda }}_k^{(t)}=g\left( {\varvec{x}}_{t::2^k,:}^{(i,u_i)},1_0^{l-1}\right)\). According to (3), the elements of the last k columns of \({\varvec{x}}_{s::2^k,:}^{(i,e)}\) are only related to s and k, i.e., \({\varvec{\lambda }}_{k,l-k:l-1}^{(s)}=1_0^{k-1}\). Similarly, \({\varvec{\lambda }}_{k,l-k:l-1}^{(t)}=1_0^{k-1}\). Therefore, \(g\left( {\varvec{x}}_{s::2^k,:}^{(i,u_i)},1_0^{l-1}\right) \ne 0_0^{l-1}\) holds.

For \(j\in [0,l-k)\), let’s suppose that \(\lambda _{k,j}^{(s)}=1\). Then, for \(\forall m\in [0,2^{n-k})\),

$$\begin{aligned} {\varvec{x}}_{m\cdot 2^k+s,j}^{(i,u_i)}=\left( \left( 0_0^{i-1},u_i,b(m,n-k),b(s,k)\right) \cdot G_l\right) _j\equiv w,w \in \{0,1\}. \end{aligned}$$
(31)

Since \({\varvec{\lambda }}_{k,l-k:l-1}^{(s)}={\varvec{\lambda }}_{k,l-k:l-1}^{(t)}=1_0^{k-1}\), then,

$$\begin{aligned} {\varvec{x}}_{m\cdot 2^k+t,j}^{(i,u_i)}=\left( \left( 0_0^{i-1},u_i,b(m,n-k),b(t,k)\right) \cdot G_l\right) _j\equiv w',w' \in \{0,1\}, \end{aligned}$$
(32)

holds, i.e., \(\lambda _{k,j}^{(t)}=1\). Therefore, \({\varvec{\lambda }}_k^{(s)}=\varvec{\lambda }_k^{(t)}\).

1.2 2. Proof of Lemma 2

Let \({\varvec{\lambda }}_0={\varvec{x}}_{0,:}\oplus {\varvec{x}}_{2,:}\) and \({\varvec{\lambda }}_1=\overline{{\varvec{\lambda }}_0}\wedge {\varvec{\lambda }}\). Since \({\varvec{x}}_{0,:}\oplus {\varvec{x}}_{1,:}={\varvec{x}}_{2,:}\oplus {\varvec{x}}_{3,:}={\varvec{c}}\), we can derive that \(f({\varvec{x}}_{i,:},{\varvec{c}})=f({\varvec{x}}_{i,:},{\varvec{\lambda }}_0)\cdot f({\varvec{x}}_{i,:},\varvec{\lambda }_1)\), \(f({\varvec{x}}_{0,:},{\varvec{\lambda }}_0)=f({\varvec{x}}_{3,:},{\varvec{\lambda }}_0)\), \(f({\varvec{x}}_{1,:},{\varvec{\lambda }}_0)=f(\varvec{x}_{2,:},{\varvec{\lambda }}_0)\), \(f({\varvec{x}}_{0,:},{\varvec{\lambda }}_1)=f({\varvec{x}}_{2,:},{\varvec{\lambda }}_1)\), and \(f({\varvec{x}}_{1,:},{\varvec{\lambda }}_1)=f({\varvec{x}}_{3,:},{\varvec{\lambda }}_1)\).

Let \(\psi =\sqrt{\left[ f({\varvec{x}}_{0,:},{\varvec{ c}})+f({\varvec{x}}_{1,:},{\varvec{c}})\right] \cdot \left[ f({\varvec{x}}_{2,:},{\varvec{c}})+f({\varvec{x}}_{3,:},{\varvec{c}})\right] }\), According to Lemma 1, we can derive that

$$\begin{aligned} \begin{array}{lllllll} \psi &{}\le &{}[f({\varvec{x}}_{0,:},{\varvec{ \lambda }}_0)+f({\varvec{x}}_{1,:},{\varvec{ \lambda }}_0)]\sqrt{f({\varvec{x}}_{0,:},{\varvec{ \lambda }}_1)\cdot f({\varvec{x}}_{1,:},{\varvec{ \lambda }}_1)}+\\ &{}&{}[f({\varvec{x}}_{0,:},{\varvec{ \lambda }}_1)+f({\varvec{x}}_{1,:},{\varvec{ \lambda }}_1)]\sqrt{f({\varvec{x}}_{0,:},{\varvec{\lambda }}_0)\cdot f({\varvec{x}}_{1,:},{\varvec{\lambda }}_0)}\\ &{}&{}-\sqrt{f({\varvec{x}}_{0,:},{\varvec{ \lambda }}_0)\cdot f({\varvec{x}}_{1,:},{\varvec{\lambda }}_0)\cdot f({\varvec{x}}_{0,:},\varvec{\lambda }_1)\cdot f({\varvec{x}}_{1,:},\varvec{\lambda }_1)}. \end{array} \end{aligned}$$
(33)

Since \(\sum _{y_0^{l-1}}\left[ f({\varvec{x}}_{0,:},{\varvec{\lambda }}_i)+f({\varvec{x}}_{1,:},{\varvec{\lambda }}_i) \right] =2\) and \(\sum _{y_0^{l-1}}\sqrt{f({\varvec{x}}_{0,:},{\varvec{\lambda }}_i)\cdot f({\varvec{x}}_{1,:},{\varvec{\lambda }}_i)}=Z^{\sum {\varvec{\lambda }}_i}(W)\), the conclusion in Lemma 2 holds.

1.3 3. Proof of Theorem 1

The proof is divided into three cases according to the value of k.

(1) \(k=0\).

Let \(Z^*\) = \(\sum _{y_0^{l-1}}\sqrt{s \left( {\varvec{\gamma }}_{0,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{0,0}^{(i,1)}\right) }\). According to (16) and (18), both \(({\varvec{\gamma }}_{1,0}^{(i,0)}, {\varvec{\gamma }}_{1,1}^{(i,0)})\) and \(({\varvec{\gamma }}_{1,1}^{(i,0)}:{\varvec{\gamma }}_{1,1}^{(i,1)})\) are mutually different vector pairs with a VCI of \({\varvec{c}}_1\). It can be derived that \({\varvec{ \lambda }}^*\) and \(\varvec{\lambda }'\) shown in Theorem 1 denote the CCIs \({\varvec{\gamma }}_{1,0}^{(i,0)}:{\varvec{ \gamma }}_{1,j}^{(i,1)}\) and \({\varvec{\gamma }}_{1,0}^{(i,0)}:{\varvec{\gamma }}_{1,1-j}^{(i,1)}\) with the same VCI of \({\varvec{c}}_1\), respectively. Then, both \(f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},\lambda ^* \right) =f \left( {\varvec{\gamma }}_{1,j}^{(i,1)},{\varvec{ \lambda }}^* \right)\) and \(f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda '}} \right) =f \left( {\varvec{\gamma }}_{1,1-j}^{(i,1)},{\varvec{\lambda '}} \right)\) hold.

According to (19) and (12), \(Z^*\) can be transformed into:

$$\begin{aligned} \begin{array}{lll} &{}\sum _{y_0^{l-1}}\sqrt{\left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda '}} \right) \cdot f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}^* \right) \cdot s\left( {\varvec{\gamma }}_{1,0}^{(i,0)}\right) +f \left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda '}} \right) \cdot f \left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{ \lambda }}^* \right) \cdot s \left( {\varvec{ \gamma }}_{1,1}^{(i,0)} \right) \right] }\cdot \\ &{}\sqrt{\left[ f \left( {\varvec{\gamma }}_{1,1-j}^{(i,1)},{\varvec{\lambda '}} \right) \cdot f \left( {\varvec{\gamma }}_{1,1-j}^{(i,1)},{\varvec{\lambda }}^* \right) \cdot s \left( {\varvec{\gamma }}_{1,1-j}^{(i,1)} \right) +f \left( {\varvec{\gamma }}_{1,j}^{(i,1)},{\varvec{\lambda '}} \right) \cdot f \left( {\varvec{\gamma }}_{1,j}^{(i,1)},{\varvec{\lambda }}^* \right) \cdot s \left( {\varvec{\gamma }}_{1,j}^{(i,1)} \right) \right] }, \end{array} \end{aligned}$$
(34)

which matches (14). Then, we can calculate the upper bound of \(Z^*\) defined by

$$\begin{aligned} \begin{array}{lllll} Z^*&{}\le &{} \sum _{y_0^{l-1}} \left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}' \right) +f \left( {\varvec{\gamma }}_{1,1}^{(i,0)},\lambda ' \right) \right] \sqrt{f \left( \gamma _{1,0}^{(i,0)},\lambda ^* \right) \cdot f \left( \gamma _{1,0}^{(i,1)},\lambda ^* \right) } \sqrt{s \left( {\varvec{\gamma }}_{1,0}^{(i,0)}\right) \cdot s \left( {\varvec{\gamma }}_{1,1}^{(i,0)} \right) }\\ &{}+&{}\sum _{y_0^{l-1}}\left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}^* \right) \cdot s \left( {\varvec{\gamma }}_{1,0}^{(i,0)} \right) +f \left( \gamma _{1,1}^{(i,0)},\lambda '* \right) \cdot s \left( \gamma _{1,1}^{(i,0)} \right) \right] \sqrt{f\left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}' \right) \cdot f\left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda }}' \right) }\\ &{}-&{} \sum _{y_0^{l-1}}\sqrt{f\left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}' \right) \cdot f\left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda }}' \right) }\sqrt{f\left( {\varvec{\gamma }}_{1,0}^{(i,0)},\varvec{\lambda }^* \right) \cdot f\left( {\varvec{\gamma }}_{1,0}^{(i,1)},{\varvec{\lambda }}^* \right) }\cdot \sqrt{s \left( {\varvec{\gamma }}_{1,0}^{(i,0)}\right) \cdot s \left( {\varvec{\gamma }}_{1,1}^{(i,0)}\right) } \end{array} \end{aligned}$$
(35)

Since \(({\varvec{\gamma }}_{1,0}^{(i,0)}, {\varvec{\gamma }}_{1,1}^{(i,0)})\) and \(({\varvec{\gamma }}_{1,1}^{(i,0)}, {\varvec{\gamma }}_{1,1}^{(i,1)})\) are mutually different vector pairs with a VCI of \({\varvec{c}}_1\), we can attain that \(\sum _{y_0^{l-1}}\left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}' \right) +f\left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda }}' \right) \right] =2\) , \(\sum _{y_0^{l-1}}\sqrt{f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}' \right) \cdot f \left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda }}' \right) }=Z^{\sum {\varvec{\lambda }}'}(W)\) , and \(\sum _{y_0^{l-1}}\sqrt{f\left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}^*\right) \cdot f \left( {\varvec{\gamma }}_{1,0}^{(i,1)},{\varvec{\lambda }}^* \right) }=Z^{\sum {\varvec{\lambda }}^*}(W)\) . Furthermore, since \({\varvec{\gamma }}_{1,0}^{(i,0)}\) has \(2^{n-k}\) leaf nodes, we can derive that \(\sum _{y_0^{l-1}}\left[ f \left( {\varvec{\gamma }}_{1,0}^{(i,0)},{\varvec{\lambda }}^* \right) \cdot s \left( {\varvec{\gamma }}_{1,0}^{i,0} \right) +f \left( {\varvec{\gamma }}_{1,1}^{(i,0)},{\varvec{\lambda }}^* \right) \cdot s \left( {\varvec{\gamma }}_{1,1}^{i,0} \right) \right] =2^n\) .

Therefore, \(Z^*\le 2\cdot Z^{\sum {\varvec{\lambda }}^*}(W)\cdot (1-Z^{\sum {\varvec{\lambda }}'}(W))\cdot \sum _{y_0^{l-1}}\sqrt{s \left( {\varvec{\gamma }}_{1,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{1,1}^{(i,0)} \right) }+2^n\cdot Z^{\sum {\varvec{\lambda }}'}(W)\).

Suppose that the upper bound of \(\sum _{y_0^{l-1}}\sqrt{s \left( {\varvec{\gamma }}_{1,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{1,1}^{(i,0)} \right) }\) is \(Z_1\). Then, we can obtain

$$\begin{aligned} Z^*\le 2\cdot Z^{\sum \lambda ^*}(W)\cdot \left( 1-Z^{\sum \lambda '}(W) \right) \cdot Z_1+2^n\cdot Z^{\sum \lambda '}(W) = Z_0. \end{aligned}$$
(36)

(2) \(0<k<n-1\).

By generalizing the case of \(k=0\) to \(0<k<n-1\), and let \(Z^*\) = \(\sum _{y_0^{l-1}}\sqrt{s \left( {\varvec{\gamma }}_{k,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{k,1}^{(i,0)}\right) }\), we could derive that

$$\begin{aligned} \begin{array}{lllll} Z^*\le &{} \sum _{y_0^{l-1}}\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}+{\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) \sqrt{s \left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) \cdot s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) }+\\ &{}\sum _{y_0^{l-1}}\left[ s \left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) +s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) \right] \sqrt{{\varvec{\gamma }}_{k+1,0}^{(i,0)}\cdot {\varvec{\gamma }}_{k+1,1}^{(i,0)}}-\\ &{}2\cdot \sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) \cdot s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) }\sqrt{{\varvec{\gamma }}_{k+1,0}^{(i,0)}\cdot \gamma _{k+1,1}^{(i,0)}}. \end{array} \end{aligned}$$
(37)

Since \(\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}, {\varvec{\gamma }}_{k+1,1}^{(i,0)} \right)\) is a mutually different vector pair with the VCI of \({\varvec{c}}_{k+1}\) and the CCI of \({\varvec{\lambda }}_{k+1}\), then \(\sum _{y_0^{l-1}}\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}+{\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) =2\) and \(\sum _{y_0^{l-1}}\sqrt{{\varvec{\gamma }}_{k+1,0}^{(i,0)}\cdot {\varvec{\gamma }}_{k+1,1}^{(i,0)}}=Z^{\sum {\varvec{\lambda }}_{k+1}}(W)\) hold. Since \({\varvec{\gamma }}_{k+1,0}^{(i,0)}\) has \(2^{n-k}\) leaf nodes, we can attain \(\sum _{y_0^{l-1}}\left[ s \left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) +s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) \right] =2^{n-k+1}\).

Therefore, \(Z^*\le \left( 2-2\cdot Z^{\sum {\varvec{\lambda }}_{k+1}}(W)\right) \cdot \sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) \cdot s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) }+2^{n-k+1}\cdot Z^{\sum {\varvec{\lambda }}_{k+1}}(W)\).

Suppose that the upper bound of \(\sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{k+1,0}^{(i,0)}\right) \cdot s\left( {\varvec{\gamma }}_{k+1,1}^{(i,0)}\right) }\) is \(Z_{k+1}\), then we obtain

$$\begin{aligned} Z^*\le \left( 2-2\cdot Z^{\sum {\varvec{\lambda }}_{k+1}}(W)\right) \cdot Z_{k+1}+2^{n-k+1}\cdot Z^{\sum {\varvec{\lambda }}_{k+1}}(W). \end{aligned}$$
(38)

(3) \(k=n-1\).

According to (16) and (18), \(\left( {\varvec{\gamma }}_{n,0}^{(i,0)}, {\varvec{\gamma }}_{n,1}^{(i,0)} \right)\) and \(\left( {\varvec{\gamma }}_{n,2}^{(i,0)}, {\varvec{\gamma }}_{n,3}^{(i,0)}\right)\) are two mutually different vector pairs under the VCI of \({\varvec{c}}_n\). Thus,

$$\begin{aligned} \begin{array}{lllll} \sum _{y_0^{l-1}}\sqrt{s\left( {\varvec{\gamma }}_{n-1,0}^{(i,0)} \right) \cdot s \left( {\varvec{\gamma }}_{n-1,1}^{(i,0)} \right) } &{}\\ = \sum _{y_0^{l-1}}\sqrt{ \left[ f\left( {\varvec{\gamma }}_{n,0}^{(i,0)}, {\varvec{\lambda }}_n \right) +f\left( {\varvec{\gamma }}_{n,1}^{(i,0)}, {\varvec{\lambda }}_n \right) \right] \cdot \left[ f\left( {\varvec{\gamma }}_{n,2}^{(i,0)}, {\varvec{\lambda }}_n \right) +f\left( {\varvec{\gamma }}_{n,3}^{(i,0)}, {\varvec{\lambda }}_n \right) \right] } &{}\\ \end{array} \end{aligned}$$
(39)

matches Lemma 2 and the expression of \(Z_k\) in (24) can be derived for \(k=n-1\).

Therefore, the conclusion of Theorem 1 holds.

1.4 4. Proof of Theorem 2

Since the mixed-nary representations of i and j differ only in the sth digit, it can be directly concluded according to Property 1 that the inputs \(u_i\) and \(u_j\) are processed by channels with the same index of the same polar code kernel in each stage before and after stage s. Therefore, the influence of the polar kernels on the reliability of channels i and j is the same in all stages except the sth stage.

In stage s, since the Bhattacharyya parameters of the polar code kernel satisfy \(Z_{l_s}^{(a)} \le Z_{l_s}^{(b)}\), there is \(Z_N^{(i)} \le Z_N^{(j)}\).

1.5 5. Proof of Theorem 3

Since the mixed-nary representations of i and j differ only in the sth and tth digits, it can be deduced according to Property 1 as in Theorem 2 that the influence of the polar kernels on the reliability of channels i and j is the same in all stages except stage s and t

Therefore, the reliability comparison of channels i and j can be attributed to the comparison between channels 1 and 2, whose binary representations are (01) and (10), respectively, of the polar code with a generator matrix of \(G_4=G_2\otimes G_2\).

According to [16], \(Z_4^{(2)} \le Z_4^{(1)}\). Therefore, \(Z_N^{(i)} \le Z_N^{(j)}\) holds.

1.6 6. The Bhattacharyya parameter upper bounds of the polar code kernels with dimensions ranging from 2 to 6

For some standard polar code kernels with dimension of \(l(\in [2,6])\), the upper Bhattacharyya parameter bounds computed by the proposed method are as follows:

For \(G_2=\begin{bmatrix} 1 &{} 0 \\ 1 &{} 1 \end{bmatrix}\), \(\begin{array}{ll} Z_2^{(0)-}=2Z-Z^2\\ Z_2^{(1)}=Z^2\\ \end{array}\) .

For \(G_3=\begin{bmatrix} 1 &{} 0 &{} 0\\ 1 &{} 1 &{} 0 \\ 1 &{} 0 &{} 1\end{bmatrix}\), \(\begin{array}{ll} Z_3^{(0)-}=3Z-3Z^2+Z^3 \\ Z_3^{(1)-}=2Z^2-Z^3\\ Z_3^{(2)}=Z^2\\ \end{array}\) .

For \(G_4=\begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 1 &{} 0 &{}0 \\ 1 &{} 0 &{} 1 &{} 0\\ 1 &{} 1 &{} 1 &{}1\end{bmatrix}\), \(\begin{array}{ll} Z_4^{(0)-}=4Z-6Z^2+4Z^3-Z^4\\ Z_4^{(1)-}=4Z^2-4Z^3+Z^4\\ Z_4^{(2)-}=2 Z^2-Z^4\\ Z_4^{(3)}=Z^2\\ \end{array}\) .

For \(G_5=\begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 &{}0\\ 1 &{} 1 &{} 0 &{}0 &{}0 \\ 1 &{} 0 &{} 1 &{}0 &{} 0\\ 1 &{} 0 &{} 0 &{}1 &{} 0\\ 1 &{} 1 &{} 1 &{} 0 &{}1 \end{bmatrix}\), \(\begin{array}{ll} Z_5^{(0)-}=5Z-10Z^2+10Z^3-5Z^4+Z^5 \\ Z_5^{(1)-}=6Z^2-9Z^3+5Z^4-Z^5 \\ Z_5^{(2)-}=3Z^2-Z^3-2Z^4+Z^5 \\ Z_5^{(3)-}=Z^2+Z^4-Z^5 \\ Z_5^{(4)}=Z^4 \\ \end{array}\) .

For \(G_6=\begin{bmatrix} 1 &{} 0 &{} 0 &{} 0 &{}0 &{} 0\\ 1 &{} 1 &{} 0 &{}0 &{}0 &{} 0\\ 1 &{} 0 &{} 1 &{}0 &{} 0 &{} 0\\ 1 &{} 1 &{} 1 &{}1 &{} 0 &{} 0\\ 0 &{} 0 &{} 1 &{} 0 &{}1 &{} 0\\ 0 &{} 0 &{} 1 &{} 1 &{}1 &{}1\end{bmatrix}\), \(\begin{array}{ll} Z_6^{(0)-}=6Z-15Z^2+20Z^3-15Z^4+6Z^5-Z^6\\ Z_6^{(1)-}=9Z^2-18Z^3+15Z^4-6Z^5+Z^6\\ Z_6^{(2)-}=4Z^2-2Z^3-4Z^4+4Z^5-Z^6\\ Z_6^{(3)-}=4Z^4-4Z^5+Z^6\\ Z_6^{(4)-}=2Z^2-Z^4\\ Z_6^{(5)}=Z^4\\ \end{array}\), where Z is the abbreviation of the Bhattacharyya parameter Z(W) of transmission channel W.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, T., Li, S. & Yu, B. Computing tight upper bounds for Bhattacharyya parameters of binary polar code kernels with arbitrary dimension. J Wireless Com Network 2021, 76 (2021). https://doi.org/10.1186/s13638-021-01954-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13638-021-01954-y

Keywords