Overview of the hardware architecture
We choose enTTS (20,28) scheme described in Section 2 for hardware implementations in a composite field GF((24)2), where the size of message (hash value) is 20 B and the signature size is 28 B.
We illustrate the generation of a multivariate signature in Fig. 1. It can be observed from Fig. 1 that the signature generations of multivariate scheme include seven steps:
(1) Affine transformation L1.
\( {L}_1^{-1} \) is an invertible affine transformation with the following form.
$$ \overline{y}= Ay+B. $$
A is a matrix with the size of 20 × 20.
B is a vector with the size of 20.
It can be observed that \( {L}_1^{-1} \) is performed via matrix-vector multiplications and vector additions, where A and B are parts of private keys.
(2) Polynomial evaluation (first part F)
First, we randomly choose the variables of \( {\overline{x}}_0,{\overline{x}}_1,\dots, {\overline{x}}_7 \), i.e., the first group variables of \( \overline{x} \).
Second, we evaluate the polynomials of f0, f1, …, f8, i.e., the first group polynomials of f
i
.
After that, this part of polynomial evaluation is performed via using additions and multiplications in a finite field.
(3) Solving (LSEs) in a finite field
During the signature generations of multivariate scheme, it is required to perform solving LSEs twice with the same matrix of size 9 × 9.
First, for the second group variables of \( \overline{x} \), i.e., \( {\overline{x}}_8,{\overline{x}}_9,\dots, {\overline{x}}_{16} \), we solve the LSEs on such variables.
Second, for the fourth group variables of \( \overline{x} \), i.e., \( {\overline{x}}_{19},{\overline{x}}_{20},\dots, {\overline{x}}_{27} \), we solve the LSEs on such variables.
During this step, solving LSEs is performed via using a variant Gauss-Jordan elimination in a finite field.
(4) Polynomial evaluation (second part F)
The third group variables of \( \overline{x} \), i.e., \( {\overline{x}}_{17},{\overline{x}}_{18} \) are solved by evaluating the second group polynomials of f
i
, i.e., f9, f10.
This part of polynomial evaluation is performed via using additions and multiplications in a finite field;
(5) Polynomial evaluation (third part F)
We evaluate the third group polynomials of f
i
, i.e., f11, f12, …, f19.
This part of polynomial evaluation is performed via using additions and multiplications in a finite field.
(6) Affine transformation L2: \( {L}_2^{-1} \) is an affine transformation with the following form:
C: is a matrix with the size of 28 × 28.
D is a vector with the size of 28.
It can be observed that \( {L}_2^{-1} \) is performed via matrix-vector multiplications and vector additions, where C and D are parts of private keys.
Our hardware architecture for the signature generation of multivariate scheme is depicted in Fig. 2. It can be observed from Fig. 2 that the hardware architecture consists of adders, multipliers, inverter, parallel Gauss-Jordan eliminator, polynomial evaluation, matrix vector multiplication, vector addition, polynomial evaluation, and processor components in a finite field, where only the first four components are computing components and the others are logical components.
Performance evaluation of irreducible polynomial in composite fields
Irreducible polynomials in composite fields are involved in the additions, multiplications, and other operations during signature generations. Thus, the performance evaluation of the irreducible polynomial in the composite field GF((24)2) is very critical for the implementation of high-speed hardware architecture of multivariate scheme.
We suppose that q(x) denotes the irreducible polynomial in GF((24)2), and it has the following form.
$$ q(x)={x}^2+{q}_1x+{q}_0. $$
p1, p0 are elements in GF(24).
We suppose that p(x) denotes the irreducible polynomial in the subfield of GF((24)2), i.e., GF(24), and it has the following form:
$$ p(x)={x}^4+{p}_3{x}^3+{p}_2{x}^2+{p}_1x+1. $$
p3, p2, p1 are bits, i.e., 0 or 1.
The performance of the multiplications and inversions has been evaluated based on such irreducible polynomials, respectively. q(x) = x2 + x + 9 is chosen as the irreducible polynomials in GF((24)2) and p(x) = x4 + x + 1 is chosen as the irreducible polynomials in the subfield GF(24).
Finite field adder
Let a(x) = a
h
x + a
l
and b(x) = b
h
x + b
l
be the elements in GF((24)2), where a
h
, a
l
, b
h
, and b
l
are elements in GF(24).
Then the addition of a(x) and b(x) can be expressed as
$$ \left({a}_hx+{a}_l\right)+\left({b}_hx+{b}_l\right)= $$
$$ \left({a}_h+{b}_h\right)x+{a}_l+{b}_l. $$
Then, we suppose that c
h
, c
l
are elements in GF(24), and we can compute their values via the following expressions:
Thus, c(x) = c
h
x + c
l
is the addition result of a(x) and b(x).
Finite field multiplier
Let a(x) = a
h
x + a
l
and b(x) = b
h
x + b
l
be the elements in GF((24)2), where a
h
, a
l
, b
h
, and b
l
are elements in GF(24).
Then the multiplication of a(x) and b(x) can be expressed as
$$ \left({a}_hx+{a}_l\right)\left({b}_hx+{b}_l\right)= $$
$$ \left({a}_h{b}_h{x}^2+\left({a}_h{b}_l+{a}_l{b}_h\right)x+{a}_l{b}_l\right)\operatorname{mod}q(x). $$
We perform the polynomial multiplication and reduction module the irreducible polynomial q(x) = x2 + x + 9. We suppose that c
h
and c
l
are elements in GF(24), and we can compute their values via the following expressions:
$$ {c}_h=\left({a}_h+{a}_l\right)\left({b}_h+{b}_l\right)+{a}_l{b}_l, $$
$$ {c}_l={a}_l{b}_l+9{a}_h{b}_h. $$
It can be observed that the critical path of multiplication of two elements in GF((24)2) includes one multiplication, one constant multiplication, and one addition in GF(24).
p(x) is the irreducible polynomial in GF(24). Let \( a(x)=\sum \limits_{i=0}^3{a}_i{x}^i \) and \( b(x)=\sum \limits_{i=0}^3{b}_i{x}^i \) be elements in GF(24), a
i
, b
i
∈ GF(2), and we suppose that
$$ c(x)=a(x)\times b(x)\left( \operatorname {mod}\;\left(p(x)\right)\right)=\sum \limits_{i=0}^3{c}_i{x}^i $$
is the multiplication result of two elements, where c
i
∈ GF(2).
First, we compute v
ij
for i = 0, 1, …, 6and j = 0, 1, 2, 3 according to the following equation:
$$ {x}^i \operatorname {mod}\;p(x)=\sum \limits_{j=0}^3{v}_{ij}{x}^j. $$
Next, we compute S
i
for i = 0, 1, …, 6 by the following equation:
$$ {S}_i=\sum \limits_{j+k=i}{a}_j{b}_k. $$
After that, we compute c
i
for i = 0, 1, 2, 3 by the following equation:
$$ {c}_i=\sum \limits_{j=0}^6{v}_{ji}{S}_j. $$
Finally, the multiplication result is
$$ c(x)=\sum \limits_{i=0}^3{c}_i{x}^i. $$
Multiplicative inverter
Let a(x) = a
h
x + a
l
and b(x) = b
h
x + b
l
be the elements in GF((24)2), where a
h
, a
l
, b
h
, and b
l
are elements in GF(24).
We suppose that b(x) is the inverse of a(x). Then,
$$ {\displaystyle \begin{array}{l}{b}_h={\left({a}_h+{a}_l\right)}^{-1}{a}_h{b}_l,\\ {}{b}_l={\left({a}_l+9{a}_h^2{\left({a}_l+{a}_h\right)}^{-1}\right)}^{-1}.\end{array}} $$
We use two binary trees for inversions in subfield GF(24), which are illustrated as follows:
Each binary tree has four layers in GF(24);
Root nodes are on the third layer;
Each node has at most two child nodes, left node represents value of zero and right node represents value of one;
Each child must either be a leaf or the root of another tree, each node has a father node when it is not a root node;
Each element in a finite field (except (0000)2 and (0001)2) has a unique traversal from root to leaf due to the fact that (0000)2 has no inverse and the inverse of (0001)2 is itself;
Each leaf is linked to another leaf.
Figure 3 depicts the architecture for inversions in GF(24).
Example 1. It can be observed from Fig. 4, if it is required to inverse the element (0100)2, we search the binary tree from root nodes to leaf nodes, the path from n1 to n4 represents (0100)2. n4 is linked with n8, thus the path from n5 to n8 represents the inverse of (0100)2, i.e., (1001)2.
Parallel Gauss-Jordan eliminator
During central map transformation in signature generations, it is required to solve LSEs in a finite field twice with the same matrix size 9 × 9.
We adopt a parallel Gauss-Jordan elimination, which is depicted in Fig. 5. It can solve a LSE with matrix size of 9 × 9. The parallel Gauss-Jordan eliminator solves systems of linear equations with 9 iterations, which is enhanced in the following directions:
First, exclusive adders are used in the parallel Gauss-Jordan elimination based on the design described in Section 3.3;
Second, exclusive multipliers are used in the parallel Gauss-Jordan elimination based on the design described in Section 3.4;
Third, exclusive inverters are used in the parallel Gauss-Jordan elimination based on the design described in Section 3.5.
It can be observed from Fig. 4, I, N
l
, and E
kl
are three kinds of cells in the architecture, where k = 1, 2, …, 9 and l = 1, 2, …, 10.
The I cell is used for multiplicative inversion in a finite field, which includes an exclusive inverter described in Section 3.5.
The N
l
cells are used for normalization of finite field elements, which includes exclusive multipliers described in Section 3.4.
The E
kl
cells are used for elimination of finite field elements, which includes exclusive adders and multipliers described in Sections 3.3 and 3.4.
In conclusion, the architecture includes one I cell, 9 N
l
cells, and 90 E
lk
cells and solves the LSEs within 9 clock cycles with the matrix size of 9 × 9.