Few-shot relation classification by context attention-based prototypical networks with BERT

EURASIP Journal on Wireless Communications and Networking

Table 2 Parameter settings

Max length of a sentence	64
Batch size	1
Training classes for one batch	8
Learning rate	2e-5
Train iterations	10000
Convolutional window size	3
Hidden layer dimension d_h	768
Number of multihead	12