From: Few-shot relation classification by context attention-based prototypical networks with BERT
Max length of a sentence | 64 |
---|---|
Batch size | 1 |
Training classes for one batch | 8 |
Learning rate | 2e-5 |
Train iterations | 10000 |
Convolutional window size | 3 |
Hidden layer dimension dh | 768 |
Number of multihead | 12 |