Table 1 Partial parameter settings

From: Reinforcement learning-based hybrid spectrum resource allocation scheme for the high load of URLLC services

Parameter Value Description
t \(0.015625\;{\text{ms}}\) The time interval for each resource allocation decision
Lrp \(0.15625\;{\text{ms}}\) The duration of the resource pool
|sRB| \(5760\;{\text{KHz}}\) The length of an RB in the frequency domain
|pRB| \(9\;{\text{dBm}}\) The length of an RB in the power domain
\(D_{MAX}\) [3 ms, 1 ms] Maximum delay constraint
\(\left| Q \right|\) 100 The length of the URLLC cache queue Q
NUMU 100 The total amount of received URLLC data
\(\alpha\) \(0.001\) Learning step size of policy gradient
\(N\) 4 The number of hidden layers
\(p_{i}\) 1/4 The initial probability of exit at the ith layer