From: Reinforcement learning-based dynamic band and channel selection in cognitive radio ad-hoc networks
Weights vector (Default) | Q-learning parameters | Reward parameters | DRE parameters |
---|---|---|---|
w1 = 0.3, w2 = 0.3, | Learning rate (α) = 0.3, | overhead (η) = 0.01, | r1 = 1/6, |
w3 = 0.3, w4 = 0.1 | Discount factor (γ) = 0.7 | δ = 2 | r2 = 5/6 |