Reinforcement learning-based dynamic band and channel selection in cognitive radio ad-hoc networks

EURASIP Journal on Wireless Communications and Networking

Table 2 Weight parameters

Weights vector (Default)	Q-learning parameters	Reward parameters	DRE parameters
w₁ = 0.3, w₂ = 0.3,	Learning rate (α) = 0.3,	overhead (η) = 0.01,	r₁ = 1/6,
w₃ = 0.3, w₄ = 0.1	Discount factor (γ) = 0.7	δ = 2	r₂ = 5/6