Skip to main content
Fig. 3 | EURASIP Journal on Wireless Communications and Networking

Fig. 3

From: Reinforcement learning-based dynamic band and channel selection in cognitive radio ad-hoc networks

Fig. 3

Proposed system architecture. Proposed Q-learning is used to dynamically select the optimal band group and channel. As the reward function, the system considers the user demand, wireless environment and system parameters. The user demand module determines the desired data rate (DDR) of the CR ad-hoc network and measures the average utilization of the channel currently used. The wireless environment module stores the spectrum sensing results. The system parameters module is used to establish the reward function and Q-learning parameters. If the band of newly selected channel is different with the old one, the overhead for band group change is adopted to the reward function

Back to article page