Skip to main content
Fig. 14 | EURASIP Journal on Wireless Communications and Networking

Fig. 14

From: Reinforcement learning-based dynamic band and channel selection in cognitive radio ad-hoc networks

Fig. 14

Rewards, states, and actions according to iteration at DDR = 90 kbps. In a, the reward is stable at more than 10 iterations, and we can see that the reward is temporally low in the overall interval by random action, similarly to Fig. 12. As shown in b the agent mainly visits the state 5. c Reveals that actions in band group 2 are selected mostly

Back to article page