Skip to main content
Fig. 17 | EURASIP Journal on Wireless Communications and Networking

Fig. 17

From: Reinforcement learning-based dynamic band and channel selection in cognitive radio ad-hoc networks

Fig. 17

a Average operation time, b average transmission rate, and c reward of utilization according to weight change. The average operation time, average data rate, and reward for channel utilization by changing the weight assignment for DDR to 40 kbps. Since the reward function is composed of the weighted sum of the objective functions, the Q-learning can be operated according to the desired objective function by adjusting the weight. Therefore, if the weight of the operation time is increased, the average operation time is increased, and if the weight of the data transmission rate is increased, the average transmission rate is increased. Finally, increasing the weight of reward for utilization increases the average of reward for utilization

Back to article page