Figure 6
From: Cell range expansion using distributed Q-learning in heterogeneous networks

Convergence of average throughputs through trials. The ratio of PRB is 40%. The Q-learning scheme is compared with the schemes using fixed common bias values, 16 dB and 32 dB. To show the convergence, the throughput’s values are averaged per 10 trials.