Figure 4
From: Sensing time and power allocation for cognitive radios using distributed Q-learning

Exploration strategy that consists in doing pure exploration during the first seconds of each TDMA time slot, then pure exploitation during the remaining last seconds of the time slot.