Skip to main content
Fig. 2 | EURASIP Journal on Wireless Communications and Networking

Fig. 2

From: Dynamic handoff policy for RAN slicing by exploiting deep reinforcement learning

Fig. 2

Deep reinforcement learning. We design the architecture diagram of the deep reinforcement learning algorithm as shown in this figure. During the execution of the algorithm, the physical network needs to provide information such as network status, user actions and reward functions. Using this information, the agent performs deep Q-learning algorithm as shown in algorithm 1 for training. we design the neural network used for the approximation of the value function as the three-layer neural network

Back to article page