TY - JOUR
T1 - A deep reinforcement learning-based power control scheme for the 5G wireless systems
AU - Liang, Renjie
AU - Lyu, Haiyang
AU - Fan, Jiancun
N1 - Publisher Copyright:
© China Communications Magazine Co., Ltd. October 2023.
PY - 2023/10/1
Y1 - 2023/10/1
N2 - In the fifth generation (5G) wireless system, a closed-loop power control (CLPC) scheme based on deep Q learning network (DQN) is introduced to intelligently adjust the transmit power of the base station (BS), which can improve the user equipment (UE) received signal to interference plus noise ratio (SINR) to a target threshold range. However, the selected power control (PC) action in DQN is not accurately matched the fluctuations of the wireless environment. Since the experience replay characteristic of the conventional DQN scheme leads to a possibility of insufficient training in the target deep neural network (DNN). As a result, the Q-value of the sub-optimal PC action exceed the optimal one. To solve this problem, we propose the improved DQN scheme. In the proposed scheme, we add an additional DNN to the conventional DQN, and set a shorter training interval to speed up the training of the DNN in order to fully train it. Finally, the proposed scheme can ensure that the Q value of the optimal action remains maximum. After multiple episodes of training, the proposed scheme can generate more accurate PC actions to match the fluctuations of the wireless environment. As a result, the UE received SINR can achieve the target threshold range faster and keep more stable. The simulation results prove that the proposed scheme outperforms the conventional schemes.
AB - In the fifth generation (5G) wireless system, a closed-loop power control (CLPC) scheme based on deep Q learning network (DQN) is introduced to intelligently adjust the transmit power of the base station (BS), which can improve the user equipment (UE) received signal to interference plus noise ratio (SINR) to a target threshold range. However, the selected power control (PC) action in DQN is not accurately matched the fluctuations of the wireless environment. Since the experience replay characteristic of the conventional DQN scheme leads to a possibility of insufficient training in the target deep neural network (DNN). As a result, the Q-value of the sub-optimal PC action exceed the optimal one. To solve this problem, we propose the improved DQN scheme. In the proposed scheme, we add an additional DNN to the conventional DQN, and set a shorter training interval to speed up the training of the DNN in order to fully train it. Finally, the proposed scheme can ensure that the Q value of the optimal action remains maximum. After multiple episodes of training, the proposed scheme can generate more accurate PC actions to match the fluctuations of the wireless environment. As a result, the UE received SINR can achieve the target threshold range faster and keep more stable. The simulation results prove that the proposed scheme outperforms the conventional schemes.
KW - closed-loop power control (CLPC)
KW - reinforcement learning
KW - signal-to-interference-plus-noise ratio (SINR)
UR - https://www.scopus.com/pages/publications/85159812201
U2 - 10.23919/JCC.ea.2021-0523.202302
DO - 10.23919/JCC.ea.2021-0523.202302
M3 - 文章
AN - SCOPUS:85159812201
SN - 1673-5447
VL - 20
SP - 109
EP - 119
JO - China Communications
JF - China Communications
IS - 10
ER -