TY - JOUR
T1 - Reinforcement learning-based scheduling of multi-battery energy storage system
AU - Cheng, Guangran
AU - Dong, Lu
AU - Yuan, Xin
AU - Sun, Changyin
N1 - Publisher Copyright:
© 1990-2011 Beijing Institute of Aerospace Information.
PY - 2023/2/1
Y1 - 2023/2/1
N2 - In this paper, a reinforcement learning-based multi-battery energy storage system (MBESS) scheduling policy is proposed to minimize the consumers' electricity cost. The MBESS scheduling problem is modeled as a Markov decision process (MDP) with unknown transition probability. However, the optimal value function is time-dependent and difficult to obtain because of the periodicity of the electricity price and residential load. Therefore, a series of time-independent action-value functions are proposed to describe every period of a day. To approximate every action-value function, a corresponding critic network is established, which is cascaded with other critic networks according to the time sequence. Then, the continuous management strategy is obtained from the related action network. Moreover, a two-stage learning protocol including offline and online learning stages is provided for detailed implementation in real-time battery management. Numerical experimental examples are given to demonstrate the effectiveness of the developed algorithm.
AB - In this paper, a reinforcement learning-based multi-battery energy storage system (MBESS) scheduling policy is proposed to minimize the consumers' electricity cost. The MBESS scheduling problem is modeled as a Markov decision process (MDP) with unknown transition probability. However, the optimal value function is time-dependent and difficult to obtain because of the periodicity of the electricity price and residential load. Therefore, a series of time-independent action-value functions are proposed to describe every period of a day. To approximate every action-value function, a corresponding critic network is established, which is cascaded with other critic networks according to the time sequence. Then, the continuous management strategy is obtained from the related action network. Moreover, a two-stage learning protocol including offline and online learning stages is provided for detailed implementation in real-time battery management. Numerical experimental examples are given to demonstrate the effectiveness of the developed algorithm.
KW - data-driven
KW - multi-battery energy storage system (MBESS)
KW - periodic value iteration
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/85150176179
U2 - 10.23919/JSEE.2023.000036
DO - 10.23919/JSEE.2023.000036
M3 - 文章
AN - SCOPUS:85150176179
SN - 1004-4132
VL - 34
SP - 117
EP - 128
JO - Journal of Systems Engineering and Electronics
JF - Journal of Systems Engineering and Electronics
IS - 1
ER -