Abstract
An efficient energy trading strategy is proven to have a vital role in reducing participants’ payment in the energy trading process of the power grid, which can greatly improve the operation efficiency of the power grid and the willingness of participants to take part in the energy trading. Nevertheless, with the increasing number of participants taking part in the energy trading, the stability and efficiency of the energy trading system are exposed to an extreme challenge. To address this issue, an actor-critic-based bidding strategy for energy trading participants is proposed in this paper. Specifically, we model the bidding strategy with sequential decision-making characteristics as a Markov decision process, which treats three elements, namely, total supply, total demand, and participants’ individual supply or demand, as the state and regards bidding price and volume as the action. In order to address the problem that the existing value-based reinforcement learning bidding strategy cannot be applied to the continuous action space environment, we propose an actor–critic architecture, which endows the actor the ability of learning the action execution and utilizes the critic to evaluate the long-term rewards conditioned by the current state–action pairs. Simulation results in energy trading scenarios with different numbers of participants indicate that the proposed method will obtain a higher cumulative reward than the traditional greedy method.
| Original language | English |
|---|---|
| Article number | 1017438 |
| Journal | Frontiers in Energy Research |
| Volume | 10 |
| DOIs | |
| State | Published - 16 Jan 2023 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
Keywords
- actor–critic architecture
- continuous action space
- double-auction mechanism
- energy trading in smart grid
- reinforcement learning method
Fingerprint
Dive into the research topics of 'A deep reinforcement learning-based bidding strategy for participants in a peer-to-peer energy trading scenario'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver