Skip to main navigation Skip to search Skip to main content

Real-Time Operation Management for Battery Swapping-Charging System via Multi-Agent Deep Reinforcement Learning

  • North China Electric Power University
  • Jinan University
  • University of Texas at Arlington

Research output: Contribution to journalArticlepeer-review

83 Scopus citations

Abstract

Battery swapping-charging systems (BSCSs) can provide better battery swapping services for electric vehicles (EVs) in large cities. In BSCSs, EV batteries can be centrally charged at battery charging stations (BCSs) and then dispatched via delivery trucks to battery swapping stations (BSSs) to support local EVs. This paper considers the real-time optimization scheduling problem in BSCS, including battery charging, swapping and truck routing. We model this real-time scheduling problem as a decentralized partially observable Markov decision process (Dec-POMDP) and solve it using multi-agent deep reinforcement learning (MADRL) algorithms. The joint scheduling process of trucks and BCSs has many dynamic hard constraints between them that cannot be solved using the existing MADRL algorithms. To this end, we combine MADRL with binary integer programming (BLP) and propose the Value Decomposition Network (VDN)-BLP algorithm to solve the problem with constraints. We also combine actor-critic architecture and local search with VDN-BLP to substantially improve computational efficiency with little performance loss. Simulation results based on historical battery swapping data in Sanya City verify the effectiveness of the proposed method.

Original languageEnglish
Pages (from-to)559-571
Number of pages13
JournalIEEE Transactions on Smart Grid
Volume14
Issue number1
DOIs
StatePublished - 1 Jan 2023
Externally publishedYes

Keywords

  • Battery swapping
  • battery logistics
  • charging scheduling
  • deep reinforcement learning

Fingerprint

Dive into the research topics of 'Real-Time Operation Management for Battery Swapping-Charging System via Multi-Agent Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this