Abstract
As the scale of data centers continues to expand, the environmental impact of their energy consumption has become a major concern, highlighting the increasing importance of thermal management in data centers. In this study, we address these challenges by adopting the Soft Actor-Critic (SAC) algorithm of reinforcement learning to enhance energy management efficiency. To further improve adaptability to environmental changes and provide a more comprehensive representation of the current state information, we introduce the Dynamic Control Interval SAC (DCI-SAC) structure and combined-value state space. We conducted two groups of simulation experiments to evaluate the performance of SAC and its variants. The first group of experiments showed that in a simulated data center model, SAC achieved energy savings of 32.23%, 9.86%, 10.77%, 6.95%, and 1.83% compared to PID, MPC, DQN, TRPO, and PPO, respectively, demonstrating SAC's superior algorithmic performance. The second group of experiments shows that DCI-SAC with a combined-value state space achieves up to a 6.25% reduction in energy consumption compared to SAC with the same state space. Additionally, it achieves up to a 9.48% reduction in energy consumption to SAC with a final-value state space. These results validate the effectiveness of the DCI-SAC and combined-value state space, showing that both improvements achieve superior energy efficiency and stability in the energy control of liquid-cooled data centers.
| Original language | English |
|---|---|
| Article number | 123815 |
| Journal | Applied Energy |
| Volume | 373 |
| DOIs | |
| State | Published - 1 Nov 2024 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
Keywords
- Combined-value state space
- Data centers
- Dynamic control interval
- Reinforcement learning
- Soft actor-critic
- Thermal management
Fingerprint
Dive into the research topics of 'Optimal dynamic thermal management for data center via soft actor-critic algorithm with dynamic control interval and combined-value state space'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver