TY - JOUR
T1 - Unlocking the black box beyond Bayesian global optimization for materials design using reinforcement learning
AU - Xian, Yuehui
AU - Ding, Xiangdong
AU - Jiang, Xue
AU - Zhou, Yumei
AU - Sun, Jun
AU - Xue, Dezhen
AU - Lookman, Turab
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/12
Y1 - 2025/12
N2 - Materials design often becomes an expensive black-box optimization problem due to limitations in balancing exploration-exploitation trade-offs in high-dimensional spaces. We propose a reinforcement learning (RL) framework that effectively navigates the complex design spaces through two complementary approaches: a model-based strategy utilizing surrogate models for sample-efficient exploration, and an on-the-fly strategy when direct experimental feedback is available. This approach demonstrates better performance in high-dimensional spaces (D ≥ 6) compared to Bayesian optimization (BO) with the Expected Improvement (EI) acquisition function through more dispersed sampling patterns and better landscape learning capabilities. Furthermore, we observe a synergistic effect when combining BO’s early-stage exploration with RL’s adaptive learning. Evaluations on both standard benchmark functions (Ackley, Rastrigin) and real-world high-entropy alloy data, demonstrate statistically significant improvements (p < 0.01) over traditional BO with EI, particularly in complex, high-dimensional scenarios. This work addresses limitations of existing methods while providing practical tools for guiding experiments.
AB - Materials design often becomes an expensive black-box optimization problem due to limitations in balancing exploration-exploitation trade-offs in high-dimensional spaces. We propose a reinforcement learning (RL) framework that effectively navigates the complex design spaces through two complementary approaches: a model-based strategy utilizing surrogate models for sample-efficient exploration, and an on-the-fly strategy when direct experimental feedback is available. This approach demonstrates better performance in high-dimensional spaces (D ≥ 6) compared to Bayesian optimization (BO) with the Expected Improvement (EI) acquisition function through more dispersed sampling patterns and better landscape learning capabilities. Furthermore, we observe a synergistic effect when combining BO’s early-stage exploration with RL’s adaptive learning. Evaluations on both standard benchmark functions (Ackley, Rastrigin) and real-world high-entropy alloy data, demonstrate statistically significant improvements (p < 0.01) over traditional BO with EI, particularly in complex, high-dimensional scenarios. This work addresses limitations of existing methods while providing practical tools for guiding experiments.
UR - https://www.scopus.com/pages/publications/105005592547
U2 - 10.1038/s41524-025-01639-w
DO - 10.1038/s41524-025-01639-w
M3 - 文章
AN - SCOPUS:105005592547
SN - 2057-3960
VL - 11
JO - npj Computational Materials
JF - npj Computational Materials
IS - 1
M1 - 143
ER -