TY - JOUR
T1 - Self-play reinforcement learning guides protein engineering
AU - Wang, Yi
AU - Tang, Hui
AU - Huang, Lichao
AU - Pan, Lulu
AU - Yang, Lixiang
AU - Yang, Huanming
AU - Mu, Feng
AU - Yang, Meng
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Nature Limited.
PY - 2023/8
Y1 - 2023/8
N2 - Designing protein sequences towards desired properties is a fundamental goal of protein engineering, with applications in drug discovery and enzymatic engineering. Machine learning-guided directed evolution has shown success in expediting the optimization cycle and reducing experimental burden. However, efficient sampling in the vast design space remains a challenge. To address this, we propose EvoPlay, a self-play reinforcement learning framework based on the single-player version of AlphaZero. In this work, we mutate a single-site residue as an action to optimize protein sequences, analogous to playing pieces on a chessboard. A policy-value neural network reciprocally interacts with look-ahead Monte Carlo tree search to guide the optimization agent with breadth and depth. We extensively evaluate EvoPlay on a suite of in silico directed evolution tasks over full-length sequences or combinatorial sites using functional surrogates. EvoPlay also supports AlphaFold2 as a structural surrogate to design peptide binders with high affinities, validated by binding assays. Moreover, we harness EvoPlay to prospectively engineer luciferase, resulting in the discovery of variants with 7.8-fold bioluminescence improvement beyond wild type. In sum, EvoPlay holds great promise for facilitating protein design to tackle unmet academic, industrial and clinical needs.
AB - Designing protein sequences towards desired properties is a fundamental goal of protein engineering, with applications in drug discovery and enzymatic engineering. Machine learning-guided directed evolution has shown success in expediting the optimization cycle and reducing experimental burden. However, efficient sampling in the vast design space remains a challenge. To address this, we propose EvoPlay, a self-play reinforcement learning framework based on the single-player version of AlphaZero. In this work, we mutate a single-site residue as an action to optimize protein sequences, analogous to playing pieces on a chessboard. A policy-value neural network reciprocally interacts with look-ahead Monte Carlo tree search to guide the optimization agent with breadth and depth. We extensively evaluate EvoPlay on a suite of in silico directed evolution tasks over full-length sequences or combinatorial sites using functional surrogates. EvoPlay also supports AlphaFold2 as a structural surrogate to design peptide binders with high affinities, validated by binding assays. Moreover, we harness EvoPlay to prospectively engineer luciferase, resulting in the discovery of variants with 7.8-fold bioluminescence improvement beyond wild type. In sum, EvoPlay holds great promise for facilitating protein design to tackle unmet academic, industrial and clinical needs.
UR - https://www.scopus.com/pages/publications/85165210189
U2 - 10.1038/s42256-023-00691-9
DO - 10.1038/s42256-023-00691-9
M3 - 文章
AN - SCOPUS:85165210189
SN - 2522-5839
VL - 5
SP - 845
EP - 860
JO - Nature Machine Intelligence
JF - Nature Machine Intelligence
IS - 8
ER -