TY - JOUR
T1 - McGPCR
T2 - A Multimodal Learning Model with Improved Applicability Domain Characterization for Predicting G Protein-Coupled Receptor Affinity of Plastic Chemicals
AU - Liu, Wenjia
AU - Wang, Haobo
AU - Fu, Zhiqiang
AU - Cui, Yunhan
AU - Chen, Jingwen
PY - 2025/12/9
Y1 - 2025/12/9
N2 - A variety of chemicals in plastics may pose risks to human health, while only a limited number have been extensively studied for their toxicity. Binding to G protein-coupled receptors (GPCRs) serves as a crucial molecular initiating event in identifying chemicals that induce toxic effects in humans. Given the diversity of GPCRs and chemicals, the binding affinity remains largely elusive, necessitating high-throughput models with the functionality of integrating chemical and receptor features to enable predictions across multiple receptors. Herein, a human GPCR affinity data set was constructed, containing 96,776 records between 59,599 compounds and 109 GPCRs. A multimodal learning model, McGPCR, was built to predict the GPCR binding affinity of chemicals by integrating multimodal features of molecular graphs and receptor binding sites. The McGPCR outperformed models with chemical structures as the only predictor variables. Applicability domain (AD) characterization based on feature-activity landscape analysis was proposed, which ensures the reliability of predictions. The McGPCR, along with the AD, was employed to predict affinities of over 9000 plastic chemicals. By integration of the affinity, persistence, bioaccumulation, and production volume, 30 plastic chemicals with potentially high environmental risks were identified. The McGPCR with AD characterization can serve as a powerful tool for identifying toxic chemicals harmful to human health.
AB - A variety of chemicals in plastics may pose risks to human health, while only a limited number have been extensively studied for their toxicity. Binding to G protein-coupled receptors (GPCRs) serves as a crucial molecular initiating event in identifying chemicals that induce toxic effects in humans. Given the diversity of GPCRs and chemicals, the binding affinity remains largely elusive, necessitating high-throughput models with the functionality of integrating chemical and receptor features to enable predictions across multiple receptors. Herein, a human GPCR affinity data set was constructed, containing 96,776 records between 59,599 compounds and 109 GPCRs. A multimodal learning model, McGPCR, was built to predict the GPCR binding affinity of chemicals by integrating multimodal features of molecular graphs and receptor binding sites. The McGPCR outperformed models with chemical structures as the only predictor variables. Applicability domain (AD) characterization based on feature-activity landscape analysis was proposed, which ensures the reliability of predictions. The McGPCR, along with the AD, was employed to predict affinities of over 9000 plastic chemicals. By integration of the affinity, persistence, bioaccumulation, and production volume, 30 plastic chemicals with potentially high environmental risks were identified. The McGPCR with AD characterization can serve as a powerful tool for identifying toxic chemicals harmful to human health.
KW - binding affinity
KW - G protein-coupled receptor
KW - multimodal learning
KW - per- and polyfluoroalkyl substances
KW - plastic chemicals
UR - https://www.scopus.com/pages/publications/105024262315
U2 - 10.1021/acs.est.5c02770
DO - 10.1021/acs.est.5c02770
M3 - 文章
C2 - 41251561
AN - SCOPUS:105024262315
SN - 0013-936X
VL - 59
SP - 25938
EP - 25949
JO - Environmental Science and Technology
JF - Environmental Science and Technology
IS - 48
ER -