TY - JOUR
T1 - Interpretable machine learning for predicting and evaluating hydrogen production from supercritical water gasification of coal
AU - Tian, Jianghua
AU - Dong, Runqiu
AU - Jia, Hanbing
AU - Peng, Zhiyong
AU - Liu, Zhigang
AU - Wang, Le
AU - Yi, Lei
AU - Xu, Jialing
AU - Jin, Hui
AU - Chen, Bin
AU - Guo, Liejin
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2026/1/15
Y1 - 2026/1/15
N2 - Coal Supercritical Water Gasification (SCWG) process optimization by Machine Learning (ML) models is a promising strategy to conserve experimental resources. However, the lack of diversity in ML models and the neglect of their interpretability in existing works may limit the development of coal SCWG technology. This paper systematically collected 233 experimental results (1631 data points) to develop five ML models to analyze coal SCWG: Support Vector Regression (SVR), AdaBoost Regression (ABR), Decision Tree (DT), Random Forest (RF) Regression and Gradient Boosting Regression (GBR). The DT and GBR were found to have more robust predictive ability among the five models due to their superior performance in Mean Square Error (MSE), coefficient of determination (R2) and Mean Absolute Error (MAE). Temperature (TEMP) and Residence Time (RT) are the main controlling factors in determining gas production by analyzing the results based on SHapley Additive exPlanations (SHAP) values. There is a significant positive correlation between TEMP and RT and gas production. The SHAP values of the GBR model can well interpret the mechanism of the influence of coal SCWG parameters, especially the Concentration (CR) is negatively correlated with the gasification yields of H2, CO, and CO2, while it is positively correlated with the gas yield of CH4. Combining with the model predictive ability (MSE of 0.54, R2 of 0.97, MAE of 0.19) of the model and the interpretability of the mechanism, the GBR model may be a superior tool to assist the coal SCWG technology. The error analysis and catalyst were input into the GBR model as characteristic parameters to further enhance its robustness. Compared to the kinetic model, the GBR model improved the accuracy and generalization ability of the four-gas yield prediction by expanding the input parameters (TEMP, RT, CR, error, catalyst type and concentration). This work would be of great value in the prediction and optimization of the coal SCWG process.
AB - Coal Supercritical Water Gasification (SCWG) process optimization by Machine Learning (ML) models is a promising strategy to conserve experimental resources. However, the lack of diversity in ML models and the neglect of their interpretability in existing works may limit the development of coal SCWG technology. This paper systematically collected 233 experimental results (1631 data points) to develop five ML models to analyze coal SCWG: Support Vector Regression (SVR), AdaBoost Regression (ABR), Decision Tree (DT), Random Forest (RF) Regression and Gradient Boosting Regression (GBR). The DT and GBR were found to have more robust predictive ability among the five models due to their superior performance in Mean Square Error (MSE), coefficient of determination (R2) and Mean Absolute Error (MAE). Temperature (TEMP) and Residence Time (RT) are the main controlling factors in determining gas production by analyzing the results based on SHapley Additive exPlanations (SHAP) values. There is a significant positive correlation between TEMP and RT and gas production. The SHAP values of the GBR model can well interpret the mechanism of the influence of coal SCWG parameters, especially the Concentration (CR) is negatively correlated with the gasification yields of H2, CO, and CO2, while it is positively correlated with the gas yield of CH4. Combining with the model predictive ability (MSE of 0.54, R2 of 0.97, MAE of 0.19) of the model and the interpretability of the mechanism, the GBR model may be a superior tool to assist the coal SCWG technology. The error analysis and catalyst were input into the GBR model as characteristic parameters to further enhance its robustness. Compared to the kinetic model, the GBR model improved the accuracy and generalization ability of the four-gas yield prediction by expanding the input parameters (TEMP, RT, CR, error, catalyst type and concentration). This work would be of great value in the prediction and optimization of the coal SCWG process.
KW - Coal
KW - Machine learning
KW - Optimization
KW - Prediction and interpretation
KW - Supercritical water gasification
UR - https://www.scopus.com/pages/publications/105009798541
U2 - 10.1016/j.fuel.2025.136173
DO - 10.1016/j.fuel.2025.136173
M3 - 文章
AN - SCOPUS:105009798541
SN - 0016-2361
VL - 404
JO - Fuel
JF - Fuel
M1 - 136173
ER -