TY - JOUR
T1 - Probabilistic evaluation of cultural soil heritage hazards in China from extremely imbalanced site investigation data using SMOTE-Gaussian process classification
AU - Song, Chao
AU - Peng, Hongzhen
AU - Xu, Ling
AU - Zhao, Tengyuan
AU - Guo, Zhiqian
AU - Chen, Wenwu
N1 - Publisher Copyright:
© 2024 Consiglio Nazionale delle Ricerche (CNR)
PY - 2024/5
Y1 - 2024/5
N2 - Cultural soil heritages (CSHs) are artifacts with historical, artistic, and scientific significance; however, they are vulnerable to various hazards, such as weathering, fractures, hollowing, collapses, and gullies. This is especially true for those CSHs exposed to the outdoors. Due to the large number of CSHs sites within China, managing and protecting these heritages with the aid of detailed on-site investigations is time-consuming and expensive. Consequently, evaluating the spatial distribution and degree of hazards developed in all these heritages becomes impractical. To address this issue, this paper developed a Gaussian process classification (GPC) method to predict the spatial distribution of typical hazards (i.e., weathering, fractures, hollowing, collapses, and gullies) and the development level of each hazard from eight environmental factors (e.g., annual relative humidity and annual sunshine time) and a limited number of investigation data. As the number of investigation data for different levels of each hazard is usually imbalanced and sparse, this study proposed a synthetic minority oversampling technique (SMOTE) with GPC to form the SMOTE-GPC method. A real-world example is used to illustrate this approach. Results from real-world data demonstrated that the proposed method achieved an F1 score, precision, recall, and Cohen's kappa with values greater than 0.93 in both the training and testing datasets, indicating its good performance.
AB - Cultural soil heritages (CSHs) are artifacts with historical, artistic, and scientific significance; however, they are vulnerable to various hazards, such as weathering, fractures, hollowing, collapses, and gullies. This is especially true for those CSHs exposed to the outdoors. Due to the large number of CSHs sites within China, managing and protecting these heritages with the aid of detailed on-site investigations is time-consuming and expensive. Consequently, evaluating the spatial distribution and degree of hazards developed in all these heritages becomes impractical. To address this issue, this paper developed a Gaussian process classification (GPC) method to predict the spatial distribution of typical hazards (i.e., weathering, fractures, hollowing, collapses, and gullies) and the development level of each hazard from eight environmental factors (e.g., annual relative humidity and annual sunshine time) and a limited number of investigation data. As the number of investigation data for different levels of each hazard is usually imbalanced and sparse, this study proposed a synthetic minority oversampling technique (SMOTE) with GPC to form the SMOTE-GPC method. A real-world example is used to illustrate this approach. Results from real-world data demonstrated that the proposed method achieved an F1 score, precision, recall, and Cohen's kappa with values greater than 0.93 in both the training and testing datasets, indicating its good performance.
KW - Cultural soil heritage
KW - Extremely imbalanced data
KW - Machine learning methods
KW - Non-parametric approach
KW - Spatial distribution prediction
UR - https://www.scopus.com/pages/publications/85186664034
U2 - 10.1016/j.culher.2024.02.014
DO - 10.1016/j.culher.2024.02.014
M3 - 文章
AN - SCOPUS:85186664034
SN - 1296-2074
VL - 67
SP - 121
EP - 133
JO - Journal of Cultural Heritage
JF - Journal of Cultural Heritage
ER -