Probabilistic evaluation of cultural soil heritage hazards in China from extremely imbalanced site investigation data using SMOTE-Gaussian process classification

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Cultural soil heritages (CSHs) are artifacts with historical, artistic, and scientific significance; however, they are vulnerable to various hazards, such as weathering, fractures, hollowing, collapses, and gullies. This is especially true for those CSHs exposed to the outdoors. Due to the large number of CSHs sites within China, managing and protecting these heritages with the aid of detailed on-site investigations is time-consuming and expensive. Consequently, evaluating the spatial distribution and degree of hazards developed in all these heritages becomes impractical. To address this issue, this paper developed a Gaussian process classification (GPC) method to predict the spatial distribution of typical hazards (i.e., weathering, fractures, hollowing, collapses, and gullies) and the development level of each hazard from eight environmental factors (e.g., annual relative humidity and annual sunshine time) and a limited number of investigation data. As the number of investigation data for different levels of each hazard is usually imbalanced and sparse, this study proposed a synthetic minority oversampling technique (SMOTE) with GPC to form the SMOTE-GPC method. A real-world example is used to illustrate this approach. Results from real-world data demonstrated that the proposed method achieved an F1 score, precision, recall, and Cohen's kappa with values greater than 0.93 in both the training and testing datasets, indicating its good performance.

Original languageEnglish
Pages (from-to)121-133
Number of pages13
JournalJournal of Cultural Heritage
Volume67
DOIs
StatePublished - May 2024

Keywords

  • Cultural soil heritage
  • Extremely imbalanced data
  • Machine learning methods
  • Non-parametric approach
  • Spatial distribution prediction

Fingerprint

Dive into the research topics of 'Probabilistic evaluation of cultural soil heritage hazards in China from extremely imbalanced site investigation data using SMOTE-Gaussian process classification'. Together they form a unique fingerprint.

Cite this