Skip to main navigation Skip to search Skip to main content

Diabetic complication prediction using a similarity-enhanced latent Dirichlet allocation model

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

Diabetes and its complications have been recognized worldwide as a major public health threat. Predicting diabetic complications is regarded as a highly effective technique for increasing the survival rate of diabetic patients. While many studies currently use medical images and structured medical records, very limited efforts have been dedicated to applying data mining techniques for unstructured textual medical records, such as admission and discharge records. Moreover, the similarities among medical records that are overlooked by existing approaches could potentially improve the accuracy of prediction models. In this paper, we propose an approach for diabetic complication prediction based on a similarity-enhanced latent Dirichlet allocation (seLDA) model. Specifically, we first estimate the similarity between textual medical records after data preprocessing, and then we perform seLDA-based diabetic complication topic mining based on similarity constraints. Finally, we construct a prediction model by solving a multilabel classification problem with support vector machines (SVMs). The experimental results show that our approach outperforms the conventional LDA-based approach in similarity indices by 22.49%. Additionally, our approach shows significant improvements in prediction accuracy over four other representative seLDA-based approaches, including random forests (RF), k-nearest neighbors (KNN), logistic regression (LR) and deep neural networks (DNNs).

Original languageEnglish
Pages (from-to)12-24
Number of pages13
JournalInformation Sciences
Volume499
DOIs
StatePublished - Oct 2019
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Diabetic complication prediction
  • Latent Dirichlet allocation
  • Multilabel classification
  • Similarity enhancement
  • Topic mining

Fingerprint

Dive into the research topics of 'Diabetic complication prediction using a similarity-enhanced latent Dirichlet allocation model'. Together they form a unique fingerprint.

Cite this