跳到主要导航 跳到搜索 跳到主要内容

队列研究中纵向缺失数据填补方法的模拟研究

  • Li Yemian
  • , Zhao Peng
  • , Yang Yuhui
  • , Wang Jingxian
  • , Yan Hong
  • , Chen Fangyao
  • Xi'an Jiaotong University

科研成果: 期刊稿件文章同行评审

11 引用 (Scopus)

摘要

Objective Data being missed is an unavoidable problem in cohort studies. This paper compares the imputation effect of eight common missing data imputation methods involved in cutting longitudinal data through simulation study to provide a valuable reference for the treatment of missing data in longitudinal studies. Methods The simulation study is based on R language software and generates missing longitudinal data by the Monte Carlo method. By comparing the average absolute deviation, average relative deviation, and Type Ⅰ error from the regression analysis of different imputation methods, the imputation effect of varying imputation methods on missing longitudinal data and the influence on subsequent multivariate analysis are evaluated. Results The mean imputation, k nearest neighbor (KNN), regression imputation, and random forest all have a similar imputation effect, which is also steady. However, the hot deck is inferior to the above imputation methods. K-means clustering and expectation maximization (EM) algorithm are among the worst and unstable. Mean imputation, EM algorithm, random forest, KNN, and regression imputation can control Type Ⅰ error. Still, multiple imputations, hot deck, and K-means clustering cannot effectively manage the Type Ⅰ error. Conclusions For missing data in longitudinal studies, mean imputation, KNN, regression imputation, and random forest can be used as better imputation methods under the mechanism of missing at random. When the missing ratio is not too large, multiple imputations and hot deck can also perform well, but K-means clustering and EM algorithm are not recommended.

投稿的翻译标题Simulation study on missing data imputation methods for longitudinal data in cohort studies
源语言繁体中文
页(从-至)1889-1894
页数6
期刊Chinese Journal of Endemiology
42
10
DOI
出版状态已出版 - 2021

联合国可持续发展目标

此成果有助于实现下列可持续发展目标:

  1. 可持续发展目标 3 - 良好健康与福祉
    可持续发展目标 3 良好健康与福祉

关键词

  • Imputation
  • Longitudinal data
  • Missing data

学术指纹

探究 '队列研究中纵向缺失数据填补方法的模拟研究' 的科研主题。它们共同构成独一无二的指纹。

引用此