TY - GEN
T1 - Learning robust patient representations from multi-modal electronic health records
T2 - 2021 SIAM International Conference on Data Mining, SDM 2021
AU - Zhang, Xianli
AU - Qian, Buyue
AU - Li, Yang
AU - Liu, Yang
AU - Chen, Xi
AU - Guan, Chong
AU - Li, Chen
N1 - Publisher Copyright:
© 2021 by SIAM.
PY - 2021
Y1 - 2021
N2 - Predicting patients’ future outcomes by analyzing Electronic health records (EHRs) is a hot topic in machine learning. The key challenge in this area is how to transform high dimensional, redundant, and heterogeneous EHRs into appropriate representations. In this paper, we argue for four desired properties of ideal patient representation learning, which are completeness, cross-modality invariance, anti-nuisance, and personality maintenance. To obtain such properties, We propose a Supervised Deep Patient Representation Learning Framework (SDPRL) to learn patients’ representations that incorporates complete semantics of health conditions by using multi-modal EHR data. Furthermore, we propose to maximize the mutual information (MI) among each pair of different modal representations, as well as minimizing the task-specific loss function. This not only keeps the task-relevant semantic information into the learned representations, but also makes the resulting representations to be relatively invariant across the modalities, anti-nuisance, and maintain the personality. With experiments conducted on the publicly available MIMIC-III dataset on the mortality prediction and forecasting the length of stay (LOS) tasks, we empirically demonstrate that the proposed SDPRL achieves higher prediction performance than baseline frameworks. Moreover, we demonstrate that SDPRL can yield the desired properties we argued. It can well-handle the modal-missing issue in the test phase, as well as getting advance to the goal of personalized medicine.
AB - Predicting patients’ future outcomes by analyzing Electronic health records (EHRs) is a hot topic in machine learning. The key challenge in this area is how to transform high dimensional, redundant, and heterogeneous EHRs into appropriate representations. In this paper, we argue for four desired properties of ideal patient representation learning, which are completeness, cross-modality invariance, anti-nuisance, and personality maintenance. To obtain such properties, We propose a Supervised Deep Patient Representation Learning Framework (SDPRL) to learn patients’ representations that incorporates complete semantics of health conditions by using multi-modal EHR data. Furthermore, we propose to maximize the mutual information (MI) among each pair of different modal representations, as well as minimizing the task-specific loss function. This not only keeps the task-relevant semantic information into the learned representations, but also makes the resulting representations to be relatively invariant across the modalities, anti-nuisance, and maintain the personality. With experiments conducted on the publicly available MIMIC-III dataset on the mortality prediction and forecasting the length of stay (LOS) tasks, we empirically demonstrate that the proposed SDPRL achieves higher prediction performance than baseline frameworks. Moreover, we demonstrate that SDPRL can yield the desired properties we argued. It can well-handle the modal-missing issue in the test phase, as well as getting advance to the goal of personalized medicine.
UR - https://www.scopus.com/pages/publications/85102820615
M3 - 会议稿件
AN - SCOPUS:85102820615
T3 - SIAM International Conference on Data Mining, SDM 2021
SP - 585
EP - 593
BT - SIAM International Conference on Data Mining, SDM 2021
PB - Siam Society
Y2 - 29 April 2021 through 1 May 2021
ER -