GRU-D-Weibull: A novel real-time individualized endpoint prediction

Xiaoyang Ruan; Liwei Wang; Charat Thongprayoon; Wisit Cheungpasitporn; Hongfang Liu

doi:10.1016/j.artmed.2023.102696

GRU-D-Weibull: A novel real-time individualized endpoint prediction

Xiaoyang Ruan, Liwei Wang, Charat Thongprayoon, Wisit Cheungpasitporn, Hongfang Liu

Digital Health Sciences

Research output: Contribution to journal › Article › peer-review

Abstract

Background: In the era of healthcare digital transformation, using electronic health record (EHR) data to generate various endpoint estimates for active monitoring is highly desirable in chronic disease management. However, traditional predictive modeling strategies leveraging well-curated data sets can have limited real-world implementation potential due to various data quality issues in EHR data. Methods: We propose a novel predictive modeling approach, GRU-D-Weibull, which models Weibull distribution leveraging gated recurrent units with decay (GRU-D), for real-time individualized endpoint prediction and population level risk management using EHR data. Experiments: We systematically evaluated the performance and showcased the real-world implementability of the proposed approach through individual level endpoint prediction using a cohort of patients with chronic kidney disease stage 4 (CKD4). A total of 536 features including ICD/CPT codes, medications, lab tests, vital measurements, and demographics were retrieved for 6879 CKD4 patients. The performance metrics including C-index, L1-loss, Parkes' error, and predicted survival probability at time of event were compared between GRU-D-Weibull and other alternative approaches including accelerated failure time model (AFT), XGBoost based AFT (XGB(AFT)), random survival forest (RSF), and Nnet-survival. Both in-process and post-process calibrations were experimented on GRU-D-Weibull generated survival probabilities. Results: GRU-D-Weibull demonstrated C-index of ~0.7 at index date, which increased to ~0.77 at 4.3 years of follow-up, comparable to that of RSF. GRU-D-Weibull achieved absolute L1-loss of ~1.1 years (sd≈0.95) at CKD4 index date, and a minimum of ~0.45 year (sd≈0.3) at 4 years of follow-up, comparing to second-ranked RSF of ~1.4 years (sd≈1.1) at index date and ~0.64 years (sd≈0.26) at 4 years. Both significantly outperform competing approaches. GRU-D-Weibull constrained predicted survival probability at time of event to smaller and more fixed range than competing models throughout follow-up. Significant correlations were observed between prediction error and missing proportions of all major categories of input features at index date (Corr ~0.1 to ~0.3), which faded away within 1 year after index date as more data became available. Through post training recalibration, we achieved a close alignment between the predicted and observed survival probabilities across multiple prediction horizons at different time points during follow-up. Conclusion: GRU-D-Weibull shows advantages over competing methods in handling missingness commonly encountered in EHR data and providing both probability and point estimates for diverse prediction horizons during follow-up. The experiment highlights the potential of GRU-D-Weibull as a suitable candidate for individualized endpoint risk management, utilizing real-time clinical data to generate various endpoint estimates for monitoring. Additional research is warranted to evaluate the influence of different data quality aspects on prediction performance. Furthermore, collaboration with clinicians is essential to explore the integration of this approach into clinical workflows and evaluate its effects on decision-making processes and patient outcomes.

Original language	English (US)
Article number	102696
Journal	Artificial Intelligence in Medicine
Volume	146
DOIs	https://doi.org/10.1016/j.artmed.2023.102696
State	Published - Dec 2023

Keywords

Chronic kidney disease (CKD)
Deep learning
Electronic health record (EHR)
Gated recurrent units with decay (GRU-D)
Individualized risk management
Real-time endpoint prediction

ASJC Scopus subject areas

Medicine (miscellaneous)
Artificial Intelligence

Access to Document

10.1016/j.artmed.2023.102696

Cite this

@article{3871b12233fd4eb88e935405d24fb55f,

title = "GRU-D-Weibull: A novel real-time individualized endpoint prediction",

abstract = "Background: In the era of healthcare digital transformation, using electronic health record (EHR) data to generate various endpoint estimates for active monitoring is highly desirable in chronic disease management. However, traditional predictive modeling strategies leveraging well-curated data sets can have limited real-world implementation potential due to various data quality issues in EHR data. Methods: We propose a novel predictive modeling approach, GRU-D-Weibull, which models Weibull distribution leveraging gated recurrent units with decay (GRU-D), for real-time individualized endpoint prediction and population level risk management using EHR data. Experiments: We systematically evaluated the performance and showcased the real-world implementability of the proposed approach through individual level endpoint prediction using a cohort of patients with chronic kidney disease stage 4 (CKD4). A total of 536 features including ICD/CPT codes, medications, lab tests, vital measurements, and demographics were retrieved for 6879 CKD4 patients. The performance metrics including C-index, L1-loss, Parkes' error, and predicted survival probability at time of event were compared between GRU-D-Weibull and other alternative approaches including accelerated failure time model (AFT), XGBoost based AFT (XGB(AFT)), random survival forest (RSF), and Nnet-survival. Both in-process and post-process calibrations were experimented on GRU-D-Weibull generated survival probabilities. Results: GRU-D-Weibull demonstrated C-index of ~0.7 at index date, which increased to ~0.77 at 4.3 years of follow-up, comparable to that of RSF. GRU-D-Weibull achieved absolute L1-loss of ~1.1 years (sd≈0.95) at CKD4 index date, and a minimum of ~0.45 year (sd≈0.3) at 4 years of follow-up, comparing to second-ranked RSF of ~1.4 years (sd≈1.1) at index date and ~0.64 years (sd≈0.26) at 4 years. Both significantly outperform competing approaches. GRU-D-Weibull constrained predicted survival probability at time of event to smaller and more fixed range than competing models throughout follow-up. Significant correlations were observed between prediction error and missing proportions of all major categories of input features at index date (Corr ~0.1 to ~0.3), which faded away within 1 year after index date as more data became available. Through post training recalibration, we achieved a close alignment between the predicted and observed survival probabilities across multiple prediction horizons at different time points during follow-up. Conclusion: GRU-D-Weibull shows advantages over competing methods in handling missingness commonly encountered in EHR data and providing both probability and point estimates for diverse prediction horizons during follow-up. The experiment highlights the potential of GRU-D-Weibull as a suitable candidate for individualized endpoint risk management, utilizing real-time clinical data to generate various endpoint estimates for monitoring. Additional research is warranted to evaluate the influence of different data quality aspects on prediction performance. Furthermore, collaboration with clinicians is essential to explore the integration of this approach into clinical workflows and evaluate its effects on decision-making processes and patient outcomes.",

keywords = "Chronic kidney disease (CKD), Deep learning, Electronic health record (EHR), Gated recurrent units with decay (GRU-D), Individualized risk management, Real-time endpoint prediction",

author = "Xiaoyang Ruan and Liwei Wang and Charat Thongprayoon and Wisit Cheungpasitporn and Hongfang Liu",

note = "Publisher Copyright: {\textcopyright} 2023 The Authors",

year = "2023",

month = dec,

doi = "10.1016/j.artmed.2023.102696",

language = "English (US)",

volume = "146",

journal = "Artificial Intelligence in Medicine",

issn = "0933-3657",

publisher = "Elsevier",

}

TY - JOUR

T1 - GRU-D-Weibull

T2 - A novel real-time individualized endpoint prediction

AU - Ruan, Xiaoyang

AU - Wang, Liwei

AU - Thongprayoon, Charat

AU - Cheungpasitporn, Wisit

AU - Liu, Hongfang

PY - 2023/12

Y1 - 2023/12

N2 - Background: In the era of healthcare digital transformation, using electronic health record (EHR) data to generate various endpoint estimates for active monitoring is highly desirable in chronic disease management. However, traditional predictive modeling strategies leveraging well-curated data sets can have limited real-world implementation potential due to various data quality issues in EHR data. Methods: We propose a novel predictive modeling approach, GRU-D-Weibull, which models Weibull distribution leveraging gated recurrent units with decay (GRU-D), for real-time individualized endpoint prediction and population level risk management using EHR data. Experiments: We systematically evaluated the performance and showcased the real-world implementability of the proposed approach through individual level endpoint prediction using a cohort of patients with chronic kidney disease stage 4 (CKD4). A total of 536 features including ICD/CPT codes, medications, lab tests, vital measurements, and demographics were retrieved for 6879 CKD4 patients. The performance metrics including C-index, L1-loss, Parkes' error, and predicted survival probability at time of event were compared between GRU-D-Weibull and other alternative approaches including accelerated failure time model (AFT), XGBoost based AFT (XGB(AFT)), random survival forest (RSF), and Nnet-survival. Both in-process and post-process calibrations were experimented on GRU-D-Weibull generated survival probabilities. Results: GRU-D-Weibull demonstrated C-index of ~0.7 at index date, which increased to ~0.77 at 4.3 years of follow-up, comparable to that of RSF. GRU-D-Weibull achieved absolute L1-loss of ~1.1 years (sd≈0.95) at CKD4 index date, and a minimum of ~0.45 year (sd≈0.3) at 4 years of follow-up, comparing to second-ranked RSF of ~1.4 years (sd≈1.1) at index date and ~0.64 years (sd≈0.26) at 4 years. Both significantly outperform competing approaches. GRU-D-Weibull constrained predicted survival probability at time of event to smaller and more fixed range than competing models throughout follow-up. Significant correlations were observed between prediction error and missing proportions of all major categories of input features at index date (Corr ~0.1 to ~0.3), which faded away within 1 year after index date as more data became available. Through post training recalibration, we achieved a close alignment between the predicted and observed survival probabilities across multiple prediction horizons at different time points during follow-up. Conclusion: GRU-D-Weibull shows advantages over competing methods in handling missingness commonly encountered in EHR data and providing both probability and point estimates for diverse prediction horizons during follow-up. The experiment highlights the potential of GRU-D-Weibull as a suitable candidate for individualized endpoint risk management, utilizing real-time clinical data to generate various endpoint estimates for monitoring. Additional research is warranted to evaluate the influence of different data quality aspects on prediction performance. Furthermore, collaboration with clinicians is essential to explore the integration of this approach into clinical workflows and evaluate its effects on decision-making processes and patient outcomes.

AB - Background: In the era of healthcare digital transformation, using electronic health record (EHR) data to generate various endpoint estimates for active monitoring is highly desirable in chronic disease management. However, traditional predictive modeling strategies leveraging well-curated data sets can have limited real-world implementation potential due to various data quality issues in EHR data. Methods: We propose a novel predictive modeling approach, GRU-D-Weibull, which models Weibull distribution leveraging gated recurrent units with decay (GRU-D), for real-time individualized endpoint prediction and population level risk management using EHR data. Experiments: We systematically evaluated the performance and showcased the real-world implementability of the proposed approach through individual level endpoint prediction using a cohort of patients with chronic kidney disease stage 4 (CKD4). A total of 536 features including ICD/CPT codes, medications, lab tests, vital measurements, and demographics were retrieved for 6879 CKD4 patients. The performance metrics including C-index, L1-loss, Parkes' error, and predicted survival probability at time of event were compared between GRU-D-Weibull and other alternative approaches including accelerated failure time model (AFT), XGBoost based AFT (XGB(AFT)), random survival forest (RSF), and Nnet-survival. Both in-process and post-process calibrations were experimented on GRU-D-Weibull generated survival probabilities. Results: GRU-D-Weibull demonstrated C-index of ~0.7 at index date, which increased to ~0.77 at 4.3 years of follow-up, comparable to that of RSF. GRU-D-Weibull achieved absolute L1-loss of ~1.1 years (sd≈0.95) at CKD4 index date, and a minimum of ~0.45 year (sd≈0.3) at 4 years of follow-up, comparing to second-ranked RSF of ~1.4 years (sd≈1.1) at index date and ~0.64 years (sd≈0.26) at 4 years. Both significantly outperform competing approaches. GRU-D-Weibull constrained predicted survival probability at time of event to smaller and more fixed range than competing models throughout follow-up. Significant correlations were observed between prediction error and missing proportions of all major categories of input features at index date (Corr ~0.1 to ~0.3), which faded away within 1 year after index date as more data became available. Through post training recalibration, we achieved a close alignment between the predicted and observed survival probabilities across multiple prediction horizons at different time points during follow-up. Conclusion: GRU-D-Weibull shows advantages over competing methods in handling missingness commonly encountered in EHR data and providing both probability and point estimates for diverse prediction horizons during follow-up. The experiment highlights the potential of GRU-D-Weibull as a suitable candidate for individualized endpoint risk management, utilizing real-time clinical data to generate various endpoint estimates for monitoring. Additional research is warranted to evaluate the influence of different data quality aspects on prediction performance. Furthermore, collaboration with clinicians is essential to explore the integration of this approach into clinical workflows and evaluate its effects on decision-making processes and patient outcomes.

KW - Chronic kidney disease (CKD)

KW - Deep learning

KW - Electronic health record (EHR)

KW - Gated recurrent units with decay (GRU-D)

KW - Individualized risk management

KW - Real-time endpoint prediction

UR - http://www.scopus.com/inward/record.url?scp=85178357960&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85178357960&partnerID=8YFLogxK

U2 - 10.1016/j.artmed.2023.102696

DO - 10.1016/j.artmed.2023.102696

M3 - Article

C2 - 38042597

AN - SCOPUS:85178357960

SN - 0933-3657

VL - 146

JO - Artificial Intelligence in Medicine

JF - Artificial Intelligence in Medicine

M1 - 102696

ER -

GRU-D-Weibull: A novel real-time individualized endpoint prediction

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Cite this