TY - JOUR
T1 - A weakly supervised model for the automated detection of adverse events using clinical notes
AU - Sanyal, Josh
AU - Rubin, Daniel
AU - Banerjee, Imon
N1 - Funding Information:
Funding: This work was supported by Grant Number U01FD004979/U01FD005978 from the FDA, which supports the UCSF-Stanford Center of Excellence in Regulatory Sciences and Innovation. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the HHS or FDA. Author contributions: JS, IB and DR conceived the project and led the studies. DR provided the study data. JS and IB carried out the design of the machine learning models and performed the experiments. JS and IB wrote the manuscript, and JS, DR and IB edited the manuscript. Data and materials availability: The classification model weights and training scripts can be obtained through an MTA.
Funding Information:
Funding : This work was supported by Grant Number U01FD004979/U01FD005978 from the FDA, which supports the UCSF-Stanford Center of Excellence in Regulatory Sciences and Innovation. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the HHS or FDA.
Publisher Copyright:
© 2021 Elsevier Inc.
PY - 2022/2
Y1 - 2022/2
N2 - With clinical trials unable to detect all potential adverse reactions to drugs and medical devices prior to their release into the market, accurate post-market surveillance is critical to ensure their safety and efficacy. Electronic health records (EHR) contain rich observational patient data, making them a valuable source to actively monitor the safety of drugs and devices. While structured EHR data and spontaneous reporting systems often underreport the complexities of patient encounters and outcomes, free-text clinical notes offer greater detail about a patient's status. Previous studies have proposed machine learning methods to detect adverse events from clinical notes, but suffer from manually extracted features, reliance on costly hand-labeled data, and lack of validation on external datasets. To address these challenges, we develop a weakly-supervised machine learning framework for adverse event detection from unstructured clinical notes and evaluate it on insulin pump failure as a test case. Our model accurately detected cases of pump failure with 0.842 PR AUC on the holdout test set and 0.815 PR AUC when validated on an external dataset. Our approach allowed us to leverage a large dataset with far less hand-labeled data and can be easily transferred to additional adverse events for scalable post-market surveillance.
AB - With clinical trials unable to detect all potential adverse reactions to drugs and medical devices prior to their release into the market, accurate post-market surveillance is critical to ensure their safety and efficacy. Electronic health records (EHR) contain rich observational patient data, making them a valuable source to actively monitor the safety of drugs and devices. While structured EHR data and spontaneous reporting systems often underreport the complexities of patient encounters and outcomes, free-text clinical notes offer greater detail about a patient's status. Previous studies have proposed machine learning methods to detect adverse events from clinical notes, but suffer from manually extracted features, reliance on costly hand-labeled data, and lack of validation on external datasets. To address these challenges, we develop a weakly-supervised machine learning framework for adverse event detection from unstructured clinical notes and evaluate it on insulin pump failure as a test case. Our model accurately detected cases of pump failure with 0.842 PR AUC on the holdout test set and 0.815 PR AUC when validated on an external dataset. Our approach allowed us to leverage a large dataset with far less hand-labeled data and can be easily transferred to additional adverse events for scalable post-market surveillance.
KW - Insulin pump failure
KW - Natural language processing
KW - Scalable post-market surveillance
KW - Unstructured clinical notes
UR - http://www.scopus.com/inward/record.url?scp=85122520279&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85122520279&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2021.103969
DO - 10.1016/j.jbi.2021.103969
M3 - Article
C2 - 34864210
AN - SCOPUS:85122520279
SN - 1532-0464
VL - 126
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
M1 - 103969
ER -