TY - JOUR
T1 - Natural language processing of clinical notes for identification of critical limb ischemia
AU - Afzal, Naveed
AU - Mallipeddi, Vishnu Priya
AU - Sohn, Sunghwan
AU - Liu, Hongfang
AU - Chaudhry, Rajeev
AU - Scott, Christopher G.
AU - Kullo, Iftikhar J.
AU - Arruda-Olson, Adelaide M.
N1 - Funding Information:
Research reported in this publication was supported by the National Heart, Lung, and Blood Institute of the National Institutes of Health (award K01HL124045) and the NHGRI eMERGE (Electronic Records and Genomics) Network grants HG04599 and HG006379. This study was made possible using the resources of the Rochester Epidemiology Project supported by the National Institute on Aging of the National Institutes of Health (award R01AG034676) and the NLP framework established through the NIGMS award R01GM102282. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Summary points • Critical limb ischemia (CLI) is a complication of advanced peripheral artery disease (PAD) with diagnosis based on the presence of clinical signs and symptoms. Automated identification of cases is challenging due to absence of a single definitive International Classification of Diseases (ICD-9 or ICD-10) code for CLI. • We developed a natural language processing (NLP)-based algorithm for ascertainment of CLI from narrative clinical notes of a community-based PAD cohort and compared the performance of the NLP algorithm with CLI-related billing codes. Both methods were compared to human abstraction as the gold standard. • The CLI-NLP algorithm for identification of CLI had excellent positive predictive value with potential for translation to patient care for case identification and NLP-based clinical decision support tools. The CLI-NLP algorithm for automatic identification of CLI cases from clinical notes may enhance CLI research and eventually lead to improved quality of care of CLI patients.
Publisher Copyright:
© 2017 The Authors
PY - 2018/3
Y1 - 2018/3
N2 - Background Critical limb ischemia (CLI) is a complication of advanced peripheral artery disease (PAD) with diagnosis based on the presence of clinical signs and symptoms. However, automated identification of cases from electronic health records (EHRs) is challenging due to absence of a single definitive International Classification of Diseases (ICD-9 or ICD-10) code for CLI. Methods and results In this study, we extend a previously validated natural language processing (NLP) algorithm for PAD identification to develop and validate a subphenotyping NLP algorithm (CLI-NLP) for identification of CLI cases from clinical notes. We compared performance of the CLI-NLP algorithm with CLI-related ICD-9 billing codes. The gold standard for validation was human abstraction of clinical notes from EHRs. Compared to billing codes the CLI-NLP algorithm had higher positive predictive value (PPV) (CLI-NLP 96%, billing codes 67%, p < 0.001), specificity (CLI-NLP 98%, billing codes 74%, p < 0.001) and F1-score (CLI-NLP 90%, billing codes 76%, p < 0.001). The sensitivity of these two methods was similar (CLI-NLP 84%; billing codes 88%; p < 0.12). Conclusions The CLI-NLP algorithm for identification of CLI from narrative clinical notes in an EHR had excellent PPV and has potential for translation to patient care as it will enable automated identification of CLI cases for quality projects, clinical decision support tools and support a learning healthcare system.
AB - Background Critical limb ischemia (CLI) is a complication of advanced peripheral artery disease (PAD) with diagnosis based on the presence of clinical signs and symptoms. However, automated identification of cases from electronic health records (EHRs) is challenging due to absence of a single definitive International Classification of Diseases (ICD-9 or ICD-10) code for CLI. Methods and results In this study, we extend a previously validated natural language processing (NLP) algorithm for PAD identification to develop and validate a subphenotyping NLP algorithm (CLI-NLP) for identification of CLI cases from clinical notes. We compared performance of the CLI-NLP algorithm with CLI-related ICD-9 billing codes. The gold standard for validation was human abstraction of clinical notes from EHRs. Compared to billing codes the CLI-NLP algorithm had higher positive predictive value (PPV) (CLI-NLP 96%, billing codes 67%, p < 0.001), specificity (CLI-NLP 98%, billing codes 74%, p < 0.001) and F1-score (CLI-NLP 90%, billing codes 76%, p < 0.001). The sensitivity of these two methods was similar (CLI-NLP 84%; billing codes 88%; p < 0.12). Conclusions The CLI-NLP algorithm for identification of CLI from narrative clinical notes in an EHR had excellent PPV and has potential for translation to patient care as it will enable automated identification of CLI cases for quality projects, clinical decision support tools and support a learning healthcare system.
KW - Critical limb ischemia
KW - Electronic health records
KW - Natural language processing
KW - Peripheral artery disease
KW - Subphenotyping
UR - http://www.scopus.com/inward/record.url?scp=85040017468&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85040017468&partnerID=8YFLogxK
U2 - 10.1016/j.ijmedinf.2017.12.024
DO - 10.1016/j.ijmedinf.2017.12.024
M3 - Article
C2 - 29425639
AN - SCOPUS:85040017468
SN - 1386-5056
VL - 111
SP - 83
EP - 89
JO - International Journal of Medical Informatics
JF - International Journal of Medical Informatics
ER -