Natural language processing of radiology reports for identification of skeletal site-specific fractures

Yanshan Wang, Saeed Mehrabi, Sunghwan Sohn, Elizabeth J. Atkinson, Shreyasee Amin, Hongfang Liu

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.

Original languageEnglish (US)
Article number73
JournalBMC Medical Informatics and Decision Making
StatePublished - Apr 4 2019


  • Electronic health records
  • Fracture identification
  • Natural language processing
  • Radiology reports

ASJC Scopus subject areas

  • Health Policy
  • Health Informatics


Dive into the research topics of 'Natural language processing of radiology reports for identification of skeletal site-specific fractures'. Together they form a unique fingerprint.

Cite this