Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Knee Arthroplasty

Elham Sagheb, Taghi Ramazanian, Ahmad P. Tafti, Sunyang Fu, Walter K. Kremers, Daniel J. Berry, David G. Lewallen, Sunghwan Sohn, Hilal Maradit Kremers

Research output: Contribution to journalArticlepeer-review


Background: Natural language processing (NLP) methods have the capability to process clinical free text in electronic health records, decreasing the need for costly manual chart review, and improving data quality. We developed rule-based NLP algorithms to automatically extract surgery specific data elements from knee arthroplasty operative notes. Methods: Within a cohort of 20,000 knee arthroplasty operative notes from 2000 to 2017 at a large tertiary institution, we randomly selected independent pairs of training and test sets to develop and evaluate NLP algorithms to detect five major data elements. The size of the training and test datasets were similar and ranged between 420 to 1592 surgeries. Expert rules using keywords in operative notes were used to implement NLP algorithms capturing: (1) category of surgery (total knee arthroplasty, unicompartmental knee arthroplasty, patellofemoral arthroplasty), (2) laterality of surgery, (3) constraint type, (4) presence of patellar resurfacing, and (5) implant model (catalog numbers). We used institutional registry data as our gold standard to evaluate the NLP algorithms. Results: NLP algorithms to detect the category of surgery, laterality, constraint, and patellar resurfacing achieved 98.3%, 99.5%, 99.2%, and 99.4% accuracy on test datasets, respectively. The implant model algorithm achieved an F1-score (harmonic mean of precision and recall) of 99.9%. Conclusions: NLP algorithms are a promising alternative to costly manual chart review to automate the extraction of embedded information within knee arthroplasty operative notes. Further validation in other hospital settings will enhance widespread implementation and efficiency in data capture for research and clinical purposes. Level of Evidence: Level III.

Original languageEnglish (US)
Pages (from-to)922-926
Number of pages5
JournalJournal of Arthroplasty
Issue number3
StatePublished - Mar 2021


  • artificial intelligence
  • constraint
  • electronic health records
  • natural language processing
  • patella resurfacing
  • total knee arthroplasty

ASJC Scopus subject areas

  • Orthopedics and Sports Medicine


Dive into the research topics of 'Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Knee Arthroplasty'. Together they form a unique fingerprint.

Cite this