Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Knee Arthroplasty

Elham Sagheb; Taghi Ramazanian; Ahmad P. Tafti; Sunyang Fu; Walter K. Kremers; Daniel J. Berry; David G. Lewallen; Sunghwan Sohn; Hilal Maradit Kremers

doi:10.1016/j.arth.2020.09.029

Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Knee Arthroplasty

Elham Sagheb, Taghi Ramazanian, Ahmad P. Tafti, Sunyang Fu, Walter K. Kremers, Daniel J. Berry, David G. Lewallen, Sunghwan Sohn, Hilal Maradit Kremers

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

Abstract

Background: Natural language processing (NLP) methods have the capability to process clinical free text in electronic health records, decreasing the need for costly manual chart review, and improving data quality. We developed rule-based NLP algorithms to automatically extract surgery specific data elements from knee arthroplasty operative notes. Methods: Within a cohort of 20,000 knee arthroplasty operative notes from 2000 to 2017 at a large tertiary institution, we randomly selected independent pairs of training and test sets to develop and evaluate NLP algorithms to detect five major data elements. The size of the training and test datasets were similar and ranged between 420 to 1592 surgeries. Expert rules using keywords in operative notes were used to implement NLP algorithms capturing: (1) category of surgery (total knee arthroplasty, unicompartmental knee arthroplasty, patellofemoral arthroplasty), (2) laterality of surgery, (3) constraint type, (4) presence of patellar resurfacing, and (5) implant model (catalog numbers). We used institutional registry data as our gold standard to evaluate the NLP algorithms. Results: NLP algorithms to detect the category of surgery, laterality, constraint, and patellar resurfacing achieved 98.3%, 99.5%, 99.2%, and 99.4% accuracy on test datasets, respectively. The implant model algorithm achieved an F1-score (harmonic mean of precision and recall) of 99.9%. Conclusions: NLP algorithms are a promising alternative to costly manual chart review to automate the extraction of embedded information within knee arthroplasty operative notes. Further validation in other hospital settings will enhance widespread implementation and efficiency in data capture for research and clinical purposes. Level of Evidence: Level III.

Original language	English (US)
Pages (from-to)	922-926
Number of pages	5
Journal	Journal of Arthroplasty
Volume	36
Issue number	3
DOIs	https://doi.org/10.1016/j.arth.2020.09.029
State	Published - Mar 2021

Keywords

artificial intelligence
constraint
electronic health records
natural language processing
patella resurfacing
total knee arthroplasty

ASJC Scopus subject areas

Orthopedics and Sports Medicine

Access to Document

10.1016/j.arth.2020.09.029

Cite this

@article{dd3d4c9cb72049e8a074bcfe0b83275f,

title = "Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Knee Arthroplasty",

abstract = "Background: Natural language processing (NLP) methods have the capability to process clinical free text in electronic health records, decreasing the need for costly manual chart review, and improving data quality. We developed rule-based NLP algorithms to automatically extract surgery specific data elements from knee arthroplasty operative notes. Methods: Within a cohort of 20,000 knee arthroplasty operative notes from 2000 to 2017 at a large tertiary institution, we randomly selected independent pairs of training and test sets to develop and evaluate NLP algorithms to detect five major data elements. The size of the training and test datasets were similar and ranged between 420 to 1592 surgeries. Expert rules using keywords in operative notes were used to implement NLP algorithms capturing: (1) category of surgery (total knee arthroplasty, unicompartmental knee arthroplasty, patellofemoral arthroplasty), (2) laterality of surgery, (3) constraint type, (4) presence of patellar resurfacing, and (5) implant model (catalog numbers). We used institutional registry data as our gold standard to evaluate the NLP algorithms. Results: NLP algorithms to detect the category of surgery, laterality, constraint, and patellar resurfacing achieved 98.3%, 99.5%, 99.2%, and 99.4% accuracy on test datasets, respectively. The implant model algorithm achieved an F1-score (harmonic mean of precision and recall) of 99.9%. Conclusions: NLP algorithms are a promising alternative to costly manual chart review to automate the extraction of embedded information within knee arthroplasty operative notes. Further validation in other hospital settings will enhance widespread implementation and efficiency in data capture for research and clinical purposes. Level of Evidence: Level III.",

keywords = "artificial intelligence, constraint, electronic health records, natural language processing, patella resurfacing, total knee arthroplasty",

author = "Elham Sagheb and Taghi Ramazanian and Tafti, {Ahmad P.} and Sunyang Fu and Kremers, {Walter K.} and Berry, {Daniel J.} and Lewallen, {David G.} and Sunghwan Sohn and {Maradit Kremers}, Hilal",

note = "Publisher Copyright: {\textcopyright} 2020 Elsevier Inc.",

year = "2021",

month = mar,

doi = "10.1016/j.arth.2020.09.029",

language = "English (US)",

volume = "36",

pages = "922--926",

journal = "Journal of Arthroplasty",

issn = "0883-5403",

publisher = "Churchill Livingstone",

number = "3",

}

TY - JOUR

T1 - Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Knee Arthroplasty

AU - Sagheb, Elham

AU - Ramazanian, Taghi

AU - Tafti, Ahmad P.

AU - Fu, Sunyang

AU - Kremers, Walter K.

AU - Berry, Daniel J.

AU - Lewallen, David G.

AU - Sohn, Sunghwan

AU - Maradit Kremers, Hilal

PY - 2021/3

Y1 - 2021/3

N2 - Background: Natural language processing (NLP) methods have the capability to process clinical free text in electronic health records, decreasing the need for costly manual chart review, and improving data quality. We developed rule-based NLP algorithms to automatically extract surgery specific data elements from knee arthroplasty operative notes. Methods: Within a cohort of 20,000 knee arthroplasty operative notes from 2000 to 2017 at a large tertiary institution, we randomly selected independent pairs of training and test sets to develop and evaluate NLP algorithms to detect five major data elements. The size of the training and test datasets were similar and ranged between 420 to 1592 surgeries. Expert rules using keywords in operative notes were used to implement NLP algorithms capturing: (1) category of surgery (total knee arthroplasty, unicompartmental knee arthroplasty, patellofemoral arthroplasty), (2) laterality of surgery, (3) constraint type, (4) presence of patellar resurfacing, and (5) implant model (catalog numbers). We used institutional registry data as our gold standard to evaluate the NLP algorithms. Results: NLP algorithms to detect the category of surgery, laterality, constraint, and patellar resurfacing achieved 98.3%, 99.5%, 99.2%, and 99.4% accuracy on test datasets, respectively. The implant model algorithm achieved an F1-score (harmonic mean of precision and recall) of 99.9%. Conclusions: NLP algorithms are a promising alternative to costly manual chart review to automate the extraction of embedded information within knee arthroplasty operative notes. Further validation in other hospital settings will enhance widespread implementation and efficiency in data capture for research and clinical purposes. Level of Evidence: Level III.

AB - Background: Natural language processing (NLP) methods have the capability to process clinical free text in electronic health records, decreasing the need for costly manual chart review, and improving data quality. We developed rule-based NLP algorithms to automatically extract surgery specific data elements from knee arthroplasty operative notes. Methods: Within a cohort of 20,000 knee arthroplasty operative notes from 2000 to 2017 at a large tertiary institution, we randomly selected independent pairs of training and test sets to develop and evaluate NLP algorithms to detect five major data elements. The size of the training and test datasets were similar and ranged between 420 to 1592 surgeries. Expert rules using keywords in operative notes were used to implement NLP algorithms capturing: (1) category of surgery (total knee arthroplasty, unicompartmental knee arthroplasty, patellofemoral arthroplasty), (2) laterality of surgery, (3) constraint type, (4) presence of patellar resurfacing, and (5) implant model (catalog numbers). We used institutional registry data as our gold standard to evaluate the NLP algorithms. Results: NLP algorithms to detect the category of surgery, laterality, constraint, and patellar resurfacing achieved 98.3%, 99.5%, 99.2%, and 99.4% accuracy on test datasets, respectively. The implant model algorithm achieved an F1-score (harmonic mean of precision and recall) of 99.9%. Conclusions: NLP algorithms are a promising alternative to costly manual chart review to automate the extraction of embedded information within knee arthroplasty operative notes. Further validation in other hospital settings will enhance widespread implementation and efficiency in data capture for research and clinical purposes. Level of Evidence: Level III.

KW - artificial intelligence

KW - constraint

KW - electronic health records

KW - natural language processing

KW - patella resurfacing

KW - total knee arthroplasty

UR - http://www.scopus.com/inward/record.url?scp=85092443985&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85092443985&partnerID=8YFLogxK

U2 - 10.1016/j.arth.2020.09.029

DO - 10.1016/j.arth.2020.09.029

M3 - Article

C2 - 33051119

AN - SCOPUS:85092443985

SN - 0883-5403

VL - 36

SP - 922

EP - 926

JO - Journal of Arthroplasty

JF - Journal of Arthroplasty

IS - 3

ER -

Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Knee Arthroplasty

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this