Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions

Jingzhi Yu; Jennifer A. Pacheco; Anika S. Ghosh; Yuan Luo; Chunhua Weng; Ning Shang; Barbara Benoit; David S. Carrell; Robert J. Carroll; Ozan Dikilitas; Robert R. Freimuth; Vivian S. Gainer; Hakon Hakonarson; George Hripcsak; Iftikhar J. Kullo; Frank Mentch; Shawn N. Murphy; Peggy L. Peissig; Andrea H. Ramirez; Nephi Walton; Wei Qi Wei; Luke V. Rasmussen

doi:10.1186/s12911-022-01759-z

Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions

Jingzhi Yu, Jennifer A. Pacheco, Anika S. Ghosh, Yuan Luo, Chunhua Weng, Ning Shang, Barbara Benoit, David S. Carrell, Robert J. Carroll, Ozan Dikilitas, Robert R. Freimuth, Vivian S. Gainer, Hakon Hakonarson, George Hripcsak, Iftikhar J. Kullo, Frank Mentch, Shawn N. Murphy, Peggy L. Peissig, Andrea H. Ramirez, Nephi WaltonWei Qi Wei, Luke V. Rasmussen

Research output: Contribution to journal › Article › peer-review

Abstract

Introduction: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. Methods: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. Results: We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. Discussion and conclusion: Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.

Original language	English (US)
Article number	23
Journal	BMC Medical Informatics and Decision Making
Volume	22
Issue number	1
DOIs	https://doi.org/10.1186/s12911-022-01759-z
State	Published - Dec 2022

Keywords

Algorithm: Natural Language
Ambiguity
Electronic Health Records (EHR)
Phenotyping
Under-Specification
Vagueness

ASJC Scopus subject areas

Health Policy
Health Informatics
Computer Science Applications

Access to Document

10.1186/s12911-022-01759-z

Cite this

Yu, J., Pacheco, J. A., Ghosh, A. S., Luo, Y., Weng, C., Shang, N., Benoit, B., Carrell, D. S., Carroll, R. J., Dikilitas, O., Freimuth, R. R., Gainer, V. S., Hakonarson, H., Hripcsak, G., Kullo, I. J., Mentch, F., Murphy, S. N., Peissig, P. L., Ramirez, A. H., ... Rasmussen, L. V. (2022). Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions. BMC Medical Informatics and Decision Making, 22(1), Article 23. https://doi.org/10.1186/s12911-022-01759-z

Yu, J, Pacheco, JA, Ghosh, AS, Luo, Y, Weng, C, Shang, N, Benoit, B, Carrell, DS, Carroll, RJ, Dikilitas, O, Freimuth, RR, Gainer, VS, Hakonarson, H, Hripcsak, G, Kullo, IJ, Mentch, F, Murphy, SN, Peissig, PL, Ramirez, AH, Walton, N, Wei, WQ & Rasmussen, LV 2022, 'Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions', BMC Medical Informatics and Decision Making, vol. 22, no. 1, 23. https://doi.org/10.1186/s12911-022-01759-z

@article{ba9a5b5f549a4d669d93aaa55ac87f67,

title = "Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions",

abstract = "Introduction: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. Methods: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. Results: We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. Discussion and conclusion: Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.",

keywords = "Algorithm: Natural Language, Ambiguity, Electronic Health Records (EHR), Phenotyping, Under-Specification, Vagueness",

author = "Jingzhi Yu and Pacheco, {Jennifer A.} and Ghosh, {Anika S.} and Yuan Luo and Chunhua Weng and Ning Shang and Barbara Benoit and Carrell, {David S.} and Carroll, {Robert J.} and Ozan Dikilitas and Freimuth, {Robert R.} and Gainer, {Vivian S.} and Hakon Hakonarson and George Hripcsak and Kullo, {Iftikhar J.} and Frank Mentch and Murphy, {Shawn N.} and Peissig, {Peggy L.} and Ramirez, {Andrea H.} and Nephi Walton and Wei, {Wei Qi} and Rasmussen, {Luke V.}",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s).",

year = "2022",

month = dec,

doi = "10.1186/s12911-022-01759-z",

language = "English (US)",

volume = "22",

journal = "BMC Medical Informatics and Decision Making",

issn = "1472-6947",

publisher = "BioMed Central",

number = "1",

}

TY - JOUR

T1 - Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions

AU - Yu, Jingzhi

AU - Pacheco, Jennifer A.

AU - Ghosh, Anika S.

AU - Luo, Yuan

AU - Weng, Chunhua

AU - Shang, Ning

AU - Benoit, Barbara

AU - Carrell, David S.

AU - Carroll, Robert J.

AU - Dikilitas, Ozan

AU - Freimuth, Robert R.

AU - Gainer, Vivian S.

AU - Hakonarson, Hakon

AU - Hripcsak, George

AU - Kullo, Iftikhar J.

AU - Mentch, Frank

AU - Murphy, Shawn N.

AU - Peissig, Peggy L.

AU - Ramirez, Andrea H.

AU - Walton, Nephi

AU - Wei, Wei Qi

AU - Rasmussen, Luke V.

PY - 2022/12

Y1 - 2022/12

N2 - Introduction: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. Methods: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. Results: We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. Discussion and conclusion: Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.

AB - Introduction: Currently, one of the commonly used methods for disseminating electronic health record (EHR)-based phenotype algorithms is providing a narrative description of the algorithm logic, often accompanied by flowcharts. A challenge with this mode of dissemination is the potential for under-specification in the algorithm definition, which leads to ambiguity and vagueness. Methods: This study examines incidents of under-specification that occurred during the implementation of 34 narrative phenotyping algorithms in the electronic Medical Record and Genomics (eMERGE) network. We reviewed the online communication history between algorithm developers and implementers within the Phenotype Knowledge Base (PheKB) platform, where questions could be raised and answered regarding the intended implementation of a phenotype algorithm. Results: We developed a taxonomy of under-specification categories via an iterative review process between two groups of annotators. Under-specifications that lead to ambiguity and vagueness were consistently found across narrative phenotype algorithms developed by all involved eMERGE sites. Discussion and conclusion: Our findings highlight that under-specification is an impediment to the accuracy and efficiency of the implementation of current narrative phenotyping algorithms, and we propose approaches for mitigating these issues and improved methods for disseminating EHR phenotyping algorithms.

KW - Algorithm: Natural Language

KW - Ambiguity

KW - Electronic Health Records (EHR)

KW - Phenotyping

KW - Under-Specification

KW - Vagueness

UR - http://www.scopus.com/inward/record.url?scp=85123877196&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85123877196&partnerID=8YFLogxK

U2 - 10.1186/s12911-022-01759-z

DO - 10.1186/s12911-022-01759-z

M3 - Article

C2 - 35090449

AN - SCOPUS:85123877196

SN - 1472-6947

VL - 22

JO - BMC Medical Informatics and Decision Making

JF - BMC Medical Informatics and Decision Making

IS - 1

M1 - 23

ER -

Under-specification as the source of ambiguity and vagueness in narrative phenotype algorithm definitions

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this