Valx: A system for extracting and structuring numeric lab test comparison statements from text

Tianyong Hao; Hongfang Liu; Chunhua Weng

doi:10.3414/ME15-01-0112

Valx: A system for extracting and structuring numeric lab test comparison statements from text

Tianyong Hao, Hongfang Liu, Chunhua Weng

Digital Health Sciences

Research output: Contribution to journal › Article › peer-review

19 Scopus citations

Abstract

Objectives: To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. Methods: Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes seven steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric,unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable – numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. Results: The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. Conclusions: Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community.

Original language	English (US)
Pages (from-to)	266-275
Number of pages	10
Journal	Methods of Information in Medicine
Volume	55
Issue number	3
DOIs	https://doi.org/10.3414/ME15-01-0112
State	Published - 2016

Keywords

Clinical trial
Comparison statement
Medical informatics
Natural language processing
Patient selection

ASJC Scopus subject areas

Health Informatics
Advanced and Specialized Nursing
Health Information Management

Access to Document

10.3414/ME15-01-0112

Cite this

@article{c8aff6e19d3e4cbfba030dc15d56f6f4,

title = "Valx: A system for extracting and structuring numeric lab test comparison statements from text",

abstract = "Objectives: To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. Methods: Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes seven steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric,unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable – numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. Results: The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. Conclusions: Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community.",

keywords = "Clinical trial, Comparison statement, Medical informatics, Natural language processing, Patient selection",

author = "Tianyong Hao and Hongfang Liu and Chunhua Weng",

note = "Publisher Copyright: {\textcopyright} Schattauer 2016.",

year = "2016",

doi = "10.3414/ME15-01-0112",

language = "English (US)",

volume = "55",

pages = "266--275",

journal = "Methods of Information in Medicine",

issn = "0026-1270",

publisher = "Schattauer GmbH",

number = "3",

}

TY - JOUR

T1 - Valx

T2 - A system for extracting and structuring numeric lab test comparison statements from text

AU - Hao, Tianyong

AU - Liu, Hongfang

AU - Weng, Chunhua

N1 - Publisher Copyright: © Schattauer 2016.

PY - 2016

Y1 - 2016

N2 - Objectives: To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. Methods: Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes seven steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric,unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable – numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. Results: The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. Conclusions: Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community.

AB - Objectives: To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. Methods: Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes seven steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric,unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable – numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. Results: The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. Conclusions: Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community.

KW - Clinical trial

KW - Comparison statement

KW - Medical informatics

KW - Natural language processing

KW - Patient selection

UR - http://www.scopus.com/inward/record.url?scp=84968760283&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84968760283&partnerID=8YFLogxK

U2 - 10.3414/ME15-01-0112

DO - 10.3414/ME15-01-0112

M3 - Article

C2 - 26940748

AN - SCOPUS:84968760283

SN - 0026-1270

VL - 55

SP - 266

EP - 275

JO - Methods of Information in Medicine

JF - Methods of Information in Medicine

IS - 3

ER -

Valx: A system for extracting and structuring numeric lab test comparison statements from text

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this