A computational framework for converting textual clinical diagnostic criteria into the quality data model

Na Hong, Dingcheng Li, Yue Yu, Qiongying Xiu, Hongfang Liu, Guoqian Jiang

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Background Constructing standard and computable clinical diagnostic criteria is an important but challenging research field in the clinical informatics community. The Quality Data Model (QDM) is emerging as a promising information model for standardizing clinical diagnostic criteria. Objective To develop and evaluate automated methods for converting textual clinical diagnostic criteria in a structured format using QDM. Methods We used a clinical Natural Language Processing (NLP) tool known as cTAKES to detect sentences and annotate events in diagnostic criteria. We developed a rule-based approach for assigning the QDM datatype(s) to an individual criterion, whereas we invoked a machine learning algorithm based on the Conditional Random Fields (CRFs) for annotating attributes belonging to each particular QDM datatype. We manually developed an annotated corpus as the gold standard and used standard measures (precision, recall and f-measure) for the performance evaluation. Results We harvested 267 individual criteria with the datatypes of Symptom and Laboratory Test from 63 textual diagnostic criteria. We manually annotated attributes and values in 142 individual Laboratory Test criteria. The average performance of our rule-based approach was 0.84 of precision, 0.86 of recall, and 0.85 of f-measure; the performance of CRFs-based classification was 0.95 of precision, 0.88 of recall and 0.91 of f-measure. We also implemented a web-based tool that automatically translates textual Laboratory Test criteria into the QDM XML template format. The results indicated that our approaches leveraging cTAKES and CRFs are effective in facilitating diagnostic criteria annotation and classification. Conclusion Our NLP-based computational framework is a feasible and useful solution in developing diagnostic criteria representation and computerization.

Original languageEnglish (US)
Pages (from-to)11-21
Number of pages11
JournalJournal of Biomedical Informatics
StatePublished - Oct 1 2016


  • Conditional random fields
  • Diagnostic criteria
  • Natural language processing
  • Quality data model
  • cTAKES

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics


Dive into the research topics of 'A computational framework for converting textual clinical diagnostic criteria into the quality data model'. Together they form a unique fingerprint.

Cite this