Discovering associations between problem list and practice setting

Liwei Wang, Yanshan Wang, Feichen Shen, Majid Rastegar-Mojarad, Hongfang Liu

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


Background: The Health Information Technology for Economic and Clinical Health Act (HITECH) has greatly accelerated the adoption of electronic health records (EHRs) with the promise of better clinical decisions and patients' outcomes. One of the core criteria for "Meaningful Use" of EHRs is to have a problem list that shows the most important health problems faced by a patient. The implementation of problem lists in EHRs has a potential to help practitioners to provide customized care to patients. However, it remains an open question on how to leverage problem lists in different practice settings to provide tailored care, of which the bottleneck lies in the associations between problem list and practice setting. Methods: In this study, using sampled clinical documents associated with a cohort of patients who received their primary care at Mayo Clinic, we investigated the associations between problem list and practice setting through natural language processing (NLP) and topic modeling techniques. Specifically, after practice settings and problem lists were normalized, statistical χ2 test, term frequency-inverse document frequency (TF-IDF) and enrichment analysis were used to choose representative concepts for each setting. Then Latent Dirichlet Allocations (LDA) were used to train topic models and predict potential practice settings using similarity metrics based on the problem concepts representative of practice settings. Evaluation was conducted through 5-fold cross validation and Recall@k, Precision@k and F1@k were calculated. Results: Our method can generate prioritized and meaningful problem lists corresponding to specific practice settings. For practice setting prediction, recall increases from 0.719 (k = 2) to 0.931 (k = 10), precision increases from 0.882 (k = 2) to 0.931 (k = 10) and F1 increases from 0.790 (k = 2) to 0.931 (k = 10). Conclusion: To our best knowledge, our study is the first attempting to discover the association between the problem lists and hospital practice settings. In the future, we plan to investigate how to provide more tailored care by utilizing the association between problem list and practice setting revealed in this study.

Original languageEnglish (US)
Article number69
JournalBMC Medical Informatics and Decision Making
StatePublished - Apr 4 2019


  • Practice setting
  • Problem list
  • Statistical χ test
  • TF-IDF and enrichment analysis
  • Topic modeling

ASJC Scopus subject areas

  • Health Policy
  • Health Informatics


Dive into the research topics of 'Discovering associations between problem list and practice setting'. Together they form a unique fingerprint.

Cite this