Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature.

Stephen Wu, Hongfang Liu

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Natural language processing (NLP) has become crucial in unlocking information stored in free text, from both clinical notes and biomedical literature. Clinical notes convey clinical information related to individual patient health care, while biomedical literature communicates scientific findings. This work focuses on semantic characterization of texts at an enterprise scale, comparing and contrasting the two domains and their NLP approaches. We analyzed the empirical distributional characteristics of NLP-discovered named entities in Mayo Clinic clinical notes from 2001-2010, and in the 2011 MetaMapped Medline Baseline. We give qualitative and quantitative measures of domain similarity and point to the feasibility of transferring resources and techniques. An important by-product for this study is the development of a weighted ontology for each domain, which gives distributional semantic information that may be used to improve NLP applications.

Original languageEnglish (US)
Pages (from-to)1550-1558
Number of pages9
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
StatePublished - 2011

ASJC Scopus subject areas

  • General Medicine


Dive into the research topics of 'Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature.'. Together they form a unique fingerprint.

Cite this