Finding Difficult-to-Disambiguate Words: Towards an Efficient Workflow to Implement Word Sense Disambiguation

Manabu Torii; Jung Wei Fan; Daniel S. Zisook

doi:10.1109/ICHI.2015.66

Finding Difficult-to-Disambiguate Words: Towards an Efficient Workflow to Implement Word Sense Disambiguation

Manabu Torii, Jung Wei Fan, Daniel S. Zisook

Digital Health Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

In the biomedical and clinical domain, valuable information is frequently represented in free-text documents. Natural language processing (NLP) is a powerful tool that can extract structured information from theses documents. Word sense disambiguation (WSD) is a critical component in an NLP pipeline that increases the accuracy of the extracted information. However, WSD is expensive to apply for all known ambiguous words. Given limited time and resources, one practical strategy is to prioritize easy-to-disambiguate words and efficiently maximize the coverage of disambiguation. To aid prioritization efforts, we studied two quantitative indicators that are associated with how easy/difficult it is to disambiguate any given word.

Original language	English (US)
Title of host publication	Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015
Editors	Wai-Tat Fu, Prabhakaran Balakrishnan, Sanda Harabagiu, Fei Wang, Jaideep Srivatsava
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	448
Number of pages	1
ISBN (Electronic)	9781467395489
DOIs	https://doi.org/10.1109/ICHI.2015.66
State	Published - Dec 8 2015
Event	3rd IEEE International Conference on Healthcare Informatics, ICHI 2015 - Dallas, United States Duration: Oct 21 2015 → Oct 23 2015

Publication series

Name	Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015

Other

Other	3rd IEEE International Conference on Healthcare Informatics, ICHI 2015
Country/Territory	United States
City	Dallas
Period	10/21/15 → 10/23/15

Keywords

Medical Informatics
Natural Language Processing
Word Sense Disambiguation

ASJC Scopus subject areas

Health Informatics

Access to Document

10.1109/ICHI.2015.66

Cite this

Torii, M., Fan, J. W., & Zisook, D. S. (2015). Finding Difficult-to-Disambiguate Words: Towards an Efficient Workflow to Implement Word Sense Disambiguation. In W.-T. Fu, P. Balakrishnan, S. Harabagiu, F. Wang, & J. Srivatsava (Eds.), Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015 (pp. 448). Article 7349727 (Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICHI.2015.66

Finding Difficult-to-Disambiguate Words: Towards an Efficient Workflow to Implement Word Sense Disambiguation. / Torii, Manabu; Fan, Jung Wei; Zisook, Daniel S.
Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015. ed. / Wai-Tat Fu; Prabhakaran Balakrishnan; Sanda Harabagiu; Fei Wang; Jaideep Srivatsava. Institute of Electrical and Electronics Engineers Inc., 2015. p. 448 7349727 (Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Torii, M, Fan, JW & Zisook, DS 2015, Finding Difficult-to-Disambiguate Words: Towards an Efficient Workflow to Implement Word Sense Disambiguation. in W-T Fu, P Balakrishnan, S Harabagiu, F Wang & J Srivatsava (eds), Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015., 7349727, Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015, Institute of Electrical and Electronics Engineers Inc., pp. 448, 3rd IEEE International Conference on Healthcare Informatics, ICHI 2015, Dallas, United States, 10/21/15. https://doi.org/10.1109/ICHI.2015.66

Torii M, Fan JW, Zisook DS. Finding Difficult-to-Disambiguate Words: Towards an Efficient Workflow to Implement Word Sense Disambiguation. In Fu WT, Balakrishnan P, Harabagiu S, Wang F, Srivatsava J, editors, Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 448. 7349727. (Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015). doi: 10.1109/ICHI.2015.66

Torii, Manabu ; Fan, Jung Wei ; Zisook, Daniel S. / Finding Difficult-to-Disambiguate Words : Towards an Efficient Workflow to Implement Word Sense Disambiguation. Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015. editor / Wai-Tat Fu ; Prabhakaran Balakrishnan ; Sanda Harabagiu ; Fei Wang ; Jaideep Srivatsava. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 448 (Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015).

@inproceedings{0e96dcc12b414165a04dff1c0c4af51f,

title = "Finding Difficult-to-Disambiguate Words: Towards an Efficient Workflow to Implement Word Sense Disambiguation",

abstract = "In the biomedical and clinical domain, valuable information is frequently represented in free-text documents. Natural language processing (NLP) is a powerful tool that can extract structured information from theses documents. Word sense disambiguation (WSD) is a critical component in an NLP pipeline that increases the accuracy of the extracted information. However, WSD is expensive to apply for all known ambiguous words. Given limited time and resources, one practical strategy is to prioritize easy-to-disambiguate words and efficiently maximize the coverage of disambiguation. To aid prioritization efforts, we studied two quantitative indicators that are associated with how easy/difficult it is to disambiguate any given word.",

keywords = "Medical Informatics, Natural Language Processing, Word Sense Disambiguation",

author = "Manabu Torii and Fan, {Jung Wei} and Zisook, {Daniel S.}",

note = "Publisher Copyright: {\textcopyright} 2015 IEEE.; 3rd IEEE International Conference on Healthcare Informatics, ICHI 2015 ; Conference date: 21-10-2015 Through 23-10-2015",

year = "2015",

month = dec,

day = "8",

doi = "10.1109/ICHI.2015.66",

language = "English (US)",

series = "Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "448",

editor = "Wai-Tat Fu and Prabhakaran Balakrishnan and Sanda Harabagiu and Fei Wang and Jaideep Srivatsava",

booktitle = "Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015",

}

TY - GEN

T1 - Finding Difficult-to-Disambiguate Words

T2 - 3rd IEEE International Conference on Healthcare Informatics, ICHI 2015

AU - Torii, Manabu

AU - Fan, Jung Wei

AU - Zisook, Daniel S.

PY - 2015/12/8

Y1 - 2015/12/8

N2 - In the biomedical and clinical domain, valuable information is frequently represented in free-text documents. Natural language processing (NLP) is a powerful tool that can extract structured information from theses documents. Word sense disambiguation (WSD) is a critical component in an NLP pipeline that increases the accuracy of the extracted information. However, WSD is expensive to apply for all known ambiguous words. Given limited time and resources, one practical strategy is to prioritize easy-to-disambiguate words and efficiently maximize the coverage of disambiguation. To aid prioritization efforts, we studied two quantitative indicators that are associated with how easy/difficult it is to disambiguate any given word.

AB - In the biomedical and clinical domain, valuable information is frequently represented in free-text documents. Natural language processing (NLP) is a powerful tool that can extract structured information from theses documents. Word sense disambiguation (WSD) is a critical component in an NLP pipeline that increases the accuracy of the extracted information. However, WSD is expensive to apply for all known ambiguous words. Given limited time and resources, one practical strategy is to prioritize easy-to-disambiguate words and efficiently maximize the coverage of disambiguation. To aid prioritization efforts, we studied two quantitative indicators that are associated with how easy/difficult it is to disambiguate any given word.

KW - Medical Informatics

KW - Natural Language Processing

KW - Word Sense Disambiguation

UR - http://www.scopus.com/inward/record.url?scp=84966277133&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84966277133&partnerID=8YFLogxK

U2 - 10.1109/ICHI.2015.66

DO - 10.1109/ICHI.2015.66

M3 - Conference contribution

AN - SCOPUS:84966277133

T3 - Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015

SP - 448

BT - Proceedings - 2015 IEEE International Conference on Healthcare Informatics, ICHI 2015

A2 - Fu, Wai-Tat

A2 - Balakrishnan, Prabhakaran

A2 - Harabagiu, Sanda

A2 - Wang, Fei

A2 - Srivatsava, Jaideep

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 21 October 2015 through 23 October 2015

ER -

Finding Difficult-to-Disambiguate Words: Towards an Efficient Workflow to Implement Word Sense Disambiguation

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this