TY - GEN
T1 - An initial study of full parsing of clinical text using the Stanford Parser
AU - Xu, Hua
AU - AbdelRahman, Samir
AU - Jiang, Min
AU - Fan, Jung Wei
AU - Huang, Yang
PY - 2011
Y1 - 2011
N2 - Full parsing recognizes a sentence and generates a syntactic structure of it (a parse tree), which is useful for many natural language processing (NLP) applications. The Stanford Parser is one of the state-of-art parsers in the general English domain. However, there is no formal evaluation of its performance in clinical text that often contains ungrammatical structures. In this study, we randomly selected 50 sentences in the clinical corpus from 2010 i2b2 NLP challenge and manually annotated them to create a gold standard of parse trees. Our evaluation showed that the original Stanford Parser achieved a bracketing F-measure (BF) of 77% on the gold standard. Moreover, we assessed the effect of part-of-speech (POS) tags on parsing and our results showed that manually corrected POS tags achieved a maximum BF of 81%. Furthermore, we analyzed errors of the Stanford Parser and provided valuable insights to large-scale parse tree annotation for clinical text.
AB - Full parsing recognizes a sentence and generates a syntactic structure of it (a parse tree), which is useful for many natural language processing (NLP) applications. The Stanford Parser is one of the state-of-art parsers in the general English domain. However, there is no formal evaluation of its performance in clinical text that often contains ungrammatical structures. In this study, we randomly selected 50 sentences in the clinical corpus from 2010 i2b2 NLP challenge and manually annotated them to create a gold standard of parse trees. Our evaluation showed that the original Stanford Parser achieved a bracketing F-measure (BF) of 77% on the gold standard. Moreover, we assessed the effect of part-of-speech (POS) tags on parsing and our results showed that manually corrected POS tags achieved a maximum BF of 81%. Furthermore, we analyzed errors of the Stanford Parser and provided valuable insights to large-scale parse tree annotation for clinical text.
UR - http://www.scopus.com/inward/record.url?scp=84862912300&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862912300&partnerID=8YFLogxK
U2 - 10.1109/BIBMW.2011.6112438
DO - 10.1109/BIBMW.2011.6112438
M3 - Conference contribution
AN - SCOPUS:84862912300
SN - 9781457716133
T3 - 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2011
SP - 607
EP - 614
BT - 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2011
T2 - 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2011
Y2 - 12 November 2011 through 15 November 2011
ER -