Human gene/protein synonym dictionary from WikiLinks

Kavishwar Wagholikar, Manabu Torii, Hongfang Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many genes and proteins have alternate names (synonyms) in scientific literature, posing a challenge to effectively organize and exchange information. To address this issue, there have been several initiatives to collate the synonyms into dictionaries. Biothesaurus is an extensive dictionary derived from multiple authoritative sources. Despite its extensive coverage, there are still some synonyms not covered by Biothesaurus. Wikipedia could be a useful source of the missing synonyms, as it has a diverse set of contributors in comparison with authoritative resources, that constitute Biothesaurus. This paper reports a feasibility study of using WikiLinks to find synonyms that are not currently covered by Biothesaurus. Wikipedia pages containing the word gene or protein were included in this study. 121 candidate synonyms were extracted from WikiLinks referencing 7,339 (16%) human genes. This number is significant, given that Biothesaurus has been earlier evaluated to have a coverage of 87%. Hence, WikiLinks were found to be a useful source for collating gene synonyms that are not recorded in authoritative databases. Biothesaurus was evaluated to cover 52% of the extracted candidate synonyms not documented in NCBI. The current study will be extended in scope to cover all genes and to extract synonyms from free text in Wikipedia pages.

Original languageEnglish (US)
Title of host publication2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011
Pages462-464
Number of pages3
DOIs
StatePublished - 2011
Event2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, ACM-BCB 2011 - Chicago, IL, United States
Duration: Aug 1 2011Aug 3 2011

Publication series

Name2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB 2011

Other

Other2011 ACM Conference on Bioinformatics, Computational Biology and Biomedicine, ACM-BCB 2011
Country/TerritoryUnited States
CityChicago, IL
Period8/1/118/3/11

Keywords

  • Encyclopedias
  • Gene/protein synonym
  • Information storage and retrieval
  • Names
  • Terminology
  • WikiLinks
  • Wikipedia

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Fingerprint

Dive into the research topics of 'Human gene/protein synonym dictionary from WikiLinks'. Together they form a unique fingerprint.

Cite this