Formalizing ICD coding rules using Formal Concept Analysis

Guoqian Jiang, Jyotishman Pathak, Christopher G. Chute

Research output: Contribution to journalArticlepeer-review

21 Scopus citations


Background: With the 11th revision of the International Classification of Disease (ICD) being officially launched by the World Health Organization (WHO), the significance of a formal representation for ICD coding rules has emerged as a pragmatic concern. Objectives: To explore the role of Formal Concept Analysis (FCA) on examining ICD10 coding rules and to develop FCA-based auditing approaches for the formalization process. Methods: We propose a model for formalizing ICD coding rules underlying the ICD Index using FCA. The coding rules are generated from FCA models and represented in the Semantic Web Rule Language (SWRL). Two auditing approaches were developed focusing upon non-disjoint nodes and anonymous nodes manifest in the FCA model. The candidate domains (i.e. any three character code with their sub-codes) of all 22 chapters of the ICD10 2006 version were analyzed using the two auditing approaches. Case studies and a preliminary evaluation were performed for validation. Results: A total of 2044 formal contexts from the candidate domains of 22 ICD chapters were generated and audited. We identified 692 ICD codes having non-disjoint nodes in all chapters; chapters 19 and 21 contained the highest proportion of candidate domains with non-disjoint nodes (61.9% and 45.6%). We also identified 6996 anonymous nodes from 1382 candidate domains. Chapters 7, 11, 13, and 17, have the highest proportion of candidate domains having anonymous nodes (97.5%, 95.4%, 93.6% and 93.0%) while chapters 15 and 17 have the highest proportion of anonymous nodes among all chapters (45.5% and 44.0%). Case studies and a limited evaluation demonstrate that non-disjoint nodes and anonymous nodes arising from FCA are effective mechanisms for auditing ICD10. Conclusion: FCA-based models demonstrate a practical solution for formalizing ICD coding rules. FCA techniques could not only audit ICD domain knowledge completeness for a specific domain, but also provide a high level auditing profile for all ICD chapters.

Original languageEnglish (US)
Pages (from-to)504-517
Number of pages14
JournalJournal of Biomedical Informatics
Issue number3
StatePublished - Jun 2009


  • Auditing approach
  • Clinical terminologies
  • Formal Concept Analysis (FCA)
  • International Classification of Disease (ICD)
  • Semantic Web Rule Language (SWRL)

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications


Dive into the research topics of 'Formalizing ICD coding rules using Formal Concept Analysis'. Together they form a unique fingerprint.

Cite this