Excerno: Filtering Mutations Caused by the Clinical Archival Process in Sequencing Data

Audrey Mitchell, Marco Ruiz, Soua Yang, Chen Wang, Jaime Davila

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The accurate detection of mutations from clinical samples using Next Generation Sequencing (NGS) is of great importance in the clinical treatment of cancer patients. Clinical tests use archival pathology slides, which are preserved by Formalin-Fixation Paraffin Embedding (FFPE). The FFPE process introduces spurious C > T mutations hindering accurate cancer diagnosis. FFPE mutational artifacts occur in a well-defined pattern called a mutational signature. By quantifying the abundance of the FFPE mutational signature and using Bayes’ formula we developed a method to filter FFPE artifacts. We implemented this method as the excerno package in the R statistical language. We tested our method by generating mutations that follow the FFPE mutational signature and combining them with variants produced by other mutational signatures from the Catalog of Somatic Mutations in Cancer (COSMIC). First, we mixed an equal number of FFPE variants and mutations from a single COSMIC mutational signature and tested excerno across all of the 60 COSMIC mutational signatures. Our median sensitivity, specificity, and Area Under the Curve (AUC) were 0.89, 0.99, and 0.96 respectively. Furthermore, our performance characteristics decrease as a linear function of the similarity between the COSMIC and the FFPE mutational signatures (R2 = 0.90). We also tested our method by mixing different proportions of mutations from the COSMIC and FFPE mutational signatures. As we increased the proportion of FFPE variants our sensitivity increased while our specificity decreased. In conclusion, we developed and implemented excerno, an accurate method to filter FFPE artifactual mutations and characterized its performance characteristics using simulated datasets.

Original languageEnglish (US)
Title of host publicationComputational Advances in Bio and Medical Sciences - 11th International Conference, ICCABS 2021, Revised Selected Papers
EditorsMukul S. Bansal, Ion Măndoiu, Sanguthevar Rajasekaran, Marmar Moussa, Murray Patterson, Pavel Skums, Alexander Zelikovsky
PublisherSpringer Science and Business Media Deutschland GmbH
Pages29-37
Number of pages9
ISBN (Print)9783031175305
DOIs
StatePublished - 2022
Event11th International Conference on Computational Advances in Bio and Medical Sciences, ICCABS 2021 - Virtual, Online
Duration: Dec 16 2021Dec 18 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13254 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th International Conference on Computational Advances in Bio and Medical Sciences, ICCABS 2021
CityVirtual, Online
Period12/16/2112/18/21

Keywords

  • Formalin-Fixation Paraffin-Embedded (FFPE)
  • Mutational signatures
  • Next Generation Sequencing (NGS)

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Excerno: Filtering Mutations Caused by the Clinical Archival Process in Sequencing Data'. Together they form a unique fingerprint.

Cite this