TY - JOUR
T1 - Leveraging spatial variation in tumor purity for improved somatic variant calling of archival tumor only samples
AU - Halperin, Rebecca F.
AU - Liang, Winnie S.
AU - Kulkarni, Sidharth
AU - Tassone, Erica E.
AU - Adkins, Jonathan
AU - Enriquez, Daniel
AU - Tran, Nhan L.
AU - Hank, Nicole C.
AU - Newell, James
AU - Kodira, Chinnappa
AU - Korn, Ronald
AU - Berens, Michael E.
AU - Kim, Seungchan
AU - Byron, Sara A.
N1 - Funding Information:
NA is supported by a Grant-in-Aid for Young Scientists (Start-up) Grant Number JP17H06882, from the Japan Society for the Promotion of Science, the Mochida Memorial Foundation for Medical and Pharmaceutical Research, The Kanae Foundation for the Promotion of Medical Science, and the Ryobi Teien Memory Foundation.
Funding Information:
We would like to thank Nicholas Schork, David Craig, Jessica Aldrich, Austin Christofferson, and Jonathan Keats for helpful discussion. We also thank the Ben and Catherine Ivy Foundation and GE Global Research for funding for this study. Texas A&M System Chancellor's Research Initiative for the Center for Computational Systems Biology at the Prairie View A&M University also provided funding to SeK. A pre-print of this manuscript has been deposited in the bioRxiv (31).
Publisher Copyright:
Copyright © 2019 Halperin, Liang, Kulkarni, Tassone, Adkins, Enriquez, Tran, Hank, Newell, Kodira, Korn, Berens, Kim and Byron.
PY - 2019
Y1 - 2019
N2 - Archival tumor samples represent a rich resource of annotated specimens for translational genomics research. However, standard variant calling approaches require a matched normal sample from the same individual, which is often not available in the retrospective setting, making it difficult to distinguish between true somatic variants and individual-specific germline variants. Archival sections often contain adjacent normal tissue, but this tissue can include infiltrating tumor cells. As existing comparative somatic variant callers are designed to exclude variants present in the normal sample, a novel approach is required to leverage adjacent normal tissue with infiltrating tumor cells for somatic variant calling. Here we present lumosVar 2.0, a software package designed to jointly analyze multiple samples from the same patient, built upon our previous single sample tumor only variant caller lumosVar 1.0. The approach assumes that the allelic fraction of somatic variants and germline variants follow different patterns as tumor content and copy number state change. lumosVar 2.0 estimates allele specific copy number and tumor sample fractions from the data, and uses a to model to determine expected allelic fractions for somatic and germline variants and to classify variants accordingly. To evaluate the utility of lumosVar 2.0 to jointly call somatic variants with tumor and adjacent normal samples, we used a glioblastoma dataset with matched high and low tumor content and germline whole exome sequencing data (for true somatic variants) available for each patient. Both sensitivity and positive predictive value were improved when analyzing the high tumor and low tumor samples jointly compared to analyzing the samples individually or in-silico pooling of the two samples. Finally, we applied this approach to a set of breast and prostate archival tumor samples for which tumor blocks containing adjacent normal tissue were available for sequencing. Joint analysis using lumosVar 2.0 detected several variants, including known cancer hotspot mutations that were not detected by standard somatic variant calling tools using the adjacent tissue as presumed normal reference. Together, these results demonstrate the utility of leveraging paired tissue samples to improve somatic variant calling when a constitutional sample is not available.
AB - Archival tumor samples represent a rich resource of annotated specimens for translational genomics research. However, standard variant calling approaches require a matched normal sample from the same individual, which is often not available in the retrospective setting, making it difficult to distinguish between true somatic variants and individual-specific germline variants. Archival sections often contain adjacent normal tissue, but this tissue can include infiltrating tumor cells. As existing comparative somatic variant callers are designed to exclude variants present in the normal sample, a novel approach is required to leverage adjacent normal tissue with infiltrating tumor cells for somatic variant calling. Here we present lumosVar 2.0, a software package designed to jointly analyze multiple samples from the same patient, built upon our previous single sample tumor only variant caller lumosVar 1.0. The approach assumes that the allelic fraction of somatic variants and germline variants follow different patterns as tumor content and copy number state change. lumosVar 2.0 estimates allele specific copy number and tumor sample fractions from the data, and uses a to model to determine expected allelic fractions for somatic and germline variants and to classify variants accordingly. To evaluate the utility of lumosVar 2.0 to jointly call somatic variants with tumor and adjacent normal samples, we used a glioblastoma dataset with matched high and low tumor content and germline whole exome sequencing data (for true somatic variants) available for each patient. Both sensitivity and positive predictive value were improved when analyzing the high tumor and low tumor samples jointly compared to analyzing the samples individually or in-silico pooling of the two samples. Finally, we applied this approach to a set of breast and prostate archival tumor samples for which tumor blocks containing adjacent normal tissue were available for sequencing. Joint analysis using lumosVar 2.0 detected several variants, including known cancer hotspot mutations that were not detected by standard somatic variant calling tools using the adjacent tissue as presumed normal reference. Together, these results demonstrate the utility of leveraging paired tissue samples to improve somatic variant calling when a constitutional sample is not available.
KW - Cancer genomics
KW - Cancer hotspot mutations
KW - Next generation sequencing
KW - Somatic variant calling
KW - Tumor exome sequencing
KW - Tumor-only sequencing
UR - http://www.scopus.com/inward/record.url?scp=85063346553&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063346553&partnerID=8YFLogxK
U2 - 10.3389/fonc.2019.00119
DO - 10.3389/fonc.2019.00119
M3 - Article
AN - SCOPUS:85063346553
SN - 2234-943X
VL - 9
JO - Frontiers in Oncology
JF - Frontiers in Oncology
IS - MAR
M1 - 119
ER -