Covariate Adaptive False Discovery Rate Control With Applications to Omics-Wide Multiple Testing

Xianyang Zhang; Jun Chen

doi:10.1080/01621459.2020.1783273

Covariate Adaptive False Discovery Rate Control With Applications to Omics-Wide Multiple Testing

Xianyang Zhang, Jun Chen

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

Abstract

Conventional multiple testing procedures often assume hypotheses for different features are exchangeable. However, in many scientific applications, additional covariate information regarding the patterns of signals and nulls are available. In this article, we introduce an FDR control procedure in large-scale inference problem that can incorporate covariate information. We develop a fast algorithm to implement the proposed procedure and prove its asymptotic validity even when the underlying likelihood ratio model is misspecified and the p-values are weakly dependent (e.g., strong mixing). Extensive simulations are conducted to study the finite sample performance of the proposed method and we demonstrate that the new approach improves over the state-of-the-art approaches by being flexible, robust, powerful, and computationally efficient. We finally apply the method to several omics datasets arising from genomics studies with the aim to identify omics features associated with some clinical and biological phenotypes. We show that the method is overall the most powerful among competing methods, especially when the signal is sparse. The proposed covariate adaptive multiple testing procedure is implemented in the R package CAMT. Supplementary materials for this article are available online.

Original language	English (US)
Pages (from-to)	411-427
Number of pages	17
Journal	Journal of the American Statistical Association
Volume	117
Issue number	537
DOIs	https://doi.org/10.1080/01621459.2020.1783273
State	Published - 2022

Keywords

Covariates
EM-algorithm
False discovery rate
Multiple testing

ASJC Scopus subject areas

Statistics and Probability
Statistics, Probability and Uncertainty

Access to Document

10.1080/01621459.2020.1783273

Cite this

@article{3ce85d704154429eb05ee337d116e214,

title = "Covariate Adaptive False Discovery Rate Control With Applications to Omics-Wide Multiple Testing",

abstract = "Conventional multiple testing procedures often assume hypotheses for different features are exchangeable. However, in many scientific applications, additional covariate information regarding the patterns of signals and nulls are available. In this article, we introduce an FDR control procedure in large-scale inference problem that can incorporate covariate information. We develop a fast algorithm to implement the proposed procedure and prove its asymptotic validity even when the underlying likelihood ratio model is misspecified and the p-values are weakly dependent (e.g., strong mixing). Extensive simulations are conducted to study the finite sample performance of the proposed method and we demonstrate that the new approach improves over the state-of-the-art approaches by being flexible, robust, powerful, and computationally efficient. We finally apply the method to several omics datasets arising from genomics studies with the aim to identify omics features associated with some clinical and biological phenotypes. We show that the method is overall the most powerful among competing methods, especially when the signal is sparse. The proposed covariate adaptive multiple testing procedure is implemented in the R package CAMT. Supplementary materials for this article are available online.",

keywords = "Covariates, EM-algorithm, False discovery rate, Multiple testing",

author = "Xianyang Zhang and Jun Chen",

note = "Publisher Copyright: {\textcopyright} 2020 American Statistical Association.",

year = "2022",

doi = "10.1080/01621459.2020.1783273",

language = "English (US)",

volume = "117",

pages = "411--427",

journal = "Journal of the American Statistical Association",

issn = "0162-1459",

publisher = "Taylor and Francis Ltd.",

number = "537",

}

TY - JOUR

T1 - Covariate Adaptive False Discovery Rate Control With Applications to Omics-Wide Multiple Testing

AU - Zhang, Xianyang

AU - Chen, Jun

PY - 2022

Y1 - 2022

N2 - Conventional multiple testing procedures often assume hypotheses for different features are exchangeable. However, in many scientific applications, additional covariate information regarding the patterns of signals and nulls are available. In this article, we introduce an FDR control procedure in large-scale inference problem that can incorporate covariate information. We develop a fast algorithm to implement the proposed procedure and prove its asymptotic validity even when the underlying likelihood ratio model is misspecified and the p-values are weakly dependent (e.g., strong mixing). Extensive simulations are conducted to study the finite sample performance of the proposed method and we demonstrate that the new approach improves over the state-of-the-art approaches by being flexible, robust, powerful, and computationally efficient. We finally apply the method to several omics datasets arising from genomics studies with the aim to identify omics features associated with some clinical and biological phenotypes. We show that the method is overall the most powerful among competing methods, especially when the signal is sparse. The proposed covariate adaptive multiple testing procedure is implemented in the R package CAMT. Supplementary materials for this article are available online.

AB - Conventional multiple testing procedures often assume hypotheses for different features are exchangeable. However, in many scientific applications, additional covariate information regarding the patterns of signals and nulls are available. In this article, we introduce an FDR control procedure in large-scale inference problem that can incorporate covariate information. We develop a fast algorithm to implement the proposed procedure and prove its asymptotic validity even when the underlying likelihood ratio model is misspecified and the p-values are weakly dependent (e.g., strong mixing). Extensive simulations are conducted to study the finite sample performance of the proposed method and we demonstrate that the new approach improves over the state-of-the-art approaches by being flexible, robust, powerful, and computationally efficient. We finally apply the method to several omics datasets arising from genomics studies with the aim to identify omics features associated with some clinical and biological phenotypes. We show that the method is overall the most powerful among competing methods, especially when the signal is sparse. The proposed covariate adaptive multiple testing procedure is implemented in the R package CAMT. Supplementary materials for this article are available online.

KW - Covariates

KW - EM-algorithm

KW - False discovery rate

KW - Multiple testing

UR - http://www.scopus.com/inward/record.url?scp=85089496722&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85089496722&partnerID=8YFLogxK

U2 - 10.1080/01621459.2020.1783273

DO - 10.1080/01621459.2020.1783273

M3 - Article

AN - SCOPUS:85089496722

SN - 0162-1459

VL - 117

SP - 411

EP - 427

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

IS - 537

ER -

Covariate Adaptive False Discovery Rate Control With Applications to Omics-Wide Multiple Testing

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this