Detecting genomic clustering of risk variants from sequence data: Cases versus controls

Daniel J. Schaid, Jason P. Sinnwell, Shannon K. McDonnell, Stephen N. Thibodeau

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


As the ability to measure dense genetic markers approaches the limit of the DNA sequence itself, taking advantage of possible clustering of genetic variants in, and around, a gene would benefit genetic association analyses, and likely provide biological insights. The greatest benefit might be realized when multiple rare variants cluster in a functional region. Several statistical tests have been developed, one of which is based on the popular Kulldorff scan statistic for spatial clustering of disease. We extended another popular spatial clustering method - Tango's statistic - to genomic sequence data. An advantage of Tango's method is that it is rapid to compute, and when single test statistic is computed, its distribution is well approximated by a scaled χ 2 distribution, making computation of p values very rapid. We compared the Type-I error rates and power of several clustering statistics, as well as the omnibus sequence kernel association test. Although our version of Tango's statistic, which we call "Kernel Distance" statistic, took approximately half the time to compute than the Kulldorff scan statistic, it had slightly less power than the scan statistic. Our results showed that the Ionita-Laza version of Kulldorff's scan statistic had the greatest power over a range of clustering scenarios.

Original languageEnglish (US)
Pages (from-to)1301-1309
Number of pages9
JournalHuman genetics
Issue number11
StatePublished - Nov 2013

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)


Dive into the research topics of 'Detecting genomic clustering of risk variants from sequence data: Cases versus controls'. Together they form a unique fingerprint.

Cite this