Regularized Rare Variant Enrichment Analysis for Case-Control Exome Sequencing Data

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research.

Original languageEnglish (US)
Pages (from-to)104-113
Number of pages10
JournalGenetic epidemiology
Issue number2
StatePublished - Feb 2014


  • Association analysis
  • Exome sequencing
  • Rare variants
  • Regularization

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)


Dive into the research topics of 'Regularized Rare Variant Enrichment Analysis for Case-Control Exome Sequencing Data'. Together they form a unique fingerprint.

Cite this