한빛사논문
Diptavo Dutta1,2,3, Peter VandeHaar1,2, Lars G. Fritsche1,2, Sebastian Zöllner1,2, Michael Boehnke1,2, Laura J. Scott1,2, Seunggeun Lee1,2,4,*
1Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
2Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
3Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA
4Graduate School of Data Science, Seoul National University, Seoul 08826, Republic of Korea
*Corresponding author
Abstract
Tests of association between a phenotype and a set of genes in a biological pathway can provide insights into the genetic architecture of complex phenotypes beyond those obtained from single-variant or single-gene association analysis. However, most existing gene set tests have limited power to detect gene set-phenotype association when a small fraction of the genes are associated with the phenotype and cannot identify the potentially “active” genes that might drive a gene set-based association. To address these issues, we have developed Gene set analysis Association Using Sparse Signals (GAUSS), a method for gene set association analysis that requires only GWAS summary statistics. For each significantly associated gene set, GAUSS identifies the subset of genes that have the maximal evidence of association and can best account for the gene set association. Using pre-computed correlation structure among test statistics from a reference panel, our p value calculation is substantially faster than other permutation- or simulation-based approaches. In simulations with varying proportions of causal genes, we find that GAUSS effectively controls type 1 error rate and has greater power than several existing methods, particularly when a small proportion of genes account for the gene set signal. Using GAUSS, we analyzed UK Biobank GWAS summary statistics for 10,679 gene sets and 1,403 binary phenotypes. We found that GAUSS is scalable and identified 13,466 phenotype and gene set association pairs. Within these gene sets, we identify an average of 17.2 (max = 405) genes that underlie these gene set associations.
Keywords : pathway association, summary statistics, core subset, UK Biobank, phenome-wide associations
논문정보
관련 링크
연구자 키워드
관련분야 연구자보기
소속기관 논문보기
관련분야 논문보기
해당논문 저자보기