posted on 2016-03-28, 00:00authored byRoxana Daneshjou, Nicholas P Tatonetti, Konrad J Karczewski, Hersh Sagreiya, Stephane Bourgeois, Katarzyna Drozda, James K Burmester, Tatsuhiko Tsunoda, Yusuke Nakamura, Michiaki Kubo, Matthew Tector, Nita A Limdi, Larisa H Cavallari, Minoli Perera, Julie A Johnson, Teri E Klein, Russ B Altman
Background: Many genome-wide association studies focus on associating single loci with target phenotypes.
However, in the setting of rare variation, accumulating sufficient samples to assess these associations can be
difficult. Moreover, multiple variations in a gene or a set of genes within a pathway may all contribute to the
phenotype, suggesting that the aggregation of variations found over the gene or pathway may be useful for
improving the power to detect associations.
Results: Here, we present a method for aggregating single nucleotide polymorphisms (SNPs) along biologically
relevant pathways in order to seek genetic associations with phenotypes. Our method uses all available genetic
variants and does not remove those in linkage disequilibrium (LD). Instead, it uses a novel SNP weighting scheme
to down-weight the contributions of correlated SNPs. We apply our method to three cohorts of patients taking
warfarin: two European descent cohorts and an African American cohort. Although the clinical covariates and key
pharmacogenetic loci for warfarin have been characterized, our association metric identifies a significant association
with mutations distributed throughout the pathway of warfarin metabolism. We improve dose prediction after
using all known clinical covariates and pharmacogenetic variants in VKORC1 and CYP2C9. In particular, we find that
at least 1% of the missing heritability in warfarin dose may be due to the aggregated effects of variations in the
warfarin metabolic pathway, even though the SNPs do not individually show a significant association.
Conclusions: Our method allows researchers to study aggregative SNP effects in an unbiased manner by not
preselecting SNPs. It retains all the available information by accounting for LD-structure through weighting, which
eliminates the need for LD pruning.