scholarly journals Nonlinear post-selection inference for genome-wide association studies

2020 ◽  
Author(s):  
Lotfi Slim ◽  
Clément Chatelain ◽  
Chloé-Agathe Azencott

AbstractAssociation testing in genome-wide association studies (GWAS) is often performed at either the SNP level or the gene level. The two levels can bring different insights into disease mechanisms. In the present work, we provide a novel approach based on nonlinear post-selection inference to bridge the gap between them. Our approach selects, within a gene, the SNPs or LD blocks most associated with the phenotype, before testing their combined effect. Both the selection and the association testing are conducted nonlinearly. We apply our tool to the study of BMI and its variation in the UK BioBank. In this study, our approach outperformed other gene-level association testing tools, with the unique benefit of pinpointing the causal SNPs.

Author(s):  
Jack W. O’Sullivan ◽  
John P. A. Ioannidis

AbstractWith the establishment of large biobanks, discovery of single nucleotide polymorphism (SNPs) that are associated with various phenotypes has been accelerated. An open question is whether SNPs identified with genome-wide significance in earlier genome-wide association studies (GWAS) are replicated also in later GWAS conducted in biobanks. To address this question, the authors examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, replication GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNPs (of which 6,289 had reached p<5e-8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0% and it was lower for binary than for quantitative phenotypes (58.1% versus 94.8% respectively). There was a18.0% decrease in SNP effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNP effect size, phenotype trait (binary or quantitative), and discovery p-value, we built and validated a model that predicted SNP replication with area under the Receiver Operator Curve = 0.90. While non-replication may often reflect lack of power rather than genuine false-positive findings, these results provide insights about which discovered associations are likely to be seen again across subsequent GWAS.


2020 ◽  
Author(s):  
Dan Ju ◽  
Iain Mathieson

AbstractSkin pigmentation is a classic example of a polygenic trait that has experienced directional selection in humans. Genome-wide association studies have identified well over a hundred pigmentation-associated loci, and genomic scans in present-day and ancient populations have identified selective sweeps for a small number of light pigmentation-associated alleles in Europeans. It is unclear whether selection has operated on all the genetic variation associated with skin pigmentation as opposed to just a small number of large-effect variants. Here, we address this question using ancient DNA from 1158 individuals from West Eurasia covering a period of 40,000 years combined with genome-wide association summary statistics from the UK Biobank. We find a robust signal of directional selection in ancient West Eurasians on skin pigmentation variants ascertained in the UK Biobank, but find this signal is driven mostly by a limited number of large-effect variants. Consistent with this observation, we find that a polygenic selection test in present-day populations fails to detect selection with the full set of variants; rather, only the top five show strong evidence of selection. Our data allow us to disentangle the effects of admixture and selection. Most notably, a large-effect variant at SLC24A5 was introduced to Europe by migrations of Neolithic farming populations but continued to be under selection post-admixture. This study shows that the response to selection for light skin pigmentation in West Eurasia was driven by a relatively small proportion of the variants that are associated with present-day phenotypic variation.SignificanceSome of the genes responsible for the evolution of light skin pigmentation in Europeans show signals of positive selection in present-day populations. Recently, genome-wide association studies have highlighted the highly polygenic nature of skin pigmentation. It is unclear whether selection has operated on all of these genetic variants or just a subset. By studying variation in over a thousand ancient genomes from West Eurasia covering 40,000 years we are able to study both the aggregate behavior of pigmentation-associated variants and the evolutionary history of individual variants. We find that the evolution of light skin pigmentation in Europeans was driven by frequency changes in a relatively small fraction of the genetic variants that are associated with variation in the trait today.


2021 ◽  
Author(s):  
Weihua Meng ◽  
Parminder Reel ◽  
Charvi Nangia ◽  
Aravind Rajendrakumar ◽  
Harry Hebert ◽  
...  

Headache is one of the commonest complaints that doctors need to address in clinical settings. The genetic mechanisms of different types of headache are not well understood. In this study, we performed a meta-analysis of genome-wide association studies (GWAS) on the self-reported headache phenotype from the UK Biobank cohort and the self-reported migraine phenotype from the 23andMe resource using the metaUSAT for genetically correlated phenotypes (N=397,385). We identified 38 loci for headaches, of which 34 loci have been reported before and 4 loci were newly identified. The LRP1-STAT6-SDR9C7 region in chromosome 12 was the most significantly associated locus with a leading P value of 1.24 x 10-62 of rs11172113. The ONECUT2 gene locus in chromosome 18 was the strongest signal among the 4 new loci with a P value of 1.29 x 10-9 of rs673939. Our study demonstrated that the genetically correlated phenotypes of self-reported headache and self-reported migraine can be meta-analysed together in theory and in practice to boost study power to identify more new variants for headaches. This study has paved way for a large GWAS meta-analysis study involving cohorts of different, though genetically correlated headache phenotypes.


2020 ◽  
Author(s):  
Lucas D. Ward ◽  
Ho-Chou Tu ◽  
Chelsea Quenneville ◽  
Alexander O. Flynn-Carroll ◽  
Margaret M. Parker ◽  
...  

AbstractTo better understand molecular pathways underlying liver health and disease, we performed genome-wide association studies (GWAS) on circulating levels of alanine aminotransferase (ALT) and aspartate aminotransferase (AST) across 408,300 subjects from four ethnic groups in the UK Biobank, focusing on variants associating with both enzymes. Of these variants, the strongest effect is a rare (MAF in White British = 0.12%) missense variant in the gene encoding manganese efflux transporter SLC30A10, Thr95Ile (rs188273166), associating with a 5.9% increase in ALT and a 4.2% increase in AST. Carriers have higher prevalence of all-cause liver disease (OR = 1.70; 95% CI = 1.24 to 2.34) and higher prevalence of extrahepatic bile duct cancer (OR = 23.8; 95% CI = 9.1 to 62.1) compared to non-carriers. Over 4% of the cases of extrahepatic cholangiocarcinoma in the UK Biobank carry SLC30A10 Thr95Ile. Unlike variants in SLC30A10 known to cause the recessive syndrome hypermanganesemia with dystonia-1 (HMNDYT1), the Thr95Ile variant has a detectable effect even in the heterozygous state. Also unlike HMNDYT1-causing variants, Thr95Ile results in a protein that is properly trafficked to the plasma membrane when expressed in HeLa cells. These results suggest that coding variation in SLC30A10 impacts liver health in more individuals than the small population of HMNDYT1 patients.


Author(s):  
Mathew Vithayathil ◽  
Paul Carter ◽  
Siddhartha Kar ◽  
Amy M. Mason ◽  
Stephen Burgess ◽  
...  

ABSTRACTObjectivesTo investigate the casual role of body mass index, body fat composition and height in cancer.DesignTwo stage mendelian randomisation studySettingPrevious genome wide association studies and the UK BiobankParticipantsGenetic instrumental variables for body mass index (BMI), fat mass index (FMI), fat free mass index (FFMI) and height from previous genome wide association studies and UK Biobank. Cancer outcomes from 367 586 participants of European descent from the UK Biobank.Main outcome measuresOverall cancer risk and 22 site-specific cancers risk for genetic instrumental variables for BMI, FMI, FFMI and height.ResultsGenetically predicted BMI (per 1 kg/m2) was not associated with overall cancer risk (OR 0.99; 95% confidence interval (CI) 0-98-1.00, p=0.105). Elevated BMI was associated with increased risk of stomach cancer (OR 1.15, 95% (CI) 1.05-1.26; p=0.003) and melanoma (OR 0.96, 95% CI 0.92-1.00; p=0.044). For sex-specific cancers, BMI was positively associated with uterine cancer (OR 1.08, 95% CI 1.01-1.14; p=0.015) but inversely associated with breast (OR 0.95, 95% CI 0.92-0.98; p=0.001), prostate (OR 0.95, 95% CI 0.92-0.99; p=0.007) and testicular cancer (OR 0.89, 95% CI 0.81-0.98; p=0.017). Elevated FMI (per 1 kg/m2) was associated with gastrointestinal cancer (stomach cancer OR 4.23, 95% CI 1.18-15.13, p=0.027; colorectal cancer OR 1.94, 95% CI 1.23-3.07; p=0.004). Increased height (per 1 standard deviation, approximately 6.5cm) was associated with increased risk of overall cancer (OR 1.06; 95% 1.04-1.09; p = 2.97×10-8) and most site-specific cancers with the strongest estimates for kidney, non-Hodgkin lymphoma, colorectal, lung, melanoma and breast cancer.ConclusionsThere is little evidence for BMI as a casual risk factor for cancer. BMI may have a causal role for sex-specific cancers, although with inconsistent directions of effect, and FMI for gastrointestinal malignancies. Elevated height is a risk factor for overall cancer and multiple site cancers.


2020 ◽  
Vol 29 (16) ◽  
pp. 2803-2811
Author(s):  
James P Cook ◽  
Anubha Mahajan ◽  
Andrew P Morris

Abstract The UK Biobank is a prospective study of more than 500 000 participants, which has aggregated data from questionnaires, physical measures, biomarkers, imaging and follow-up for a wide range of health-related outcomes, together with genome-wide genotyping supplemented with high-density imputation. Previous studies have highlighted fine-scale population structure in the UK on a North-West to South-East cline, but the impact of unmeasured geographical confounding on genome-wide association studies (GWAS) of complex human traits in the UK Biobank has not been investigated. We considered 368 325 white British individuals from the UK Biobank and performed GWAS of their birth location. We demonstrate that widely used approaches to adjust for population structure, including principal component analysis and mixed modelling with a random effect for a genetic relationship matrix, cannot fully account for the fine-scale geographical confounding in the UK Biobank. We observe significant genetic correlation of birth location with a range of lifestyle-related traits, including body-mass index and fat mass, hypertension and lung function, even after adjustment for population structure. Variants driving associations with birth location are also strongly associated with many of these lifestyle-related traits after correction for population structure, indicating that there could be environmental factors that are confounded with geography that have not been adequately accounted for. Our findings highlight the need for caution in the interpretation of lifestyle-related trait GWAS in UK Biobank, particularly in loci demonstrating strong residual association with birth location.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jack W. O’Sullivan ◽  
John P. A. Ioannidis

AbstractWith the establishment of large biobanks, discovery of single nucleotide variants (SNVs, also known as single nucleotide polymorphisms (SNVs)) associated with various phenotypes has accelerated. An open question is whether genome-wide significant SNVs identified in earlier genome-wide association studies (GWAS) are replicated in later GWAS conducted in biobanks. To address this, we examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, “replication” GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNVs (of which 6289 reached P < 5e−8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0%; although lower for binary than quantitative phenotypes (58.1% versus 94.8% respectively). There was a 18.0% decrease in SNV effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNV effect size, phenotype trait (binary or quantitative), and discovery P value, we built and validated a model that predicted SNV replication with area under the Receiver Operator Curve = 0.90. While non-replication may reflect lack of power rather than genuine false-positives, these results provide insights about which discovered associations are likely to be replicated across subsequent GWAS.


BMC Medicine ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Wei Liu ◽  
Åsa Johansson ◽  
Helge Rask-Andersen ◽  
Mathias Rask-Andersen

Abstract Background Sensorineural hearing loss is one of the most common sensory deficiencies. However, the molecular contribution to age-related hearing loss is not fully elucidated. Methods We performed genome-wide association studies (GWAS) for hearing loss-related traits in the UK Biobank (N = 362,396) and selected a high confidence set of ten hearing-associated gene products for staining in human cochlear samples: EYA4, LMX1A, PTK2/FAK, UBE3B, MMP2, SYNJ2, GRM5, TRIOBP, LMO-7, and NOX4. Results All proteins were found to be expressed in human cochlear structures. Our findings illustrate cochlear structures that mediate mechano-electric transduction of auditory stimuli, neuronal conductance, and neuronal plasticity to be involved in age-related hearing loss. Conclusions Our results suggest common genetic variation to influence structural resilience to damage as well as cochlear recovery after trauma, which protect against accumulated damage to cochlear structures and the development of hearing loss over time.


Sign in / Sign up

Export Citation Format

Share Document