scholarly journals Covariate Adaptive Family-wise Error Rate Control for Genome-Wide Association Studies

Biometrika ◽  
2020 ◽  
Author(s):  
Huijuan Zhou ◽  
Xianyang Zhang ◽  
Jun Chen

Abstract The family-wise error rate (FWER) has been widely used in genome-wide association studies. With the increasing availability of functional genomics data, it is possible to increase the detection power by leveraging these genomic functional annotations. Previous efforts to accommodate covariates in multiple testing focus on the false discovery rate control while covariate-adaptive FWER-controlling procedures remain under-developed. Here we propose a novel covariate-adaptive FWER-controlling procedure that incorporates external covariates which are potentially informative of either the statistical power or the prior null probability. An efficient algorithm is developed to implement the proposed method. We prove its asymptotic validity and obtain the rate of convergence through a perturbation-type argument. Our numerical studies show that the new procedure is more powerful than competing methods and maintains robustness across different settings. We apply the proposed approach to the UK Biobank data and analyze 27 traits with 9 million single-nucleotide polymorphisms tested for associations. Seventy-five genomic annotations are used as covariates. Our approach detects more genome-wide significant loci than other methods in 21 out of the 27 traits.

2021 ◽  
Author(s):  
Robin N Beaumont ◽  
Isabelle K Mayne ◽  
Rachel M Freathy ◽  
Caroline F Wright

Abstract Birth weight is an important factor in newborn survival; both low and high birth weights are associated with adverse later-life health outcomes. Genome-wide association studies (GWAS) have identified 190 loci associated with maternal or fetal effects on birth weight. Knowledge of the underlying causal genes is crucial to understand how these loci influence birth weight and the links between infant and adult morbidity. Numerous monogenic developmental syndromes are associated with birth weights at the extreme ends of the distribution. Genes implicated in those syndromes may provide valuable information to prioritize candidate genes at the GWAS loci. We examined the proximity of genes implicated in developmental disorders (DDs) to birth weight GWAS loci using simulations to test whether they fall disproportionately close to the GWAS loci. We found birth weight GWAS single nucleotide polymorphisms (SNPs) fall closer to such genes than expected both when the DD gene is the nearest gene to the birth weight SNP and also when examining all genes within 258 kb of the SNP. This enrichment was driven by genes causing monogenic DDs with dominant modes of inheritance. We found examples of SNPs in the intron of one gene marking plausible effects via different nearby genes, highlighting the closest gene to the SNP not necessarily being the functionally relevant gene. This is the first application of this approach to birth weight, which has helped identify GWAS loci likely to have direct fetal effects on birth weight, which could not previously be classified as fetal or maternal owing to insufficient statistical power.


Author(s):  
Ting-Hao Chen ◽  
Chen-Cheng Yang ◽  
Kuei-Hau Luo ◽  
Chia-Yen Dai ◽  
Yao-Chung Chuang ◽  
...  

Aluminum (Al) toxicity is related to renal failure and the failure of other systems. Although there were some genome-wide association studies (GWAS) in Australia and England, there were no GWAS about Han Chinese to our knowledge. Thus, this research focused on using whole genomic genotypes from the Taiwan Biobank for exploring the association between Al concentrations in plasma and renal function. Participants, who underwent questionnaire interviews, biomarkers, and genotyping, were from the Taiwan Biobank database. Then, we measured their plasma Al concentrations with ICP-MS in the laboratory at Kaohsiung Medical University. We used this data to link genome-wide association (GWA) tests while looking for candidate genes and associated plasma Al concentration to renal function. Furthermore, we examined the path relationship between Single Nucleotide Polymorphisms (SNPs), Al concentrations, and estimated glomerular filtration rates (eGFR) through the mediation analysis with 3000 replication bootstraps. Following the principles of GWAS, we focused on three SNPs within the dipeptidyl peptidase-like protein 6 (DPP6) gene in chromosome 7, rs10224371, rs2316242, and rs10268004, respectively. The results of the mediation analysis showed that all of the selected SNPs have indirectly affected eGFR through a mediation of Al concentrations. Our analysis revealed the association between DPP6 SNPs, plasma Al concentrations, and eGFR. However, further longitudinal studies and research on mechanism are in need. Our analysis was still be the first study that explored the association between the DPP6, SNPs, and Al in plasma affecting eGFR.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Guomin Zhang ◽  
Rongsheng Wang ◽  
Juntao Ma ◽  
Hongru Gao ◽  
Lingwei Deng ◽  
...  

Abstract Background Heilongjiang Province is a high-quality japonica rice cultivation area in China. One in ten bowls of Chinese rice is produced here. Increasing yield is one of the main aims of rice production in this area. However, yield is a complex quantitative trait composed of many factors. The purpose of this study was to determine how many genetic loci are associated with yield-related traits. Genome-wide association studies (GWAS) were performed on 450 accessions collected from northeast Asia, including Russia, Korea, Japan and Heilongjiang Province of China. These accessions consist of elite varieties and landraces introduced into Heilongjiang Province decade ago. Results After resequencing of the 450 accessions, 189,019 single nucleotide polymorphisms (SNPs) were used for association studies by two different models, a general linear model (GLM) and a mixed linear model (MLM), examining four traits: days to heading (DH), plant height (PH), panicle weight (PW) and tiller number (TI). Over 25 SNPs were found to be associated with each trait. Among them, 22 SNPs were selected to identify candidate genes, and 2, 8, 1 and 11 SNPs were found to be located in 3′ UTR region, intron region, coding region and intergenic region, respectively. Conclusions All SNPs detected in this research may become candidates for further fine mapping and may be used in the molecular breeding of high-latitude rice.


2019 ◽  
Vol 116 (4) ◽  
pp. 1195-1200 ◽  
Author(s):  
Daniel J. Wilson

Analysis of “big data” frequently involves statistical comparison of millions of competing hypotheses to discover hidden processes underlying observed patterns of data, for example, in the search for genetic determinants of disease in genome-wide association studies (GWAS). Controlling the familywise error rate (FWER) is considered the strongest protection against false positives but makes it difficult to reach the multiple testing-corrected significance threshold. Here, I introduce the harmonic mean p-value (HMP), which controls the FWER while greatly improving statistical power by combining dependent tests using generalized central limit theorem. I show that the HMP effortlessly combines information to detect statistically significant signals among groups of individually nonsignificant hypotheses in examples of a human GWAS for neuroticism and a joint human–pathogen GWAS for hepatitis C viral load. The HMP simultaneously tests all ways to group hypotheses, allowing the smallest groups of hypotheses that retain significance to be sought. The power of the HMP to detect significant hypothesis groups is greater than the power of the Benjamini–Hochberg procedure to detect significant hypotheses, although the latter only controls the weaker false discovery rate (FDR). The HMP has broad implications for the analysis of large datasets, because it enhances the potential for scientific discovery.


2020 ◽  
Vol 117 (21) ◽  
pp. 11608-11613 ◽  
Author(s):  
Marcelo Blatt ◽  
Alexander Gusev ◽  
Yuriy Polyakov ◽  
Shafi Goldwasser

Genome-wide association studies (GWASs) seek to identify genetic variants associated with a trait, and have been a powerful approach for understanding complex diseases. A critical challenge for GWASs has been the dependence on individual-level data that typically have strict privacy requirements, creating an urgent need for methods that preserve the individual-level privacy of participants. Here, we present a privacy-preserving framework based on several advances in homomorphic encryption and demonstrate that it can perform an accurate GWAS analysis for a real dataset of more than 25,000 individuals, keeping all individual data encrypted and requiring no user interactions. Our extrapolations show that it can evaluate GWASs of 100,000 individuals and 500,000 single-nucleotide polymorphisms (SNPs) in 5.6 h on a single server node (or in 11 min on 31 server nodes running in parallel). Our performance results are more than one order of magnitude faster than prior state-of-the-art results using secure multiparty computation, which requires continuous user interactions, with the accuracy of both solutions being similar. Our homomorphic encryption advances can also be applied to other domains where large-scale statistical analyses over encrypted data are needed.


2020 ◽  
Vol 116 (9) ◽  
pp. 1620-1634
Author(s):  
Charlotte Glinge ◽  
Najim Lahrouchi ◽  
Reza Jabbari ◽  
Jacob Tfelt-Hansen ◽  
Connie R Bezzina

Abstract The genetic basis of cardiac electrical phenotypes has in the last 25 years been the subject of intense investigation. While in the first years, such efforts were dominated by the study of familial arrhythmia syndromes, in recent years, large consortia of investigators have successfully pursued genome-wide association studies (GWAS) for the identification of single-nucleotide polymorphisms that govern inter-individual variability in electrocardiographic parameters in the general population. We here provide a review of GWAS conducted on cardiac electrical phenotypes in the last 14 years and discuss the implications of these discoveries for our understanding of the genetic basis of disease susceptibility and variability in disease severity. Furthermore, we review functional follow-up studies that have been conducted on GWAS loci associated with cardiac electrical phenotypes and highlight the challenges and opportunities offered by such studies.


Sign in / Sign up

Export Citation Format

Share Document