scholarly journals Robust, flexible, and scalable tests for Hardy-Weinberg Equilibrium across diverse ancestries

2020 ◽  
Author(s):  
Alan M. Kwong ◽  
Thomas W. Blackwell ◽  
Jonathon LeFaive ◽  
Mariza de Andrade ◽  
John Barnard ◽  
...  

ABSTRACTTraditional Hardy-Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in datasets comprised of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence datasets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently amongst the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.

Genetics ◽  
2021 ◽  
Author(s):  
Alan M Kwong ◽  
Thomas W Blackwell ◽  
Jonathon LeFaive ◽  
Mariza de Andrade ◽  
John Barnard ◽  
...  

Abstract Traditional Hardy–Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in data sets composed of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and to evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence data sets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false-positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently among the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.


2019 ◽  
Vol 19 (3) ◽  
pp. 2476-2483 ◽  
Author(s):  
Areej M Al Qahtani ◽  
Ayat B Al-Ghafari ◽  
Huda A Al Doghaither ◽  
Anas H Alzahrani ◽  
Ulfat M Omar ◽  
...  

Background: Colorectal cancer (CRC) is one of the most prevalent cancers in Saudi Arabia that is highly characterized with poor survival rate and advanced metastasis. Many studies contribute this poor outcome to the expression of ABC transporters on the surface of cancer cells.Objectives: In this study, two ABCB1 variants, C3435T and T129C, were examined to evaluate their contribution to CRC risk.Methods: 125 subjects (62 CRC patients and 63 healthy controls) were involved. The DNA was isolated and analyzed with PCR-RFLP to determine the different genotypes. The hardy-Weinberg equilibrium was performed to determine genotype distribution and allele frequencies. Fisher’s exact test (two-tailed) was used to compare allele frequencies between patients and control subjects. Results: The study showed that for SNP C3435T, the population of both CRC patients and controls were out of Hardy-Weinberg equilibrium. Genotype distribution for CRC patients was (Goodness of fit χ2 = 20, df= 1, P≤0.05), whereas, for the controls the genotype distribution was (Goodness of fit χ2 = 21, df =1, P ≤0.05). For SNP T129C, all subjects showed normal (TT) genotype.Conclusion: There was no significant association between ABCB1 3435C>T and 129T>C polymorphisms with CRC risk.Keywords: Colorectal cancer, ABCB1 gene, SNP C3435T, SNP T129C, PCR-RFLP, Saudi Arabia.


2015 ◽  
Vol 63 (4) ◽  
pp. 275
Author(s):  
Andrea Bertram ◽  
P. Joana Dias ◽  
Sherralee Lukehurst ◽  
W. Jason Kennington ◽  
David Fairclough ◽  
...  

Bight redfish, Centroberyx gerrardi, is a demersal teleost endemic to continental shelf and upper slope waters of southern Australia. Throughout most of its range, C. gerrardi is targeted by a number of separately managed commercial and recreational fisheries across several jurisdictions. However, it is currently unknown whether stock assessments and management for this shared resource are being conducted at appropriate spatial scales, thereby requiring knowledge of population structure and connectivity. To investigate population structure and connectivity, we developed 16 new polymorphic microsatellite markers using 454 shotgun sequencing. Two to 15 alleles per locus were detected. There was no evidence of linkage disequilibrium between pairs of loci and all loci except one were in Hardy–Weinberg equilibrium. Cross-amplification trials in the congeneric C. australis and C. lineatus revealed that 11 and 16 loci are potentially useful, respectively. However, deviations from Hardy–Weinberg equilibrium and linkage disequilibrium between pairs of loci were detected at several of the 16 markers for C. australis, and therefore the number of markers useful for population genetic analyses with C. lineatus is likely considerably lower than 11.


2017 ◽  
Author(s):  
Jan Graffelman ◽  
Bruce Weir

Statistical tests for Hardy-Weinberg equilibrium are important elementary tools in genetic data analysis. X-chromosomal variants have long been tested by applying autosomal test procedures to females only, and gender is usually not considered when testing autosomal variants for equilibrium. Recently, we proposed specific X-chromosomal exact test procedures for bi-allelic variants that include the hemizygous males, as well as autosomal tests that consider gender. In this paper we present the extension of the previous work for variants with multiple alleles. A full enumeration algorithm is used for the exact calculations of tri-allelic variants. For variants with many alternate alleles we use a permutation test. Some empirical examples with data from the 1000 genomes project are discussed.


Genetics ◽  
2001 ◽  
Vol 158 (2) ◽  
pp. 875-883
Author(s):  
Luis E Montoya-Delgado ◽  
Telba Z Irony ◽  
Carlos A de B. Pereira ◽  
Martin R Whittle

Abstract Much forensic inference based upon DNA evidence is made assuming that the Hardy-Weinberg equilibrium (HWE) is valid for the genetic loci being used. Several statistical tests to detect and measure deviation from HWE have been devised, each having advantages and limitations. The limitations become more obvious when testing for deviation within multiallelic DNA loci is attempted. Here we present an exact test for HWE in the biallelic case, based on the ratio of weighted likelihoods under the null and alternative hypotheses, the Bayes factor. This test does not depend on asymptotic results and minimizes a linear combination of type I and type II errors. By ordering the sample space using the Bayes factor, we also define a significance (evidence) index, P value, using the weighted likelihood under the null hypothesis. We compare it to the conditional exact test for the case of sample size n = 10. Using the idea under the method of χ2 partition, the test is used sequentially to test equilibrium in the multiple allele case and then applied to two short tandem repeat loci, using a real Caucasian data bank, showing its usefulness.


2021 ◽  
Author(s):  
William S Pearman ◽  
Lara Urban ◽  
Alana Alexander

Reduced representation sequencing (RRS) is a widely used method to assay the diversity of genetic loci across the genome of an organism. The dominant class of RRS approaches assay loci associated with restriction sites within the genome (restriction site associated DNA sequencing, or RADseq). RADseq is frequently applied to non-model organisms since it enables population genetic studies without relying on well-characterized reference genomes. However, RADseq requires the use of many bioinformatic filters to ensure the quality of genotyping calls. These filters can have direct impacts on population genetic inference, and therefore require careful consideration. One widely used filtering approach is the removal of loci which do not conform to expectations of Hardy-Weinberg equilibrium (HWE). Despite being widely used, we show that this filtering approach is rarely described in sufficient detail to enable replication. Furthermore, through analyses of in silico and empirical datasets we show that some of the most widely used HWE filtering approaches dramatically impact inference of population structure. In particular, the removal of loci exhibiting departures from HWE after pooling across samples significantly reduces the degree of inferred population structure within a dataset (despite this approach being widely used). Based on these results, we provide recommendations for best practice regarding the implementation of HWE filtering for RADseq datasets.


2017 ◽  
Author(s):  
Wei Hao ◽  
John D. Storey

AbstractTesting for Hardy-Weinberg equilibrium (HWE) is an important component in almost all analyses of population genetic data. Genetic markers that violate HWE are often treated as special cases; for example, they may be flagged as possible genotyping errors or they may be investigated more closely for evolutionary signatures of interest. The presence of population structure is one reason why genetic markers may fail a test of HWE. This is problematic because almost all natural populations studied in the modern setting show some degree of structure. Therefore, it is important to be able to detect deviations from HWE for reasons other than structure. To this end, we extend statistical tests of HWE to allow for population structure, which we call a test of “structural HWE” (sHWE). Additionally, our new test allows one to automatically choose tuning parameters and identify accurate models of structure. We demonstrate our approach on several important studies, provide theoretical justification for the test, and present empirical evidence for its utility. We anticipate the proposed test will be useful in a broad range of analyses of genome-wide population genetic data.


Sign in / Sign up

Export Citation Format

Share Document