scholarly journals Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions

Author(s):  
David R. Bickel
F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 441
Author(s):  
Megan H. Murray ◽  
Jeffrey D. Blume

False discovery rates (FDR) are an essential component of statistical inference, representing the propensity for an observed result to be mistaken. FDR estimates should accompany observed results to help the user contextualize the relevance and potential impact of findings. This paper introduces a new user-friendly R pack-age for estimating FDRs and computing adjusted p-values for FDR control. The roles of these two quantities are often confused in practice and some software packages even report the adjusted p-values as the estimated FDRs. A key contribution of this package is that it distinguishes between these two quantities while also offering a broad array of refined algorithms for estimating them. For example, included are newly augmented methods for estimating the null proportion of findings - an important part of the FDR estimation procedure. The package is broad, encompassing a variety of adjustment methods for FDR estimation and FDR control, and includes plotting functions for easy display of results. Through extensive illustrations, we strongly encourage wider reporting of false discovery rates for observed findings.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 441
Author(s):  
Megan H. Murray ◽  
Jeffrey D. Blume

False discovery rates (FDR) are an essential component of statistical inference, representing the propensity for an observed result to be mistaken. FDR estimates should accompany observed results to help the user contextualize the relevance and potential impact of findings. This paper introduces a new user-friendly R pack-age for estimating FDRs and computing adjusted p-values for FDR control. The roles of these two quantities are often confused in practice and some software packages even report the adjusted p-values as the estimated FDRs. A key contribution of this package is that it distinguishes between these two quantities while also offering a broad array of refined algorithms for estimating them. For example, included are newly augmented methods for estimating the null proportion of findings - an important part of the FDR estimation procedure. The package is broad, encompassing a variety of adjustment methods for FDR estimation and FDR control, and includes plotting functions for easy display of results. Through extensive illustrations, we strongly encourage wider reporting of false discovery rates for observed findings.


Author(s):  
Peter Hettegger ◽  
Klemens Vierlinger ◽  
Andreas Weinhaeusel

Abstract Motivation Data generated from high-throughput technologies such as sequencing, microarray and bead-chip technologies are unavoidably affected by batch effects (BEs). Large effort has been put into developing methods for correcting these effects. Often, BE correction and hypothesis testing cannot be done with one single model, but are done successively with separate models in data analysis pipelines. This potentially leads to biased P-values or false discovery rates due to the influence of BE correction on the data. Results We present a novel approach for estimating null distributions of test statistics in data analysis pipelines where BE correction is followed by linear model analysis. The approach is based on generating simulated datasets by random rotation and thereby retains the dependence structure of genes adequately. This allows estimating null distributions of dependent test statistics, and thus the calculation of resampling-based P-values and false-discovery rates following BE correction while maintaining the alpha level. Availability The described methods are implemented as randRotation package on Bioconductor: https://bioconductor.org/packages/randRotation/ Supplementary information Supplementary data are available at Bioinformatics online.


Biometrika ◽  
2011 ◽  
Vol 98 (2) ◽  
pp. 251-271 ◽  
Author(s):  
Bradley Efron ◽  
Nancy R. Zhang

Scientifica ◽  
2012 ◽  
Vol 2012 ◽  
pp. 1-9 ◽  
Author(s):  
Emily Hansen ◽  
Kathleen F. Kerr

The goal of many microarray studies is to identify genes that are differentially expressed between two classes or populations. Many data analysts choose to estimate the false discovery rate (FDR) associated with the list of genes declared differentially expressed. Estimating an FDR largely reduces to estimatingπ1, the proportion of differentially expressed genes among all analyzed genes. Estimatingπ1is usually done throughP-values, but computingP-values can be viewed as a nuisance and potentially problematic step. We evaluated methods for estimatingπ1directly from test statistics, circumventing the need to computeP-values. We adapted existing methodology for estimatingπ1fromt- andz-statistics so thatπ1could be estimated from other statistics. We compared the quality of these estimates to estimates generated by two established methods for estimatingπ1fromP-values. Overall, methods varied widely in bias and variability. The least biased and least variable estimates ofπ1, the proportion of differentially expressed genes, were produced by applying the “convest” mixture model method toP-values computed from a pooled permutation null distribution. Estimates computed directly from test statistics rather thanP-values did not reliably perform well.


Author(s):  
Balthasar Bickel

Large-scale areal patterns point to ancient population history and form a well-known confound for language universals. Despite their importance, demonstrating such patterns remains a challenge. This chapter argues that large-scale area hypotheses are better tested by modeling diachronic family biases than by controlling for genealogical relations in regression models. A case study of the Trans-Pacific area reveals that diachronic bias estimates do not depend much on the amount of phylogenetic information that is used when inferring them. After controlling for false discovery rates, about 39 variables in WALS and AUTOTYP show diachronic biases that differ significantly inside vs. outside the Trans-Pacific area. Nearly three times as many biases hold outside than inside the Trans-Pacific area, indicating that the Trans-Pacific area is not so much characterized by the spread of biases but rather by the retention of earlier diversity, in line with earlier suggestions in the literature.


PROTEOMICS ◽  
2009 ◽  
Vol 9 (5) ◽  
pp. 1220-1229 ◽  
Author(s):  
Andrew R. Jones ◽  
Jennifer A. Siepen ◽  
Simon J. Hubbard ◽  
Norman W. Paton

Sign in / Sign up

Export Citation Format

Share Document