null hypothesis significance testing
Recently Published Documents


TOTAL DOCUMENTS

186
(FIVE YEARS 61)

H-INDEX

26
(FIVE YEARS 3)

2021 ◽  
Vol 15 ◽  
Author(s):  
Ruslan Masharipov ◽  
Irina Knyazeva ◽  
Yaroslav Nikolaev ◽  
Alexander Korotkov ◽  
Michael Didur ◽  
...  

Classical null hypothesis significance testing is limited to the rejection of the point-null hypothesis; it does not allow the interpretation of non-significant results. This leads to a bias against the null hypothesis. Herein, we discuss statistical approaches to ‘null effect’ assessment focusing on the Bayesian parameter inference (BPI). Although Bayesian methods have been theoretically elaborated and implemented in common neuroimaging software packages, they are not widely used for ‘null effect’ assessment. BPI considers the posterior probability of finding the effect within or outside the region of practical equivalence to the null value. It can be used to find both ‘activated/deactivated’ and ‘not activated’ voxels or to indicate that the obtained data are not sufficient using a single decision rule. It also allows to evaluate the data as the sample size increases and decide to stop the experiment if the obtained data are sufficient to make a confident inference. To demonstrate the advantages of using BPI for fMRI data group analysis, we compare it with classical null hypothesis significance testing on empirical data. We also use simulated data to show how BPI performs under different effect sizes, noise levels, noise distributions and sample sizes. Finally, we consider the problem of defining the region of practical equivalence for BPI and discuss possible applications of BPI in fMRI studies. To facilitate ‘null effect’ assessment for fMRI practitioners, we provide Statistical Parametric Mapping 12 based toolbox for Bayesian inference.


2021 ◽  
Author(s):  
Erik Otarola-Castillo ◽  
Meissa G Torquato ◽  
Caitlin E. Buck

Archaeologists often use data and quantitative statistical methods to evaluate their ideas. Although there are various statistical frameworks for decision-making in archaeology and science in general, in this chapter, we provide a simple explanation of Bayesian statistics. To contextualize the Bayesian statistical framework, we briefly compare it to the more widespread null hypothesis significance testing (NHST) approach. We also provide a simple example to illustrate how archaeologists use data and the Bayesian framework to compare hypotheses and evaluate their uncertainty. We then review how archaeologists have applied Bayesian statistics to solve research problems related to radiocarbon dating and chronology, lithic, ceramic, zooarchaeological, bioarchaeological, and spatial analyses. Because recent work has reviewed Bayesian applications in archaeology from the 1990s up to 2017, this work considers the relevant literature published since 2017.


Author(s):  
Andrew Gelman ◽  
Simine Vazire

For several decades, leading behavioral scientists have offered strong criticisms of the common practice of null hypothesis significance testing as producing spurious findings without strong theoretical or empirical support. But only in the past decade has this manifested as a full-scale replication crisis. We consider some possible reasons why, on or about December 2010, the behavioral sciences changed.


2021 ◽  
Author(s):  
Edward William Legg ◽  
Benjamin George Farrar ◽  
Aleksandra Lazić ◽  
Maleen Thiele ◽  
Dora Kampis ◽  
...  

Null Hypothesis Significance Testing is a statistical procedure widely used in cognitive development research. There is widespread concern that the results of this statistical procedure are misinterpreted and lead to unsubstantiated claims about studies’ outcomes. Two particularly pertinent issues for research on cognitive development are: i) treating a non-significant result as evidence of no difference or no effect, and ii) treating a non-significant result in one group/condition and a significant result in another as evidence of a difference between groups/conditions. The current study focuses on quantifying the extent to which these two issues can be observed in the published literature on cognitive development. To this end, we will systematically search for empirical studies investigating cognitive development in 0-to-16-year-old children that have been published at two time points, namely in 1999 and 2019. For each of the two issues, we will extract information from 300 published articles, 150 per publication year.


2021 ◽  
Vol 1 (1-2) ◽  
Author(s):  
Julia Mossbridge ◽  
Dean Radin

Objective: We set out to gain a better understanding of human psychic or “psi” functioning by using a smartphone-based app to gather data from thousands of participants. Our expectations were that psi performance would often be revealed to be in the direction opposite to the participants’ conscious intentions (“expectation-opposing”; previously called “psi-missing”), and that gender and psi belief would be related to performance. Method:We created and launched three iOS-based tasks, available from 2017 to 2020, related to micro-psychokinesis (the ability to mentally influence a random number generator) and precognition (the ability to predict future randomly selected events). We statistically analyzed data from more than 2,613 unique logins and 995,995 contributed trials using null hypothesis significance testing as well as a pre-registered confirmatory analysis. Results: Our expectations were confirmed, and we discovered additional effects post-hoc. Our key findings were: 1) significant expectation-opposing effects, with a confirmatory pre-registered replication of a clear expectation-opposing effect on a micro-pk task,  2) performance correlated with psi belief on all three tasks, 3) performance on two of the three tasks related to gender, 4) men and women apparently used different strategies to perform micro-pk and precognition tasks. Conclusions: We describe our recommendations for future attempts to better understand performance on forced-choice psi tasks. The mnemonic for this strategy is SEARCH: Small effects, Early and exploratory, Accrue data, Recognize diversity in approach, Characterize rather than impose, and Hone in on big results.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Robin AA Ince ◽  
Angus T Paton ◽  
Jim W Kay ◽  
Philippe G Schyns

Within neuroscience, psychology, and neuroimaging, the most frequently used statistical approach is null hypothesis significance testing (NHST) of the population mean. An alternative approach is to perform NHST within individual participants and then infer, from the proportion of participants showing an effect, the prevalence of that effect in the population. We propose a novel Bayesian method to estimate such population prevalence that offers several advantages over population mean NHST. This method provides a population-level inference that is currently missing from study designs with small participant numbers, such as in traditional psychophysics and in precision imaging. Bayesian prevalence delivers a quantitative population estimate with associated uncertainty instead of reducing an experiment to a binary inference. Bayesian prevalence is widely applicable to a broad range of studies in neuroscience, psychology, and neuroimaging. Its emphasis on detecting effects within individual participants can also help address replicability issues in these fields.


2021 ◽  
Author(s):  
David Trafimow

In the debate about the merits or demerits of null hypothesis significance testing (NHST), authorities on both sides assume that the p value that a researcher computes is based on the null hypothesis or test hypothesis. If the assumption is true, it suggests that there are proper uses for NHST, such as distinguishing between competing directional hypotheses. And once it is admitted that there are proper uses for NHST, it makes sense to educate substantive researchers about how to use NHST properly and avoid using it improperly. From this perspective, the conclusion would be that researchers in the business and social sciences could benefit from better education pertaining to NHST. In contrast, my goal is to demonstrate that the p value that a researcher computes is not based on a hypothesis, but on a model in which the hypothesis is embedded. In turn, the distinction between hypotheses and models indicates that NHST cannot soundly be used to distinguish between competing directional hypotheses or to draw any conclusions about directional hypotheses whatsoever. Therefore, it is not clear that better education is likely to prove satisfactory. It is the temptation issue, not the education issue, that deserves to be in the forefront of NHST discussions.


Author(s):  
David McGiffin ◽  
Geoff Cumming ◽  
Paul Myles

Null hypothesis significance testing (NHST) and p-values are widespread in the cardiac surgical literature but are frequently misunderstood and misused. The purpose of the review is to discuss major disadvantages of p-values and suggest alternatives. We describe diagnostic tests, the prosecutor’s fallacy in the courtroom, and NHST, which involve inter-related conditional probabilities, to help clarify the meaning of p-values, and discuss the enormous sampling variability, or unreliability, of p-values. Finally, we use a cardiac surgical database and simulations to explore further issues involving p-values. In clinical studies, p-values provide a poor summary of the observed treatment effect, whereas the three- number summary provided by effect estimates and confidence intervals is more informative and minimises over-interpretation of a “significant” result. P-values are an unreliable measure of strength of evidence; if used at all they give only, at best, a very rough guide to decision making. Researchers should adopt Open Science practices to improve the trustworthiness of research and, where possible, use estimation (three-number summaries) or other better techniques.


Author(s):  
Riko Kelter

AbstractHypothesis testing is a central statistical method in psychology and the cognitive sciences. However, the problems of null hypothesis significance testing (NHST) and p values have been debated widely, but few attractive alternatives exist. This article introduces the R package, which implements the Full Bayesian Significance Test (FBST) to test a sharp null hypothesis against its alternative via the e value. The statistical theory of the FBST has been introduced more than two decades ago and since then the FBST has shown to be a Bayesian alternative to NHST and p values with both theoretical and practical highly appealing properties. The algorithm provided in the package is applicable to any Bayesian model as long as the posterior distribution can be obtained at least numerically. The core function of the package provides the Bayesian evidence against the null hypothesis, the e value. Additionally, p values based on asymptotic arguments can be computed and rich visualizations for communication and interpretation of the results can be produced. Three examples of frequently used statistical procedures in the cognitive sciences are given in this paper, which demonstrate how to apply the FBST in practice using the package. Based on the success of the FBST in statistical science, the package should be of interest to a broad range of researchers and hopefully will encourage researchers to consider the FBST as a possible alternative when conducting hypothesis tests of a sharp null hypothesis.


Sign in / Sign up

Export Citation Format

Share Document