Lady Justice Versus Cult of Statistical Significance

Author(s):  
Stephen T. Ziliak ◽  
Deirdre McCloskey

Economics and other sciences use null hypothesis statistical significance testing without a loss function and avoid asking “how big is a big loss or gain?.” Statistical significance is not equivalent to economic significance; the mistake is evident when one reflects that the estimated payoff from a lottery is not the same as the odds of winning that lottery. Yet a widespread failure to make the distinction between an estimate of human consequence and an estimate of its probability—between the meaning of an estimated average and the random variance around it—is killing people in medicine and impoverishing people in economics. The ethical problem created by a test of statistical significance is made worse by the method’s blatant illogic at the foundational level, a fact unacknowledged by most of those depending on it. Several changes to the literature and a recent Supreme Court decision could help.

2000 ◽  
Vol 23 (2) ◽  
pp. 292-293 ◽  
Author(s):  
Brian D. Haig

Chow's endorsement of a limited role for null hypothesis significance testing is a needed corrective of research malpractice, but his decision to place this procedure in a hypothetico-deductive framework of Popperian cast is unwise. Various failures of this version of the hypothetico-deductive method have negative implications for Chow's treatment of significance testing, meta-analysis, and theory evaluation.


2019 ◽  
Vol 51 (2) ◽  
pp. 274-278 ◽  
Author(s):  
Arjen van Witteloostuijn

AbstractIn this commentary, I argue why we should stop engaging in null hypothesis statistical significance testing altogether. Artificial and misleading it may be, but we know how to play the p value threshold and null hypothesis-testing game. We feel secure; we love the certainty. The fly in the ointment is that the conventions have led to questionable research practices. Wasserstein, Schirm, & Lazar (Am Stat 73(sup1):1–19, 2019. 10.1080/00031305.2019.1583913) explain why, in their thought-provoking editorial introducing a special issue of The American Statistician: “As ‘statistical significance’ is used less, statistical thinking will be used more.” Perhaps we empirical researchers can together find a way to work ourselves out of the straitjacket that binds us.


2019 ◽  
Vol 15 (2) ◽  
pp. 321-346 ◽  
Author(s):  
Alexander Koplenig

Abstract In the first volume of Corpus Linguistics and Linguistic Theory, Gries (2005. Null-hypothesis significance testing of word frequencies: A follow-up on Kilgarriff. Corpus Linguistics and Linguistic Theory 1(2). doi:10.1515/cllt.2005.1.2.277. http://www.degruyter.com/view/j/cllt.2005.1.issue-2/cllt.2005.1.2.277/cllt.2005.1.2.277.xml: 285) asked whether corpus linguists should abandon null-hypothesis significance testing. In this paper, I want to revive this discussion by defending the argument that the assumptions that allow inferences about a given population – in this case about the studied languages – based on results observed in a sample – in this case a collection of naturally occurring language data – are not fulfilled. As a consequence, corpus linguists should indeed abandon null-hypothesis significance testing.


2018 ◽  
Author(s):  
Fiona Fidler

Compelling criticisms of statistical significance testing (or Null Hypothesis Significance Testing, NHST) can be found in virtually all areas of the social and life sciences—including economics, sociology, ecology, biology, education and psychology. Because it is the overwhelmingly dominant statistical method in these sciences, criticisms need to be taken seriously. Yet, after half a century of cogent arguments against NHST and calls to adopt alternative practices some disciplines show little sign of change. One obvious question is ‘why?’ Why are researchers so unwilling to abandon this flawed practice? In this thesis I attempt to answer this question, and compare practice across scientific disciplines.


2021 ◽  
pp. 204589402110249
Author(s):  
David D Ivy ◽  
Damien Bonnet ◽  
Rolf MF Berger ◽  
Gisela Meyer ◽  
Simin Baygani ◽  
...  

Objective: This study evaluated the efficacy and safety of tadalafil in pediatric patients with pulmonary arterial hypertension (PAH). Methods: This phase-3, international, randomized, multicenter (24 weeks double-blind placebo controlled period; 2-year, open-labelled extension period), add-on (patient’s current endothelin receptor antagonist therapy) study included pediatric patients aged <18 years with PAH. Patients received tadalafil 20 mg or 40 mg based on their weight (Heavy-weight: ≥40 kg; Middle-weight: ≥25—<40 kg) or placebo orally QD for 24 weeks. Primary endpoint was change from baseline in 6-minute walk (6MW) distance in patients aged ≥6 years at Week 24. Sample size was amended from 134 to ≥34 patients, due to serious recruitment challenges. Therefore, statistical significance testing was not performed between treatment groups. Results: Patient demographics and baseline characteristics (N=35; tadalafil=17; placebo=18) were comparable between treatment groups; median age was 14.2 years (6.2 to 17.9 years) and majority (71.4%, n=25) of patients were in HW cohort. Least square mean (SE) changes from baseline in 6MW distance at Week 24 was numerically greater with tadalafil versus placebo (60.48 [20.41] vs 36.60 [20.78] meters; placebo-adjusted mean difference [SD] 23.88 [29.11]). Safety of tadalafil treatment was as expected without any new safety concerns. During study period 1, two patients (1 in each group) discontinued due to investigator’s reported clinical worsening, and no deaths were reported. Conclusions: The statistical significance testing was not performed between the treatment groups due to low sample size, however, the study results show positive trend in improvement in non invasive measurements, commonly utilized by clinicians to evaluate the disease status for children with PAH. Safety of tadalafil treatment was as expected without any new safety signals.


Sign in / Sign up

Export Citation Format

Share Document