region of practical equivalence
Recently Published Documents


TOTAL DOCUMENTS

12
(FIVE YEARS 9)

H-INDEX

2
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Guilherme D. Garcia ◽  
Natália Brambatti Guzzo

Categorical approaches to lexical stress typically assume that words have either regular or irregular stress, and imply that only the latter needs to be stored in the lexicon, while the former can be derived by rule. In this paper, we compare these two groups of words in a lexical decision task in Portuguese to examine whether the dichotomy in question affects lexical retrieval latencies in native speakers, which could indirectly reveal different processing patterns. Our results show no statistically credible effect of stress regularity on reaction times, even when lexical frequency, neighborhood density, and phonotactic probability are taken into consideration. The lack of an effect is consistent with a probabilistic approach to stress, not with a categorical (traditional) approach where syllables are either light or heavy and stress is either regular or irregular. We show that the posterior distribution of credible effect sizes of regularity is almost entirely (96.28%) within the region of practical equivalence, which provides strong evidence that no effect of regularity exists in the lexical decision data modelled. Frequency and phonotactic probability, in contrast, showed statistically credible effects given the experimental data modelled, which is consistent with the literature.


Author(s):  
Riko Kelter

AbstractTesting differences between a treatment and control group is common practice in biomedical research like randomized controlled trials (RCT). The standard two-sample t test relies on null hypothesis significance testing (NHST) via p values, which has several drawbacks. Bayesian alternatives were recently introduced using the Bayes factor, which has its own limitations. This paper introduces an alternative to current Bayesian two-sample t tests by interpreting the underlying model as a two-component Gaussian mixture in which the effect size is the quantity of interest, which is most relevant in clinical research. Unlike p values or the Bayes factor, the proposed method focusses on estimation under uncertainty instead of explicit hypothesis testing. Therefore, via a Gibbs sampler, the posterior of the effect size is produced, which is used subsequently for either estimation under uncertainty or explicit hypothesis testing based on the region of practical equivalence (ROPE). An illustrative example, theoretical results and a simulation study show the usefulness of the proposed method, and the test is made available in the R package . In sum, the new Bayesian two-sample t test provides a solution to the Behrens–Fisher problem based on Gaussian mixture modelling.


2021 ◽  
Vol 15 ◽  
Author(s):  
Ruslan Masharipov ◽  
Irina Knyazeva ◽  
Yaroslav Nikolaev ◽  
Alexander Korotkov ◽  
Michael Didur ◽  
...  

Classical null hypothesis significance testing is limited to the rejection of the point-null hypothesis; it does not allow the interpretation of non-significant results. This leads to a bias against the null hypothesis. Herein, we discuss statistical approaches to ‘null effect’ assessment focusing on the Bayesian parameter inference (BPI). Although Bayesian methods have been theoretically elaborated and implemented in common neuroimaging software packages, they are not widely used for ‘null effect’ assessment. BPI considers the posterior probability of finding the effect within or outside the region of practical equivalence to the null value. It can be used to find both ‘activated/deactivated’ and ‘not activated’ voxels or to indicate that the obtained data are not sufficient using a single decision rule. It also allows to evaluate the data as the sample size increases and decide to stop the experiment if the obtained data are sufficient to make a confident inference. To demonstrate the advantages of using BPI for fMRI data group analysis, we compare it with classical null hypothesis significance testing on empirical data. We also use simulated data to show how BPI performs under different effect sizes, noise levels, noise distributions and sample sizes. Finally, we consider the problem of defining the region of practical equivalence for BPI and discuss possible applications of BPI in fMRI studies. To facilitate ‘null effect’ assessment for fMRI practitioners, we provide Statistical Parametric Mapping 12 based toolbox for Bayesian inference.


2021 ◽  
pp. 1-9
Author(s):  
Daniel Tough ◽  
Alan Batterham ◽  
Kirsti Loughran ◽  
Jonathan Robinson ◽  
John Dixon ◽  
...  

INTRODUCTION: More than one in three older adults (≥65 years) fall within a two-year period. Over one third of cancer diagnoses are among people aged ≥75 years. Falls research in the UK cancer population is limited and contradictory. The aim of this study was to explore the association between a cancer diagnosis and incidence of falls in older adults in England. METHODS: Data were extracted from the English Longitudinal Study of Ageing (an ongoing panel study) collected between 2002 and 2014, consisting of a representative cohort of older adults living in England. Baseline data were collected within two-years of a cancer diagnosis. Falls data were extracted from the subsequent two-year period. The unexposed group included those with no chronic conditions. The fully adjusted logistic regression analysis model included age, sex, wealth, and education level as covariates. We defined odds ratios between 0.67 and 1.5 as the region of practical equivalence. RESULTS: A total of 139 people had a type of cancer (exposed group) (Breast = 18.7%, Colon, Rectum or Bowel = 14.4%, Melanoma or Skin = 7.2%, Lung = 4.3%, Somewhere else = 51.8%) (70.6±7.1 years; 58.3%male) with 3,899 in the unexposed group (69.5±7.3 years; 54.6%male). The fully-adjusted odds ratio was 1.21 (95%CI: 0.81 to 1.82; P = 0.348). The probability of falling among the exposed group was 22.7%versus 19.5%for the unexposed group. CONCLUSION: The cancer and control groups were not statistically equivalent for falls incidence, and a meaningful positive association between cancer and falls cannot be ruled out. Further research is required to elucidate this relationship.


2021 ◽  
Author(s):  
Niklas Johannes ◽  
Philipp K. Masur ◽  
Matti Vuorre ◽  
Andrew K Przybylski

The study of the relation between social media use and well-being is at a critical junction. Many researchers find small to no associations, yet policymakers and public stakeholders keep asking for more evidence. One way the field is reacting is by inspecting the variation around average relations – with the goal of describing individual social media users. Here, we argue that such an approach risks losing sight of the most important outcomes of a quantitative social science: estimates of the average relation in a large group. Our analysis begins by describing how the field got to this point. Then, we explain the problems of the current approach of studying variation. Next, we propose a principled approach to quantify, interpret, and explain variation in average relations: (1) conducting model comparisons, (2) defining a region of practical equivalence and testing the theoretical distribution of relations against that region, (3) defining a smallest effect size of interest and comparing it against the theoretical distribution. We close with recommendations to either study moderators as systematic factors that explain variation or to conduct N = 1 studies and qualitative research.


Author(s):  
Felipe Mattioni Maturana ◽  
Philipp Schellhorn ◽  
Gunnar Erz ◽  
Christof Burgstahler ◽  
Manuel Widmann ◽  
...  

Abstract Purpose We investigated the cardiovascular individual response to 6 weeks (3×/week) of work-matched within the severe-intensity domain (high-intensity interval training, HIIT) or moderate-intensity domain (moderate-intensity continuous training, MICT). In addition, we analyzed the cardiovascular factors at baseline underlying the response variability. Methods 42 healthy sedentary participants were randomly assigned to HIIT or MICT. We applied the region of practical equivalence-method for identifying the levels of responders to the maximal oxygen uptake (V̇O2max) response. For investigating the influence of cardiovascular markers, we trained a Bayesian machine learning model on cardiovascular markers. Results Despite that HIIT and MICT induced significant increases in V̇O2max, HIIT had greater improvements than MICT (p < 0.001). Greater variability was observed in MICT, with approximately 50% classified as “non-responder” and “undecided”. 20 “responders”, one “undecided” and no “non-responders” were observed in HIIT. The variability in the ∆V̇O2max was associated with initial cardiorespiratory fitness, arterial stiffness, and left-ventricular (LV) mass and LV end-diastolic diameter in HIIT; whereas, microvascular responsiveness and right-ventricular (RV) excursion velocity showed a significant association in MICT. Conclusion Our findings highlight the critical influence of exercise-intensity domains and biological variability on the individual V̇O2max response. The incidence of “non-responders” in MICT was one third of the group; whereas, no “non-responders” were observed in HIIT. The incidence of “responders” was 11 out of 21 participants in MICT, and 20 out of 21 participants in HIIT. The response in HIIT showed associations with baseline fitness, arterial stiffness, and LV-morphology; whereas, it was associated with RV systolic function in MICT.


2021 ◽  
Author(s):  
Josue E. Rodriguez ◽  
Donald Ray Williams

We propose the Bayesian bootstrap (BB) as a generic, simple, and accessible method for sampling from the posterior distribution of various correlation coefficients that are commonly used in the social-behavioral sciences. In a series of examples, we demonstrate how the BB can be used to estimate Pearson's, Spearman's, Gaussian rank, Kendall's tau, and polychoric correlations. We also describe an approach based on a region of practical equivalence to evaluate differences and null associations among the estimated correlations. Key advantages of the proposed methods are illustrated using two psychological datasets. In addition, we have implemented the methodology in the R package BBcor.


2020 ◽  
Author(s):  
Maximilian Linde ◽  
Jorge Tendeiro ◽  
Ravi Selker ◽  
Eric-Jan Wagenmakers ◽  
Don van Ravenzwaaij

Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared to the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately 0.2 or 0.3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size.


2020 ◽  
Vol 13 (1) ◽  
Author(s):  
Riko Kelter

Abstract Objectives The data presented herein represents the simulated datasets of a recently conducted larger study which investigated the behaviour of Bayesian indices of significance and effect size as alternatives to traditional p-values. The study considered the setting of Student’s and Welch’s two-sample t-test often used in medical research. It investigated the influence of the sample size, noise, the selected prior hyperparameters and the sensitivity to type I errors. The posterior indices used included the Bayes factor, the region of practical equivalence, the probability of direction, the MAP-based p-value and the e-value in the Full Bayesian Significance Test. The simulation study was conducted in the statistical programming language R. Data description The R script files for simulation of the datasets used in the study are presented in this article. These script files can both simulate the raw datasets and run the analyses. As researchers may be faced with different effect sizes, noise levels or priors in their domain than the ones studied in the original paper, the scripts extend the original results by allowing to recreate all analyses of interest in different contexts. Therefore, they should be relevant to other researchers.


2018 ◽  
Vol 10 (4) ◽  
pp. 411-415 ◽  
Author(s):  
Jeffrey N. Siegelman ◽  
Michelle Lall ◽  
Lindsay Lee ◽  
Tim P. Moran ◽  
Joshua Wallenstein ◽  
...  

ABSTRACT Background  Gender-related disparities persist in medicine and medical education. Prior work has found differences in medical education assessments based on gender. Objective  We hypothesized that gender bias would be mitigated in a simulation-based assessment. Methods  We conducted a retrospective cohort study of emergency medicine residents at a single, urban residency program. Beginning in spring 2013, residents participated in mandatory individual simulation assessments. Twelve simulated cases were included in this study. Rating forms mapped milestone language to specific observable behaviors. A Bayesian regression was used to evaluate the effect of resident and rater gender on assessment scores. Both 95% credible intervals (CrIs) and a Region of Practical Equivalence approach were used to evaluate the results. Results  Participants included 48 faculty raters (25 men [52%]) and 102 residents (47 men [46%]). The difference in scores between male and female residents (M = −0.58, 95% CrI –3.31–2.11), and male and female raters (M = 2.87, 95% CrI –0.43–6.30) was small and 95% CrIs overlapped with 0. The 95% CrI for the interaction between resident and rater gender also overlapped with 0 (M = 0.41, 95% CrI –3.71–4.23). Conclusions  In a scripted and controlled system of assessments, there were no differences in scores due to resident or rater gender.


Sign in / Sign up

Export Citation Format

Share Document