scholarly journals Model selection versus traditional hypothesis testing in circular statistics: a simulation study

Biology Open ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. bio049866
Author(s):  
Lukas Landler ◽  
Graeme D. Ruxton ◽  
E. Pascal Malkemper

ABSTRACTMany studies in biology involve data measured on a circular scale. Such data require different statistical treatment from those measured on linear scales. The most common statistical exploration of circular data involves testing the null hypothesis that the data show no aggregation and are instead uniformly distributed over the whole circle. The most common means of performing this type of investigation is with a Rayleigh test. An alternative might be to compare the fit of the uniform distribution model to alternative models. Such model-fitting approaches have become a standard technique with linear data, and their greater application to circular data has been recently advocated. Here we present simulation data that demonstrate that such model-based inference can offer very similar performance to the best traditional tests, but only if adjustment is made in order to control type I error rate.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Lukas Landler ◽  
Graeme D. Ruxton ◽  
E. Pascal Malkemper

AbstractMany biological variables are recorded on a circular scale and therefore need different statistical treatment. A common question that is asked of such circular data involves comparison between two groups: Are the populations from which the two samples are drawn differently distributed around the circle? We compared 18 tests for such situations (by simulation) in terms of both abilities to control Type-I error rate near the nominal value, and statistical power. We found that only eight tests offered good control of Type-I error in all our simulated situations. Of these eight, we were able to identify the Watson’s U2 test and a MANOVA approach, based on trigonometric functions of the data, as offering the best power in the overwhelming majority of our test circumstances. There was often little to choose between these tests in terms of power, and no situation where either of the remaining six tests offered substantially better power than either of these. Hence, we recommend the routine use of either Watson’s U2 test or MANOVA approach when comparing two samples of circular data.


2021 ◽  
Author(s):  
Lukas Landler ◽  
Graeme D Ruxton ◽  
Erich Pascal Malkemper

Many biological variables, often involving timings of events or directions, are recorded on a circular rather than linear scale, and need different statistical treatment for that reason. A common question that is asked of such circular data involves comparison between two groups or treatments: Are the populations from which the two samples drawn differently distributed around the circle? For example, we might ask whether the distribution of directions from which a stalking predator approaches its prey differs between sunny and cloudy conditions; or whether the time of day of mating attempts differs between lab mice subject to one of two hormone treatments. An array of statistical approaches to these questions have been developed. We compared 18 of these (by simulation) in terms of both abilities to control type I error rate near the nominal value, and statistical power. We found that only eight tests offered good control of type I error in all our test situations. Of these eight, we are able to identify Watsons U^2 test and MANOVA based on trigonometric functions of the data as offering the best power in the overwhelming majority of our test circumstances. There was often little to choose between these tests in terms of power, and no situation where either of the remaining six tests offered substantially better power than either of these. Hence, we recommend the routine use of either Watsons U^2 test or MANOVA when comparing two samples of circular data.


2019 ◽  
Vol 17 (1) ◽  
pp. e0701
Author(s):  
Renhe Zhang ◽  
Xiyuan Hu

AbstractThe empirical best linear unbiased prediction (eBLUP) is usually based on the assumption that the residual error variance (REV) is homogenous. This may be unrealistic, and therefore limits the accuracy of genotype evaluations for multi-location trials, where the REV often varies across locations. The objective of this contribution was to investigate the direct implications of the eBLUP with different considerations about REV based on the mixed model for evaluation of genotype simple effects (i.e. genotype effects at individual locations). A series of 14 multi-location trials from a rape-breeding program in the north of China were simultaneously analyzed from 2012 to 2014 using a randomized complete block design at each location. The results showed that the model with heterogeneous REV was more appropriate than the one with homogeneous REV in all of the trials according to model fitting statistics. Whether the REV differences across locations were accounted for in the analysis procedure influenced the variance estimate of related random effects and testing of the variance of genotype-location (G-L) interactions. Ignoring REV differences by use of the eBLUP could result not only in an inflation or deflation of statistical Type I error rates for pair-wise testing but also in an inaccurate ranking of genotype simple effects for these trials. Therefore, it is suggested that in application of the eBLUP for evaluation of genotype simple effects in multi-location trials, the heterogeneity of REV should be accounted for based on mixed model approaches with appropriate variance-covariance structure.


2018 ◽  
Vol 79 (2) ◽  
pp. 385-398
Author(s):  
Rudolf Debelak ◽  
Carolin Strobl

M-fluctuation tests are a recently proposed method for detecting differential item functioning in Rasch models. This article discusses a generalization of this method to two additional item response theory models: the two-parametric logistic model and the three-parametric logistic model with a common guessing parameter. The Type I error rate and the power of this method were evaluated by a variety of simulation studies. The results suggest that the new method allows the detection of various forms of differential item functioning in these models, which also includes differential discrimination and differential guessing effects. It is also robust against moderate violations of several assumptions made in the item parameter estimation.


2000 ◽  
Vol 14 (1) ◽  
pp. 1-10 ◽  
Author(s):  
Joni Kettunen ◽  
Niklas Ravaja ◽  
Liisa Keltikangas-Järvinen

Abstract We examined the use of smoothing to enhance the detection of response coupling from the activity of different response systems. Three different types of moving average smoothers were applied to both simulated interbeat interval (IBI) and electrodermal activity (EDA) time series and to empirical IBI, EDA, and facial electromyography time series. The results indicated that progressive smoothing increased the efficiency of the detection of response coupling but did not increase the probability of Type I error. The power of the smoothing methods depended on the response characteristics. The benefits and use of the smoothing methods to extract information from psychophysiological time series are discussed.


Methodology ◽  
2012 ◽  
Vol 8 (1) ◽  
pp. 23-38 ◽  
Author(s):  
Manuel C. Voelkle ◽  
Patrick E. McKnight

The use of latent curve models (LCMs) has increased almost exponentially during the last decade. Oftentimes, researchers regard LCM as a “new” method to analyze change with little attention paid to the fact that the technique was originally introduced as an “alternative to standard repeated measures ANOVA and first-order auto-regressive methods” (Meredith & Tisak, 1990, p. 107). In the first part of the paper, this close relationship is reviewed, and it is demonstrated how “traditional” methods, such as the repeated measures ANOVA, and MANOVA, can be formulated as LCMs. Given that latent curve modeling is essentially a large-sample technique, compared to “traditional” finite-sample approaches, the second part of the paper addresses the question to what degree the more flexible LCMs can actually replace some of the older tests by means of a Monte-Carlo simulation. In addition, a structural equation modeling alternative to Mauchly’s (1940) test of sphericity is explored. Although “traditional” methods may be expressed as special cases of more general LCMs, we found the equivalence holds only asymptotically. For practical purposes, however, no approach always outperformed the other alternatives in terms of power and type I error, so the best method to be used depends on the situation. We provide detailed recommendations of when to use which method.


Methodology ◽  
2015 ◽  
Vol 11 (1) ◽  
pp. 3-12 ◽  
Author(s):  
Jochen Ranger ◽  
Jörg-Tobias Kuhn

In this manuscript, a new approach to the analysis of person fit is presented that is based on the information matrix test of White (1982) . This test can be interpreted as a test of trait stability during the measurement situation. The test follows approximately a χ2-distribution. In small samples, the approximation can be improved by a higher-order expansion. The performance of the test is explored in a simulation study. This simulation study suggests that the test adheres to the nominal Type-I error rate well, although it tends to be conservative in very short scales. The power of the test is compared to the power of four alternative tests of person fit. This comparison corroborates that the power of the information matrix test is similar to the power of the alternative tests. Advantages and areas of application of the information matrix test are discussed.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


Sign in / Sign up

Export Citation Format

Share Document