scholarly journals A SMOOTH TEST FOR THE EQUALITY OF DISTRIBUTIONS

2012 ◽  
Vol 29 (2) ◽  
pp. 419-446 ◽  
Author(s):  
Anil K. Bera ◽  
Aurobindo Ghosh ◽  
Zhijie Xiao

The two-sample version of the celebrated Pearson goodness-of-fit problem has been a topic of extensive research, and several tests like the Kolmogorov-Smirnov and Cramér-von Mises have been suggested. Although these tests perform fairly well as omnibus tests for comparing two probability density functions (PDFs), they may have poor power against specific departures such as in location, scale, skewness, and kurtosis. We propose a new test for the equality of two PDFs based on a modified version of the Neyman smooth test using empirical distribution functions minimizing size distortion in finite samples. The suggested test can detect the specific directions of departure from the null hypothesis. Specifically, it can identify deviations in the directions of mean, variance, skewness, or tail behavior. In a finite sample, the actual probability of type-I error depends on the relative sizes of the two samples. We propose two different approaches to deal with this problem and show that, under appropriate conditions, the proposed tests are asymptotically distributed as chi-squared. We also study the finite sample size and power properties of our proposed test. As an application of our procedure, we compare the age distributions of employees with small employers in New York and Pennsylvania with group insurance before and after the enactment of the “community rating” legislation in New York. It has been conventional wisdom that if community rating is enforced (where the group health insurance premium does not depend on age or any other physical characteristics of the insured), then the insurance market will collapse, since only older or less healthy patients would prefer group insurance. We find that there are significant changes in the age distribution in the population in New York owing mainly to a shift in location and scale.

2019 ◽  
Vol 8 (4) ◽  
pp. 8539-8546

Rain is a major component of the water cycle that deposits most of the fresh water on the earth. The determination of the frequency of occurrence of extreme hydrological events is a prerequisite for planning and execution of many water resource projects. A comprehensive statistical analysis on annual, monthly and seasonal rainfall for Warangal District, Telangana was performed using rainfall data for 40 years (1962-2001). The current investigation was conducted with the ultimate aim of determining the type of Probability distribution that best fits the rainfall data of that particular area. The plotting position and probabilistic methods of probability distribution functions were used for analysis of rainfall data. Rainfall magnitude were evaluated for different return periods. As well as the rainfall pattern of that area has been studied with help of standard deviation and co-efficient of variation. The difference in results obtained from the methods of plotting position were found to be insignificant. Chi-square test was used to measure the Goodness of fit for the seasonal and monthly rainfall. Gumbel’s (Extreme value type-I) method and Normal method was found to be the best method of distribution for the rainfall data of this region. A detailed study was conducted on the crop planning of this region. Rainfall amount is decreasing gradually due to urbanization, global climatic change and hence a decrease in crop productivity. Despite the growth in percentage of gross-irrigated area over Rain-fed farming, Farmers are still rainfall dependent. Crop planning is done with the average effective rainfall of Warangal. Economic analysis is carried out for the crops cultivated in this region and farmers get 23% increase in their yield according to the rates available in Warangal market.


Methodology ◽  
2012 ◽  
Vol 8 (1) ◽  
pp. 23-38 ◽  
Author(s):  
Manuel C. Voelkle ◽  
Patrick E. McKnight

The use of latent curve models (LCMs) has increased almost exponentially during the last decade. Oftentimes, researchers regard LCM as a “new” method to analyze change with little attention paid to the fact that the technique was originally introduced as an “alternative to standard repeated measures ANOVA and first-order auto-regressive methods” (Meredith & Tisak, 1990, p. 107). In the first part of the paper, this close relationship is reviewed, and it is demonstrated how “traditional” methods, such as the repeated measures ANOVA, and MANOVA, can be formulated as LCMs. Given that latent curve modeling is essentially a large-sample technique, compared to “traditional” finite-sample approaches, the second part of the paper addresses the question to what degree the more flexible LCMs can actually replace some of the older tests by means of a Monte-Carlo simulation. In addition, a structural equation modeling alternative to Mauchly’s (1940) test of sphericity is explored. Although “traditional” methods may be expressed as special cases of more general LCMs, we found the equivalence holds only asymptotically. For practical purposes, however, no approach always outperformed the other alternatives in terms of power and type I error, so the best method to be used depends on the situation. We provide detailed recommendations of when to use which method.


Econometrics ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 10
Author(s):  
Šárka Hudecová ◽  
Marie Hušková ◽  
Simos G. Meintanis

This article considers goodness-of-fit tests for bivariate INAR and bivariate Poisson autoregression models. The test statistics are based on an L2-type distance between two estimators of the probability generating function of the observations: one being entirely nonparametric and the second one being semiparametric computed under the corresponding null hypothesis. The asymptotic distribution of the proposed tests statistics both under the null hypotheses as well as under alternatives is derived and consistency is proved. The case of testing bivariate generalized Poisson autoregression and extension of the methods to dimension higher than two are also discussed. The finite-sample performance of a parametric bootstrap version of the tests is illustrated via a series of Monte Carlo experiments. The article concludes with applications on real data sets and discussion.


Author(s):  
John A. Gallis ◽  
Fan Li ◽  
Elizabeth L. Turner

Cluster randomized trials, where clusters (for example, schools or clinics) are randomized to comparison arms but measurements are taken on individuals, are commonly used to evaluate interventions in public health, education, and the social sciences. Analysis is often conducted on individual-level outcomes, and such analysis methods must consider that outcomes for members of the same cluster tend to be more similar than outcomes for members of other clusters. A popular individual-level analysis technique is generalized estimating equations (GEE). However, it is common to randomize a small number of clusters (for example, 30 or fewer), and in this case, the GEE standard errors obtained from the sandwich variance estimator will be biased, leading to inflated type I errors. Some bias-corrected standard errors have been proposed and studied to account for this finite-sample bias, but none has yet been implemented in Stata. In this article, we describe several popular bias corrections to the robust sandwich variance. We then introduce our newly created command, xtgeebcv, which will allow Stata users to easily apply finite-sample corrections to standard errors obtained from GEE models. We then provide examples to demonstrate the use of xtgeebcv. Finally, we discuss suggestions about which finite-sample corrections to use in which situations and consider areas of future research that may improve xtgeebcv.


Author(s):  
Reilly M. Blocho ◽  
Richard W. Smith ◽  
Mark R. Noll

AbstractThe purpose of this study was to observe how the composition of organic matter (OM) and the extent of anoxia during deposition within the Marcellus Formation in New York varied by distance from the sediment source in eastern New York. Lipid biomarkers (n-alkanes and fatty acids) in the extractable organic component (bitumen) of the shale samples were analyzed, and proxies such as the average chain length (ACL), aquatic to terrestrial ratio (ATR) and carbon preference index (CPI) of n-alkanes were calculated. Fatty acids were relatively non-abundant due to the age of the shale bed, but n-alkane distributions revealed that the primary component of the OM was terrigenous plants. The presence of shorter n-alkane chain lengths in the samples indicated that there was also a minor component of phytoplankton and algal (marine) sourced OM. Whole rock analyses were also conducted, and cerium anomalies were calculated as a proxy for anoxia. All samples had a negative anomaly value, indicating anoxic conditions during deposition. Two samples, however, contained values close to zero and thus were determined to have suboxic conditions. Anoxia and total organic matter (TOM) did not show any spatial trends across the basin, which may be caused by varying depths within the basin during deposition. A correlation between nickel concentrations and TOM was observed and indicates that algae was the primary source of the marine OM, which supports the lipid biomarker analysis. It was determined that the kerogen type of the Marcellus Formation in New York State is type III, consistent with a methane-forming shale bed.


1982 ◽  
Vol 19 (A) ◽  
pp. 359-365 ◽  
Author(s):  
David Pollard

The theory of weak convergence has developed into an extensive and useful, but technical, subject. One of its most important applications is in the study of empirical distribution functions: the explication of the asymptotic behavior of the Kolmogorov goodness-of-fit statistic is one of its greatest successes. In this article a simple method for understanding this aspect of the subject is sketched. The starting point is Doob's heuristic approach to the Kolmogorov-Smirnov theorems, and the rigorous justification of that approach offered by Donsker. The ideas can be carried over to other applications of weak convergence theory.


Author(s):  
Nitin Sachdeva

Innovation diffusion models have been developed by many researchers during the past few decades based on the famous Bass (1969) model. Several such diffusion models have been developed in consideration of price, marketing efforts etc., however, it is hardly seen that customer attrition (disadoption) can play a significant role in long term growth process of any new product or service. This paper defines two types of disadoption process, Type I disadoption and Type II disadoption process, representing disadopters from innovators and imitators, respectively. We illustrate that there is an increase in the market size along with the adoption of new product and this increase is addressed in this paper. The explicit mean value function for the two types of disadoption processes is derived in this paper. The thrust of the research is on studying the management educational services in the Delhi/NCR region of India and the impact of disadoption on the long term growth of such services. In order to validate the proposed modeling framework, we make use of different goodness-of-fit criteria on primary data collected from an institute in Delhi/NCR.


Sign in / Sign up

Export Citation Format

Share Document