inclusion probabilities
Recently Published Documents


TOTAL DOCUMENTS

83
(FIVE YEARS 21)

H-INDEX

9
(FIVE YEARS 3)

2021 ◽  
Vol 37 (4) ◽  
pp. 865-905
Author(s):  
Martín Humberto Félix-Medina

Abstract We propose Horvitz-Thompson-like and Hájek-like estimators of the total and mean of a response variable associated with the elements of a hard-to-reach population, such as drug users and sex workers. A portion of the population is assumed to be covered by a frame of venues where the members of the population tend to gather. An initial cluster sample of elements is selected from the frame, where the clusters are the venues, and the elements in the sample are asked to name their contacts who belong to the population. The sample size is increased by including in the sample the named elements who are not in the initial sample. The proposed estimators do not use design-based inclusion probabilities, but model-based inclusion probabilities which are derived from a Rasch model and are estimated by maximum likelihood estimators. The inclusion probabilities are assumed to be heterogeneous, that is, they depend on the sampled people. Variance estimates are obtained by bootstrap and are used to construct confidence intervals. The performance of the proposed estimators and confidence intervals is evaluated by two numerical studies, one of them based on real data, and the results show that their performance is acceptable.


Author(s):  
Sara Franceschi ◽  
Gianni Betti ◽  
Lorenzo Fattorini ◽  
Francesca Gagliardi ◽  
Gianni Montrone

AbstractThe best evaluation for the proportion of defective units in a batch of fruits and vegetables can be achieved by an exhaustive checking of all the boxes in the batch, that is prohibitive to perform in most cases. Usually, only a sample of boxes is checked. In EU countries, EU regulations establish to estimate the proportion of defective units in a batch by the proportion of defective units in the sample, without giving any rule for selecting boxes. Therefore, results are highly dependent on the subjective choice of boxes. In the present study, an objective design-based approach is considered to select boxes from batches, adopting balanced spatial schemes with equal inclusion probabilities. The schemes are able to select samples of boxes evenly spread throughout the batch also ensuring good statistical properties for the proportion of defective units in the sample as estimator of the proportion of defective units in the batch. The performance of these strategies is evaluated by means of a simulation study performed on real and artificial batches of apples, peppers and strawberries. A case study is considered to estimate the proportion of defective units in a batch of courgettes stored in a distribution center of a supermarket chain in Central Italy.


Buildings ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 360
Author(s):  
Janusz Sobieraj ◽  
Dominik Metelski

The problem with evaluating investment projects is that there are many factors that determine the degree of their successful conclusion. Consequently, there has been an active debate for years as to which critical success factors (CSFs) contribute most to the performance of construction projects. This is because the practice of empirical research is based on two steps: first, researchers choose a particular model from the space of all possible models, and second, they act as if the chosen model is the only one that fits the data and describes the phenomenon under study. Hence, there are many CSF lists that can be found in the literature, owing to the uncertainty at the model selection stage, which is usually ignored. Alternatively, model averaging accounts for this model uncertainty. In this study, the Bayesian model averaging and data from a survey of Polish construction managers were used to investigate the potential of 28 factors describing a diverse set of characteristics in explaining the performance of construction projects in Poland. Determinants of successful completion of investment projects are categorized by their level of evidential strength, which is derived from posterior inclusion probabilities (PIPs), i.e., providing strong, medium and weak evidence.


Author(s):  
Raphaël Jauslin ◽  
Esther Eustache ◽  
Yves Tillé

AbstractA balanced sampling design should always be the adopted strategy if auxiliary information is available. In addition, integrating a stratified structure of the population in the sampling process can considerably reduce the variance of the estimators. We propose here a new method to handle the selection of a balanced sample in a highly stratified population. The method improves substantially the commonly used sampling designs and reduces the time-consuming problem that could arise if inclusion probabilities within strata do not sum to an integer.


Author(s):  
Roberto Benedetti ◽  
Maria Michela Dickson ◽  
Giuseppe Espa ◽  
Francesco Pantalone ◽  
Federica Piersimoni

AbstractBalanced sampling is a random method for sample selection, the use of which is preferable when auxiliary information is available for all units of a population. However, implementing balanced sampling can be a challenging task, and this is due in part to the computational efforts required and the necessity to respect balancing constraints and inclusion probabilities. In the present paper, a new algorithm for selecting balanced samples is proposed. This method is inspired by simulated annealing algorithms, as a balanced sample selection can be interpreted as an optimization problem. A set of simulation experiments and an example using real data shows the efficiency and the accuracy of the proposed algorithm.


2021 ◽  
Author(s):  
Liu Liu ◽  
Ang Li ◽  
Qun Xu ◽  
Qin Wang ◽  
Feng Han ◽  
...  

Abstract Epidemiological studies have demonstrated that various kinds of urinary element concentrations were different between healthy, prediabetes, and diabetes patients. Meanwhile, many studies have explored the relationship between element concentration and fasting blood glucose (FBG), but the association between joint exposure to co-existing elements and FBG level has not been well understood. The study explored the associations of joint exposure to co-existing urinary elements with FBG level in a cross-sectional design. 275 retired elderly people were recruited from Beijing, China. The questionnaire survey was conducted, and biological samples were collected. The generalized linear model (GLM) and two-phase Bayesian kernel machine regression (BKMR) model were used to perform in-depth association analysis between urinary elements and FBG. The GLM analysis showed that Zn, Sr, and Cd were significantly correlated with the FBG level, under control potential confounding factors. The BKMR analysis demonstrated 8 elements (Zn, Se, Fe, Cr, Ni, Cd, Mn, and Al) had a higher influence on FBG (Posterior inclusion probabilities >0.1). Further intensive analyses result of the BKMR model indicated that the overall estimated exposure of 8 elements was positively correlated with the FBG level and was statistically significant when all element concentrations were at their 65th percentile. Meanwhile, the BKMR analysis showed that Cd and Zn had a statistically significant association with FBG levels when other co-existing elements were controlled at different levels (25th, 50th or 75th percentile), respectively. The results of the GLM and BKMR model were inconsistent. The BKMR model could flexibly calculate the joint exposure to co-existing elements, evaluate the possible interaction effects and nonlinear correlations. The meaningful conclusions were found that it was difficult to get by traditional methods. This study will provide methodological reference and experimental evidence for the association between joint exposure to co-existing elements and FBG in elderly people.


Author(s):  
Wilmer Prentius ◽  
Xin Zhao ◽  
Anton Grafström

AbstractNew ways to combine data from multiple environmental area frame surveys of a finite population are being introduced. Environmental surveys often sample finite populations through area frames. However, to combine multiple surveys without risking bias, design components (inclusion probabilities, etc.) are needed at unit level of the finite population. We show how to derive the design components and exemplify this for three commonly used area frame sampling designs. We show how to produce an unbiased estimator using data from multiple surveys, and how to reduce the risk of introducing significant bias in linear combinations of estimators from multiple surveys. If separate estimators and variance estimators are used in linear combinations, there’s a risk of introducing negative bias. By using pooled variance estimators, the bias of a linear combination estimator can be reduced. National environmental surveys often provide good estimators at national level, while being too sparse to provide sufficiently good estimators for some domains. With the proposed methods, one can plan extra sampling efforts for such domains, without discarding readily available information from the aggregate/national survey. Through simulation, we show that the proposed methods are either unbiased, or yield low variance with small bias, compared to traditionally used methods.


2020 ◽  
Vol 4 (349) ◽  
pp. 67-80
Author(s):  
Wojciech Gamrot

Design‑based estimation of finite population parameters such as totals usually relies on the knowledge of inclusion probabilities characterising the sampling design. They are directly incorporated into sampling weights and estimators. However, for some useful sampling designs, these probabilities may remain unknown. In such a case, they may often be estimated in a simulation experiment which is carried out by repeatedly generating samples using the same sampling scheme and counting occurrences of individual units. By replacing unknown inclusion probabilities with such estimates, design‑based population total estimates may be computed. The calculation of required sample replication numbers remains an important challenge in such an approach. In this paper, a new procedure is proposed that might lead to the reduction in computational complexity of simulations.


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 948
Author(s):  
Stefano Cabras

The variable selection problem in general, and specifically for the ordinary linear regression model, is considered in the setup in which the number of covariates is large enough to prevent the exploration of all possible models. In this context, Gibbs-sampling is needed to perform stochastic model exploration to estimate, for instance, the model inclusion probability. We show that under a Bayesian non-parametric prior model for analyzing Gibbs-sampling output, the usual empirical estimator is just the asymptotic version of the expected posterior inclusion probability given the simulation output from Gibbs-sampling. Other posterior conditional estimators of inclusion probabilities can also be considered as related to the latent probabilities distributions on the model space which can be sampled given the observed Gibbs-sampling output. This paper will also compare, in this large model space setup the conventional prior approach against the non-local prior approach used to define the Bayes Factors for model selection. The approach is exposed along with simulation samples and also an application of modeling the Travel and Tourism factors all over the world.


Sign in / Sign up

Export Citation Format

Share Document