estimate sample size
Recently Published Documents


TOTAL DOCUMENTS

11
(FIVE YEARS 2)

H-INDEX

2
(FIVE YEARS 0)

2020 ◽  
Author(s):  
Kiyoshi Kubota ◽  
Masao Iwagami ◽  
Takuhiro Yamaguchi

Abstract Background:We propose and evaluate the approximation formulae for the 95% confidence intervals (CIs) of the sensitivity and specificity and a formula to estimate sample size in a validation study with stratified sampling where positive samples satisfying the outcome definition and negative samples that do not are selected with different extraction fractions. Methods:We used the delta method to derive the approximation formulae for estimating the sensitivity and specificity and their CIs. From those formulae, we derived the formula to estimate the size of negative samples required to achieve the intended precision and the formula to estimate the precision for a negative sample size arbitrarily selected by the investigator. We conducted simulation studies in a population where 4% were outcome definition positive, the positive predictive value (PPV)=0.8, and the negative predictive value (NPV)=0.96, 0.98 and 0.99. The size of negative samples, n0, was either selected to make the 95% CI fall within ± 0.1, 0.15 and 0.2 or set arbitrarily as 150, 300 and 600. We assumed a binomial distribution for the positive and negative samples. The coverage of the 95% CIs of the sensitivity and specificity was calculated as the proportion of CIs including the sensitivity and specificity in the population, respectively. For selected studies, the coverage was also estimated by the bootstrap method. The sample size was evaluated by examining whether the observed precision was within the pre-specified value.Results:For the sensitivity, the coverage of the approximated 95% CIs was larger than 0.95 in most studies but in 9 of 18 selected studies derived by the bootstrap method. For the specificity, the coverage of the approximated 95% CIs was approximately 0.93 in most studies, but the coverage was more than 0.95 in all 18 studies derived by the bootstrap method. The calculated size of negative samples yielded precisions within the pre-specified values in most of the studies.Conclusion:The approximation formulae for the 95% CIs of the sensitivity and specificity for stratified validation studies are presented. These formulae will help in conducting and analysing validation studies with stratified sampling.


Author(s):  
Rochelle Rocha Costa ◽  
Othavio Porto Backes ◽  
Pedro Figueiredo ◽  
Flávio Antônio De Souza Castro

Quantitative monographic studies systematically use inferential statistical procedures to test hypotheses. For this purpose, sampling procedures and sample sizes need to be adequate for the proposed procedures. The aim of this study was to identify the sample selection methods, as well as the performance and types of calculation to determine the sample size adopted in theses and dissertations developed in a graduate program in the field of Physical Education. Theses and dissertations defended between 2003 and 2013 were obtained through digital repository. Only quantitative studies were included, in which the following issues were analyzed: (1) sample selection criteria; (2) presence of sample calculation; (3) calculation type to estimate sample size. A total of 199 studies were included. Of these, 6% (n=11) used probabilistic methods for sample selection and 3% (n=6) used animal models. As for the accomplishment of sample calculations, 36% (n=72) studies reported having adopted this procedure. Of studies that performed sample calculations, 25% (n=18) used predictive equations, 67% (n=48) considered methods with statistical power as their base, 3% (n=2) used confidence interval, 4% (n=3) did not mention the method and 1% (n=1) was based on the type of statistical test to be used later. Nonprobabilistic sampling methods predominate for the selection of subjects; most studies do not report adopting calculations to estimate sample size and, among those that reported the use, the models that consider statistical power as the main criterion are predominant. 


2018 ◽  
Author(s):  
Edlin Guerra ◽  
Nuno Simoes ◽  
Juan J Cruz-Motta ◽  
Maite Mascaró

Deciding sample-size is a key step in any study based on statistical inference. Recently, a pioneer methodology applicable in the multivariate context was proposed by Anderson & Santana-Garcon (2015, DOI: 10.1111/ele.12385). This method is based on estimating the dissimilarity-based multivariate standard error (MultSE) of different sample efforts by double resampling the original data. However, this method has two limitations: (1) it is not possible to observe the behavior of MultSE beyond the original effort; and (2) the estimates are no longer independent when the same sampling units are used. We put forward an alternative method that overcomes both. The procedure consists in simulate a data matrix that contains the ecological properties of the community. Then, sampling is repeatedly executed, so that the following is achieved: (1) estimation of independent MultSE for greater efforts than the original; and (2) estimation of sample-size at different scales. These advantages were evaluated using four study cases.


2018 ◽  
Author(s):  
Edlin Guerra ◽  
Nuno Simoes ◽  
Juan J Cruz-Motta ◽  
Maite Mascaró

Deciding sample-size is a key step in any study based on statistical inference. Recently, a pioneer methodology applicable in the multivariate context was proposed by Anderson & Santana-Garcon (2015, DOI: 10.1111/ele.12385). This method is based on estimating the dissimilarity-based multivariate standard error (MultSE) of different sample efforts by double resampling the original data. However, this method has two limitations: (1) it is not possible to observe the behavior of MultSE beyond the original effort; and (2) the estimates are no longer independent when the same sampling units are used. We put forward an alternative method that overcomes both. The procedure consists in simulate a data matrix that contains the ecological properties of the community. Then, sampling is repeatedly executed, so that the following is achieved: (1) estimation of independent MultSE for greater efforts than the original; and (2) estimation of sample-size at different scales. These advantages were evaluated using four study cases.


2018 ◽  
Vol 25 (7) ◽  
pp. 774-779
Author(s):  
Carlos Baladrón ◽  
Alejandro Santos-Lozano ◽  
Javier M Aguiar ◽  
Alejandro Lucia ◽  
Juan Martín-Hernández

Abstract Objective The most used search engine for scientific literature, PubMed, provides tools to filter results by several fields. When searching for reports on clinical trials, sample size can be among the most important factors to consider. However, PubMed does not currently provide any means of filtering search results by sample size. Such a filtering tool would be useful in a variety of situations, including meta-analyses or state-of-the-art analyses to support experimental therapies. In this work, a tool was developed to filter articles identified by PubMed based on their reported sample sizes. Materials and Methods A search engine was designed to send queries to PubMed, retrieve results, and compute estimates of reported sample sizes using a combination of syntactical and machine learning methods. The sample size search tool is publicly available for download at http://ihealth.uemc.es. Its accuracy was assessed against a manually annotated database of 750 random clinical trials returned by PubMed. Results Validation tests show that the sample size search tool is able to accurately (1) estimate sample size for 70% of abstracts and (2) classify 85% of abstracts into sample size quartiles. Conclusions The proposed tool was validated as useful for advanced PubMed searches of clinical trials when the user is interested in identifying trials of a given sample size.


2018 ◽  
Author(s):  
Douglas Abrams ◽  
Parveen Kumar ◽  
R. Krishna Murthy Karuturi ◽  
Joshy George

AbstractBackgroundThe advent of single cell RNA sequencing (scRNA-seq) enabled researchers to study transcriptomic activity within individual cells and identify inherent cell types in the sample. Although numerous computational tools have been developed to analyze single cell transcriptomes, there are no published studies and analytical packages available to guide experimental design and to devise suitable analysis procedure for cell type identification.ResultsWe have developed an empirical methodology to address this important gap in single cell experimental design and analysis into an easy-to-use tool called SCEED (Single Cell Empirical Experimental Design and analysis). With SCEED, user can choose a variety of combinations of tools for analysis, conduct performance analysis of analytical procedures and choose the best procedure, and estimate sample size (number of cells to be profiled) required for a given analytical procedure at varying levels of cell type rarity and other experimental parameters. Using SCEED, we examined 3 single cell algorithms using 48 simulated single cell datasets that were generated for varying number of cell types and their proportions, number of genes expressed per cell, number of marker genes and their fold change, and number of single cells successfully profiled in the experiment.ConclusionsBased on our study, we found that when marker genes are expressed at fold change of 4 or more than the rest of the genes, either Seurat or Simlr algorithm can be used to analyze single cell dataset for any number of single cells isolated (minimum 1000 single cells were tested). However, when marker genes are expected to be only up to fC 2 upregulated, choice of the single cell algorithm is dependent on the number of single cells isolated and proportion of rare cell type to be identified. In conclusion, our work allows the assessment of various single cell methods and also aids in examining the single cell experimental design.


1998 ◽  
Vol 40 (4) ◽  
pp. 307-312 ◽  
Author(s):  
Maxia Dong ◽  
Martin R. Petersen ◽  
Mark J. Mendell

Sign in / Sign up

Export Citation Format

Share Document