Clarifying selection bias in cluster randomized trials

2021 ◽  
pp. 174077452110568
Fan Li ◽  
Zizhong Tian ◽  
Jennifer Bobb ◽  
Georgia Papadogeorgou ◽  
Fan Li

Background In cluster randomized trials, patients are typically recruited after clusters are randomized, and the recruiters and patients may not be blinded to the assignment. This often leads to differential recruitment and consequently systematic differences in baseline characteristics of the recruited patients between intervention and control arms, inducing post-randomization selection bias. We aim to rigorously define causal estimands in the presence of selection bias. We elucidate the conditions under which standard covariate adjustment methods can validly estimate these estimands. We further discuss the additional data and assumptions necessary for estimating causal effects when such conditions are not met. Methods Adopting the principal stratification framework in causal inference, we clarify there are two average treatment effect (ATE) estimands in cluster randomized trials: one for the overall population and one for the recruited population. We derive analytical formula of the two estimands in terms of principal-stratum-specific causal effects. Furthermore, using simulation studies, we assess the empirical performance of the multivariable regression adjustment method under different data generating processes leading to selection bias. Results When treatment effects are heterogeneous across principal strata, the average treatment effect on the overall population generally differs from the average treatment effect on the recruited population. A naïve intention-to-treat analysis of the recruited sample leads to biased estimates of both average treatment effects. In the presence of post-randomization selection and without additional data on the non-recruited subjects, the average treatment effect on the recruited population is estimable only when the treatment effects are homogeneous between principal strata, and the average treatment effect on the overall population is generally not estimable. The extent to which covariate adjustment can remove selection bias depends on the degree of effect heterogeneity across principal strata. Conclusion There is a need and opportunity to improve the analysis of cluster randomized trials that are subject to post-randomization selection bias. For studies prone to selection bias, it is important to explicitly specify the target population that the causal estimands are defined on and adopt design and estimation strategies accordingly. To draw valid inferences about treatment effects, investigators should (1) assess the possibility of heterogeneous treatment effects, and (2) consider collecting data on covariates that are predictive of the recruitment process, and on the non-recruited population from external sources such as electronic health records.

2021 ◽  
Mateus C. R. Neves ◽  
Felipe De Figueiredo Silva ◽  
Carlos Otávio Freitas

In this paper we estimate the average treatment effect from access to extension services and credit on agricultural production in selected Andean countries (Bolivia, Peru, and Colombia). More specifically, we want to identify the effect of accessibility, here represented as travel time to the nearest area with 1,500 or more inhabitants per square kilometer or at least 50,000 inhabitants, on the likelihood of accessing extension and credit. To estimate the treatment effect and identify the effect of accessibility on these variables, we use data from the Colombian and Bolivian Agricultural Censuses of 2013 and 2014, respectively; a national agricultural survey from 2017 for Peru; and geographic information on travel time. We find that the average treatment effect for extension is higher compared to that of credit for farms in Bolivia and Peru, and lower for Colombia. The average treatment effects of extension and credit for Peruvian farms are $2,387.45 and $3,583.42 respectively. The average treatment effect for extension and credit are $941.92 and $668.69, respectively, while in Colombia are $1,365.98 and $1,192.51, respectively. We also find that accessibility and the likelihood of accessing these services are nonlinearly related. Results indicate that higher likelihood is associated with lower travel time, especially in the analysis of credit.

2018 ◽  
Vol 42 (4) ◽  
pp. 391-422 ◽  
Donald P. Green ◽  
Winston Lin ◽  
Claudia Gerber

Background: Many place-based randomized trials and quasi-experiments use a pair of cross-section surveys, rather than panel surveys, to estimate the average treatment effect of an intervention. In these studies, a random sample of individuals in each geographic cluster is selected for a baseline (preintervention) survey, and an independent random sample is selected for an endline (postintervention) survey. Objective: This design raises the question, given a fixed budget, how should a researcher allocate resources between the baseline and endline surveys to maximize the precision of the estimated average treatment effect? Results: We formalize this allocation problem and show that although the optimal share of interviews allocated to the baseline survey is always less than one-half, it is an increasing function of the total number of interviews per cluster, the cluster-level correlation between the baseline measure and the endline outcome, and the intracluster correlation coefficient. An example using multicountry survey data from Africa illustrates how the optimal allocation formulas can be combined with data to inform decisions at the planning stage. Another example uses data from a digital political advertising experiment in Texas to explore how precision would have varied with alternative allocations.

Graham K. Brown ◽  
Thanos Mergoupis

Treatment effects may vary with the observed characteristics of the treated, often with important implications. In the context of experimental data, a growing literature deals with the problem of specifying treatment interaction terms that most effectively capture this variation. Some results of this literature are now implemented in Stata. With nonexperimental (observational) data, and in particular when selection into treatment depends on unmeasured factors, treatment effects can be estimated using Stata's treatreg command. Though not originally designed for this purpose, treatreg can be used to consistently estimate treatment interaction parameters. With interactions, however, adjustments are required to generate predicted values and estimate the average treatment effect. In this article, we introduce commands that perform this adjustment for multiplicative interactions, and we show the required adjustment for more complicated interactions.

2015 ◽  
Vol 6 (1-2) ◽  
Joel A. Middleton ◽  
Peter M. Aronow

AbstractMany estimators of the average treatment effect, including the difference-in-means, may be biased when clusters of units are allocated to treatment. This bias remains even when the number of units within each cluster grows asymptotically large. In this paper, we propose simple, unbiased, location-invariant, and covariate-adjusted estimators of the average treatment effect in experiments with random allocation of clusters, along with associated variance estimators. We then analyze a cluster-randomized field experiment on voter mobilization in the US, demonstrating that the proposed estimators have precision that is comparable, if not superior, to that of existing, biased estimators of the average treatment effect.

2013 ◽  
Vol 32 (19) ◽  
pp. 3357-3372 ◽  
C. Leyrat ◽  
A. Caille ◽  
A. Donner ◽  
B. Giraudeau

Biometrics ◽  
2014 ◽  
Vol 70 (4) ◽  
pp. 1014-1022 ◽  
Zhenke Wu ◽  
Constantine E. Frangakis ◽  
Thomas A. Louis ◽  
Daniel O. Scharfstein

2019 ◽  
Vol 52 (2) ◽  
pp. 187-200

The estimated average treatment effect in observational studies is biased if the assumptions of ignorability and overlap are not satisfied. To deal with this potential problem when propensity score weights are used in the estimation of the treatment effects, in this paper we propose a bootstrap bias correction estimator for the average treatment effect (ATE) obtained with the inverse propensity score (BBC-IPS) estimator. We show in simulations that the BBC-IPC performs well when we have misspecifications of the propensity score (PS) due to: omitted variables (ignorability property may not be satisfied), overlap (imbalances in distribution between treatment and control groups) and confounding effects between observables and unobservables (endogeneity). Further refinements in bias reductions of the ATE estimates in smaller samples are attained by iterating the BBC-IPS estimator.

Sign in / Sign up

Export Citation Format

Share Document