Selection bias in instrumental variable analyses

AbstractParticipants in epidemiological and genetic studies are rarely true random samples of the populations they are intended to represent, and both known and unknown factors can influence participation in a study (known as selection into a study). The circumstances in which selection causes bias in an instrumental variable (IV) analysis are not widely understood by practitioners of IV analyses. We use directed acyclic graphs (DAGs) to depict assumptions about the selection mechanism (factors affecting selection) and show how DAGs can be used to determine when a two-stage least squares (2SLS) IV analysis is biased by different selection mechanisms. Via simulations, we show that selection can result in a biased IV estimate with substantial confidence interval undercoverage, and the level of bias can differ between instrument strengths, a linear and nonlinear exposure-instrument association, and a causal and noncausal exposure effect. We present an application from the UK Biobank study, which is known to be a selected sample of the general population. Of interest was the causal effect of education on the decision to smoke. The 2SLS exposure estimates were very different between the IV analysis ignoring selection and the IV analysis which adjusted for selection (e.g., 1.8 [95% confidence interval −1.5, 5.0] and −4.5 [−6.6, −2.4], respectively). We conclude that selection bias can have a major effect on an IV analysis and that statistical methods for estimating causal effects using data from nonrandom samples are needed.

Download Full-text

Do Sports Crowd Out Books? The Impact of Intercollegiate Athletic Participation on Grades

Journal of Sports Economics ◽

10.1177/1527002517716975 ◽

2017 ◽

Vol 20 (1) ◽

pp. 115-153 ◽

Cited By ~ 2

Author(s):

Michael A. Insler ◽

Jimmy Karam

Keyword(s):

Instrumental Variable ◽

Causal Effect ◽

Peer Groups ◽

Sports Participation ◽

Athletic Participation ◽

Naval Academy ◽

Intercollegiate Athletic ◽

Using Data ◽

Promotion Rates ◽

The Impact

We investigate the influence of intercollegiate athletic participation on grades using data from the U.S. Naval Academy. Athletic participation is an endogenous decision with respect to educational outcomes. To identify a causal effect, we develop an instrument via the Academy’s random assignment of students into peer groups. Instrumental variable (IVs) estimates suggest that sports participation modestly reduces recruited athletes’ grades. This finding has implications beyond college, as we also show that grades—not athletic participation—are most strongly associated with postcollegiate outcomes such as military tenure and promotion rates.

Download Full-text

Status signalling in the market for consumer goods

10.31234/osf.io/ab35e ◽

2021 ◽

Author(s):

HAIMEI YU ◽

Edward Vul

Keyword(s):

Social Status ◽

Consumer Goods ◽

Instrumental Variable ◽

Causal Effect ◽

Department Stores ◽

Purchasing Decisions ◽

Social Signals ◽

Status Signalling ◽

Goods Market ◽

Using Data

People are concerned with signalling their social status to others, and conspicuous consumption may be a prevalent means of signalling, such that purchasing decisions are motivated not only by the direct value of a product, but by the indirect value gained from what the product might communicate to others. Here we measure which products people might use as signals and ask how the signalling potential of products relates to the distribution of product offerings in the consumer goods market. In particular, we asked how the signalling potential of products influences the number, price, and dispersion of prices within and across department stores. Using data scraped from 11 department stores, we found that products with greater signalling potential are available in greater quantity, more expensive within a given store, and that more expensive stores stock more products with higher signalling potential, leading to greater global variance in prices for goods with greater signalling potential. Further, we use product visibility as an instrumental variable to estimate the causal effect of signalling potential on product offerings. Altogether, these results suggest that consumers demand to use visible goods as social signals, and being sensitive to this demand, suppliers of consumer goods position their product offerings to supply ample material for signalling via consumption.

Download Full-text

Biases in GWAS – the dog that did not bark

10.1101/709063 ◽

2019 ◽

Cited By ~ 2

Author(s):

C M Schooling

Keyword(s):

Selection Bias ◽

Scientific Discovery ◽

Association Studies ◽

Directed Acyclic Graphs ◽

Genome Wide Association Studies ◽

Acyclic Graphs ◽

Genome Wide ◽

The Uk ◽

Type 2 Error ◽

Disease Specific

AbstractBackgroundGenome wide association studies (GWAS) of specific diseases are central to scientific discovery. Bias from inevitably recruiting only survivors of genetic make-up and disease specific competing risk has not been comprehensively considered.MethodsWe identified sources of bias using directed acyclic graphs, and tested for them in the UK Biobank GWAS by making comparisons across the survival distribution, proxied by age at recruitment.ResultsAssociations of genetic variants with some diseases depended on their effect on survival. Variants associated with common harmful diseases had weaker or reversed associations with subsequent diseases that shared causes.ConclusionGenetic studies of diseases that involve surviving other common diseases are open to selection bias that can generate systematic type 2 error. GWAS ignoring such selection bias are most suitable for monogenetic diseases. Genetic effects on age at recruitment may indicate potential bias in disease-specific GWAS and relevance to population health.

Download Full-text

Use of multivariable Mendelian randomization to address biases due to competing risk before recruitment

10.1101/716621 ◽

2019 ◽

Cited By ~ 8

Author(s):

C Mary Schooling ◽

Priscilla M Lopez ◽

Zhao Yang ◽

J V Zhao ◽

SL Au Yeung ◽

...

Keyword(s):

Ischemic Stroke ◽

Selection Bias ◽

Mendelian Randomization ◽

Late Onset ◽

Later Life ◽

Directed Acyclic Graphs ◽

Competing Risk ◽

Uk Biobank ◽

The Uk ◽

Statin Use

AbstractBackgroundMendelian randomization (MR) provides unconfounded estimates. MR is open to selection bias particularly when the underlying sample is selected on surviving the genetically instrumented exposure and other conditions that share etiology with the outcome (competing risk before recruitment). Few methods to address this bias exist.MethodsWe use directed acyclic graphs to show this selection bias can be addressed by adjusting for common causes of survival and outcome. We use multivariable MR to obtain a corrected MR estimate, specifically, the effect of statin use on ischemic stroke, because statins affect survival and stroke typically occurs later in life than ischemic heart disease so is open to competing risk.ResultsIn univariable MR the genetically instrumented effect of statin use on ischemic stroke was in a harmful direction in MEGASTROKE and the UK Biobank (odds ratio (OR) 1.33, 95% confidence interval (CI) 0.80 to 2.20). In multivariable MR adjusted for major causes of survival and ischemic stroke, (blood pressure, body mass index and smoking initiation) the effect of statin use on stroke in the UK Biobank was as expected (OR 0.81, 95% CI 0.68 to 0.98) with a Q-statistic indicating absence of genetic pleiotropy or selection bias, but not in MEGASTROKE.ConclusionMR studies concerning late onset chronic conditions with shared etiology based on samples recruited in later life need to be conceptualized within a mechanistic understanding, so as to any identify potential bias due to competing risk before recruitment, and to inform the analysis and interpretation.

Download Full-text

International Remittances and Private Healthcare in Kerala, India

MIGRATION LETTERS ◽

10.33182/ml.v17i3.778 ◽

2020 ◽

Vol 17 (3) ◽

pp. 445-460

Author(s):

Mohd Imran Khan ◽

Valatheeswaran C.

Keyword(s):

Instrumental Variable ◽

Capital Investment ◽

Healthcare Expenditure ◽

Healthcare Services ◽

Private Healthcare ◽

International Remittances ◽

Variable Approach ◽

Using Data ◽

The Impact ◽

Instrumental Variable Approach

The inflow of international remittances to Kerala has been increasing over the last three decades. It has increased the income of recipient households and enabled them to spend more on human capital investment. Using data from the Kerala Migration Survey-2010, this study analyses the impact of remittance receipts on the households’ healthcare expenditure and access to private healthcare in Kerala. This study employs an instrumental variable approach to account for the endogeneity of remittances receipts. The empirical results show that remittance income has a positive and significant impact on households’ healthcare expenditure and access to private healthcare services. After disaggregating the sample into different heterogeneous groups, this study found that remittances have a greater effect on lower-income households and Other Backward Class (OBC) households but not Scheduled Caste (SC) and Scheduled Tribe (ST) households, which remain excluded from reaping the benefit of international migration and remittances.

Download Full-text

The Longitudinal Analysis of the Performance Factors Affecting the Korean Ladies Professional Golfer's Prize Money : Using Data from 2010 to 2019

Journal of Golf Studies ◽

10.34283/ksgs.2020.14.3.14 ◽

2020 ◽

Vol 14 (3) ◽

pp. 163-172

Author(s):

Sun-Hee Chung ◽

Keyword(s):

Longitudinal Analysis ◽

Prize Money ◽

Factors Affecting ◽

Performance Factors ◽

Using Data

Download Full-text

Response to: ‘Correspondence on ‘Variants in urate transporters, ADH1B, GCKR and MEPE genes associated with transition from asymptomatic hyperuricaemia to gout: results of the first gout versus asymptomatic hyperuricaemia GWAS in Caucasians using data from the UK Biobank’’ by Takei et al

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2021-220785 ◽

2021 ◽

pp. annrheumdis-2021-220785

Author(s):

Gabriela Sandoval-Plata ◽

Kevin Morgan ◽

Abhishek Abhishek

Keyword(s):

Uk Biobank ◽

Using Data ◽

The Uk

Download Full-text

Measuring the Efficiency of Football Clubs Using Data Envelopment Analysis: Empirical Evidence From Spanish Professional Football

SAGE Open ◽

10.1177/2158244021989257 ◽

2021 ◽

Vol 11 (1) ◽

pp. 215824402198925

Author(s):

Isidoro Guzmán-Raja ◽

Manuela Guzmán-Raja

Keyword(s):

Data Envelopment Analysis ◽

High Efficiency ◽

Sport Performance ◽

Data Envelopment ◽

Professional Football ◽

Efficiency Measure ◽

Factors Affecting ◽

Football Clubs ◽

Main Factors ◽

Using Data

Professional football clubs have a special characteristic not shared by other types of companies: their sport performance (on the field) is important, in addition to their financial performance (off the field). The aim of this paper is to calculate an efficiency measure using a model that combines performance (sport and economic) based on data envelopment analysis (DEA). The main factors affecting teams’ efficiency levels are investigated using cluster analysis. For a sample of Spanish football clubs, the findings indicate that clubs achieved a relatively high efficiency level for the period studied, and that the oldest teams with the most assets had the highest efficiency scores. These results could help club managers to improve the performance of their teams.

Download Full-text

Clinical laboratory tests and five-year incidence of major depressive disorder: a prospective cohort study of 433,890 participants from the UK Biobank

Translational Psychiatry ◽

10.1038/s41398-021-01505-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Michael Wainberg ◽

Stefan Kloiber ◽

Breno Diniz ◽

Roger S. McIntyre ◽

Daniel Felsky ◽

...

Keyword(s):

Public Health ◽

Major Depressive Disorder ◽

Cohort Study ◽

Confidence Interval ◽

Depressive Disorder ◽

Blood Cell ◽

Biological Processes ◽

Uk Biobank ◽

Major Depressive ◽

The Uk

AbstractPrevention of major depressive disorder (MDD) is a public health priority. Identifying biomarkers of underlying biological processes that contribute to MDD onset may help address this public health need. This prospective cohort study encompassed 383,131 white British participants from the UK Biobank with no prior history of MDD, with replication in 50,759 participants of other ancestries. Leveraging linked inpatient and primary care records, we computed adjusted odds ratios for 5-year MDD incidence among individuals with values below or above the 95% confidence interval (<2.5th or >97.5th percentile) on each of 57 laboratory measures. Sensitivity analyses were performed across multiple percentile thresholds and in comparison to established reference ranges. We found that indicators of liver dysfunction were associated with increased 5-year MDD incidence (even after correction for alcohol use and body mass index): elevated alanine aminotransferase (AOR = 1.35, 95% confidence interval [1.16, 1.58]), aspartate aminotransferase (AOR = 1.39 [1.19, 1.62]), and gamma glutamyltransferase (AOR = 1.52 [1.31, 1.76]) as well as low albumin (AOR = 1.28 [1.09, 1.50]). Similar observations were made with respect to endocrine dysregulation, specifically low insulin-like growth factor 1 (AOR = 1.34 [1.16, 1.55]), low testosterone among males (AOR = 1.60 [1.27, 2.00]), and elevated glycated hemoglobin (HbA1C; AOR = 1.23 [1.05, 1.43]). Markers of renal impairment (i.e. elevated cystatin C, phosphate, and urea) and indicators of anemia and macrocytosis (i.e. red blood cell enlargement) were also associated with MDD incidence. While some immune markers, like elevated white blood cell and neutrophil count, were associated with MDD (AOR = 1.23 [1.07, 1.42]), others, like elevated C-reactive protein, were not (AOR = 1.04 [0.89, 1.22]). The 30 significant associations validated as a group in the multi-ancestry replication cohort (Wilcoxon p = 0.0005), with a median AOR of 1.235. Importantly, all 30 significant associations with extreme laboratory test results were directionally consistent with an increased MDD risk. In sum, markers of liver and kidney dysfunction, growth hormone and testosterone deficiency, innate immunity, anemia, macrocytosis, and insulin resistance were associated with MDD incidence in a large community-based cohort. Our results support a contributory role of diverse biological processes to MDD onset.

Download Full-text

Association between Dyslipidemia and Mercury Exposure in Adults

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18020775 ◽

2021 ◽

Vol 18 (2) ◽

pp. 775

Author(s):

Purum Kang ◽

Hye Young Shin ◽

Ka Young Kim

Keyword(s):

Behavioral Interventions ◽

Dietary Habits ◽

Subjective Health ◽

Targeted Prevention ◽

Adult Males ◽

Subjective Health Status ◽

Factors Affecting ◽

Health And Nutrition ◽

Metal Levels ◽

Using Data

Background—Dyslipidemia is one of the prominent risk factors for cardiovascular disease, which is the leading cause of death worldwide. Dyslipidemia has various causes, including metabolic capacity, genetic problems, physical inactivity, and dietary habits. This study aimed to determine the association between dyslipidemia and exposure to heavy metals in adults. Methods—Using data from the seventh Korean National Health and Nutrition Examination Survey (2016–2017), 5345 participants aged ≥20 years who were tested for heavy metal levels were analyzed in this study. Multiple logistic regression was conducted to assess the factors affecting the prevalence of dyslipidemia. Results—The risks of dyslipidemia among all and male participants with mercury (Hg) levels of ≥2.75 μg/L (corresponding to the Korean average level) were 1.273 and 1.699 times higher than in those with levels of <2.75 μg/L, respectively. The factors that significantly affected the dyslipidemia risk were age, household income, body mass index, and subjective health status in both males and females. Conclusions—In adult males, exposure to Hg at higher-than-average levels was positively associated with dyslipidemia. These results provide a basis for targeted prevention strategies for dyslipidemia using lifestyle guidelines for reducing Hg exposure and healthy behavioral interventions.

Download Full-text