scholarly journals On cross-ancestry cancer polygenic risk scores

PLoS Genetics ◽  
2021 ◽  
Vol 17 (9) ◽  
pp. e1009670
Author(s):  
Lars G. Fritsche ◽  
Ying Ma ◽  
Daiwei Zhang ◽  
Maxwell Salvatore ◽  
Seunggeun Lee ◽  
...  

Polygenic risk scores (PRS) can provide useful information for personalized risk stratification and disease risk assessment, especially when combined with non-genetic risk factors. However, their construction depends on the availability of summary statistics from genome-wide association studies (GWAS) independent from the target sample. For best compatibility, it was reported that GWAS and the target sample should match in terms of ancestries. Yet, GWAS, especially in the field of cancer, often lack diversity and are predominated by European ancestry. This bias is a limiting factor in PRS research. By using electronic health records and genetic data from the UK Biobank, we contrast the utility of breast and prostate cancer PRS derived from external European-ancestry-based GWAS across African, East Asian, European, and South Asian ancestry groups. We highlight differences in the PRS distributions of these groups that are amplified when PRS methods condense hundreds of thousands of variants into a single score. While European-GWAS-derived PRS were not directly transferrable across ancestries on an absolute scale, we establish their predictive potential when considering them separately within each group. For example, the top 10% of the breast cancer PRS distributions within each ancestry group each revealed significant enrichments of breast cancer cases compared to the bottom 90% (odds ratio of 2.81 [95%CI: 2.69,2.93] in European, 2.88 [1.85, 4.48] in African, 2.60 [1.25, 5.40] in East Asian, and 2.33 [1.55, 3.51] in South Asian individuals). Our findings highlight a compromise solution for PRS research to compensate for the lack of diversity in well-powered European GWAS efforts while recruitment of diverse participants in the field catches up.

2021 ◽  
Author(s):  
Lars G. Fritsche ◽  
Ying Ma ◽  
Daiwei Zhang ◽  
Maxwell Salvatore ◽  
Seunggeun Lee ◽  
...  

AbstractPolygenic risk scores (PRS) can provide useful information for personalized risk stratification and disease risk assessment, especially when combined with non-genetic risk factors. However, their construction depends on the availability of summary statistics from genome-wide association studies (GWAS) independent from the target sample. For best compatibility, it was reported that GWAS and the target sample should match in terms of ancestries. Yet, GWAS, especially in the field of cancer, often lack diversity and are predominated by European ancestry. This bias is a limiting factor in PRS research. By using electronic health records and genetic data from the UK Biobank, we contrast the utility of breast and prostate cancer PRS derived from external European-ancestry-based GWAS across African, East Asian, European, and South Asian ancestry groups. We highlight differences in the PRS distributions of these groups that are amplified when PRS methods condense hundreds of thousands of variants into a single score. While European-GWAS-derived PRS were not directly transferrable across ancestries on an absolute scale, we establish their predictive potential when considering them separately within each group. For example, the top 10% of the breast cancer PRS distributions within each ancestry group each revealed significant enrichments of breast cancer cases compared to the bottom 90% (odds ratio of 2.81 [95%CI: 2.69,2.93] in European, 2.88 [1.85, 4.48] in African, 2.60 [1.25, 5.40] in East Asian, and 2.33 [1.55, 3.51] in South Asian individuals). Our findings highlight a compromise solution for PRS research to compensate for the lack of diversity in well-powered European GWAS efforts while recruitment of diverse participants in the field catches up.


2021 ◽  
Author(s):  
VT Nguyen ◽  
A Braun ◽  
J Kraft ◽  
TMT Ta ◽  
GM Panagiotaropoulou ◽  
...  

AbstractObjectivesGenome-Wide Association Studies (GWAS) of Schizophrenia (SCZ) have provided new biological insights; however, most cohorts are of European ancestry. As a result, derived polygenic risk scores (PRS) show decreased predictive power when applied to populations of different ancestries. We aimed to assess the feasibility of a large-scale data collection in Hanoi, Vietnam, contribute to international efforts to diversify ancestry in SCZ genetic research and examine the transferability of SCZ-PRS to individuals of Vietnamese Kinh ancestry.MethodsIn a pilot study, 368 individuals (including 190 SCZ cases) were recruited at the Hanoi Medical University’s associated psychiatric hospitals and outpatient facilities. Data collection included sociodemographic data, baseline clinical data, clinical interviews assessing symptom severity and genome-wide SNP genotyping. SCZ-PRS were generated using different training data sets: i) European, ii) East-Asian and iii) trans-ancestry GWAS summary statistics from the latest SCZ GWAS meta-analysis.ResultsSCZ-PRS significantly predicted case status in Vietnamese individuals using mixed-ancestry (R2 liability=4.9%, p=6.83*10−8), East-Asian (R2 liability=4.5%, p=2.73*10−7) and European (R2 liability=3.8%, p = 1.79*10−6) discovery samples.DiscussionOur results corroborate previous findings of reduced PRS predictive power across populations, highlighting the importance of ancestral diversity in GWA studies.


2020 ◽  
Author(s):  
Evan A. Winiger ◽  
Jarrod M. Ellingson ◽  
Claire L. Morrison ◽  
Robin P. Corley ◽  
Joëlle A. Pasman ◽  
...  

AbstractStudy ObjectivesEstimate the genetic relationship of cannabis use with sleep deficits and eveningness chronotype.MethodsWe used linkage disequilibrium score regression (LDSC) to analyze genetic correlations between sleep deficits and cannabis use behaviors. Secondly, we generated sleep deficit polygenic risk scores (PRSs) and estimated their ability to predict cannabis use behaviors using logistic regression. Summary statistics came from existing genome wide association studies (GWASs) of European ancestry that were focused on sleep duration, insomnia, chronotype, lifetime cannabis use, and cannabis use disorder (CUD). A target sample for PRS prediction consisted of high-risk participants and participants from twin/family community-based studies (n = 796, male = 66%; mean age = 26.81). Target data consisted of self-reported sleep (sleep duration, feeling tired, and taking naps) and cannabis use behaviors (lifetime use, number of lifetime uses, past 180-day use, age of first use, and lifetime CUD symptoms).ResultsSignificant genetic correlation between lifetime cannabis use and eveningness chronotype (rG = 0.24, p < 0.01), as well as between CUD and both short sleep duration (<7 h) (rG = 0.23, p = 0.02) and insomnia (rG = 0.20, p = 0.02). Insomnia PRS predicted earlier age of first cannabis use (β = −0.09, p = 0.02) and increased lifetime CUD symptom count use (β = 0.07, p = 0.03).ConclusionCannabis use is genetically associated with both sleep deficits and an eveningness chronotype, suggesting that there are genes that predispose individuals to both cannabis use and sleep deficits.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Yanyu Liang ◽  
Milton Pividori ◽  
Ani Manichaikul ◽  
Abraham A. Palmer ◽  
Nancy J. Cox ◽  
...  

Abstract Background Polygenic risk scores (PRS) are valuable to translate the results of genome-wide association studies (GWAS) into clinical practice. To date, most GWAS have been based on individuals of European-ancestry leading to poor performance in populations of non-European ancestry. Results We introduce the polygenic transcriptome risk score (PTRS), which is based on predicted transcript levels (rather than SNPs), and explore the portability of PTRS across populations using UK Biobank data. Conclusions We show that PTRS has a significantly higher portability (Wilcoxon p=0.013) in the African-descent samples where the loss of performance is most acute with better performance than PRS when used in combination.


2021 ◽  
Author(s):  
Ying Wang ◽  
Shinichi Namba ◽  
Esteban Lopera ◽  
Sini Kerminen ◽  
Kristin Tsuo ◽  
...  

SummaryWith the increasing availability of biobank-scale datasets that incorporate both genomic data and electronic health records, many associations between genetic variants and phenotypes of interest have been discovered. Polygenic risk scores (PRS), which are being widely explored in precision medicine, use the results of association studies to predict the genetic component of disease risk by accumulating risk alleles weighted by their effect sizes. However, limited studies have thoroughly investigated best practices for PRS in global populations across different diseases. In this study, we utilize data from the Global-Biobank Meta-analysis Initiative (GBMI), which consists of individuals from diverse ancestries and across continents, to explore methodological considerations and PRS prediction performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRS using heuristic (pruning and thresholding, P+T) and Bayesian (PRS-CS) methods. We found that the genetic architecture, such as SNP-based heritability and polygenicity, varied greatly among endpoints. For both PRS construction methods, using a European ancestry LD reference panel resulted in comparable or higher prediction accuracy compared to several other non-European based panels; this is largely attributable to European descent populations still comprising the majority of GBMI participants. PRS-CS overall outperformed the classic P+T method, especially for endpoints with higher SNP-based heritability. For example, substantial improvements are observed in East-Asian ancestry (EAS) using PRS-CS compared to P+T for heart failure (HF) and chronic obstructive pulmonary disease (COPD). Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma which has known variation in disease prevalence across global populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using the GBMI and highlight the importance of best practices for PRS in the biobank-scale genomics era.


2019 ◽  
Author(s):  
Yan Zhang ◽  
Amber N. Wilcox ◽  
Haoyu Zhang ◽  
Parichoy Pal Choudhury ◽  
Douglas F. Easton ◽  
...  

AbstractWe analyzed summary-level data from genome-wide association studies (GWAS) of European ancestry across fourteen cancer sites to estimate the number of common susceptibility variants (polygenicity) contributing to risk, as well as the distribution of their associated effect sizes. All cancers evaluated showed polygenicity, involving at a minimum thousands of independent susceptibility variants. For some malignancies, particularly chronic lymphoid leukemia (CLL) and testicular cancer, there are a larger proportion of variants with larger effect sizes than those for other cancers. In contrast, most variants for lung and breast cancers have very small associated effect sizes. For different cancer sites, we estimate a wide range of GWAS sample sizes, required to explain 80% of GWAS heritability, varying from 60,000 cases for CLL to over 1,000,000 cases for lung cancer. The maximum relative risk achievable for subjects at the 99th risk percentile of underlying polygenic risk scores, compared to average risk, ranges from 12 for testicular to 2.5 for ovarian cancer. We show that polygenic risk scores have substantial potential for risk stratification for relatively common cancers such as breast, prostate and colon, but limited potential for other cancer sites because of modest heritability and lower disease incidence.


2019 ◽  
Author(s):  
Thomas M. Piasecki ◽  
Ian R. Gizer ◽  
Wendy S. Slutske

Background and Aims: Twin studies indicate that disordered gambling (DG) is heritable but are silent with respect to specific genes or pathways involved. Genome-wide association studies of other psychiatric disorders permit calculation of polygenic risk scores (PRS) that reflect the aggregated effects of common genetic variants contributing to risk for the target condition. We investigated whether gambling and DG are associated with PRSs for four psychiatric conditions found to be comorbid with DG in epidemiologic surveys. Design and Setting: Data were drawn from the Wave IV assessment of the National Longitudinal Study of Adolescent to Adult Health, a representative sample of adolescents recruited in 1994-5 and followed into young adulthood. Participants: Analyses were limited to unrelated individuals classified as having European ancestry based on analysis of genetic principal components (N = 5,215). Measurements: Participants were surveyed about lifetime gambling and DG. Genotyping data were used to construct PRSs quantifying participants’ common variant genetic risk for major depression (MDD), attention-deficit hyperactivity disorder (ADHD), bipolar disorder (BD), and schizophrenia (SCZ). Findings: Most participants (78.4%) reported ever having gambled, and 1.3% of those reported lifetime DG. Polygenic risk for BD was associated with decreased odds of lifetime gambling, OR = 0.93 [0.87, 0.99], p = .045, pseudo-R2(%) = 0.12. The SCZ PRS was associated with increased odds of DG, OR = 1.54 [1.07, 2.21], p = .020, pseudo-R2 (%) = 0.85. Polygenic risk for MDD and ADHD were not related to either gambling outcome. Conclusions: Common variant risk for SCZ is associated with DG. Investigating features common to both SCZ and DG might generate valuable clues about the genetically-influenced liabilities to DG.


2021 ◽  
Author(s):  
Louise Wang ◽  
Heena Desai ◽  
Shefali S. Verma ◽  
Anh Le ◽  
Ryan Hausler ◽  
...  

Purpose: Genome-wide association studies (GWAS) have identified hundreds of single nucleotide polymorphisms (SNPs) significantly associated with several cancers, but the predictive ability of polygenic risk scores (PRS) derived from multiple variants is unclear for many cancers, especially among non-European populations. Methods: Genome wide genotype data was available for 20,079 individuals enrolled in an academic biobank. PRS were derived from significant DNA variants for 15 cancers. Logistic regression was used to determine the discriminatory accuracy of each cancer-specific PRS in patients of genetically determined African and European ancestry separately. Results: Among European individuals, four PRS were significantly associated with their respective cancers (breast, colon, melanoma, and prostate), with an OR ranging from 1.25-1.47. Among African individuals, PRS for breast, colon, and prostate were significantly associated with their respective cancers. The discriminatory ability of a model comprised of age, sex, and principal components was 0.617–0.709, and the AUC increased by 1-4% with the addition of the PRS in Europeans. AUC was overall higher in the full model including PRS (AUC 0.742-0.818) in African individuals, but the PRS increased the AUC by less than 1% in the majority of cancers in African individuals. Conclusion: PRS constructed from SNPs moderately increased discriminatory ability for cancer status over age, sex, and nonspecific genetic factors in individuals of European but not African ancestry. Further large-scale studies are needed to identify ancestry-specific genetic factors for cancer risk in non-European populations to incorporate PRS into cancer risk assessment.


2019 ◽  
Author(s):  
R.L. Kember ◽  
A. Verma ◽  
S. Verma ◽  
A. Lucas ◽  
R. Judy ◽  
...  

AbstractCardio-renal-metabolic (CaReMe) conditions are common and the leading cause of mortality around the world. Genome-wide association studies have shown that these diseases are polygenic and share many genetic risk factors. Identifying individuals at high genetic risk will allow us to target prevention and treatment strategies. Polygenic risk scores (PRS) are aggregate weighted counts that can demonstrate an individual’s genetic liability for disease. However, current PRS are often based on European ancestry individuals, limiting the implementation of precision medicine efforts in diverse populations. In this study, we develop PRS for six diseases and traits related to cardio-renal-metabolic disease in the Penn Medicine Biobank. We investigate their performance in both European and African ancestry individuals, and identify genetic and phenotypic overlap within these conditions. We find that genetic risk is associated with the primary phenotype in both ancestries, but this does not translate into a model of predictive value in African ancestry individuals. We conclude that future research should prioritize genetic studies in diverse ancestries in order to address this disparity.


2021 ◽  
Author(s):  
Sijia Huang ◽  
Xiao Ji ◽  
Michael Cho ◽  
Jaehyun Joo ◽  
Jason Moore

Abstract Background COPD is a complex heterogeneous disease influenced by both environmental and genetic risk factors. Traditional genome wide association studies (GWAS) have been successful in identifying many reproducible risk variants of moderate to small effect. Polygenic risk scores (PRS) were developed as way to aggregate risk alleles weighted by their effect size to produce a score which could be used in clinical practice to identify individuals at high risk of disease. A limitation of both GWAS and PRS is that they make the important assumption that the effect of each allele is independent and not modified by other genetic or environmental factors. Machine learning methods such as deep learning (DL) neural networks complement the GWAS and PRS paradigm by making fewer assumptions about the nature of the genetic effects being modeled. For example, the hidden layers of a DL model have the potential to model gene-gene interactions with non-additive effects on disease risk. The goal of the present study was to develop a DL neural network approach to GWAS and PRS and to compare it to the prevailing paradigm based on modeling independent effects. We applied our DL-PRS method to genetic association data from several GWAS studies of chronic obstructive pulmonary disease (COPD).Results We developed a DL learning algorithm for modeling the relationship between genetic variation from GWAS and risk of COPD in several population-based studies. We then developed a DL-PRS based on nodes and associated weights from the first and second layer of the DL neural network. Our DL-PRS framework has overall satisfactory performance in the prediction of COPD and provides significant contribution to prediction in addition to the current PRS methods. Moreover, regarding the clinical relevance of COPD, our DL-PRS has a consistent and closer relationship regarding individual deciles and lung functions such as FEV1/FVC and predicted FEV1%. Conclusions Not only does DL-PRS show favorable predictive performance with current benchmark PRS methods, but it also extends the ranges of PRS deciles in predicting different stages of COPD. Moreover, our DL-PRS results were replicated in an independent cohort. This study opens the door to the use of machine learning for developing risk scores from models developed using fewer assumptions about the nature of the genetic effects.


Sign in / Sign up

Export Citation Format

Share Document