Discovery of breast cancer risk genes and establishment of a prediction model based on estrogen metabolism regulation
Abstract Background Multiple common variants identified by genome-wide association studies have shown limited evidence of the risk of breast cancer in Chinese individuals. In this study, we aimed to uncover the relationship between estrogen levels and the genetic polymorphism of estrogen metabolism-related enzymes in breast cancer (BC) and establish a risk prediction model composed of estrogen-metabolizing enzyme genes and GWAS-identified breast cancer-related genes based on a polygenic risk score. Methods Unrelated BC patients and healthy subjects were recruited for analysis of estrogen levels and single nucleotide polymorphisms (SNPs) in genes encoding estrogen metabolism-related enzymes. The polygenic risk score (PRS) was used to explore the combined effect of multiple genes, which was calculated using a Bayesian approach. An independent sample t-test was used to evaluate the differences between PRS scores of BC and healthy subjects. The discriminatory accuracy of the models was compared using the area under the receiver operating characteristic (ROC) curve. Results The estrogen homeostasis profile was disturbed in BC patients, with parent estrogens (E1, E2) and carcinogenic catechol estrogens (2/4-OHE1, 2-OHE2, 4-OHE2) significantly accumulating in the serum of BC patients. We then established a PRS model to evaluate the role of SNPs in multiple genes. PRS model 1 (M1) was established from SNPs in 6 GWAS-identified high risk genes. On the basis of M1, we added SNPs from 7 estrogen metabolism enzyme genes to establish PRS model 2 (M2). The independent sample t-test results showed that there was no difference between BC and healthy subjects in M1 (P = 0.17); however, there was a significant difference between BC and healthy subjects in M2 (P = 4.9*10− 5). The ROC curve results showed that the accuracy of M2 (AUC = 62.18%) in breast cancer risk identification was better than that of M1 (AUC = 54.56%). Conclusion Estrogen and related metabolic enzyme gene polymorphisms are closely related to BC. The model constructed by adding estrogen metabolic enzyme gene SNPs has a good predictive ability for breast cancer risk, and the accuracy is greatly improved compared with that of the PRS model that only includes GWAS-identified gene SNPs.