Robust Performance of Potentially Functional SNPs in Machine Learning Models for the Prediction of Atorvastatin-Induced Myalgia
Background:Statins can cause muscle symptoms resulting in poor adherence to therapy and increased cardiovascular risk. We hypothesize that combinations of potentially functional SNPs (pfSNPs), rather than individual SNPs, better predict myalgia in patients on atorvastatin. This study assesses the value of potentially functional single nucleotide polymorphisms (pfSNPs) and employs six machine learning algorithms to identify the combination of SNPs that best predict myalgia.Methods: Whole genome sequencing of 183 Chinese, Malay and Indian patients from Singapore was conducted to identify genetic variants associated with atorvastatin induced myalgia. To adjust for confounding factors, demographic and clinical characteristics were also examined for their association with myalgia. The top factor, sex, was then used as a covariate in the whole genome association analyses. Variants that were highly associated with myalgia from this and previous studies were extracted, assessed for potential functionality (pfSNPs) and incorporated into six machine learning models. Predictive performance of a combination of different models and inputs were compared using the average cross validation area under ROC curve (AUC). The minimum combination of SNPs to achieve maximum sensitivity and specificity as determined by AUC, that predict atorvastatin-induced myalgia in most, if not all the six machine learning models was determined.Results: Through whole genome association analyses using sex as a covariate, a larger proportion of pfSNPs compared to non-pf SNPs were found to be highly associated with myalgia. Although none of the individual SNPs achieved genome wide significance in univariate analyses, machine learning models identified a combination of 15 SNPs that predict myalgia with good predictive performance (AUC >0.9). SNPs within genes identified in this study significantly outperformed SNPs within genes previously reported to be associated with myalgia. pfSNPs were found to be more robust in predicting myalgia, outperforming non-pf SNPs in the majority of machine learning models tested.Conclusion: Combinations of pfSNPs that were consistently identified by different machine learning models to have high predictive performance have good potential to be clinically useful for predicting atorvastatin-induced myalgia once validated against an independent cohort of patients.