Breast Cancer Surgery 10-Year Survival Prediction by Machine Learning: A Large Prospective Cohort Study
Machine learning algorithms have proven to be effective for predicting survival after surgery, but their use for predicting 10-year survival after breast cancer surgery has not yet been discussed. This study compares the accuracy of predicting 10-year survival after breast cancer surgery in the following five models: a deep neural network (DNN), K nearest neighbor (KNN), support vector machine (SVM), naive Bayes classifier (NBC) and Cox regression (COX), and to optimize the weighting of significant predictors. The subjects recruited for this study were breast cancer patients who had received breast cancer surgery (ICD-9 cm 174–174.9) at one of three southern Taiwan medical centers during the 3-year period from June 2007, to June 2010. The registry data for the patients were randomly allocated to three datasets, one for training (n = 824), one for testing (n = 177), and one for validation (n = 177). Prediction performance comparisons revealed that all performance indices for the DNN model were significantly (p < 0.001) higher than in the other forecasting models. Notably, the best predictor of 10-year survival after breast cancer surgery was the preoperative Physical Component Summary score on the SF-36. The next best predictors were the preoperative Mental Component Summary score on the SF-36, postoperative recurrence, and tumor stage. The deep-learning DNN model is the most clinically useful method to predict and to identify risk factors for 10-year survival after breast cancer surgery. Future research should explore designs for two-level or multi-level models that provide information on the contextual effects of the risk factors on breast cancer survival.