Efficient method for breast cancer classification based on ensemble hoffeding tree and naïve Bayes
<p><span>The most dangerous type of cancer suffered by women above 35 years of age is breast cancer. Breast Cancer datasets are normally characterized by missing data, high dimensionality, non-normal distribution, class imbalance, noisy, and inconsistency. Classification is a machine learning (ML) process which has a significant role in the prediction of outcomes, and one of the outstanding supervised classification methods in data mining is Naives Bayess Classification (NBC). Naïve Bayes Classifications is good at predicting outcomes and often outperforms other classifications techniques. Ones of the reasons behind this strong performance of NBC is the assumptions of conditional Independences among the initial parameters and the predictors. However, this assumption is not always true and can cause loss of accuracy. Hoeffding trees assume the suitability of using a small sample to select the optimal splitting attribute. This study proposes a new method for improving accuracy of classification of breast cancer datasets. The method proposes the use of Hoeffding trees for normal classification and naïve Bayes for reducing data dimensionality.</span></p>