Efficient method for breast cancer classification based on ensemble hoffeding tree and naïve Bayes

Royida A. Ibrahem Alhayali; Munef Abdullah Ahmed; Yasmin Makki Mohialden; Ahmed H. Ali

doi:10.11591/ijeecs.v18.i2.pp1074-1080

Efficient method for breast cancer classification based on ensemble hoffeding tree and naïve Bayes

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i2.pp1074-1080 ◽

2020 ◽

Vol 18 (2) ◽

pp. 1074

Author(s):

Royida A. Ibrahem Alhayali ◽

Munef Abdullah Ahmed ◽

Yasmin Makki Mohialden ◽

Ahmed H. Ali

Keyword(s):

Breast Cancer ◽

Naive Bayes ◽

Class Imbalance ◽

Naïve Bayes ◽

Small Sample ◽

Cancer Breast ◽

Improving Accuracy ◽

Distribution Class ◽

Supervised Classification Methods

<p><span>The most dangerous type of cancer suffered by women above 35 years of age is breast cancer. Breast Cancer datasets are normally characterized by missing data, high dimensionality, non-normal distribution, class imbalance, noisy, and inconsistency. Classification is a machine learning (ML) process which has a significant role in the prediction of outcomes, and one of the outstanding supervised classification methods in data mining is Naives Bayess Classification (NBC). Naïve Bayes Classifications is good at predicting outcomes and often outperforms other classifications techniques. Ones of the reasons behind this strong performance of NBC is the assumptions of conditional Independences among the initial parameters and the predictors. However, this assumption is not always true and can cause loss of accuracy. Hoeffding trees assume the suitability of using a small sample to select the optimal splitting attribute. This study proposes a new method for improving accuracy of classification of breast cancer datasets. The method proposes the use of Hoeffding trees for normal classification and naïve Bayes for reducing data dimensionality.</span></p>

Download Full-text

Analisis Klasifikasi Kanker Payudara Menggunakan Algoritma Naive Bayes

INFORMAL: Informatics Journal ◽

10.19184/isj.v4i3.14170 ◽

2020 ◽

Vol 4 (3) ◽

pp. 117

Author(s):

Hardian Oktavianto ◽

Rahman Puji Handri

Keyword(s):

Breast Cancer ◽

Naive Bayes ◽

Naïve Bayes ◽

World Health ◽

Average Percentage ◽

Average Value ◽

Treatment Measures ◽

Bayes Algorithm ◽

Health Organization

Breast cancer is one of the highest causes of death among women, this disease ranks second cause of death after lung cancer. According to the world health organization, 1 million women get a diagnosis of breast cancer every year and half of them die, in general this is due to early treatment and slow treatment resulting in new cancers being detected after entering the final stage. In the field of health and medicine, machine learning-based classification has been carried out to help doctors and health professionals in classifying the types of cancer, to determine which treatment measures should be performed. In this study breast cancer classification will be carried out using the Naive Bayes algorithm to group the types of cancer. The dataset used is from the Wisconsin breast cancer database. The results of this study are the ability of the Naive Bayes algorithm for the classification of breast cancer produces a good value, where the average percentage of correctly classified data reaches 96.9% and the average percentage of data is classified as incorrect only 3.1%. While the level of effectiveness of classification with naive bayes is high, where the average value of precision and recall is around 0.96. The highest precision and recall values are when the test data uses a percentage split of 40% with the respective values reaching 0.974 and 0.973.

Download Full-text

K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes

Jurnal Teknologi dan Sistem Komputer ◽

10.14710/jtsiskom.8.2.2020.89-93 ◽

2020 ◽

Vol 8 (2) ◽

pp. 89-93 ◽

Cited By ~ 3

Author(s):

Hairani Hairani ◽

Khurniawan Eko Saputro ◽

Sofiansyah Fadli

Keyword(s):

Naive Bayes ◽

Class Imbalance ◽

Naïve Bayes ◽

Data Sampling ◽

Minority Class ◽

Class A ◽

Positive Class ◽

Negative Class ◽

Imbalanced Class

The occurrence of imbalanced class in a dataset causes the classification results to tend to the class with the largest amount of data (majority class). A sampling method is needed to balance the minority class (positive class) so that the class distribution becomes balanced and leading to better classification results. This study was conducted to overcome imbalanced class problems on the Indian Pima diabetes illness dataset using k-means-SMOTE. The dataset has 268 instances of the positive class (minority class) and 500 instances of the negative class (majority class). The classification was done by comparing C4.5, SVM, and naïve Bayes while implementing k-means-SMOTE in data sampling. Using k-means-SMOTE, the SVM classification method has the highest accuracy and sensitivity of 82 % and 77 % respectively, while the naive Bayes method produces the highest specificity of 89 %.

Download Full-text

Selecting Features Subsets Based on Support Vector Machine-Recursive Features Elimination and One Dimensional-Naïve Bayes Classifier using Support Vector Machines for Classification of Prostate and Breast Cancer

Procedia Computer Science ◽

10.1016/j.procs.2019.08.238 ◽

2019 ◽

Vol 157 ◽

pp. 450-458 ◽

Cited By ~ 1

Author(s):

Alhadi Bustamam ◽

Anas Bachtiar ◽

Devvi Sarwinda

Keyword(s):

Breast Cancer ◽

Support Vector Machine ◽

Support Vector Machines ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Bayes Classifier ◽

One Dimensional ◽

Vector Machines

Download Full-text

Classification of breast cancer using Wrapper and Naïve Bayes algorithms

Journal of Physics Conference Series ◽

10.1088/1742-6596/1040/1/012017 ◽

2018 ◽

Vol 1040 ◽

pp. 012017 ◽

Cited By ~ 1

Author(s):

I M D Maysanjaya ◽

I M A Pradnyana ◽

I M Putrama

Keyword(s):

Breast Cancer ◽

Naive Bayes ◽

Naïve Bayes

Download Full-text

Optimization Naive Bayes Algorithm Using Particle Swarm Optimization in the Classification of Breast Cancer

Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019) ◽

10.2991/aisr.k.200424.055 ◽

2020 ◽

Author(s):

Vira MELINDA ◽

Rifkie PRIMARTHA ◽

Adi WIJAYA ◽

Muhammad Ihsan JAMBAK

Keyword(s):

Breast Cancer ◽

Particle Swarm Optimization ◽

Naive Bayes ◽

Particle Swarm ◽

Naïve Bayes ◽

Swarm Optimization ◽

Bayes Algorithm

Download Full-text

KERNEL-BASED NAIVE BAYES CLASSIFIER FOR BREAST CANCER PREDICTION

Journal of Biological System ◽

10.1142/s0218339007002076 ◽

2007 ◽

Vol 15 (01) ◽

pp. 17-25 ◽

Cited By ~ 18

Author(s):

JESMIN NAHAR ◽

YI-PING PHOEBE CHEN ◽

SHAWKAT ALI

Keyword(s):

Breast Cancer ◽

Cancer Diagnosis ◽

Naive Bayes ◽

Naïve Bayes ◽

Breast Cancer Patients ◽

Cancer Prediction ◽

Early Cancer Diagnosis ◽

Cancer Tumor ◽

Bayes Algorithm

The classification of breast cancer patients is of great importance in cancer diagnosis. Most classical cancer classification methods are clinical-based and have limited diagnostic ability. The recent advances in machine learning technique has made a great impact in cancer diagnosis. In this research, we develop a new algorithm: Kernel-Based Naive Bayes (KBNB) to classify breast cancer tumor based on memography data. The performance of the proposed algorithm is compared with that of classical navie bayes algorithm and kernel-based decision tree algorithm C4.5. The proposed algorithm is found to outperform in the both cases. We recommend the proposed algorithm could be used as a tool to classify the breast patient for early cancer diagnosis.

Download Full-text

Efficient Jamming Identification in Wireless Communication: Using Small Sample Data Driven Naive Bayes Classifier

IEEE Wireless Communications Letters ◽

10.1109/lwc.2021.3064843 ◽

2021 ◽

pp. 1-1

Author(s):

Yuxin Shi ◽

Xinjin Lu ◽

Yingtao Niu ◽

Yusheng Li.

Keyword(s):

Wireless Communication ◽

Naive Bayes ◽

Naïve Bayes ◽

Small Sample ◽

Data Driven ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Sample Data

Download Full-text

Perbandingan Optimasi Feature Selection pada Naïve Bayes untuk Klasifikasi Kepuasan Airline Passenger

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i3.3086 ◽

2021 ◽

Vol 5 (3) ◽

pp. 527-533

Author(s):

Yoga Religia ◽

Amali Amali

Keyword(s):

Feature Selection ◽

Customer Satisfaction ◽

Naive Bayes ◽

Naïve Bayes ◽

Point Of View ◽

Classification Model ◽

Passenger Satisfaction ◽

Airline Passenger ◽

Bayes Algorithm

The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.

Download Full-text

Analysis and Classification of Danger Level in Android Applications Using Naive Bayes Algorithm

2018 6th International Conference on Information and Communication Technology (ICoICT) ◽

10.1109/icoict.2018.8528733 ◽

2018 ◽

Author(s):

Ridho Alif Utama ◽

Parman Sukarno ◽

Erwid Musthofa Jadied

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Android Applications ◽

Bayes Algorithm ◽

Danger Level

Download Full-text

Prediction of benign and malignant breast cancer using data mining techniques

Journal of Algorithms & Computational Technology ◽

10.1177/1748301818756225 ◽

2018 ◽

Vol 12 (2) ◽

pp. 119-126 ◽

Cited By ~ 43

Author(s):

Vikas Chaurasia ◽

Saurabh Pal ◽

BB Tiwari

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Low Income ◽

Prediction Models ◽

Naive Bayes ◽

Naïve Bayes ◽

Low Income Countries ◽

Breast Cancer Dataset ◽

Cancer Dataset ◽

Rbf Network

Breast cancer is the second most leading cancer occurring in women compared to all other cancers. Around 1.1 million cases were recorded in 2004. Observed rates of this cancer increase with industrialization and urbanization and also with facilities for early detection. It remains much more common in high-income countries but is now increasing rapidly in middle- and low-income countries including within Africa, much of Asia, and Latin America. Breast cancer is fatal in under half of all cases and is the leading cause of death from cancer in women, accounting for 16% of all cancer deaths worldwide. The objective of this research paper is to present a report on breast cancer where we took advantage of those available technological advancements to develop prediction models for breast cancer survivability. We used three popular data mining algorithms (Naïve Bayes, RBF Network, J48) to develop the prediction models using a large dataset (683 breast cancer cases). We also used 10-fold cross-validation methods to measure the unbiased estimate of the three prediction models for performance comparison purposes. The results (based on average accuracy Breast Cancer dataset) indicated that the Naïve Bayes is the best predictor with 97.36% accuracy on the holdout sample (this prediction accuracy is better than any reported in the literature), RBF Network came out to be the second with 96.77% accuracy, J48 came out third with 93.41% accuracy.

Download Full-text