Ensemble classification technique for heart disease prediction with meta-heuristic-enabled training system

2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Parvathaneni Rajendra Kumar ◽  
Suban Ravichandran ◽  
Satyala Narayana

AbstractObjectivesThis research work exclusively aims to develop a novel heart disease prediction framework including three major phases, namely proposed feature extraction, dimensionality reduction, and proposed ensemble-based classification.MethodsAs the novelty, the training of NN is carried out by a new enhanced optimization algorithm referred to as Sea Lion with Canberra Distance (S-CDF) via tuning the optimal weights. The improved S-CDF algorithm is the extended version of the existing “Sea Lion Optimization (SLnO)”. Initially, the statistical and higher-order statistical features are extracted including central tendency, degree of dispersion, and qualitative variation, respectively. However, in this scenario, the “curse of dimensionality” seems to be the greatest issue, such that there is a necessity of dimensionality reduction in the extracted features. Hence, the principal component analysis (PCA)-based feature reduction approach is deployed here. Finally, the dimensional concentrated features are fed as the input to the proposed ensemble technique with “Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN)” with optimized Neural Network (NN) as the final classifier.ResultsAn elaborative analyses as well as discussion have been provided by concerning the parameters, like evaluation metrics, year of publication, accuracy, implementation tool, and utilized datasets obtained by various techniques.ConclusionsFrom the experiment outcomes, it is proved that the accuracy of the proposed work with the proposed feature set is 5, 42.85, and 10% superior to the performance with other feature sets like central tendency + dispersion feature, central tendency qualitative variation, and dispersion qualitative variation, respectively.ResultsFinally, the comparative evaluation shows that the presented work is appropriate for heart disease prediction as it has high accuracy than the traditional works.

Deriving the methodologies to detect heart issues at an earlier stage and intimating the patient to improve their health. To resolve this problem, we will use Machine Learning techniques to predict the incidence at an earlier stage. We have a tendency to use sure parameters like age, sex, height, weight, case history, smoking and alcohol consumption and test like pressure ,cholesterol, diabetes, ECG, ECHO for prediction. In machine learning there are many algorithms which will be used to solve this issue. The algorithms include K-Nearest Neighbour, Support vector classifier, decision tree classifier, logistic regression and Random Forest classifier. Using these parameters and algorithms we need to predict whether or not the patient has heart disease or not and recommend the patient to improve his/her health.


2020 ◽  
Vol 7 (2) ◽  
pp. 631-647
Author(s):  
Emrana Kabir Hashi ◽  
Md. Shahid Uz Zaman

Machine learning techniques are widely used in healthcare sectors to predict fatal diseases. The objective of this research was to develop and compare the performance of the traditional system with the proposed system that predicts the heart disease implementing the Logistic regression, K-nearest neighbor, Support vector machine, Decision tree, and Random Forest classification models. The proposed system helped to tune the hyperparameters using the grid search approach to the five mentioned classification algorithms. The performance of the heart disease prediction system is the major research issue. With the hyperparameter tuning model, it can be used to enhance the performance of the prediction models. The achievement of the traditional and proposed system was evaluated and compared in terms of accuracy, precision, recall, and F1 score. As the traditional system achieved accuracies between 81.97% and 90.16%., the proposed hyperparameter tuning model achieved accuracies in the range increased between 85.25% and 91.80%. These evaluations demonstrated that the proposed prediction approach is capable of achieving more accurate results compared with the traditional approach in predicting heart disease with the acquisition of feasible performance.


2021 ◽  
Vol 16 (3) ◽  
Author(s):  
Khushbu Verma ◽  
Ankit Singh Bartwal ◽  
Mathura Prasad Thapliyal

People nowadays suffer from a variety of heart ailments as a result of the environment and their lifestyle choices. As a result, analyzing sickness at an early stage becomes a critical responsibility. Data mining uses disease data to uncover important knowledge. In this research paper, we employ the hybrid combination of a Genetic Algorithm based Feature selection and Ensemble Deep Neural Network Model for Heart Disease prediction. In this algorithm, we used a 0.04 learning rate and Adam optimizer was used for enhancement of the proposed model. The proposed algorithm has come to 98% accuracy of heart disease prediction, which is higher than the past approaches. Other exist models such as random forest, logistic regression, support vector machine, Decision tree algorithms have taken a higher time and give less accuracy compare to the proposed hybrid deep learning-based approach.


2019 ◽  
Vol 9 (18) ◽  
pp. 3723
Author(s):  
Sharif ◽  
Mumtaz ◽  
Shafiq ◽  
Riaz ◽  
Ali ◽  
...  

The rise of social media has led to an increasing online cyber-war via hate and violent comments or speeches, and even slick videos that lead to the promotion of extremism and radicalization. An analysis to sense cyber-extreme content from microblogging sites, specifically Twitter, is a challenging, and an evolving research area since it poses several challenges owing short, noisy, context-dependent, and dynamic nature content. The related tweets were crawled using query words and then carefully labelled into two classes: Extreme (having two sub-classes: pro-Afghanistan government and pro-Taliban) and Neutral. An Exploratory Data Analysis (EDA) using Principal Component Analysis (PCA), was performed for tweets data (having Term Frequency—Inverse Document Frequency (TF-IDF) features) to reduce a high-dimensional data space into a low-dimensional (usually 2-D or 3-D) space. PCA-based visualization has shown better cluster separation between two classes (extreme and neutral), whereas cluster separation, within sub-classes of extreme class, was not clear. The paper also discusses the pros and cons of applying PCA as an EDA in the context of textual data that is usually represented by a high-dimensional feature set. Furthermore, the classification algorithms like naïve Bayes’, K Nearest Neighbors (KNN), random forest, Support Vector Machine (SVM) and ensemble classification methods (with bagging and boosting), etc., were applied with PCA-based reduced features and with a complete set of features (TF-IDF features extracted from n-gram terms in the tweets). The analysis has shown that an SVM demonstrated an average accuracy of 84% compared with other classification models. It is pertinent to mention that this is the novel reported research work in the context of Afghanistan war zone for Twitter content analysis using machine learning methods.


Technologies ◽  
2021 ◽  
Vol 9 (3) ◽  
pp. 52
Author(s):  
Md Manjurul Ahsan ◽  
M. A. Parvez Mahmud ◽  
Pritom Kumar Saha ◽  
Kishor Datta Gupta ◽  
Zahed Siddique

Heart disease, one of the main reasons behind the high mortality rate around the world, requires a sophisticated and expensive diagnosis process. In the recent past, much literature has demonstrated machine learning approaches as an opportunity to efficiently diagnose heart disease patients. However, challenges associated with datasets such as missing data, inconsistent data, and mixed data (containing inconsistent missing data both as numerical and categorical) are often obstacles in medical diagnosis. This inconsistency led to a higher probability of misprediction and a misled result. Data preprocessing steps like feature reduction, data conversion, and data scaling are employed to form a standard dataset—such measures play a crucial role in reducing inaccuracy in final prediction. This paper aims to evaluate eleven machine learning (ML) algorithms—Logistic Regression (LR), Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN), Classification and Regression Trees (CART), Naive Bayes (NB), Support Vector Machine (SVM), XGBoost (XGB), Random Forest Classifier (RF), Gradient Boost (GB), AdaBoost (AB), Extra Tree Classifier (ET)—and six different data scaling methods—Normalization (NR), Standscale (SS), MinMax (MM), MaxAbs (MA), Robust Scaler (RS), and Quantile Transformer (QT) on a dataset comprising of information of patients with heart disease. The result shows that CART, along with RS or QT, outperforms all other ML algorithms with 100% accuracy, 100% precision, 99% recall, and 100% F1 score. The study outcomes demonstrate that the model’s performance varies depending on the data scaling method.


Author(s):  
Sowbarnica V. S ◽  
Vismaya V ◽  
Vidhyapoonthalir M ◽  
S. Bhuvana

The heart is an operating system of the human body .If it does not function properly it will affect other parts also. Heart disease problem describes a range of conditions that affect your heart. The existing system uses Support Vector Machine (SVM), it propose a system for heart disease prediction. The method will help doctor to explore their data and predict heart disease accurately. The Hospitals do not provide the same quality of service even though they provide the same type of service. The Proposed system includes the following phases: Pre-Processing of the input data with Min-Max scalar and Normalization ,Feature extraction by PSO algorithm, Classification of data by K-Nearest Neighbour. In comparison with the existing approach ,the proposed approach significantly improves the accuracy from 51% to 76.66%.


Cardio Vascular Diseases (CVD) is the major reason for the death of the majority of the people in the world. Earlier diagnosis of disease will reduce the mortality rate. Machine learning (ML) algorithms are giving promising results in the disease diagnosis and it is now widely accepted by medical experts as their clinical decision support system. In this work, the most popular ML models are investigated and compared with one other for heart disease prediction based on various metrics. The base classifiers such as Support Vector Machine (SVM), Logistic regression, Naïve Bayes, Decision Tree, K Nearest Neighbour are used for predicting heart disease. In this paper, bagging and boosting techniques are applied over these individual classifiers to improve the performance of the system. With the Cleveland and Statlog datasets, Naive Bayes as the individual classifier gives the maximum accuracy of 85.13%and 84.81% respectively. Bagging technique improves the accuracy of the decision tree which is identified as a weak classifier by 7% and it is a significant improvement in identifying CVD.


Sign in / Sign up

Export Citation Format

Share Document