matthews correlation coefficient
Recently Published Documents


TOTAL DOCUMENTS

40
(FIVE YEARS 34)

H-INDEX

7
(FIVE YEARS 4)

2021 ◽  
Vol 20 ◽  
pp. 24-34
Author(s):  
Arif Hussain ◽  
Hassaan Malik ◽  
Muhammad Umar Chaudhry

Detecting cardiovascular disease (CVD) in the early stage is a difficult and crucial process. The objective of this study is to test the capability of machine learning (ML) methods for accurately diagnosing the CVD outcomes. For this study, the efficiency and effectiveness of four well renowned ML classifiers, i.e., support vector machine (SVM), logistics regression (LR), naive Bayes (NB), and decision tree (J48), are measured in terms of precision, sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC), correctly and incorrectly classified instances, and model building time. These ML classifiers are applied on publically available CVD dataset. In accordance with the measured result, J48 performs better than its competitor classifiers, providing significant assistance to the cardiologists.


2021 ◽  
Author(s):  
Seyma Toy ◽  
Yusuf Secgin ◽  
Zülal Oner ◽  
Muhammed Kamil Turan ◽  
Serkan Oner ◽  
...  

Abstract The aim of this study is to test whether sex prediction can be made by using machine learning algorithms (ML) with parameters taken from computerized tomography (CT) images of cranium and mandible skeleton which are known to be dimorphic. CT images of the cranium skeletons of 150 men and 150 women were included in the study. 25 parameters determined were tested with different ML algorithms. Accuracy (Acc), Specificity (Spe), Sensitivity (Sen), F1 score (F1), Matthews correlation coefficient (Mcc) values were included as performance criteria and Minitab 17 package program was used in descriptive statistical analyses. p ≤ 0.05 value was considered as statistically significant. In ML algorithms, the highest prediction was found with 0.90 Acc, 0.80 Mcc, 0.90 Spe, 0.90 Sen, 0.90 F1 values as a result of LR algorithms. As a result of confusion matrix, it was found that 27 of 30 males and 27 of 30 females were predicted correctly. Acc ratios of other MLs were found to be between 0.81 and 0.88. It has been concluded that the LR algorithm to be applied to the parameters obtained from CT images of the cranium skeleton will predict sex with high accuracy.


2021 ◽  
Vol 7 ◽  
pp. e654
Author(s):  
Parvathaneni Naga Srinivasu ◽  
Valentina Emilia Balas

In recent years in medical imaging technology, the advancement for medical diagnosis, the initial assessment of the ailment, and the abnormality have become challenging for radiologists. Magnetic resonance imaging is one such predominant technology used extensively for the initial evaluation of ailments. The primary goal is to mechanizean approach that can accurately assess the damaged region of the human brain throughan automated segmentation process that requires minimal training and can learn by itself from the previous experimental outcomes. It is computationally more efficient than other supervised learning strategies such as CNN deep learning models. As a result, the process of investigation and statistical analysis of the abnormality would be made much more comfortable and convenient. The proposed approach’s performance seems to be much better compared to its counterparts, with an accuracy of 77% with minimal training of the model. Furthermore, the performance of the proposed training model is evaluated through various performance evaluation metrics like sensitivity, specificity, the Jaccard Similarity Index, and the Matthews correlation coefficient, where the proposed model is productive with minimal training.


Author(s):  
Narongsak Chayangkoon ◽  
Anongnart Srivihok

<span>Methamphetamine addiction is a prominent problem in Southeast Asia. Drug addicts often discuss illegal activities on popular social networking services. These individuals spread messages on social media as a means of both buying and selling drugs online. This paper proposes a model, the “text classification model of methamphetamine tweets in Southeast Asia” (TMTA), to identify whether a tweet from Southeast Asia is related to methamphetamine abuse. The research addresses the weakness of bag of words (BoW) by introducing BoW and Word2Vec feature selection (BWF) techniques. A domain-based feature selection method was performed using the BoW dataset and Word2Vec. The BWF dataset provided a smaller number of features than the BoW and TF–IDF dataset. We experimented with three candidate classifiers: Support vector machine (SVM), decision tree (J48) and naive bayes (NB). We found that the J48 classifier with the BWF dataset provided the best performance for the TMTA in terms of accuracy (0.815), F-measure (0.818), Kappa (0.528), Matthews correlation coefficient (0.529) and high area under the ROC Curve (0.763). Moreover, TMTA provided the lowest runtime (3.480 seconds) using the J48 with the BWF dataset.</span>


Mathematics ◽  
2021 ◽  
Vol 9 (14) ◽  
pp. 1644
Author(s):  
Anh-Hien Dao ◽  
Cheng-Zen Yang

The severity of software bug reports plays an important role in maintaining software quality. Many approaches have been proposed to predict the severity of bug reports using textual information. In this research, we propose a deep learning framework called MASP that uses convolutional neural networks (CNN) and the content-aspect, sentiment-aspect, quality-aspect, and reporter-aspect features of bug reports to improve prediction performance. We have performed experiments on datasets collected from Eclipse and Mozilla. The results show that the MASP model outperforms the state-of-the-art CNN model in terms of average Accuracy, Precision, Recall, F1-measure, and the Matthews Correlation Coefficient (MCC) by 1.83%, 0.46%, 3.23%, 1.72%, and 6.61%, respectively.


Antibiotics ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 815
Author(s):  
Atul Tyagi ◽  
Sudeep Roy ◽  
Sanjay Singh ◽  
Manoj Semwal ◽  
Ajit K. Shasany ◽  
...  

Emerging infectious diseases (EID) are serious problems caused by fungi in humans and plant species. They are a severe threat to food security worldwide. In our current work, we have developed a support vector machine (SVM)-based model that attempts to design and predict therapeutic plant-derived antifungal peptides (PhytoAFP). The residue composition analysis shows the preference of C, G, K, R, and S amino acids. Position preference analysis shows that residues G, K, R, and A dominate the N-terminal. Similarly, residues N, S, C, and G prefer the C-terminal. Motif analysis reveals the presence of motifs like NYVF, NYVFP, YVFP, NYVFPA, and VFPA. We have developed two models using various input functions such as mono-, di-, and tripeptide composition, as well as binary, hybrid, and physiochemical properties, based on methods that are applied to the main data set. The TPC-based monopeptide composition model achieved more accuracy, 94.4%, with a Matthews correlation coefficient (MCC) of 0.89. Correspondingly, the second-best model based on dipeptides achieved an accuracy of 94.28% under the MCC 0.89 of the training dataset.


Author(s):  
Andrea González-Ramírez ◽  
Josué Lopez ◽  
Deni Torres ◽  
Israel Yañez-Vargas

Remote sensing imaging datasets for classification generally present high levels of imbalance between classes of interest. This work presented a study of a set of performance evaluation metrics for an imbalance dataset. In this work, a support vector machine (SVM) was used to perform the classification of seven classes of interest in a popular dataset called Salinas-A. The performance evaluation of the classifier was performed using two types of metrics: 1) Metrics for multi-class classification, and 2) Metrics based on the binary confusion matrix. In the results, a comparison of the scores of each metric is developed, some being more optimistic than others due to the bias that they present given the imbalance. In addition, our case study helps to conclude that the Matthews correlation coefficient (MCC) presents the lowest bias in imbalanced cases and is regarded to be robust metric. These results can be extended to any imbalanced dataset taking into account the equations developed by Luque.


2021 ◽  
Vol 11 (11) ◽  
pp. 4942
Author(s):  
Jorge E. Preciado-Velasco ◽  
Joan D. Gonzalez-Franco ◽  
Caridad E. Anias-Calderon ◽  
Juan I. Nieto-Hipolito ◽  
Raul Rivera-Rodriguez

The classification of services in 5G/B5G (Beyond 5G) networks has become important for telecommunications service providers, who face the challenge of simultaneously offering a better Quality of Service (QoS) in their networks and a better Quality of Experience (QoE) to users. Service classification allows 5G service providers to accurately select the network slices for each service, thereby improving the QoS of the network and the QoE perceived by users, and ensuring compliance with the Service Level Agreement (SLA). Some projects have developed systems for classifying these services based on the Key Performance Indicators (KPIs) that characterize the different services. However, Key Quality Indicators (KQIs) are also significant in 5G networks, although these are generally not considered. We propose a service classifier that uses a Machine Learning (ML) approach based on Supervised Learning (SL) to improve classification and to support a better distribution of resources and traffic over 5G/B5G based networks. We carry out simulations of our proposed scheme using different SL algorithms, first with KPIs alone and then incorporating KQIs and show that the latter achieves better prediction, with an accuracy of 97% and a Matthews correlation coefficient of 96.6% with a Random Forest classifier.


Author(s):  
Bassam Sulaiman Arkok ◽  
Akram Mohammed Zeki

Imbalanced classification techniques have been applied widely in the field of data mining. It is used to classify the imbalanced classes that are not equal in the number of samples. The problem of imbalanced classes is that the classification performance tends to the class with more samples while the class with few samples will obtain poor performance. This problem can be occurred in the Qur’anic classification due to the different number of verses. Many studies classified Qur’anic verses, which depended on the traditional classification. However, no study classified Qur’anic topics based on the techniques of imbalanced classification. Therefore, this paper aims to apply the methods of imbalanced classification as synthetic minority over-sampling technique (SMOTE), random over sample (ROS), and random under sample (RUS) methods to classify the Qur’anic topics that are imbalanced. Many metrics were used in this research to evaluate the experimental results. These metrics are sensitivity/recall, specificity, overall accuracy, F-Measure, G-mean, and matthews correlation coefficient (MCC). The results showed that the Quranic classification performance improved when imbalanced classification techniques were applied


Sign in / Sign up

Export Citation Format

Share Document