An efficient multi-classifier method for differential diagnosis

2020 ◽  
Vol 14 (3) ◽  
pp. 337-347
Author(s):  
Mohammad Mahdi Ershadi ◽  
Abbas Seifi

There are many useful data mining methods for diagnosis of diseases and cancers. However, early diagnosis of a disease or cancer could significantly affect the chance of patient survival in some cases. The objective of this study is to develop a method for helping accurate diagnosis of different diseases based on various classification methods. Knowledge collection from domain experts is challenging, inaccessible and time-consuming; so we design a multi-classifier using a dynamic classifier and clustering selection approach to takes advantages of these methods based on data. We combine Forward-backward and Principal Component Analysis for feature reduction. The multi-classifier evaluates three clustering methods and ascertains the best classification methods in each cluster based on some training data. In this study, we use ten datasets taken from Machine Learning Repository datasets of the University of California at Irvine (UCI). The proposed multi-classifier improves both computation time and accuracy as compared with all other classification methods. It achieves maximum accuracy with minimum standard deviation over the sampled datasets.

2020 ◽  
Vol 13 (1) ◽  
pp. 103-126 ◽  
Author(s):  
Mohammad Mahdi Ershadi ◽  
Abbas Seifi

PurposeThis study aims to differential diagnosis of some diseases using classification methods to support effective medical treatment. For this purpose, different classification methods based on data, experts’ knowledge and both are considered in some cases. Besides, feature reduction and some clustering methods are used to improve their performance.Design/methodology/approachFirst, the performances of classification methods are evaluated for differential diagnosis of different diseases. Then, experts' knowledge is utilized to modify the Bayesian networks' structures. Analyses of the results show that using experts' knowledge is more effective than other algorithms for increasing the accuracy of Bayesian network classification. A total of ten different diseases are used for testing, taken from the Machine Learning Repository datasets of the University of California at Irvine (UCI).FindingsThe proposed method improves both the computation time and accuracy of the classification methods used in this paper. Bayesian networks based on experts' knowledge achieve a maximum average accuracy of 87 percent, with a minimum standard deviation average of 0.04 over the sample datasets among all classification methods.Practical implicationsThe proposed methodology can be applied to perform disease differential diagnosis analysis.Originality/valueThis study presents the usefulness of experts' knowledge in the diagnosis while proposing an adopted improvement method for classifications. Besides, the Bayesian network based on experts' knowledge is useful for different diseases neglected by previous papers.


2021 ◽  
Vol 22 (1) ◽  
pp. 53-66
Author(s):  
D. Anand Joseph Daniel ◽  
M. Janaki Meena

Sentiment analysis of online product reviews has become a mainstream way for businesses on e-commerce platforms to promote their products and improve user satisfaction. Hence, it is necessary to construct an automatic sentiment analyser for automatic identification of sentiment polarity of the online product reviews. Traditional lexicon-based approaches used for sentiment analysis suffered from several accuracy issues while machine learning techniques require labelled training data. This paper introduces a hybrid sentiment analysis framework to bond the gap between both machine learning and lexicon-based approaches. A novel tunicate swarm algorithm (TSA) based feature reduction is integrated with the proposed hybrid method to solve the scalability issue that arises due to a large feature set. It reduces the feature set size to 43% without changing the accuracy (93%). Besides, it improves the scalability, reduces the computation time and enhances the overall performance of the proposed framework. From experimental analysis, it can be observed that TSA outperforms existing feature selection techniques such as particle swarm optimization and genetic algorithm. Moreover, the proposed approach is analysed with performance metrics such as recall, precision, F1-score, feature size and computation time.


2021 ◽  
Vol 11 (1) ◽  
pp. 29-49
Author(s):  
Amit Kumar ◽  
Bikash Kanti Sarkar

Research in disease diagnosis is a challenging task due to inconsistent, class imbalance, conflicting, and the high dimensionality of medical data sets. The excellent features of each data set play an important role in improving performance of classifiers that may follow either iterative or non-iterative approaches. In the present study, a comparative study is carried out to show the performance of iterative and non-iterative classifiers in combination with genetic algorithm (GA)-based feature selection approach over some widely used medical data sets. The experiment assists to identify the clinical data sets for which feature reduction is necessary for improving performance of classifiers. For iterative approaches, two popular classifiers, namely C4.5 and RIPPER, are chosen, whereas k-NN and naïve Bayes are taken as non-iterative learners. Fourteen real-world medical domain data sets are selected from the University of California, Irvine (UCI Repository) for conducting experiment over the learners.


Author(s):  
Soroush Mohammadzadeh ◽  
Yeesock Kim

In this book chapter, a system identification method for modeling nonlinear behavior of smart buildings is discussed that has a significantly low computation time. To reduce the size of the training data used for the adaptive neuro-fuzzy inference system (ANFIS), principal component analysis (PCA) is used, i.e., PCA-based adaptive neuro-fuzzy inference system: PANFIS. The PANFIS model is evaluated on a seismically excited three-story building equipped with a magnetorheological (MR) damper. The PANFIS model is trained using an artificial earthquake that contains a variety of characteristics of earthquakes. The trained PANFIS model is tested using four different earthquakes. It was demonstrated that the proposed PANFIS model is effective in modeling nonlinear behavior of a smart building with significant reduction in computational loads.


Fuzzy Systems ◽  
2017 ◽  
pp. 1183-1202
Author(s):  
Soroush Mohammadzadeh ◽  
Yeesock Kim

In this book chapter, a system identification method for modeling nonlinear behavior of smart buildings is discussed that has a significantly low computation time. To reduce the size of the training data used for the adaptive neuro-fuzzy inference system (ANFIS), principal component analysis (PCA) is used, i.e., PCA-based adaptive neuro-fuzzy inference system: PANFIS. The PANFIS model is evaluated on a seismically excited three-story building equipped with a magnetorheological (MR) damper. The PANFIS model is trained using an artificial earthquake that contains a variety of characteristics of earthquakes. The trained PANFIS model is tested using four different earthquakes. It was demonstrated that the proposed PANFIS model is effective in modeling nonlinear behavior of a smart building with significant reduction in computational loads.


Author(s):  
Dilip Kumar Choubey ◽  
Sudhakar Tripathi ◽  
Prabhat Kumar ◽  
Vaibhav Shukla ◽  
Vinay Kumar Dhandhania

Background: Classification method is needed to deduce the possible errors and assist the doctor’s. These methods are used in every many of our lives to take suitable decisions. It is well known that classification is an efficient, effective and broadly utilized strategy in several applications such as medical disease diagnosis, etc. The prime objective of this research paper is to achieve an efficient and effective classification method for Diabetes. Discussion: The proposed methodology comprises of two phases: The first phase deals with description of Pima Indian Diabetes Dataset and Localized Diabetes Dataset whereas in the second phase dataset has been processed through two different approaches. First approach entails classification through Polynomial Kernel, RBF Kernel, Sigmoid Function Kernel and Linear Kernel SVM on Pima Indian Diabetes Dataset and Localized Diabetes Dataset. In the second approach, PSO have been utilized as a feature reduction method followed by using the same set of classification methods used in the first approach. PSO_Linear Kernel SVM provides the highest accuracy and ROC for both the above mentioned dataset. Conclusion: In this research paper, comparative analysis of outcomes w.r.t. performance assessment has been done using both with PSO and without PSO for the same set of classification methods. Finally, it has been concluded that PSO is selecting the relevant features, reducing the expense and computation time while improving the ROC and accuracy. The used methodology may similarly be implemented in other medical diseases.


2021 ◽  
Vol 13 (11) ◽  
pp. 2125
Author(s):  
Bardia Yousefi ◽  
Clemente Ibarra-Castanedo ◽  
Martin Chamberland ◽  
Xavier P. V. Maldague ◽  
Georges Beaudoin

Clustering methods unequivocally show considerable influence on many recent algorithms and play an important role in hyperspectral data analysis. Here, we challenge the clustering for mineral identification using two different strategies in hyperspectral long wave infrared (LWIR, 7.7–11.8 μm). For that, we compare two algorithms to perform the mineral identification in a unique dataset. The first algorithm uses spectral comparison techniques for all the pixel-spectra and creates RGB false color composites (FCC). Then, a color based clustering is used to group the regions (called FCC-clustering). The second algorithm clusters all the pixel-spectra to directly group the spectra. Then, the first rank of non-negative matrix factorization (NMF) extracts the representative of each cluster and compares results with the spectral library of JPL/NASA. These techniques give the comparison values as features which convert into RGB-FCC as the results (called clustering rank1-NMF). We applied K-means as clustering approach, which can be modified in any other similar clustering approach. The results of the clustering-rank1-NMF algorithm indicate significant computational efficiency (more than 20 times faster than the previous approach) and promising performance for mineral identification having up to 75.8% and 84.8% average accuracies for FCC-clustering and clustering-rank1 NMF algorithms (using spectral angle mapper (SAM)), respectively. Furthermore, several spectral comparison techniques are used also such as adaptive matched subspace detector (AMSD), orthogonal subspace projection (OSP) algorithm, principal component analysis (PCA), local matched filter (PLMF), SAM, and normalized cross correlation (NCC) for both algorithms and most of them show a similar range in accuracy. However, SAM and NCC are preferred due to their computational simplicity. Our algorithms strive to identify eleven different mineral grains (biotite, diopside, epidote, goethite, kyanite, scheelite, smithsonite, tourmaline, pyrope, olivine, and quartz).


2021 ◽  
Vol 13 (3) ◽  
pp. 526
Author(s):  
Shengliang Pu ◽  
Yuanfeng Wu ◽  
Xu Sun ◽  
Xiaotong Sun

The nascent graph representation learning has shown superiority for resolving graph data. Compared to conventional convolutional neural networks, graph-based deep learning has the advantages of illustrating class boundaries and modeling feature relationships. Faced with hyperspectral image (HSI) classification, the priority problem might be how to convert hyperspectral data into irregular domains from regular grids. In this regard, we present a novel method that performs the localized graph convolutional filtering on HSIs based on spectral graph theory. First, we conducted principal component analysis (PCA) preprocessing to create localized hyperspectral data cubes with unsupervised feature reduction. These feature cubes combined with localized adjacent matrices were fed into the popular graph convolution network in a standard supervised learning paradigm. Finally, we succeeded in analyzing diversified land covers by considering local graph structure with graph convolutional filtering. Experiments on real hyperspectral datasets demonstrated that the presented method offers promising classification performance compared with other popular competitors.


2017 ◽  
Vol 3 (2) ◽  
pp. 811-814 ◽  
Author(s):  
Erik Rodner ◽  
Marcel Simon ◽  
Joachim Denzler

AbstractWe present an automated approach for rating HER2 over-expressions in given whole-slide images of breast cancer histology slides. The slides have a very high resolution and only a small part of it is relevant for the rating.Our approach is based on Convolutional Neural Networks (CNN), which are directly modelling the whole computer vision pipeline, from feature extraction to classification, with a single parameterized model. CNN models have led to a significant breakthrough in a lot of vision applications and showed promising results for medical tasks. However, the required size of training data is still an issue. Our CNN models are pre-trained on a large set of datasets of non-medical images, which prevents over-fitting to the small annotated dataset available in our case. We assume the selection of the probe in the data with just a single mouse click defining a point of interest. This is reasonable especially for slices acquired together with another sample. We sample image patches around the point of interest and obtain bilinear features by passing them through a CNN and encoding the output of the last convolutional layer with its second-order statistics.Our approach ranked second in the Her2 contest held by the University of Warwick achieving 345 points compared to 348 points of the winning team. In addition to pure classification, our approach would also allow for localization of parts of the slice relevant for visual detection of Her2 over-expression.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1573
Author(s):  
Loris Nanni ◽  
Giovanni Minchio ◽  
Sheryl Brahnam ◽  
Gianluca Maguolo ◽  
Alessandra Lumini

Traditionally, classifiers are trained to predict patterns within a feature space. The image classification system presented here trains classifiers to predict patterns within a vector space by combining the dissimilarity spaces generated by a large set of Siamese Neural Networks (SNNs). A set of centroids from the patterns in the training data sets is calculated with supervised k-means clustering. The centroids are used to generate the dissimilarity space via the Siamese networks. The vector space descriptors are extracted by projecting patterns onto the similarity spaces, and SVMs classify an image by its dissimilarity vector. The versatility of the proposed approach in image classification is demonstrated by evaluating the system on different types of images across two domains: two medical data sets and two animal audio data sets with vocalizations represented as images (spectrograms). Results show that the proposed system’s performance competes competitively against the best-performing methods in the literature, obtaining state-of-the-art performance on one of the medical data sets, and does so without ad-hoc optimization of the clustering methods on the tested data sets.


Sign in / Sign up

Export Citation Format

Share Document