Statistical Clustering and Classification

The Fuzzy C-means (FCM) algorithm has been widely used in the field of clustering and classification but has encountered difficulties with noisy data and outliers. Other versions of algorithms related to possibilistic theory have given good results, such as Fuzzy C- Means(FCM), possibilistic C-means (PCM), Fuzzy possibilistic C-means (FPCM) and possibilistic fuzzy C- Means algorithm (PFCM).This last algorithm works effectively in some environments but encountered more shortcomings with noisy databases. To solve this problem, we propose in this manuscript, a new algorithm named Improved Possibilistic Fuzzy C-Means (ImPFCM) by combining the PFCM algorithm with a very powerful statistical method. The properties of this new ImPFCM algorithm show that it is not only applicable on clusters of spherical shapes, but also on clusters of different sizes and densities. The results of the comparative study with very recent algorithms indicate the performance and the superiority of the proposed approach to easily group the datasets in a large-dimensional space and to use not only the Euclidean distance but more sophisticated standards norms, capable to deal with much more complicated problems. On the other hand, we have demonstrated that the ImPFCM algorithm is also capable of detecting the cluster center with high accuracy and performing satisfactorily in multiple environments with noisy data and outliers.

Download Full-text

Model-Based Clustering and Classification for Data Science

10.1017/9781108644181 ◽

2019 ◽

Cited By ~ 17

Author(s):

Charles Bouveyron ◽

Gilles Celeux ◽

T. Brendan Murphy ◽

Adrian E. Raftery

Keyword(s):

Data Science ◽

Model Based Clustering ◽

Model Based ◽

Clustering And Classification

Download Full-text

Prediction Analysis Technique based on Clustering and Classification

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i6.688692 ◽

2018 ◽

Vol 6 (6) ◽

pp. 688-692

Author(s):

Bhupendra Kumar Jain ◽

Manish Tiwari

Keyword(s):

Prediction Analysis ◽

Analysis Technique ◽

Clustering And Classification

Download Full-text

STATISTICAL CLUSTERING ANALYSIS OF ARTIODACTYL POSTCRANIAL MORPHOLOGY

10.1130/abs/2020am-359831 ◽

2020 ◽

Author(s):

Nicholas Weldon ◽

◽

Win McLaughlin

Keyword(s):

Clustering Analysis ◽

Postcranial Morphology ◽

Statistical Clustering

Download Full-text

A set theory based similarity measure for text clustering and classification

Journal Of Big Data ◽

10.1186/s40537-020-00344-3 ◽

2020 ◽

Vol 7 (1) ◽

Cited By ~ 1

Author(s):

Ali A. Amer ◽

Hassan I. Abdalla

Keyword(s):

Set Theory ◽

Similarity Measure ◽

Similarity Measures ◽

Text Clustering ◽

Plagiarism Detection ◽

K Nearest Neighbor ◽

Single Measure ◽

Highly Effective ◽

Clustering And Classification ◽

Effectiveness And Efficiency

Abstract Similarity measures have long been utilized in information retrieval and machine learning domains for multi-purposes including text retrieval, text clustering, text summarization, plagiarism detection, and several other text-processing applications. However, the problem with these measures is that, until recently, there has never been one single measure recorded to be highly effective and efficient at the same time. Thus, the quest for an efficient and effective similarity measure is still an open-ended challenge. This study, in consequence, introduces a new highly-effective and time-efficient similarity measure for text clustering and classification. Furthermore, the study aims to provide a comprehensive scrutinization for seven of the most widely used similarity measures, mainly concerning their effectiveness and efficiency. Using the K-nearest neighbor algorithm (KNN) for classification, the K-means algorithm for clustering, and the bag of word (BoW) model for feature selection, all similarity measures are carefully examined in detail. The experimental evaluation has been made on two of the most popular datasets, namely, Reuters-21 and Web-KB. The obtained results confirm that the proposed set theory-based similarity measure (STB-SM), as a pre-eminent measure, outweighs all state-of-art measures significantly with regards to both effectiveness and efficiency.

Download Full-text

Clustering and Classification of Manufacturing Enterprises Regarding Their Industry 4.0 Reshoring Incentives

Procedia Computer Science ◽

10.1016/j.procs.2021.01.292 ◽

2021 ◽

Vol 180 ◽

pp. 696-705

Author(s):

Petra Unterberger ◽

Julian M. Müller

Keyword(s):

Industry 4.0 ◽

Manufacturing Enterprises ◽

Clustering And Classification

Download Full-text

Incorporating statistical clustering methods into mortality models to improve forecasting performances

Insurance Mathematics and Economics ◽

10.1016/j.insmatheco.2021.03.005 ◽

2021 ◽

Author(s):

Cary Chi-Liang Tsai ◽

Echo Sihan Cheng

Keyword(s):

Clustering Methods ◽

Mortality Models ◽

Statistical Clustering

Download Full-text

Performance evaluation of ImmunoCAP® ISAC 112: a multi-site study

Clinical Chemistry and Laboratory Medicine (CCLM) ◽

10.1515/cclm-2016-0586 ◽

2017 ◽

Vol 55 (4) ◽

Cited By ~ 9

Author(s):

Marianne van Hage ◽

Peter Schmid-Grendelmeier ◽

Chrysanthi Skevaki ◽

Mario Plebani ◽

Walter Canonica ◽

...

Keyword(s):

South African ◽

Low Frequency ◽

Specific Ige ◽

High Concentration ◽

Serum Samples ◽

Site Analysis ◽

Calibration Sample ◽

Clustering And Classification ◽

Study Sites ◽

Positive Results

Abstract Background: After the re-introduction of ImmunoCAP Methods: The study was carried out at 22 European and one South African site. Microarrays from different batches, eight specific IgE (sIgE) positive, three sIgE negative serum samples and a calibration sample were sent to participating laboratories where assays were performed according to the manufacturer’s instructions. Results: For both the negative and positive samples results were consistent between sites, with a very low frequency of false positive results (0.014%). A similar pattern of results for each of the samples was observed across the 23 sites. Homogeneity analysis of all measurements for each sample were well clustered, indicating good reproducibility; unsupervised hierarchical clustering and classification via random forests, showed clustering of identical samples independent of the assay site. Analysis of raw continuous data confirmed the good accuracy across the study sites; averaged standardized, site-specific ISU-E values fell close to the center of the distribution of measurements from all sites. After outlier filtering, variability across the whole study was estimated at 25.5%, with values of 22%, 27.1% and 22.4% for the ‘Low’, ‘Moderate to High’ and ‘Very High’ concentration categories, respectively. Conclusions: The study shows a robust performance of the ImmunoCAP

Download Full-text

Hydrometeor classification through statistical clustering of polarimetric radar measurements: a semi-supervised approach

Atmospheric Measurement Techniques ◽

10.5194/amt-9-4425-2016 ◽

2016 ◽

Vol 9 (9) ◽

pp. 4425-4445 ◽

Cited By ~ 31

Author(s):

Nikola Besic ◽

Jordi Figueras i Ventura ◽

Jacopo Grazioli ◽

Marco Gabella ◽

Urs Germann ◽

...

Keyword(s):

Polarimetric Radar ◽

Statistical Framework ◽

X Band ◽

Different Types ◽

Unsupervised Approach ◽

Radar Measurements ◽

Supervised Classification Methods ◽

Operational Potential ◽

Statistical Clustering

Abstract. Polarimetric radar-based hydrometeor classification is the procedure of identifying different types of hydrometeors by exploiting polarimetric radar observations. The main drawback of the existing supervised classification methods, mostly based on fuzzy logic, is a significant dependency on a presumed electromagnetic behaviour of different hydrometeor types. Namely, the results of the classification largely rely upon the quality of scattering simulations. When it comes to the unsupervised approach, it lacks the constraints related to the hydrometeor microphysics. The idea of the proposed method is to compensate for these drawbacks by combining the two approaches in a way that microphysical hypotheses can, to a degree, adjust the content of the classes obtained statistically from the observations. This is done by means of an iterative approach, performed offline, which, in a statistical framework, examines clustered representative polarimetric observations by comparing them to the presumed polarimetric properties of each hydrometeor class. Aside from comparing, a routine alters the content of clusters by encouraging further statistical clustering in case of non-identification. By merging all identified clusters, the multi-dimensional polarimetric signatures of various hydrometeor types are obtained for each of the studied representative datasets, i.e. for each radar system of interest. These are depicted by sets of centroids which are then employed in operational labelling of different hydrometeors. The method has been applied on three C-band datasets, each acquired by different operational radar from the MeteoSwiss Rad4Alp network, as well as on two X-band datasets acquired by two research mobile radars. The results are discussed through a comparative analysis which includes a corresponding supervised and unsupervised approach, emphasising the operational potential of the proposed method.

Download Full-text

Application of statistical clustering to mathematical description and control of continuous processes with discrete event output

Conference Record of the 1992 IEEE Industry Applications Society Annual Meeting ◽

10.1109/ias.1992.244232 ◽

2003 ◽

Author(s):

V.A. Skormin

Keyword(s):

Mathematical Description ◽

Discrete Event ◽

Continuous Processes ◽

And Control ◽

Statistical Clustering

Download Full-text

Statistical Clustering and Classification

A Generalization of Possibilistic Fuzzy C-Means Method for Statistical Clustering of Data

Model-Based Clustering and Classification for Data Science

Prediction Analysis Technique based on Clustering and Classification

STATISTICAL CLUSTERING ANALYSIS OF ARTIODACTYL POSTCRANIAL MORPHOLOGY

A set theory based similarity measure for text clustering and classification

Clustering and Classification of Manufacturing Enterprises Regarding Their Industry 4.0 Reshoring Incentives

Incorporating statistical clustering methods into mortality models to improve forecasting performances

Performance evaluation of ImmunoCAP® ISAC 112: a multi-site study

Hydrometeor classification through statistical clustering of polarimetric radar measurements: a semi-supervised approach

Application of statistical clustering to mathematical description and control of continuous processes with discrete event output

Export Citation Format