attribute clustering Latest Research Papers

An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples

PeerJ Computer Science ◽

10.7717/peerj-cs.671 ◽

2021 ◽

Vol 7 ◽

pp. e671

Author(s):

Shilpi Bose ◽

Chandra Das ◽

Abhik Banerjee ◽

Kuntal Ghosh ◽

Matangini Chattopadhyay ◽

...

Keyword(s):

Machine Learning ◽

Clustering Algorithm ◽

Class Imbalance ◽

Classification Model ◽

Machine Learning Techniques ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Ensemble Machine Learning ◽

Learning Techniques ◽

Attribute Clustering

Background Machine learning is one kind of machine intelligence technique that learns from data and detects inherent patterns from large, complex datasets. Due to this capability, machine learning techniques are widely used in medical applications, especially where large-scale genomic and proteomic data are used. Cancer classification based on bio-molecular profiling data is a very important topic for medical applications since it improves the diagnostic accuracy of cancer and enables a successful culmination of cancer treatments. Hence, machine learning techniques are widely used in cancer detection and prognosis. Methods In this article, a new ensemble machine learning classification model named Multiple Filtering and Supervised Attribute Clustering algorithm based Ensemble Classification model (MFSAC-EC) is proposed which can handle class imbalance problem and high dimensionality of microarray datasets. This model first generates a number of bootstrapped datasets from the original training data where the oversampling procedure is applied to handle the class imbalance problem. The proposed MFSAC method is then applied to each of these bootstrapped datasets to generate sub-datasets, each of which contains a subset of the most relevant/informative attributes of the original dataset. The MFSAC method is a feature selection technique combining multiple filters with a new supervised attribute clustering algorithm. Then for every sub-dataset, a base classifier is constructed separately, and finally, the predictive accuracy of these base classifiers is combined using the majority voting technique forming the MFSAC-based ensemble classifier. Also, a number of most informative attributes are selected as important features based on their frequency of occurrence in these sub-datasets. Results To assess the performance of the proposed MFSAC-EC model, it is applied on different high-dimensional microarray gene expression datasets for cancer sample classification. The proposed model is compared with well-known existing models to establish its effectiveness with respect to other models. From the experimental results, it has been found that the generalization performance/testing accuracy of the proposed classifier is significantly better compared to other well-known existing models. Apart from that, it has been also found that the proposed model can identify many important attributes/biomarker genes.

Download Full-text

The Electromagnetic Signal Track Correlation Algorithm based on Attribute Clustering and Space-time Constraints

Journal of Physics Conference Series ◽

10.1088/1742-6596/1961/1/012020 ◽

2021 ◽

Vol 1961 (1) ◽

pp. 012020

Author(s):

Bo Wu ◽

Jianan Wang ◽

Zhaojun Wang ◽

Jiecheng Yu ◽

Yue Gao ◽

...

Keyword(s):

Space Time ◽

Time Constraints ◽

Electromagnetic Signal ◽

Correlation Algorithm ◽

Track Correlation ◽

Attribute Clustering

Download Full-text

Attribute Clustering

Encyclopedic Dictionary of Archaeology ◽

10.1007/978-3-030-58292-0_10987 ◽

2021 ◽

pp. 107-107

Keyword(s):

Attribute Clustering

Download Full-text

A study on Two-Stage Mixed Attribute Data Clustering Based on Density Peaks

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/5/2 ◽

2021 ◽

Vol 18 (5) ◽

Author(s):

Shihua Liu ◽

Hao Zhang ◽

Xianghua Liu

Keyword(s):

Data Clustering ◽

Clustering Algorithm ◽

Two Stage ◽

One Dimensional ◽

Attribute Data ◽

Numerical Attributes ◽

Density Peaks ◽

Density Peaks Clustering ◽

Categorical Attribute ◽

Attribute Clustering

A Two-stage clustering framework and a clustering algorithm for mixed attribute data based on density peaks and Goodall distance are proposed. Firstly, the subset of numerical attributes of the dataset is clustered, and then the result is mapped into one-dimensional categorical attribute and added to the subset of categorical attribute data. Finally, the new dataset is clustered by the density peaks clustering algorithm to obtain the final result. Experiments on three commonly used UCI datasets show that this algorithm can effectively realize mixed attribute clustering and produce better clustering results than the traditional K-prototypes algorithm do. The clustering accuracy on the Acute, Heart and Credit datasets are 17%, 24%, and 21% higher on average than that of the K-prototypes, respectively.

Download Full-text

Feature Detection of Nanofibers Based on Nonlinear Mapping Pattern Recognition

Nanoscience and Nanotechnology Letters ◽

10.1166/nnl.2020.3145 ◽

2020 ◽

Vol 12 (4) ◽

pp. 506-511

Author(s):

Min Sun ◽

Jiang Duan

Keyword(s):

Pattern Recognition ◽

Association Rule ◽

Feature Detection ◽

Recognition Performance ◽

Spatial Spectrum ◽

Distribution Model ◽

Nonlinear Mapping ◽

Reconstruction Method ◽

Extraction Capacity ◽

Attribute Clustering

To enhance the feature extraction capacity of nanofibers, a method of feature detection based on nonlinear mapping pattern recognition is proposed. The characteristic distribution model of nanofibers is constructed, and the spectral characteristic decomposition method is used to recognize the nonlinear mapping pattern of nanofibers at current density. The spatial spectrum beam forming processing of nanofiber features is carried out by using cluster–cluster hybrid molecular reconstruction method, and the association rule feature decomposition of nanofibers is carried out by recursive graph analysis method, and the nonlinear mapping pattern recognition of nanofiber features is realized. The classification and recognition of nanofiber features are carried out by combining the correlation attribute clustering method, and the characteristics detection optimization of nanofibers is realized. The proposed method has higher acurracy than other methods. The pattern recognition performance of nonlinear mapping is good, and the ability of accurate recognition of the crystal structure characteristics of nanofibers is better.

Download Full-text

Mixed Attribute Clustering Algorithm Based on Filtering Mechanism

2019 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) ◽

10.1109/cyberc.2019.00040 ◽

2019 ◽

Author(s):

Wenxin Wang ◽

Ruilin Zhang

Keyword(s):

Clustering Algorithm ◽

Attribute Clustering

Download Full-text

Clustering Approach toward Large Truck Crash Analysis

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198119839347 ◽

2019 ◽

Vol 2673 (8) ◽

pp. 73-85 ◽

Cited By ~ 7

Author(s):

Alireza Rahimi ◽

Ghazaleh Azimi ◽

Hamidreza Asgari ◽

Xia Jin

Keyword(s):

Goodness Of Fit ◽

Crash Analysis ◽

Clustering Methods ◽

Crash Data ◽

Pseudo Likelihood ◽

Single Vehicle ◽

Clustering Approach ◽

Block Clustering ◽

Attribute Clustering ◽

High Dimensional Clustering

Heterogeneity of crash data masks the underlying crash patterns and perplexes crash analysis. This paper aims to explore an advanced high-dimensional clustering approach to investigate heterogeneity in large datasets. Detailed records of crashes involving large trucks occurring in the state of Florida between 2007 and 2016 were examined to identify truck crash patterns and significant conditions contributing to the patterns. The block clustering method was applied to more than 220,000 crash records with nearly 200 attributes. The analysis showed promising results in segmenting a large heterogeneous dataset into meaningful subgroups (with 95.72% average degree of homogeneity for selected blocks). The goodness of fit for clustering methods is evaluated and both integrated completed likelihood (ICL) and pseudo-likelihood values improved significantly (20.8% and 21.1% respectively). Attribute clustering showed distinct characteristics for each cluster. Crash clustering revealed significant differences among the clusters and suggested that this crash dataset could be portioned as same-direction, opposing-direction, and single-vehicle crashes. Individual blocks defined by both row and column clustering were further investigated to better understand the contribution set of conditions that lead to large truck crashes. Major features for each of the three major types of crashes were analyzed, which may provide additional insights to develop potential countermeasures and strategies that target specific segments. The clustering approach could be used as a preanalysis method to identify homogeneous subgroups for further analysis, which will help enhance the effectiveness of safety programs.

Download Full-text

A Novel Software Reliability Prediction Algorithm Using Fuzzy Attribute Clustering and Nave Bayesian Classification

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i2.7382 ◽

2019 ◽

Vol 7 (2) ◽

pp. 73-82

Author(s):

Neeta Rastogi ◽

Shishir Rastogi ◽

Manuj Darbari

Keyword(s):

Software Reliability ◽

Bayesian Classification ◽

Reliability Prediction ◽

Prediction Algorithm ◽

Software Reliability Prediction ◽

Attribute Clustering

Download Full-text

A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications

PLoS Computational Biology ◽

10.1371/journal.pcbi.1006772 ◽

2019 ◽

Vol 15 (2) ◽

pp. e1006772 ◽

Cited By ~ 5

Author(s):

He Peng ◽

Xiangxiang Zeng ◽

Yadi Zhou ◽

Defu Zhang ◽

Ruth Nussinov ◽

...

Keyword(s):

Data Analysis ◽

Single Cell ◽

Rna Sequencing ◽

Sequencing Data ◽

Single Cell Rna Sequencing ◽

Attribute Clustering ◽

Sequencing Data Analysis

Download Full-text

Entropy-based Attribute Clustering

2019 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT-NCON) ◽

10.1109/ecti-ncon.2019.8692247 ◽

2019 ◽

Author(s):

Adison Khomprasert ◽

Thanawin Rakthamanon ◽

Kitsana Waiyamai

Keyword(s):

Attribute Clustering

Download Full-text

attribute clustering
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples

The Electromagnetic Signal Track Correlation Algorithm based on Attribute Clustering and Space-time Constraints

Attribute Clustering

A study on Two-Stage Mixed Attribute Data Clustering Based on Density Peaks

Feature Detection of Nanofibers Based on Nonlinear Mapping Pattern Recognition

Mixed Attribute Clustering Algorithm Based on Filtering Mechanism

Clustering Approach toward Large Truck Crash Analysis

A Novel Software Reliability Prediction Algorithm Using Fuzzy Attribute Clustering and Nave Bayesian Classification

A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications

Entropy-based Attribute Clustering

Export Citation Format

attribute clusteringRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples

The Electromagnetic Signal Track Correlation Algorithm based on Attribute Clustering and Space-time Constraints

Attribute Clustering

A study on Two-Stage Mixed Attribute Data Clustering Based on Density Peaks

Feature Detection of Nanofibers Based on Nonlinear Mapping Pattern Recognition

Mixed Attribute Clustering Algorithm Based on Filtering Mechanism

Clustering Approach toward Large Truck Crash Analysis

A Novel Software Reliability Prediction Algorithm Using Fuzzy Attribute Clustering and Nave Bayesian Classification

A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications

Entropy-based Attribute Clustering

attribute clustering
Recently Published Documents