Balanced Data Clustering Algorithm for Both Hard and Soft Clustering

For a clustering algorithm in two-dimension spatial data, the Adaptive Resonance Theory exists not only the shortcomings of pattern drift and vector module of information missing, but also difficultly adapts to spatial data clustering which is irregular distribution. A Tree-ART2 network model was proposed based on the above situation. It retains the memory of old model which maintains the constraint of spatial distance by learning and adjusting LTM pattern and amplitude information of vector. Meanwhile, introducing tree structure to the model can reduce the subjective requirement of vigilance parameter and decrease the occurrence of pattern mixing. It is showed that TART2 network has higher plasticity and adaptability through compared experiments.

Download Full-text

Improved Fuzzy C-Means Clustering for Transformer Fault Diagnosis Using Dissolved Gas Analysis Data

Energies ◽

10.3390/en11092344 ◽

2018 ◽

Vol 11 (9) ◽

pp. 2344 ◽

Cited By ~ 6

Author(s):

Enwen Li ◽

Linong Wang ◽

Bin Song ◽

Siliang Jian

Keyword(s):

Fault Diagnosis ◽

Membership Function ◽

Data Clustering ◽

Clustering Algorithm ◽

Gas Analysis ◽

Dissolved Gas ◽

Fuzzy C Means ◽

Dissolved Gas Analysis ◽

Fcm Clustering ◽

Transformer Fault

Dissolved gas analysis (DGA) of the oil allows transformer fault diagnosis and status monitoring. Fuzzy c-means (FCM) clustering is an effective pattern recognition method, but exhibits poor clustering accuracy for dissolved gas data and usually fails to subsequently correctly classify transformer faults. The existing feasible approach involves combination of the FCM clustering algorithm with other intelligent algorithms, such as neural networks and support vector machines. This method enables good classification; however, the algorithm complexity is greatly increased. In this paper, the FCM clustering algorithm itself is improved and clustering analysis of DGA data is realized. First, the non-monotonicity of the traditional clustering membership function with respect to the sample distance and its several local extrema are discussed, which mainly explain the poor classification accuracy of DGA data clustering. Then, an exponential form of the membership function is proposed to obtain monotony with respect to distance, thereby improving the dissolved gas data clustering. Likewise, a similarity function to determine the degree of membership is derived. Test results for large datasets show that the improved clustering algorithm can be successfully applied for DGA-data-based transformer fault detection.

Download Full-text

Genetic Algorithm Based Parallel K-Means Data Clustering Algorithm Using MapReduce Programming Paradigm on Hadoop Environment (GAPKCA)

Advances in Intelligent Systems and Computing - Recent Advances on Soft Computing and Data Mining ◽

10.1007/978-3-030-36056-6_10 ◽

2019 ◽

pp. 98-108 ◽

Cited By ~ 1

Author(s):

Sayer Alshammari ◽

Maslina Binti Zolkepli ◽

Rusli Bin Abdullah

Keyword(s):

Genetic Algorithm ◽

Data Clustering ◽

Clustering Algorithm ◽

Programming Paradigm

Download Full-text

Algoritma K-Means Clustering dalam Mengklasifikasi Data Daerah Rawan Tindak Kriminalitas (Polres Kepulauan Mentawai)

Jurnal Sistim Informasi dan Teknologi ◽

10.37034/jsisfotek.v3i4.179 ◽

2021 ◽

pp. 243-248

Author(s):

Yoni Aswan ◽

Sarjon Defit ◽

Gunadi Widi Nurcahyo

Keyword(s):

Data Clustering ◽

Clustering Algorithm ◽

Motor Vehicle ◽

Motor Vehicles ◽

Hierarchical Data ◽

Motor Vehicle Theft ◽

Vehicle Theft ◽

Mentawai Islands ◽

Or Groups ◽

Different Characteristics

Crime is all kinds of actions and actions that are economically and psychologically harmful that violate the laws in force in the State of Indonesia as well as social and religious norms. Ordinary criminal acts affect the security of the community and threaten their inner and outer peace. The research location is the Mentawai Islands Police, which is an agency that can provide security and protection for the community, especially those in the Mentawai Islands Regency. The problem is that it is difficult for the Mentawai Islands Police to classify areas that are prone to crime in the most vulnerable, moderately vulnerable and not vulnerable categories. Especially considering the condition of the Mentawai, there are four large islands consisting of 10 sub-districts, where crime is increasing every year, especially those in the Mentawai Islands Regency area such as motor vehicle theft. Based on the background of the problem above, the researcher is interested in taking research in creating a system to predict the crime rate in the Mentawai Islands Regency in order to anticipate the surge in crime that will come. The method used is the K-Means Clustering Algorithm as a non-hierarchical data clustering method to partition existing data into one or more clusters or groups. This method partitions data into clusters so that data with the same characteristics are grouped into the same cluster and data with different characteristics are grouped into other clusters. Clustering is one of the data mining techniques used to get groups of objects that have common characteristics in large enough data. The data used is data on cases of criminal theft of motor vehicles for the last 5 years from 2016 to 2020. The results of the test show that South Sipora District is an area prone to the crime of motor vehicle theft.

Download Full-text

Development of a Data Clustering Algorithm for Predicting Heart

International Journal of Computer Applications ◽

10.5120/7358-0095 ◽

2012 ◽

Vol 48 (7) ◽

pp. 8-13 ◽

Cited By ~ 5

Author(s):

Bala SundarV ◽

T Devi ◽

N Saravanan

Keyword(s):

Data Clustering ◽

Clustering Algorithm

Download Full-text

New Data Clustering Algorithm Combined of Ant Colony Algorithm and Improved Fuzzy C-Means Algorithm

Proceedings of the 2016 International Conference on Communications, Information Management and Network Security ◽

10.2991/cimns-16.2016.56 ◽

2016 ◽

Author(s):

Zhiming Zhang ◽

Guobin Wu ◽

Jie Luo

Keyword(s):

Data Clustering ◽

Ant Colony Algorithm ◽

Clustering Algorithm ◽

Ant Colony ◽

Fuzzy C Means ◽

Fuzzy C Means Algorithm

Download Full-text

Improved Bidirectional CABOSFV Based on Multi-Adjustment Clustering and Simulated Annealing

Cybernetics and Information Technologies ◽

10.1515/cait-2016-0075 ◽

2016 ◽

Vol 16 (6) ◽

pp. 27-42 ◽

Cited By ~ 1

Author(s):

Minghan Yang ◽

Xuedong Gao ◽

Ling Li

Keyword(s):

Simulated Annealing ◽

Data Clustering ◽

Time Complexity ◽

Clustering Algorithm ◽

Feature Vector ◽

Parameter Determination ◽

Data Sets ◽

Parameter Vector ◽

Clustering Validity

Abstract Although Clustering Algorithm Based on Sparse Feature Vector (CABOSFV) and its related algorithms are efficient for high dimensional sparse data clustering, there exist several imperfections. Such imperfections as subjective parameter designation and order sensibility of clustering process would eventually aggravate the time complexity and quality of the algorithm. This paper proposes a parameter adjustment method of Bidirectional CABOSFV for optimization purpose. By optimizing Parameter Vector (PV) and Parameter Selection Vector (PSV) with the objective function of clustering validity, an improved Bidirectional CABOSFV algorithm using simulated annealing is proposed, which circumvents the requirement of initial parameter determination. The experiments on UCI data sets show that the proposed algorithm, which can perform multi-adjustment clustering, has a higher accurateness than single adjustment clustering, along with a decreased time complexity through iterations.

Download Full-text

A Research Roadmap of Big Data Clustering Algorithms for Future Internet of Things

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2019040102 ◽

2019 ◽

Vol 9 (2) ◽

pp. 16-30 ◽

Cited By ~ 1

Author(s):

Hind Bangui ◽

Mouzhi Ge ◽

Barbora Buhnova

Keyword(s):

Big Data ◽

Internet Of Things ◽

Mobile Networks ◽

Data Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Future Internet ◽

Research Challenges ◽

Initial Stage ◽

Big Data Technologies

Due to the massive data increase in different Internet of Things (IoT) domains such as healthcare IoT and Smart City IoT, Big Data technologies have been emerged as critical analytics tools for analyzing the IoT data. Among the Big Data technologies, data clustering is one of the essential approaches to process the IoT data. However, how to select a suitable clustering algorithm for IoT data is still unclear. Furthermore, since Big Data technology are still in its initial stage for different IoT domains, it is thus valuable to propose and structure the research challenges between Big Data and IoT. Therefore, this article starts by reviewing and comparing the data clustering algorithms that can be applied in IoT datasets, and then extends the discussions to a broader IoT context such as IoT dynamics and IoT mobile networks. Finally, this article identifies a set of research challenges that harvest a research roadmap for the Big Data research in IoT domains. The proposed research roadmap aims at bridging the research gaps between Big Data and various IoT contexts.

Download Full-text

EKEGWO: Enhanced Kernel-Based Exponential Grey Wolf Optimizer for Bi-Objective Data Clustering

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488519500296 ◽

2019 ◽

Vol 27 (04) ◽

pp. 669-688 ◽

Cited By ~ 1

Author(s):

Amolkumar Narayan Jadhav ◽

Gomathi N.

Keyword(s):

Data Clustering ◽

Clustering Algorithm ◽

Fitness Function ◽

Multidimensional Data ◽

Grey Wolf Optimizer ◽

Grey Wolf ◽

Widespread Application ◽

Clustering Techniques ◽

Cluster Distance ◽

F Measure

The widespread application of clustering in various fields leads to the discovery of different clustering techniques in order to partition multidimensional data into separable clusters. Although there are various clustering approaches used in literature, optimized clustering techniques with multi-objective consideration are rare. This paper proposes a novel data clustering algorithm, Enhanced Kernel-based Exponential Grey Wolf Optimization (EKEGWO), handling two objectives. EKEGWO, which is the extension of KEGWO, adopts weight exponential functions to improve the searching process of clustering. Moreover, the fitness function of the algorithm includes intra-cluster distance and the inter-cluster distance as an objective to provide an optimum selection of cluster centroids. The performance of the proposed technique is evaluated by comparing with the existing approaches PSC, mPSC, GWO, and EGWO for two datasets: banknote authentication and iris. Four metrics, Mean Square Error (MSE), F-measure, rand and jaccord coefficient, estimates the clustering efficiency of the algorithm. The proposed EKEGWO algorithm can attain an MSE of 837, F-measure of 0.9657, rand coefficient of 0.8472, jaccord coefficient of 0.7812, for the banknote dataset.

Download Full-text