Balanced Data Clustering Algorithm for Both Hard and Soft Clustering

2018 ◽  
Vol 6 (2) ◽  
pp. 176-183
Author(s):  
Purnendu Das ◽  
◽  
Bishwa Ranjan Roy ◽  
Saptarshi Paul ◽  
◽  
...  
2014 ◽  
Vol 543-547 ◽  
pp. 1934-1938
Author(s):  
Ming Xiao

For a clustering algorithm in two-dimension spatial data, the Adaptive Resonance Theory exists not only the shortcomings of pattern drift and vector module of information missing, but also difficultly adapts to spatial data clustering which is irregular distribution. A Tree-ART2 network model was proposed based on the above situation. It retains the memory of old model which maintains the constraint of spatial distance by learning and adjusting LTM pattern and amplitude information of vector. Meanwhile, introducing tree structure to the model can reduce the subjective requirement of vigilance parameter and decrease the occurrence of pattern mixing. It is showed that TART2 network has higher plasticity and adaptability through compared experiments.


Energies ◽  
2018 ◽  
Vol 11 (9) ◽  
pp. 2344 ◽  
Author(s):  
Enwen Li ◽  
Linong Wang ◽  
Bin Song ◽  
Siliang Jian

Dissolved gas analysis (DGA) of the oil allows transformer fault diagnosis and status monitoring. Fuzzy c-means (FCM) clustering is an effective pattern recognition method, but exhibits poor clustering accuracy for dissolved gas data and usually fails to subsequently correctly classify transformer faults. The existing feasible approach involves combination of the FCM clustering algorithm with other intelligent algorithms, such as neural networks and support vector machines. This method enables good classification; however, the algorithm complexity is greatly increased. In this paper, the FCM clustering algorithm itself is improved and clustering analysis of DGA data is realized. First, the non-monotonicity of the traditional clustering membership function with respect to the sample distance and its several local extrema are discussed, which mainly explain the poor classification accuracy of DGA data clustering. Then, an exponential form of the membership function is proposed to obtain monotony with respect to distance, thereby improving the dissolved gas data clustering. Likewise, a similarity function to determine the degree of membership is derived. Test results for large datasets show that the improved clustering algorithm can be successfully applied for DGA-data-based transformer fault detection.


Author(s):  
Yoni Aswan ◽  
Sarjon Defit ◽  
Gunadi Widi Nurcahyo

Crime is all kinds of actions and actions that are economically and psychologically harmful that violate the laws in force in the State of Indonesia as well as social and religious norms. Ordinary criminal acts affect the security of the community and threaten their inner and outer peace. The research location is the Mentawai Islands Police, which is an agency that can provide security and protection for the community, especially those in the Mentawai Islands Regency. The problem is that it is difficult for the Mentawai Islands Police to classify areas that are prone to crime in the most vulnerable, moderately vulnerable and not vulnerable categories. Especially considering the condition of the Mentawai, there are four large islands consisting of 10 sub-districts, where crime is increasing every year, especially those in the Mentawai Islands Regency area such as motor vehicle theft. Based on the background of the problem above, the researcher is interested in taking research in creating a system to predict the crime rate in the Mentawai Islands Regency in order to anticipate the surge in crime that will come. The method used is the K-Means Clustering Algorithm as a non-hierarchical data clustering method to partition existing data into one or more clusters or groups. This method partitions data into clusters so that data with the same characteristics are grouped into the same cluster and data with different characteristics are grouped into other clusters. Clustering is one of the data mining techniques used to get groups of objects that have common characteristics in large enough data. The data used is data on cases of criminal theft of motor vehicles for the last 5 years from 2016 to 2020. The results of the test show that South Sipora District is an area prone to the crime of motor vehicle theft.


2012 ◽  
Vol 48 (7) ◽  
pp. 8-13 ◽  
Author(s):  
Bala SundarV ◽  
T Devi ◽  
N Saravanan

2016 ◽  
Vol 16 (6) ◽  
pp. 27-42 ◽  
Author(s):  
Minghan Yang ◽  
Xuedong Gao ◽  
Ling Li

Abstract Although Clustering Algorithm Based on Sparse Feature Vector (CABOSFV) and its related algorithms are efficient for high dimensional sparse data clustering, there exist several imperfections. Such imperfections as subjective parameter designation and order sensibility of clustering process would eventually aggravate the time complexity and quality of the algorithm. This paper proposes a parameter adjustment method of Bidirectional CABOSFV for optimization purpose. By optimizing Parameter Vector (PV) and Parameter Selection Vector (PSV) with the objective function of clustering validity, an improved Bidirectional CABOSFV algorithm using simulated annealing is proposed, which circumvents the requirement of initial parameter determination. The experiments on UCI data sets show that the proposed algorithm, which can perform multi-adjustment clustering, has a higher accurateness than single adjustment clustering, along with a decreased time complexity through iterations.


Author(s):  
Hind Bangui ◽  
Mouzhi Ge ◽  
Barbora Buhnova

Due to the massive data increase in different Internet of Things (IoT) domains such as healthcare IoT and Smart City IoT, Big Data technologies have been emerged as critical analytics tools for analyzing the IoT data. Among the Big Data technologies, data clustering is one of the essential approaches to process the IoT data. However, how to select a suitable clustering algorithm for IoT data is still unclear. Furthermore, since Big Data technology are still in its initial stage for different IoT domains, it is thus valuable to propose and structure the research challenges between Big Data and IoT. Therefore, this article starts by reviewing and comparing the data clustering algorithms that can be applied in IoT datasets, and then extends the discussions to a broader IoT context such as IoT dynamics and IoT mobile networks. Finally, this article identifies a set of research challenges that harvest a research roadmap for the Big Data research in IoT domains. The proposed research roadmap aims at bridging the research gaps between Big Data and various IoT contexts.


Author(s):  
Amolkumar Narayan Jadhav ◽  
Gomathi N.

The widespread application of clustering in various fields leads to the discovery of different clustering techniques in order to partition multidimensional data into separable clusters. Although there are various clustering approaches used in literature, optimized clustering techniques with multi-objective consideration are rare. This paper proposes a novel data clustering algorithm, Enhanced Kernel-based Exponential Grey Wolf Optimization (EKEGWO), handling two objectives. EKEGWO, which is the extension of KEGWO, adopts weight exponential functions to improve the searching process of clustering. Moreover, the fitness function of the algorithm includes intra-cluster distance and the inter-cluster distance as an objective to provide an optimum selection of cluster centroids. The performance of the proposed technique is evaluated by comparing with the existing approaches PSC, mPSC, GWO, and EGWO for two datasets: banknote authentication and iris. Four metrics, Mean Square Error (MSE), F-measure, rand and jaccord coefficient, estimates the clustering efficiency of the algorithm. The proposed EKEGWO algorithm can attain an MSE of 837, F-measure of 0.9657, rand coefficient of 0.8472, jaccord coefficient of 0.7812, for the banknote dataset.


Sign in / Sign up

Export Citation Format

Share Document