An Improved K-Means Algorithm Based on Evidence Distance

Ailin Zhu; Zexi Hua; Yu Shi; Yongchuan Tang; Lingwei Miao

doi:10.3390/e23111550

An Improved K-Means Algorithm Based on Evidence Distance

Entropy ◽

10.3390/e23111550 ◽

2021 ◽

Vol 23 (11) ◽

pp. 1550

Author(s):

Ailin Zhu ◽

Zexi Hua ◽

Yu Shi ◽

Yongchuan Tang ◽

Lingwei Miao

Keyword(s):

Euclidean Distance ◽

Gaussian Mixture ◽

Optimal Solutions ◽

Basic Probability ◽

Clustering Center ◽

Clustering Effect ◽

Sample Points ◽

Distance Parameter ◽

Selection Of ◽

Experimental Comparisons

The main influencing factors of the clustering effect of the k-means algorithm are the selection of the initial clustering center and the distance measurement between the sample points. The traditional k-mean algorithm uses Euclidean distance to measure the distance between sample points, thus it suffers from low differentiation of attributes between sample points and is prone to local optimal solutions. For this feature, this paper proposes an improved k-means algorithm based on evidence distance. Firstly, the attribute values of sample points are modelled as the basic probability assignment (BPA) of sample points. Then, the traditional Euclidean distance is replaced by the evidence distance for measuring the distance between sample points, and finally k-means clustering is carried out using UCI data. Experimental comparisons are made with the traditional k-means algorithm, the k-means algorithm based on the aggregation distance parameter, and the Gaussian mixture model. The experimental results show that the improved k-means algorithm based on evidence distance proposed in this paper has a better clustering effect and the convergence of the algorithm is also better.

Download Full-text

An Improved Initial Clustering Center Selection Method for K-Means Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1022.337 ◽

2014 ◽

Vol 1022 ◽

pp. 337-340

Author(s):

Hong Bo Zhou ◽

Jun Tao Gao

Keyword(s):

Time Complexity ◽

Euclidean Distance ◽

Selection Method ◽

Maximum Distance ◽

Original Algorithm ◽

Clustering Center ◽

Clustering Effect ◽

Data Objects ◽

Improved Algorithm

Clustering result is easily influenced by the initial clustering centers in the K-means algorithm,an improved algorithm about initial clustering centers selection is presented.The algorithm finds the maximun Euclidean distance of cluster firstly,and then makes the cluster to split by used two data objects which have the maximum distance as new clustering centers,repeat the above steps until the specified number of clustering centers are obtained.Compared to the original algorithm,the improved algorithm can solve the problem of the instability of clustering effect generated by randomness, and its time complexity was also decreased.

Download Full-text

THE STATIC OPTIMIZATION TASK OF OPTIMAL DESIGN OF NONLINEAR ELECTRONIC SCHEME

10.46813/2019-121-109 ◽

2019 ◽

pp. 109-115

Author(s):

Didmanidze Ibraim ◽

Donadze Mikheil

Keyword(s):

Optimal Design ◽

Optimality Criteria ◽

Pareto Optimal Solutions ◽

Optimal Solutions ◽

Design Task ◽

Optimization Task ◽

Minimum Capacity ◽

Optimal Values ◽

The Given ◽

Selection Of

The article deals with such an important selection of the elements of electronic scheme of the given conﬁguration, when the certain requirements of technical task are satisﬁed and at the same time the selected optimality criteria reach the extreme value. The gives task has been solved by the method of one-criterion optimization, in particular, the method of center gravity. To formalize the given scheme we have compiled a mathematical model of optimization, which considers the requirements of technical task. The optimal design task of the presented electronic scheme was brought to the task of multi criteria optimization. The computational experiments have been resulted in the Pareto-optimal solutions, from which there was selected a compromise on that corresponds to the minimum capacity, required by the scheme. According to the optimal values of resistors, we have conducted a computerized analysis of the transient process of the given electronic scheme with the help of a computer program Electronics Workbench.

Download Full-text

Using a Genetic Algorithm for Selection of Starting Conditions for the EM Algorithm for Gaussian Mixture Models

Advances in Intelligent Systems and Computing - Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015 ◽

10.1007/978-3-319-26227-7_12 ◽

2016 ◽

pp. 125-134

Author(s):

Wojciech Kwedlo

Keyword(s):

Genetic Algorithm ◽

Em Algorithm ◽

Mixture Models ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

The Em Algorithm ◽

Starting Conditions ◽

Selection Of

Download Full-text

Automatic selection of ROIs in functional imaging using Gaussian mixture models

Neuroscience Letters ◽

10.1016/j.neulet.2009.05.039 ◽

2009 ◽

Vol 460 (2) ◽

pp. 108-111 ◽

Cited By ~ 34

Author(s):

J.M. Górriz ◽

A. Lassl ◽

J. Ramírez ◽

D. Salas-Gonzalez ◽

C.G. Puntonet ◽

...

Keyword(s):

Mixture Models ◽

Functional Imaging ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Automatic Selection ◽

Selection Of

Download Full-text

K-Means Clustering Algorithm Based on Prim Improvement

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.644-650.2063 ◽

2014 ◽

Vol 644-650 ◽

pp. 2063-2066

Author(s):

He Wei Zhang ◽

Lei Sun ◽

Hong Zhang

Keyword(s):

Data Mining ◽

Clustering Algorithm ◽

Contrastive Analysis ◽

Greedy Strategy ◽

Classical Algorithm ◽

Sample Data ◽

Clustering Center ◽

Clustering Effect ◽

Implementation Steps

K - means algorithm is the classical algorithm to solve the problem of clustering in the area of data mining, when the sample data meets certain conditions, the results of clustering is better. But the algorithm is sensitive to the initial clustering center and clustering results will change as the differences of initial clustering center its number. Aimed at this shortage, this paper proposes a new algorithm based on prim algorithm to select the initial clustering center, details the basic idea of the algorithm and improves the specific methods and implementation steps, finally uses a test for the contrastive analysis. Results show that the improved K - means clustering algorithm needs not to specify the initial clustering center in advance, and it is not sensitive to abnormal value, and at the same time the use of greedy strategy makes the clustering effect more optimal than usual algorithms.

Download Full-text

Query by Example of Audio Signals using Euclidean Distance Between Gaussian Mixture Models

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 ◽

10.1109/icassp.2007.366657 ◽

2007 ◽

Cited By ~ 24

Author(s):

Marko Helen ◽

Tuomas Virtanen

Keyword(s):

Mixture Models ◽

Euclidean Distance ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Audio Signals ◽

Query By Example

Download Full-text

Characterization of Stylosanthes introductions by using seed protein patterns

Australian Journal of Agricultural Research ◽

10.1071/ar9750467 ◽

1975 ◽

Vol 26 (3) ◽

pp. 467 ◽

Cited By ~ 21

Author(s):

PJ Robinson ◽

RG Megarrity

Keyword(s):

Euclidean Distance ◽

Seed Protein ◽

Pattern Analysis ◽

Polyacrylamide Gel Electrophoresis ◽

Species Level ◽

Protein Patterns ◽

Data Set ◽

Distance Coefficient ◽

Selection Of

Seed protein patterns of 182 Stylosanthes accessions, representing 16 species and two hybrids, were obtained by polyacrylamide gel electrophoresis of crude extracts. All species could be recognized by examination of photographs and densitometer traces of the gels. Within the species capitata, guyanensis, hamata and viscosa considerable variation occurred, whilst the variation in humilis, scabra and fruticosa was not as great. Data from the densitometer traces were analysed by various methods of pattern analysis and the resulting classifications compared. A variance-standardized Euclidean distance coefficient was found to be the similarity measure of choice, whilst selection of fusion strategy was not as critical.Species relationships obtained by using the chemical data were not in agreement with the accepted taxonomic division of the genus into the sections Styposanthes and Stylosanthes. A classification based on the complete data set was compared with a working classification based on morphological and agronomic data, which is used in the agronomic assessment of the genus. Only within S. scabra did the two classifications conform. Morphological–agronomic (M–A) types within the species hamata and subsericea could be distinguished by the examination of the fine structure of the densitometer traces, whilst groups based on protein data in the species ahumilis, guyanensis, fruticosa and viscosa did not correspond with M–A groups. The application of seed protein patterns as a rapid and inexpensive means of identifying introductions of the genus at the species level, as well as characterizing types within certain species, is proposed.

Download Full-text

Multi-Objective Design and Selection of One Single Optimal Solution

Design Engineering ◽

10.1115/imece2004-60902 ◽

2004 ◽

Cited By ~ 2

Author(s):

F. Levi ◽

M. Gobbi ◽

M. Farina ◽

G. Mastinu

Keyword(s):

Optimal Solution ◽

Engineering Problem ◽

Design Solution ◽

Pareto Optimal ◽

Large Set ◽

Pareto Optimal Solutions ◽

Optimal Solutions ◽

Pareto Optimal Set ◽

Final Design ◽

Selection Of

In the paper, the problem of choosing a single final design solution among a large set of Pareto-optimal solutions is addressed. Two methods, the k-optimality approach and the more general k-ε-optimality method will be introduced. These two methods theoretically justify and mathematically define the designer’s tendency to choose solutions which are “in the middle” of the Pareto-optimal set. These two methods have been applied to the solution of a relatively simple engineering problem, i.e. the selection of the stiffness and damping of a passively suspended vehicle in order to get the best compromise between discomfort, road holding and working space. The final design solution, found by means of the k-ε-optimality approach seems consistent with the solution selected by skilled suspensions specialists. Finally the k-optimality method has proved to be very effective also when applied to complex engineering problems. The optimization of the tyre/suspension system of a sports car has been formulated as a design problem with 18 objective functions. A large set of Pareto-optimal solutions have been computed. Again, the k-optimality approach has proved to be a useful tool for the selection of a fully satisfactory final design solution.

Download Full-text

Mining Negative Comment Data of Microblog Based on Merge-AP

Mathematical Problems in Engineering ◽

10.1155/2020/9723780 ◽

2020 ◽

Vol 2020 ◽

pp. 1-7

Author(s):

Zhijun Chen ◽

Weijian Jin ◽

Shibiao Mu

Keyword(s):

Euclidean Distance ◽

Similarity Matrix ◽

Algorithm Evaluation ◽

Negative Comment ◽

Data Points ◽

Clustering Effect ◽

Merge Process

A new depiction method based on the merge-AP algorithm is proposed to effectively improve the mining accuracy of negative comment data on microblog. In this method, we first employ the AP algorithm to analyze negative comment data on microblog and calculate the similarity value and the similarity matrix of data points by Euclidean distance. Then, we introduce the distance-based merge process to solve the problem of poor clustering effect of the AP algorithm for datasets with the complex clustering structure. Finally, we compare and analyze the performance of K-means, AP, and merge-AP algorithms by collecting the actual microblog data for algorithm evaluation. The results show that the merge-AP algorithm has good adaptability.

Download Full-text

Hypercube-Based Crowding Differential Evolution with Neighborhood Mutation for Multimodal Optimization

International Journal of Swarm Intelligence Research ◽

10.4018/ijsir.2018040102 ◽

2018 ◽

Vol 9 (2) ◽

pp. 15-27

Author(s):

Haihuang Huang ◽

Liwei Jiang ◽

Xue Yu ◽

Dongqing Xie

Keyword(s):

Differential Evolution ◽

Euclidean Distance ◽

Optimization Problems ◽

Random Search ◽

Adaptive Method ◽

Radius Vector ◽

Multimodal Optimization ◽

Optimal Solutions ◽

Reasonable Range ◽

Neighborhood Mutation

In reality, multiple optimal solutions are often necessary to provide alternative options in different occasions. Thus, multimodal optimization is important as well as challenging to find multiple optimal solutions of a given objective function simultaneously. For solving multimodal optimization problems, various differential evolution (DE) algorithms with niching and neighborhood strategies have been developed. In this article, a hypercube-based crowding DE with neighborhood mutation is proposed for such problems as well. It is characterized by the use of hypercube-based neighborhoods instead of Euclidean-distance-based neighborhoods or other simpler neighborhoods. Moreover, a self-adaptive method is additionally adopted to control the radius vector of a hypercube so as to guarantee the neighborhood size always in a reasonable range. In this way, the algorithm will perform a more accurate search in the sub-regions with dense individuals, but perform a random search in the sub-regions with only sparse individuals. Experiments are conducted in comparison with an outstanding DE with neighborhood mutation, namely NCDE. The results show that the proposed algorithm is promising and computationally inexpensive.

Download Full-text