Unsupervised Learning Through Generalized Mixture Model

Author(s):  
Samyajoy Pal ◽  
Christian Heumann

Abstract A generalized way of building mixture models using different distributions is explored in this article. The EM algorithm is used with some modifications to accommodate different distributions within the same model. The model uses any point estimate available for the respective distributions to estimate the mixture components and model parameters. The study is focused on the application of mixture models in unsupervised learning problems, especially cluster analysis. The convenience of building mixture models using the generalized approach is further emphasised by appropriate examples, exploiting the well-known maximum likelihood and Bayesian estimates of the parameters of the parent distributions.

2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Yupeng Li ◽  
Jianhua Zhang ◽  
Ruisi He ◽  
Lei Tian ◽  
Hewen Wei

In this paper, the Gaussian mixture model (GMM) is introduced to the channel multipath clustering. In the GMM field, the expectation-maximization (EM) algorithm is usually utilized to estimate the model parameters. However, the EM widely converges into local optimization. To address this issue, a hybrid differential evolution (DE) and EM (DE-EM) algorithms are proposed in this paper. To be specific, the DE is employed to initialize the GMM parameters. Then, the parameters are estimated with the EM algorithm. Thanks to the global searching ability of DE, the proposed hybrid DE-EM algorithm is more likely to obtain the global optimization. Simulations demonstrate that our proposed DE-EM clustering algorithm can significantly improve the clustering performance.


2021 ◽  
Author(s):  
Masahiro Kuroda

Mixture models become increasingly popular due to their modeling flexibility and are applied to the clustering and classification of heterogeneous data. The EM algorithm is largely used for the maximum likelihood estimation of mixture models because the algorithm is stable in convergence and simple in implementation. Despite such advantages, it is pointed out that the EM algorithm is local and has slow convergence as the main drawback. To avoid the local convergence of the EM algorithm, multiple runs from several different initial values are usually used. Then the algorithm may take a large number of iterations and long computation time to find the maximum likelihood estimates. The speedup of computation of the EM algorithm is available for these problems. We give the algorithms to accelerate the convergence of the EM algorithm and apply them to mixture model estimation. Numerical experiments examine the performance of the acceleration algorithms in terms of the number of iterations and computation time.


Forests ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 1196
Author(s):  
Eric K. Zenner ◽  
Mahdi Teimouri

The creation and maintenance of complex forest structures has become an important forestry objective. Complex forest structures, often expressed in multimodal shapes of tree size/diameter (DBH) distributions, are challenging to model. Mixture probability density functions of two- or three-component gamma, log-normal, and Weibull mixture models offer a solution and can additionally provide insights into forest dynamics. Model parameters can be efficiently estimated with the maximum likelihood (ML) approach using iterative methods such as the Newton-Raphson (NR) algorithm. However, the NR algorithm is sensitive to the choice of initial values and does not always converge. As an alternative, we explored the use of the iterative expectation-maximization (EM) algorithm for estimating parameters of the aforementioned mixture models because it always converges to ML estimators. Since forestry data frequently occur both in grouped (classified) and ungrouped (raw) forms, the EM algorithm was applied to explore the goodness-of-fit of the gamma, log-normal, and Weibull mixture distributions in three sample plots that exhibited irregular, multimodal, highly skewed, and heavy-tailed DBH distributions where some size classes were empty. The EM-based goodness-of-fit was further compared against a nonparametric kernel-based density estimation (NK) model and the recently popularized gamma-shaped mixture (GSM) models using the ungrouped data. In this example application, the EM algorithm provided well-fitting two- or three-component mixture models for all three model families. The number of components of the best-fitting models differed among the three sample plots (but not among model families) and the mixture models of the log-normal and gamma families provided a better fit than the Weibull distribution for grouped and ungrouped data. For ungrouped data, both log-normal and gamma mixture distributions outperformed the GSM model and, with the exception of the multimodal diameter distribution, also the NK model. The EM algorithm appears to be a promising tool for modeling complex forest structures.


2016 ◽  
Vol 46 (3) ◽  
pp. 779-799 ◽  
Author(s):  
Cuihong Yin ◽  
X. Sheldon Lin

AbstractThe Erlang mixture model has been widely used in modeling insurance losses due to its desirable distributional properties. In this paper, we consider the problem of efficient estimation of the Erlang mixture model. We present a new thresholding penalty function and a corresponding EM algorithm to estimate model parameters and to determine the order of the mixture. Using simulation studies and a real data application, we demonstrate the efficiency of the EM algorithm.


2002 ◽  
Vol 14 (6) ◽  
pp. 1261-1266 ◽  
Author(s):  
Akihiro Minagawa ◽  
Norio Tagawa ◽  
Toshiyuki Tanaka

The expectation-maximization (EM) algorithm with split-and-merge operations (SMEM algorithm) proposed by Ueda, Nakano, Ghahramani, and Hinton (2000) is a nonlocal searching method, applicable to mixture models, for relaxing the local optimum property of the EM algorithm. In this article, we point out that the SMEM algorithm uses the acceptance-rejection evaluation method, which may pick up a distribution with smaller likelihood, and demonstrate that an increase in likelihood can then be guaranteed only by comparing log likelihoods.


2012 ◽  
Vol 2012 ◽  
pp. 1-5
Author(s):  
Qihong Duan ◽  
Ying Wei ◽  
Xiang Chen

A parameter estimation problem for a backup system in a condition-based maintenance is considered. We model a backup system by a hidden, three-state continuous time Markov process. Data are obtained through condition monitoring at discrete time points. Maximum likelihood estimates of the model parameters are obtained using the EM algorithm. We establish conditions under which there is no more than one limitation in the parameter space for any sequence derived by the EM algorithm.


2021 ◽  
Vol 8 (9) ◽  
pp. 275-277
Author(s):  
Ahsene Lanani

This paper yields with the Maximum likelihood estimation using the EM algorithm. This algorithm is very used to solve nonlinear equations with missing data. We estimated the linear mixed model parameters and those of the variance-covariance matrix. The considered structure of this matrix is not necessarily linear. Keywords: Algorithm EM; Maximum likelihood; Mixed linear model.


2012 ◽  
Vol 532-533 ◽  
pp. 1445-1449
Author(s):  
Ting Ting Tong ◽  
Zhen Hua Wu

EM algorithm is a common method to solve mixed model parameters in statistical classification of remote sensing image. The EM algorithm based on fuzzification is presented in this paper to use a fuzzy set to represent each training sample. Via the weighted degree of membership, different samples will be of different effect during iteration to decrease the impact of noise on parameter learning and to increase the convergence rate of algorithm. The function and accuracy of classification of image data can be completed preferably.


Sign in / Sign up

Export Citation Format

Share Document