A Comparative Study of Clustering Methods for Relevant Gene Selection in Microarray Data

Author(s):  
Manju Sardana ◽  
R. K. Agrawal
2019 ◽  
Vol 9 (6) ◽  
pp. 1294-1300 ◽  
Author(s):  
A. Sampathkumar ◽  
P. Vivekanandan

In the field of bioinformatics research, a large volume of genetic data has been generated. Availability of higher throughput devices at lower cost has contributed to this generation of huge volumetric data. Handling such numerous data has become extremely challenging for selecting the relevant disease-causing gene. The development of microarray technology provides higher chances of cancer diagnosis, by enabling to measure the expression level of multiple genes at the same stretch. Selecting the relevant gene by using classifiers for investigation of gene expression data is a complicated process. Proper identification of gene from the gene expression datasets plays a vital role in improving the accuracy of classification. In this article, identification of the highly relevant gene from the gene expression data for cancer treatment is discussed in detail. By using modified meta-heuristic approach, known as 'parallel lion optimization' (PLOA) for selecting genes from microarray data that can classify various cancer sub-types with more accuracy. The experimental results depict that PLOA outperforms than LOA and other well-known approaches, considering the five benchmark cancer gene expression dataset. It returns 99% classification accuracy for the dataset namely Prostate, Lung, Leukemia and Central Nervous system (CNS) for top 200 genes. Prostate and Lymphoma dataset PLOA is 99.19% and 99.93% respectively. On evaluating the result with other algorithm, the higher level of accuracy in gene selection is achieved by the proposed algorithm.


Author(s):  
Miguel Reboiro-Jato ◽  
Daniel Glez-Peña ◽  
Juan Francisco Gálvez ◽  
Rosalía Laza Fidalgo ◽  
Fernando Díaz ◽  
...  

2017 ◽  
Vol 12 (3) ◽  
pp. 202-212 ◽  
Author(s):  
Tham W. Shi ◽  
Wong S. Kah ◽  
Mohd S. Mohamad ◽  
Kohbalan Moorthy ◽  
Safaai Deris ◽  
...  

2008 ◽  
Vol 06 (02) ◽  
pp. 261-282 ◽  
Author(s):  
AO YUAN ◽  
WENQING HE

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.


2011 ◽  
Vol 7 (3) ◽  
pp. 142-146 ◽  
Author(s):  
Kohbalan Moorthy ◽  
Mohd Saberi Mohamad

Sign in / Sign up

Export Citation Format

Share Document