markov clustering Latest Research Papers

AbstractPredicted growth in world population will put unparalleled stress on the need for sustainable energy and global food production, as well as increase the likelihood of future pandemics. In this work, we identify high-resolution environmental zones in the context of a changing climate and predict longitudinal processes relevant to these challenges. We do this using exhaustive vector comparison methods that measure the climatic similarity between all locations on earth at high geospatial resolution. The results are captured as networks, in which edges between geolocations are defined if their historical climates exceed a similarity threshold. We then apply Markov clustering and our novel Correlation of Correlations method to the resulting climatic networks, which provides unprecedented agglomerative and longitudinal views of climatic relationships across the globe. The methods performed here resulted in the fastest (9.37 × 1018 operations/sec) and one of the largest (168.7 × 1021 operations) scientific computations ever performed, with more than 100 quadrillion edges considered for a single climatic network. Correlation and network analysis methods of this kind are widely applicable across computational and predictive biology domains, including systems biology, ecology, carbon cycles, biogeochemistry, and zoonosis research.

Download Full-text

Markov Clustering Algorithms and Their Application in Analysis of PPI Network of Malaria Genes

10.1109/idaacs53288.2021.9661009 ◽

2021 ◽

Author(s):

Mamata Das ◽

PJA Alphonse ◽

Selvakumar Kamalanathan

Keyword(s):

Clustering Algorithms ◽

Ppi Network ◽

Markov Clustering

Download Full-text

Selecting Clustering Algorithms for IBD Mapping

10.1101/2021.08.11.456036 ◽

2021 ◽

Author(s):

Ruhollah Shemirani ◽

Gillian M Belbin ◽

Keith Burghardt ◽

Kristina Lerman ◽

Christy L Avery ◽

...

Keyword(s):

Statistical Power ◽

Large Scale ◽

Clustering Algorithms ◽

False Negative ◽

Chromosome 1 ◽

Detection Methods ◽

Scalable Clustering ◽

Markov Clustering ◽

Cluster Properties ◽

Greedy Methods

Background: Groups of distantly related individuals who share a short segment of their genome identical-by-descent (IBD) can provide insights about rare traits and diseases in massive biobanks via a process called IBD mapping. Clustering algorithms play an important role in finding these groups. We set out to analyze the fitness of commonly used, fast and scalable clustering algorithms for IBD mapping applications. We designed a realistic benchmark for local IBD graphs and utilized it to compare clustering algorithms in terms of statistical power. We also investigated the effectiveness of common clustering metrics as replacements for statistical power. Results: We simulated 3.4 million clusters across 850 experiments with varying cluster counts, false-positive, and false-negative rates. Infomap and Markov Clustering (MCL) community detection methods have high statistical power in most of the graphs, compared to greedy methods such as Louvain and Leiden. We demonstrate that standard clustering metrics, such as modularity, cannot predict statistical power of algorithms in IBD mapping applications, though they can help with simulating realistic benchmarks. We extend our findings to real datasets by analyzing 3 populations in the Population Architecture using Genomics and Epidemiology (PAGE) Study with ~51,000 members and 2 million shared segments on Chromosome 1, resulting in the extraction of ~39 million local IBD clusters across three different populations in PAGE. We used cluster properties derived in PAGE to increase the accuracy of our simulations and comparison. Conclusions: Markov Clustering produces a 30% increase in statistical power compared to the current state-of-art approach, while reducing runtime by 3 orders of magnitude; making it computationally tractable in modern large-scale genetic datasets. We provide an efficient implementation to enable clustering at scale for IBD mapping and poplation-based linkage for various populations and scenarios.

Download Full-text

Orthology clusters from gene trees with Possvm

10.1101/2021.05.03.442399 ◽

2021 ◽

Author(s):

Xavier Grau-Bové ◽

Arnau Sebé-Pedrós

Keyword(s):

Gene Family ◽

Phylogenetic Trees ◽

Clustering Algorithm ◽

Gene Tree ◽

Gene Family Evolution ◽

Gene Trees ◽

Orthologous Genes ◽

Gene Annotations ◽

Species Overlap ◽

Markov Clustering

Possvm (Phylogenetic Ortholog Sorting with Species oVerlap and MCL) is a tool that automates the process of classifying clusters of orthologous genes from precomputed phylogenetic trees. It identifies orthology relationships between genes using the species overlap algorithm to infer taxonomic information from the gene tree topology, and then uses the Markov Clustering Algorithm (MCL) to identify orthology clusters and provide annotated gene family classifications. Our benchmarking shows that this approach, when provided with accurate phylogenies, is able to identify manually curated orthogroups with high precision and recall. Overall, Possvm automates the routine process of gene tree inspection and annotation in a highly interpretable manner, and provides reusable outputs that can be used to obtain phylogeny-informed gene annotations and inform comparative genomics and gene family evolution analyses.

Download Full-text

Tracing Topic Transitions with Temporal Graph Clusters

The International FLAIRS Conference Proceedings ◽

10.32473/flairs.v34i1.128547 ◽

2021 ◽

Vol 34 (1) ◽

Author(s):

Xiaonan Jing ◽

Qingyuan Hu ◽

Yi Zhang ◽

Julia Taylor Rayz

Keyword(s):

Language Processing ◽

Data Stream ◽

Clustering Algorithm ◽

Twitter Data ◽

Markov Clustering ◽

Data Source ◽

Temporal Graph ◽

Node Removal ◽

Temporal Graphs ◽

Optimal Graph

Twitter serves as a data source for many Natural Language Processing (NLP) tasks. It can be challenging to identify topics on Twitter due to continuous updating data stream. In this paper, we present an unsupervised graph based framework to identify the evolution of sub-topics within two weeks of real-world Twitter data. We first employ a Markov Clustering Algorithm (MCL) with a node removal method to identify optimal graph clusters from temporal Graph-of-Words (GoW). Subsequently, we model the clustering transitions between the temporal graphs to identify the topic evolution. Finally, the transition flows generated from both computational approach and human annotations are compared to ensure the validity of our framework.

Download Full-text

Geometrical inspired pre-weighting enhances Markov clustering community detection in complex networks

Applied Network Science ◽

10.1007/s41109-021-00370-x ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Claudio Durán ◽

Alessandro Muscoloni ◽

Carlo Vittorio Cannistraci

Keyword(s):

Complex Networks ◽

Community Detection ◽

State Of The Art ◽

Similarity Measures ◽

Feature Space ◽

Ground Truth ◽

Recognition Algorithm ◽

Design Of Algorithms ◽

Art Methods ◽

Markov Clustering

AbstractMarkov clustering is an effective unsupervised pattern recognition algorithm for data clustering in high-dimensional feature space. However, its community detection performance in complex networks has been demonstrating results far from the state of the art methods such as Infomap and Louvain. The crucial issue is to convert the unweighted network topology in a ‘smart-enough’ pre-weighted connectivity that adequately steers the stochastic flow procedure behind Markov clustering. Here we introduce a conceptual innovation and we discuss how to leverage network latent geometry notions in order to design similarity measures for pre-weighting the adjacency matrix used in Markov clustering community detection. Our results demonstrate that the proposed strategy improves Markov clustering significantly, to the extent that it is often close to the performance of current state of the art methods for community detection. These findings emerge considering both synthetic ‘realistic’ networks (with known ground-truth communities) and real networks (with community metadata), and even when the real network connectivity is corrupted by noise artificially induced by missing or spurious links. Our study enhances the generalized understanding of how network geometry plays a fundamental role in the design of algorithms based on network navigability.

Download Full-text

Application of soft regularized markov clustering for analyzing protein-protein interaction in sars-cov-2 and other related coronavirus

Journal of Physics Conference Series ◽

10.1088/1742-6596/1722/1/012012 ◽

2021 ◽

Vol 1722 ◽

pp. 012012

Author(s):

S A Pratiwi ◽

A Bustamam ◽

D Sarwinda

Keyword(s):

Protein Interaction ◽

Protein Protein Interaction ◽

Markov Clustering

Download Full-text

Applications of cuckoo search and ant lion optimization for analyzing protein-protein interaction through regularized Markov clustering on coronavirus

Journal of Physics Conference Series ◽

10.1088/1742-6596/1722/1/012008 ◽

2021 ◽

Vol 1722 ◽

pp. 012008

Author(s):

A Rizki ◽

A Bustamam ◽

D Sarwinda

Keyword(s):

Protein Interaction ◽

Cuckoo Search ◽

Protein Protein Interaction ◽

Ant Lion Optimization ◽

Markov Clustering ◽

Ant Lion

Download Full-text

markov clustering
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A heterogeneous parallel implementation of the Markov clustering algorithm for large-scale biological networks on distributed CPU–GPU clusters

Binary Bat Algorithm for text feature selection in news events detection model using Markov clustering

Climatic clustering and longitudinal analysis with impacts on food, bioenergy, and pandemics

Markov Clustering Algorithms and Their Application in Analysis of PPI Network of Malaria Genes

Selecting Clustering Algorithms for IBD Mapping

Orthology clusters from gene trees with Possvm

Tracing Topic Transitions with Temporal Graph Clusters

Geometrical inspired pre-weighting enhances Markov clustering community detection in complex networks

Application of soft regularized markov clustering for analyzing protein-protein interaction in sars-cov-2 and other related coronavirus

Applications of cuckoo search and ant lion optimization for analyzing protein-protein interaction through regularized Markov clustering on coronavirus

Export Citation Format

markov clusteringRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A heterogeneous parallel implementation of the Markov clustering algorithm for large-scale biological networks on distributed CPU–GPU clusters

Binary Bat Algorithm for text feature selection in news events detection model using Markov clustering

Climatic clustering and longitudinal analysis with impacts on food, bioenergy, and pandemics

Markov Clustering Algorithms and Their Application in Analysis of PPI Network of Malaria Genes

Selecting Clustering Algorithms for IBD Mapping

Orthology clusters from gene trees with Possvm

Tracing Topic Transitions with Temporal Graph Clusters

Geometrical inspired pre-weighting enhances Markov clustering community detection in complex networks

Application of soft regularized markov clustering for analyzing protein-protein interaction in sars-cov-2 and other related coronavirus

Applications of cuckoo search and ant lion optimization for analyzing protein-protein interaction through regularized Markov clustering on coronavirus

markov clustering
Recently Published Documents