scalable algorithms Latest Research Papers

AbstractOnline social networks provide a forum where people make new connections, learn more about the world, get exposed to different points of view, and access information that were previously inaccessible. It is natural to assume that content-delivery algorithms in social networks should not only aim to maximize user engagement but also to offer opportunities for increasing connectivity and enabling social networks to achieve their full potential. Our motivation and aim is to develop methods that foster the creation of new connections, and subsequently, improve the flow of information in the network. To achieve our goal, we propose to leverage the strong triadic closure principle, and consider violations to this principle as opportunities for creating more social links. We formalize this idea as an algorithmic problem related to the densest k-subgraph problem. For this new problem, we establish hardness results and propose approximation algorithms. We identify two special cases of the problem that admit a constant-factor approximation. Finally, we experimentally evaluate our proposed algorithm on real-world social networks, and we additionally evaluate some simpler but more scalable algorithms.

Download Full-text

Scalable Algorithms Using Sparse Storage for Parallel Spectral Clustering on GPU

Lecture Notes in Computer Science - Network and Parallel Computing ◽

10.1007/978-3-030-93571-9_4 ◽

2022 ◽

pp. 40-52

Author(s):

Guanlin He ◽

Stephane Vialle ◽

Nicolas Sylvestre ◽

Marc Baboulin

Keyword(s):

Spectral Clustering ◽

Scalable Algorithms

Download Full-text

AOI-shapes: An Efficient Footprint Algorithm to Support Visualization of User-defined Urban Areas of Interest

ACM Transactions on Interactive Intelligent Systems ◽

10.1145/3431817 ◽

2021 ◽

Vol 11 (3-4) ◽

pp. 1-32

Author(s):

Mingzhao Li ◽

Zhifeng Bao ◽

Farhana Choudhury ◽

Hanan Samet ◽

Matt Duckham ◽

...

Keyword(s):

Real Estate ◽

Real World ◽

Urban Areas ◽

Real Life ◽

Scalable Algorithms ◽

Interactive Query ◽

Boundary Information ◽

Real World Datasets ◽

Effective Visualization ◽

Areas Of Interest

Understanding urban areas of interest (AOIs) is essential in many real-life scenarios, and such AOIs can be computed based on the geographic points that satisfy user queries. In this article, we study the problem of efficient and effective visualization of user-defined urban AOIs in an interactive manner. In particular, we first define the problem of user-defined AOI visualization based on a real estate data visualization scenario, and we illustrate why a novel footprint method is needed to support the visualization. After extensively reviewing existing “footprint” methods, we propose a parameter-free footprint method, named AOI-shapes, to capture the boundary information of a user-defined urban AOI. Next, to allow interactive query refinements by the user, we propose two efficient and scalable algorithms to incrementally generate urban AOIs by reusing existing visualization results. Finally, we conduct extensive experiments with both synthetic and real-world datasets to demonstrate the quality and efficiency of the proposed methods.

Download Full-text

Functional observability and target state estimation in large-scale networks

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2113750119 ◽

2021 ◽

Vol 119 (1) ◽

pp. e2113750119

Author(s):

Arthur N. Montanari ◽

Chao Duan ◽

Luis A. Aguirre ◽

Adilson E. Motter

Keyword(s):

Large Scale ◽

Phase Measurement ◽

Scale Up ◽

Measurement Data ◽

State Observer ◽

Sensor Nodes ◽

Scalable Algorithms ◽

Functional Observer ◽

Full State ◽

Large Scale Networks

The quantitative understanding and precise control of complex dynamical systems can only be achieved by observing their internal states via measurement and/or estimation. In large-scale dynamical networks, it is often difficult or physically impossible to have enough sensor nodes to make the system fully observable. Even if the system is in principle observable, high dimensionality poses fundamental limits on the computational tractability and performance of a full-state observer. To overcome the curse of dimensionality, we instead require the system to be functionally observable, meaning that a targeted subset of state variables can be reconstructed from the available measurements. Here, we develop a graph-based theory of functional observability, which leads to highly scalable algorithms to 1) determine the minimal set of required sensors and 2) design the corresponding state observer of minimum order. Compared with the full-state observer, the proposed functional observer achieves the same estimation quality with substantially less sensing and fewer computational resources, making it suitable for large-scale networks. We apply the proposed methods to the detection of cyberattacks in power grids from limited phase measurement data and the inference of the prevalence rate of infection during an epidemic under limited testing conditions. The applications demonstrate that the functional observer can significantly scale up our ability to explore otherwise inaccessible dynamical processes on complex networks.

Download Full-text

Attraction Basins in Metaheuristics: A Systematic Mapping Study

Mathematics ◽

10.3390/math9233036 ◽

2021 ◽

Vol 9 (23) ◽

pp. 3036

Author(s):

Mihael Baketarić ◽

Marjan Mernik ◽

Tomaž Kosar

Keyword(s):

Full Text ◽

Data Extraction ◽

Systematic Mapping Study ◽

Scalable Algorithms ◽

Mapping Study ◽

Systematic Mapping ◽

Continuous Domains ◽

Attraction Basins ◽

Discrete Domains ◽

Full Text Screening

Context: In this study, we report on a Systematic Mapping Study (SMS) for attraction basins in the domain of metaheuristics. Objective: To identify research trends, potential issues, and proposed solutions on attraction basins in the field of metaheuristics. Research goals were inspired by the previous paper, published in 2021, where attraction basins were used to measure exploration and exploitation. Method: We conducted the SMS in the following steps: Defining research questions, conducting the search in the ISI Web of Science and Scopus databases, full-text screening, iterative forward and backward snowballing (with ongoing full-text screening), classifying, and data extraction. Results: Attraction basins within discrete domains are understood far better than those within continuous domains. Attraction basins on dynamic problems have hardly been investigated. Multi-objective problems are investigated poorly in both domains, although slightly more often within a continuous domain. There is a lack of parallel and scalable algorithms to compute attraction basins and a general framework that would unite all different definitions/implementations used for attraction basins. Conclusions: Findings regarding attraction basins in the field of metaheuristics reveal that the concept alone is poorly exploited, as well as identify open issues where researchers may improve their research.

Download Full-text

ODGI: understanding pangenome graphs

10.1101/2021.11.10.467921 ◽

2021 ◽

Author(s):

Andrea Guarracino ◽

Simon Heumos ◽

Sven Nahnsen ◽

Pjotr Prins ◽

Erik Garrison

Keyword(s):

Open Source ◽

Source Code ◽

Parallel Execution ◽

Exploratory Analysis ◽

Genomic Diversity ◽

Free Software ◽

Memory Representation ◽

Scalable Algorithms ◽

Dna Variation ◽

Complete Representation

Motivation: Pangenome graphs provide a complete representation of the mutual alignment of collections of genomes. These models offer the opportunity to study the entire genomic diversity of a population, including structurally complex regions. Nevertheless, analyzing hundreds of gigabase-scale genomes using pangenome graphs is difficult as it is not well-supported by existing tools. Hence, fast and versatile software is required to ask advanced questions to such data in an efficient way. Results: We wrote ODGI, a novel suite of tools that implements scalable algorithms and has an efficient in-memory representation of DNA variation graphs. ODGI includes tools for detecting complex regions, extracting loci, removing artifacts, exploratory analysis, manipulation, validation, and visualization. Its fast parallel execution facilitates routine pangenomic tasks, as well as pipelines that can quickly answer complex biological questions of gigabase-scale pangenome graphs. Availability: ODGI is published as free software under the MIT open source license. Source code can be downloaded from https://github.com/pangenome/odgi and documentation is available at https://odgi.readthedocs.io. ODGI can be installed via Bioconda https: //bioconda.github.io/recipes/odgi/README.html or GNU Guix https://github.com/ ekg/guix-genomics/blob/master/odgi.scm.

Download Full-text

Proceedings of ScalA 2021: 12th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems [Title page]

10.1109/scala54577.2021.00001 ◽

2021 ◽

Keyword(s):

Large Scale ◽

Title Page ◽

Scalable Algorithms ◽

Large Scale Systems

Download Full-text

Scalable Algorithms for Identifying Stealthy Attackers in a Game‐Theoretic Framework Using Deception

10.1002/9781119723950.ch3 ◽

2021 ◽

pp. 47-61

Author(s):

Anjon Basak ◽

Charles A. Kamhoua ◽

Sridhar Venkatesan ◽

Marcus Gutierrez ◽

Ahmed H. Anwar ◽

...

Keyword(s):

Scalable Algorithms ◽

Game Theoretic

Download Full-text

A Consensus Algorithm for Linear Support Vector Machines

Management Science ◽

10.1287/mnsc.2021.4042 ◽

2021 ◽

Author(s):

Haimonti Dutta

Keyword(s):

Big Data ◽

Large Scale ◽

Descent Method ◽

Consensus Algorithm ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Gradient Descent Method ◽

Scalable Algorithms ◽

Data Set ◽

Svm Algorithm

In the era of big data, an important weapon in a machine learning researcher’s arsenal is a scalable support vector machine (SVM) algorithm. Traditional algorithms for learning SVMs scale superlinearly with the training set size, which becomes infeasible quickly for large data sets. In recent years, scalable algorithms have been designed which study the primal or dual formulations of the problem. These often suggest a way to decompose the problem and facilitate development of distributed algorithms. In this paper, we present a distributed algorithm for learning linear SVMs in the primal form for binary classification called the gossip-based subgradient (GADGET) SVM. The algorithm is designed such that it can be executed locally on sites of a distributed system. Each site processes its local homogeneously partitioned data and learns a primal SVM model; it then gossips with random neighbors about the classifier learnt and uses this information to update the model. To learn the model, the SVM optimization problem is solved using several techniques, including a gradient estimation procedure, stochastic gradient descent method, and several variants including minibatches of varying sizes. Our theoretical results indicate that the rate at which the GADGET SVM algorithm converges to the global optimum at each site is dominated by an [Formula: see text] term, where λ measures the degree of convexity of the function at the site. Empirical results suggest that this anytime algorithm—where the quality of results improve gradually as computation time increases—has performance comparable to its centralized, pseudodistributed, and other state-of-the-art gossip-based SVM solvers. It is at least 1.5 times (often several orders of magnitude) faster than other gossip-based SVM solvers known in literature and has a message complexity of O(d) per iteration, where d represents the number of features of the data set. Finally, a large-scale case study is presented wherein the consensus-based SVM algorithm is used to predict failures of advanced mechanical components in a chocolate manufacturing process using more than one million data points. This paper was accepted by J. George Shanthikumar, big data analytics.

Download Full-text

scalable algorithms
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Scalable algorithms for semiparametric accelerated failure time models in high dimensions

Strengthening ties towards a highly-connected world

Scalable Algorithms Using Sparse Storage for Parallel Spectral Clustering on GPU

AOI-shapes: An Efficient Footprint Algorithm to Support Visualization of User-defined Urban Areas of Interest

Functional observability and target state estimation in large-scale networks

Attraction Basins in Metaheuristics: A Systematic Mapping Study

ODGI: understanding pangenome graphs

Proceedings of ScalA 2021: 12th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems [Title page]

Scalable Algorithms for Identifying Stealthy Attackers in a Game‐Theoretic Framework Using Deception

A Consensus Algorithm for Linear Support Vector Machines

Export Citation Format

scalable algorithmsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Scalable algorithms for semiparametric accelerated failure time models in high dimensions

Strengthening ties towards a highly-connected world

Scalable Algorithms Using Sparse Storage for Parallel Spectral Clustering on GPU

AOI-shapes: An Efficient Footprint Algorithm to Support Visualization of User-defined Urban Areas of Interest

Functional observability and target state estimation in large-scale networks

Attraction Basins in Metaheuristics: A Systematic Mapping Study

ODGI: understanding pangenome graphs

Proceedings of ScalA 2021: 12th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems [Title page]

Scalable Algorithms for Identifying Stealthy Attackers in a Game‐Theoretic Framework Using Deception

A Consensus Algorithm for Linear Support Vector Machines

scalable algorithms
Recently Published Documents