distributed computing systems
Recently Published Documents


TOTAL DOCUMENTS

668
(FIVE YEARS 75)

H-INDEX

31
(FIVE YEARS 2)

2022 ◽  
Vol 22 (1) ◽  
pp. 1-21
Author(s):  
Iram Bibi ◽  
Adnan Akhunzada ◽  
Jahanzaib Malik ◽  
Muhammad Khurram Khan ◽  
Muhammad Dawood

Volunteer Computing provision of seamless connectivity that enables convenient and rapid deployment of greener and cheaper computing infrastructure is extremely promising to complement next-generation distributed computing systems. Undoubtedly, without tactile Internet and secure VC ecosystems, harnessing its full potentials and making it an alternative viable and reliable computing infrastructure is next to impossible. Android-enabled smart devices, applications, and services are inevitable for Volunteer computing. Contrarily, the progressive developments of sophisticated Android malware may reduce its exponential growth. Besides, Android malwares are considered the most potential and persistent cyber threat to mobile VC systems. To secure Android-based mobile volunteer computing, the authors proposed MulDroid, an efficient and self-learning autonomous hybrid (Long-Short-Term Memory, Convolutional Neural Network, Deep Neural Network) multi-vector Android malware threat detection framework. The proposed mechanism is highly scalable with well-coordinated infrastructure and self-optimizing capabilities to proficiently tackle fast-growing dynamic variants of sophisticated malware threats and attacks with 99.01% detection accuracy. For a comprehensive evaluation, the authors employed current state-of-the-art malware datasets (Android Malware Dataset, Androzoo) with standard performance evaluation metrics. Moreover, MulDroid is compared with our constructed contemporary hybrid DL-driven architectures and benchmark algorithms. Our proposed mechanism outperforms in terms of detection accuracy with a trivial tradeoff speed efficiency. Additionally, a 10-fold cross-validation is performed to explicitly show unbiased results.


2022 ◽  
Vol 4 ◽  
Author(s):  
Alessandro Di Girolamo ◽  
Federica Legger ◽  
Panos Paparrigopoulos ◽  
Jaroslava Schovancová ◽  
Thomas Beermann ◽  
...  

As a joint effort from various communities involved in the Worldwide LHC Computing Grid, the Operational Intelligence project aims at increasing the level of automation in computing operations and reducing human interventions. The distributed computing systems currently deployed by the LHC experiments have proven to be mature and capable of meeting the experimental goals, by allowing timely delivery of scientific results. However, a substantial number of interventions from software developers, shifters, and operational teams is needed to efficiently manage such heterogenous infrastructures. Under the scope of the Operational Intelligence project, experts from several areas have gathered to propose and work on “smart” solutions. Machine learning, data mining, log analysis, and anomaly detection are only some of the tools we have evaluated for our use cases. In this community study contribution, we report on the development of a suite of operational intelligence services to cover various use cases: workload management, data management, and site operations.


2021 ◽  
Vol 12 (6) ◽  
pp. 1-45
Author(s):  
Paolo Notaro ◽  
Jorge Cardoso ◽  
Michael Gerndt

Modern society is increasingly moving toward complex and distributed computing systems. The increase in scale and complexity of these systems challenges O&M teams that perform daily monitoring and repair operations, in contrast with the increasing demand for reliability and scalability of modern applications. For this reason, the study of automated and intelligent monitoring systems has recently sparked much interest across applied IT industry and academia. Artificial Intelligence for IT Operations (AIOps) has been proposed to tackle modern IT administration challenges thanks to Machine Learning, AI, and Big Data. However, AIOps as a research topic is still largely unstructured and unexplored, due to missing conventions in categorizing contributions for their data requirements, target goals, and components. In this work, we focus on AIOps for Failure Management (FM), characterizing and describing 5 different categories and 14 subcategories of contributions, based on their time intervention window and the target problem being solved. We review 100 FM solutions, focusing on applicability requirements and the quantitative results achieved, to facilitate an effective application of AIOps solutions. Finally, we discuss current development problems in the areas covered by AIOps and delineate possible future trends for AI-based failure management.


2021 ◽  
Vol 11 (22) ◽  
pp. 10807
Author(s):  
Fatma Mbarek ◽  
Volodymyr Mosorov

Many computer problems that arise from real-world circumstances are NP-hard, while, in the worst case, these problems are generally assumed to be intractable. Existing distributed computing systems are commonly used for a range of large-scale complex problems, adding advantages to many areas of research. Dynamic load balancing is feasible in distributed computing systems since it is a significant key to maintaining stability of heterogeneous distributed computing systems (HDCS). The challenge of load balancing is an objective function of optimization with exponential complexity of solutions. The problem of dynamic load balancing raises with the scale of the HDCS and it is hard to tackle effectively. The solution to this unsolvable issue is being explored under a particular algorithm paradigm. A new codification strategy, namely hybrid nearest-neighbor ant colony optimization (ACO-NN), which, based on the metaheuristic ant colony optimization (ACO) and an approximate nearest-neighbor (NN) approaches, has been developed to establish a dynamic load balancing algorithm for distributed systems. Several experiments have been conducted to explore the efficiency of this stochastic iterative load balancing algorithm; it is tested with task and nodes accessibility and proved to be effective with diverse performance metrics.


2021 ◽  
Vol 10 (11) ◽  
pp. 763
Author(s):  
Panagiotis Moutafis ◽  
George Mavrommatis ◽  
Michael Vassilakopoulos ◽  
Antonio Corral

Aiming at the problem of spatial query processing in distributed computing systems, the design and implementation of new distributed spatial query algorithms is a current challenge. Apache Spark is a memory-based framework suitable for real-time and batch processing. Spark-based systems allow users to work on distributed in-memory data, without worrying about the data distribution mechanism and fault-tolerance. Given two datasets of points (called Query and Training), the group K nearest-neighbor (GKNN) query retrieves (K) points of the Training with the smallest sum of distances to every point of the Query. This spatial query has been actively studied in centralized environments and several performance improving techniques and pruning heuristics have been also proposed, while, a distributed algorithm in Apache Hadoop was recently proposed by our team. Since, in general, Apache Hadoop exhibits lower performance than Spark, in this paper, we present the first distributed GKNN query algorithm in Apache Spark and compare it against the one in Apache Hadoop. This algorithm incorporates programming features and facilities that are specific to Apache Spark. Moreover, techniques that improve performance and are applicable in Apache Spark are also incorporated. The results of an extensive set of experiments with real-world spatial datasets are presented, demonstrating that our Apache Spark GKNN solution, with its improvements, is efficient and a clear winner in comparison to processing this query in Apache Hadoop.


Electronics ◽  
2021 ◽  
Vol 10 (21) ◽  
pp. 2720
Author(s):  
Yongseok Choi ◽  
Eunji Lim ◽  
Jaekwon Shin ◽  
Cheol-Hoon Lee

Large-scale computational problems that need to be addressed in modern computers, such as deep learning or big data analysis, cannot be solved in a single computer, but can be solved with distributed computer systems. Since most distributed computing systems, consisting of a large number of networked computers, should propagate their computational results to each other, they can suffer the problem of an increasing overhead, resulting in lower computational efficiencies. To solve these problems, we proposed an architecture of a distributed system that used a shared memory that is simultaneously accessible by multiple computers. Our architecture aimed to be implemented in FPGA or ASIC. Using an FPGA board that implemented our architecture, we configured the actual distributed system and showed the feasibility of our system. We compared the results of the deep learning application test using our architecture with that using Google Tensorflow’s parameter server mechanism. We showed improvements in our architecture beyond Google Tensorflow’s parameter server mechanism and we determined the future direction of research by deriving the expected problems.


Vestnik NSUEM ◽  
2021 ◽  
pp. 19-30
Author(s):  
E. V. Agapov ◽  
L. K. Bobrov ◽  
K. A. Zaykov

The work is devoted to the consideration of the main widespread algorithms for scheduling tasks in geographically distributed computing systems. The specific features of the algorithms are characterized and their comparative analysis is presented in accordance with the selected criteria. The main factors that should be taken into account when constructing job control algorithms in geographically distributed computing systems are determined.


2021 ◽  
Vol 7 (3) ◽  
pp. 73-78
Author(s):  
D. Shchemelinin

Monitoring events and predicting the behavior of a dynamic information system are becoming increasingly important due to the globalization of cloud services and a sharp increase in the volume of processed data. Well-known monitoring systems are used for the timely detection and prompt correction of the anomaly, which require new, more effective and proactive forecasting tools. At the CMG-2013 conference, a method for predicting memory leaks in Java applications was presented, which allows IT teams to automatically release resources by safely restarting services when a certain critical threshold value is reached. Article’s solution implements a simple linear mathematical model for describing the historical trend function. However, in practice, the degradation of memory and other computational resources may not occur gradually, but very quickly, depending on the workload, and therefore, solving the forecasting problem using linear methods is not effective enough.


Author(s):  
A. I. Kalyaev

This article describes an approach to solving the problem of searching, identifying and tracking UAVs (Unmanned Aerial Vehicles) using a distributed computing system for processing images from multiple surveillance cameras. Today, the problem of finding UAVs is becoming especially relevant due to their widespread distribution and low cost, which gives a wide scope for illegal use: the implementation of terrorist attacks in crowded places and critical infrastructure, as well as unauthorized tracking of specially protected areas. At the same time, modern radars have low efficiency for searching for UAVs, so today visual detection tools are used, for which effective work requires complex calculations. In this article, it is proposed to use distributed computing systems to solve these complex problems of processing a video stream for the purpose of searching, identifying and tracking objects (UAVs). For this, the author of the article, proceeding from the potential areas of application of such systems, decided to apply a multiagent approach, which makes it possible to create fault-tolerant and scalable systems. In the course of work on the article, softwarefor a distributed computing system for image processing in order to search for unmanned aerial vehicleswas created, a hardware stand was assembled to test it. While performing tests, it was concluded that the proposed method can be applied to implement high-resolution video processing and frame rate in a distributed computing system.


Author(s):  
Vadym Shchur ◽  
Yuriy Kulakov

The article discusses the topical issue of load balancing in distributed computing systems. The analysis of existing solutions is carried out, tasks, problems and practical significance are determined. An improved balancing method using the checkpoint method and an additional confidence factor is proposed, which made it possible to ensure a uniform load on the controllers, while maintaining an acceptable level of efficiency. An assessment of the performance and comparison of the proposed method with existing methods is carried out, as well as steps for further research are indicated.


Sign in / Sign up

Export Citation Format

Share Document