scholarly journals Performance Analysis of Effective Symbolic Methods for Solving Band Matrix SLAEs

2019 ◽  
Vol 214 ◽  
pp. 05004
Author(s):  
Milena Veneva ◽  
Alexander Ayriyan

This paper presents an experimental performance study of implementations of three symbolic algorithms for solving band matrix systems of linear algebraic equations with heptadiagonal, pentadiagonal, and tridiagonal coefficient matrices. The only assumption on the coefficient matrix in order for the algorithms to be stable is nonsingularity. These algorithms are implemented using the GiNaC library of C++ and the SymPy library of Python, considering five different data storing classes. Performance analysis of the implementations is done using the high-performance computing (HPC) platforms “HybriLIT” and “Avitohol”. The experimental setup and the results from the conducted computations on the individual computer systems are presented and discussed. An analysis of the three algorithms is performed.

2019 ◽  
pp. 112-115
Author(s):  
M. Z. Benenson

The  article  discusses  the  use  of  graphics  processing  units  for  solving  large  system  of  linear  algebraic  equations  (SLAE).  A heterogeneous multiprocessor computing platform produced by the NIIVK, whose architecture allows the integration of general‑ purpose microprocessor modules with graphics processor modules was used as an equipment for solving SLAEs. The description  of the SLAE solution program, developed on the basis of the CUBLAS CUDA software interface library, is given. A method is proposed for increasing the accuracy of calculations of linear systems based on the use of a modified Gauss method. It has been  established that the use of the modified Gauss method practically does not increase the program operation time with a significant  increase in the accuracy of calculations. It is concluded that the use of graphics processors for solving SLAEs allows processing  matrices of a larger size compared to the use of general‑purpose microprocessors.


2004 ◽  
Vol 14 (08) ◽  
pp. 2991-2997 ◽  
Author(s):  
PETER C. CHU ◽  
LEONID M. IVANOV ◽  
TATYANA M. MARGOLINA

Reconstruction of processes and fields from noisy data is to solve a set of linear algebraic equations. Three factors affect the accuracy of reconstruction: (a) a large condition number of the coefficient matrix, (b) high noise-to-signal ratio in the source term, and (c) no a priori knowledge of noise statistics. To improve reconstruction accuracy, the set of linear algebraic equations is transformed into a new set with minimum condition number and noise-to-signal ratio using the rotation matrix. The procedure does not require any knowledge of low-order statistics of noises. Several examples including highly distorted Lorenz attractor illustrate the benefit of using this procedure.


2016 ◽  
Author(s):  
Jens Krüger ◽  
Oliver Kohlbacher

Practical experiences are reported about implementing a workflow for the prediction of mass spectra. QCEIMS is used to simulate the fragmentation trajectories consequently leading to predicted mass spectra for small molecules, such as metabolites. The individual calculations are embedded into UNICORE workflow nodes using Docker containerization for the applications themselves. Challenges, caveats, but also advantages are discussed, providing guidance for the deployment of a scientific protocol on high performance computing resources.


2017 ◽  
Author(s):  
Yang-Min Kim ◽  
Jean-Baptiste Poline ◽  
Guillaume Dumas

AbstractReproducibility has been shown to be limited in many scientific fields. This question is a fundamental tenet of the scientific activity, but the related issues of reusability of scientific data are poorly documented. Here, we present a case study of our attempt to reproduce a promising bioinformatics method [1] and illustrate the challenges to use a published method for which code and data were available. First, we tried to re-run the analysis with the code and data provided by the authors. Second, we reimplemented the method in Python to avoid dependency on a MATLAB licence and ease the execution of the code on HPCC (High-Performance Computing Cluster). Third, we assessed reusability of our reimplementation and the quality of our documentation. Then, we experimented with our own software and tested how easy it would be to start from our implementation to reproduce the results, hence attempting to estimate the robustness of the reproducibility. Finally, in a second part, we propose solutions from this case study and other observations to improve reproducibility and research efficiency at the individual and collective level.Availabilitylast version of StratiPy (Python) with two examples of reproducibility are available at GitHub [2][email protected]


The article deals with high-performance information technology (HPC) for the problems of stress-strain analysis at all stages of the life cycle of buildings and structures: construction, operation and reconstruction. The results of numerical simulation of high buildings using software as a processor component based on a new hybrid algorithm for solving systems of linear algebraic equations [1] with a symmetric positive-definite matrix that combines computation on multi-core processors and graphs. It has been found that to accelerate the calculations, hybrid systems that combine multi-core CPUs with accelerator coprocessors, including GPUs, are promising [5]. To test the effectiveness of the proposed parallel algorithm for solving systems of linear algebraic equations [1], numerical experiments were carried out at the most dangerous loads of a 27-story building. Results of numerical researches with use for preprocessor (input of initial data) and postprocessor (output of results of calculations) of processing of the LIRA-SAPR software complex are presented [2, 4, 6]. The results of numerical studies of the behavior of structures of high buildings have shown a multiple reduction in the time of solving systems of linear algebraic equations with symmetric matrices on multiprocessor (multi-core) computers with graphical accelerators using the proposed hybrid algorithms [1]. High-performance technologies based on parallel calculations give more effect than more complex processes: modeling of life cycle of high buildings, bridges, especially complex structures of NPPs, etc. for static and dynamic loads, including emergencies in normal and difficult geological conditions, which make up 70% of Ukraine's territories.


2015 ◽  
Vol 24 (05) ◽  
pp. 1550074 ◽  
Author(s):  
Ali A. El-Moursy ◽  
Wael S. Afifi ◽  
Fadi N. Sibai ◽  
Salwa M. Nassar

STRIKE is an algorithm which predicts protein–protein interactions (PPIs) and determines that proteins interact if they contain similar substrings of amino acids. Unlike other methods for PPI prediction, STRIKE is able to achieve reasonable improvement over the existing PPI prediction methods. Although its high accuracy as a PPI prediction method, STRIKE consumes a large execution time and hence it is considered to be a compute-intensive application. In this paper, we develop and implement a parallel STRIKE algorithm for high-performance computing (HPC) systems. Using a large-scale cluster, the execution time of the parallel implementation of this bioinformatics algorithm was reduced from about a week on a serial uniprocessor machine to about 16.5 h on 16 computing nodes, down to about 2 h on 128 parallel nodes. Communication overheads between nodes are thoroughly studied.


2013 ◽  
Vol 756-759 ◽  
pp. 3070-3073 ◽  
Author(s):  
Er Yan Zhang ◽  
Xiao Feng Zhu

Toeplitz matrix arises in a remarkable variety of applications such as signal processing, time series analysis, image processing. Yule-Walker equation in generalized stationary prediction is linear algebraic equations that use Toeplitz matrix as coefficient matrix. Making better use of the structure of Toeplitz matrix, we present a recursive algorithm of linear algebraic equations from by using Toeplitz matrix as coefficient matrix , and also offer the proof of the recursive formula. The algorithm, making better use of the structure of Toeplitz matrices, effectively reduces calculation cost. For n-order Toeplitz coefficient matrix, the computational complexity of usual Gaussian elimination is about , while this algorithm is about , decreasing of one order of magnitude.


Author(s):  
Alexander Khimich ◽  
Victor Polyanko ◽  
Tamara Chistyakova

Introduction. At present, in science and technology, new computational problems constantly arise with large volumes of data, the solution of which requires the use of powerful supercomputers. Most of these problems come down to solving systems of linear algebraic equations (SLAE). The main problem of solving problems on a computer is to obtain reliable solutions with minimal computing resources. However, the problem that is solved on a computer always contains approximate data regarding the original task (due to errors in the initial data, errors when entering numerical data into the computer, etc.). Thus, the mathematical properties of a computer problem can differ significantly from the properties of the original problem. It is necessary to solve problems taking into account approximate data and analyze computer results. Despite the significant results of research in the field of linear algebra, work in the direction of overcoming the existing problems of computer solving problems with approximate data is further aggravated by the use of contemporary supercomputers, do not lose their significance and require further development. Today, the most high-performance supercomputers are parallel ones with graphic processors. The architectural and technological features of these computers make it possible to significantly increase the efficiency of solving problems of large volumes at relatively low energy costs. The purpose of the article is to develop new parallel algorithms for solving systems of linear algebraic equations with approximate data on supercomputers with graphic processors that implement the automatic adjustment of the algorithms to the effective computer architecture and the mathematical properties of the problem, identified in the computer, as well with estimates of the reliability of the results. Results. A methodology for creating parallel algorithms for supercomputers with graphic processors that implement the study of the mathematical properties of linear systems with approximate data and the algorithms with the analysis of the reliability of the results are described. The results of computational experiments on the SKIT-4 supercomputer are presented. Conclusions. Parallel algorithms have been created for investigating and solving linear systems with approximate data on supercomputers with graphic processors. Numerical experiments with the new algorithms showed a significant acceleration of calculations with a guarantee of the reliability of the results. Keywords: systems of linear algebraic equations, hybrid algorithm, approximate data, reliability of the results, GPU computers.


2019 ◽  
Vol 2019 ◽  
pp. 1-13
Author(s):  
Hamish J. Macintosh ◽  
Jasmine E. Banks ◽  
Neil A. Kelson

Solving diagonally dominant tridiagonal linear systems is a common problem in scientific high-performance computing (HPC). Furthermore, it is becoming more commonplace for HPC platforms to utilise a heterogeneous combination of computing devices. Whilst it is desirable to design faster implementations of parallel linear system solvers, power consumption concerns are increasing in priority. This work presents the oclspkt routine. The oclspkt routine is a heterogeneous OpenCL implementation of the truncated SPIKE algorithm that can use FPGAs, GPUs, and CPUs to concurrently accelerate the solving of diagonally dominant tridiagonal linear systems. The routine is designed to solve tridiagonal systems of any size and can dynamically allocate optimised workloads to each accelerator in a heterogeneous environment depending on the accelerator’s compute performance. The truncated SPIKE FPGA solver is developed first for optimising OpenCL device kernel performance, global memory bandwidth, and interleaved host to device memory transactions. The FPGA OpenCL kernel code is then refactored and optimised to best exploit the underlying architecture of the CPU and GPU. An optimised TDMA OpenCL kernel is also developed to act as a serial baseline performance comparison for the parallel truncated SPIKE kernel since no FPGA tridiagonal solver capable of solving large tridiagonal systems was available at the time of development. The individual GPU, CPU, and FPGA solvers of the oclspkt routine are 110%, 150%, and 170% faster, respectively, than comparable device-optimised third-party solvers and applicable baselines. Assessing heterogeneous combinations of compute devices, the GPU + FPGA combination is found to have the best compute performance and the FPGA-only configuration is found to have the best overall estimated energy efficiency.


Sign in / Sign up

Export Citation Format

Share Document