Performance improvements for iterative electron tomography reconstruction using graphics processing units (GPUs)

W.J. Palenstijn; K.J. Batenburg; J. Sijbers

doi:10.1016/j.jsb.2011.07.017

High-performance iterative electron tomography reconstruction with long-object compensation using graphics processing units (GPUs)

Journal of Structural Biology ◽

10.1016/j.jsb.2010.03.018 ◽

2010 ◽

Vol 171 (2) ◽

pp. 142-153 ◽

Cited By ~ 53

Author(s):

Wei Xu ◽

Fang Xu ◽

Mel Jones ◽

Bettina Keszthelyi ◽

John Sedat ◽

...

Keyword(s):

Graphics Processing Units ◽

High Performance ◽

Electron Tomography ◽

Tomography Reconstruction ◽

Graphics Processing

Download Full-text

SU-DD-A4-03: Cone Beam Computed Tomography Reconstruction on Graphics Processing Units: Flat3D Texture Techniques

Medical Physics ◽

10.1118/1.3468001 ◽

2010 ◽

Vol 37 (6Part6) ◽

pp. 3092-3092

Author(s):

J Kim ◽

L Ren ◽

J Jin ◽

H Zhong ◽

I J Chetty

Keyword(s):

Computed Tomography ◽

Cone Beam Computed Tomography ◽

Graphics Processing Units ◽

Cone Beam ◽

Tomography Reconstruction ◽

Computed Tomography Reconstruction ◽

Graphics Processing

Download Full-text

Efficient Parallel Implementations of LWE-Based Post-Quantum Cryptosystems on Graphics Processing Units

Mathematics ◽

10.3390/math8101781 ◽

2020 ◽

Vol 8 (10) ◽

pp. 1781

Author(s):

SangWoo An ◽

Seog Chung Seo

Keyword(s):

Cloud Computing ◽

Parallel Processing ◽

Graphics Processing Units ◽

Optimization Techniques ◽

Parallel Optimization ◽

Processing Unit ◽

Processing Technologies ◽

Performance Improvements ◽

Cloud Computing Service ◽

Graphics Processing

With the development of the Internet of Things (IoT) and cloud computing technology, various cryptographic systems have been proposed to protect increasing personal information. Recently, Post-Quantum Cryptography (PQC) algorithms have been proposed to counter quantum algorithms that threaten public key cryptography. To efficiently use PQC in a server environment dealing with large amounts of data, optimization studies are required. In this paper, we present optimization methods for FrodoKEM and NewHope, which are the NIST PQC standardization round 2 competition algorithms in the Graphics Processing Unit (GPU) platform. For each algorithm, we present a part that can perform parallel processing of major operations with a large computational load using the characteristics of the GPU. In the case of FrodoKEM, we introduce parallel optimization techniques for matrix generation operations and matrix arithmetic operations such as addition and multiplication. In the case of NewHope, we present a parallel processing technique for polynomial-based operations. In the encryption process of FrodoKEM, the performance improvements have been confirmed up to 5.2, 5.75, and 6.47 times faster than the CPU implementation in FrodoKEM-640, FrodoKEM-976, and FrodoKEM-1344, respectively. In the encryption process of NewHope, the performance improvements have been shown up to 3.33 and 4.04 times faster than the CPU implementation in NewHope-512 and NewHope-1024, respectively. The results of this study can be used in the IoT devices server or cloud computing service server. In addition, the results of this study can be utilized in image processing technologies such as facial recognition technology.

Download Full-text

High performance computing on graphics processing units

Pollack Periodica ◽

10.1556/pollack.3.2008.2.3 ◽

2008 ◽

Vol 3 (2) ◽

pp. 27-34 ◽

Cited By ~ 2

Author(s):

Balázs Tukora ◽

Tibor Szalay

Keyword(s):

High Performance Computing ◽

Graphics Processing Units ◽

High Performance ◽

Graphics Processing ◽

Performance Computing

Download Full-text

Embedded Gold Markers for Improved TEM/STEM Tomography Reconstruction

ISTFA 2008: Conference Proceedings from the 34th International Symposium for Testing and Failure Analysis ◽

10.31399/asm.cp.istfa2008p0172 ◽

2008 ◽

Author(s):

Jian-Shing Luo ◽

Chia-Chi Huang ◽

Jeremy D. Russell

Keyword(s):

3D Reconstruction ◽

Feature Tracking ◽

Semiconductor Device ◽

Electron Tomography ◽

Tracking Process ◽

Carbon Coated ◽

Novel Method ◽

Tomography Reconstruction ◽

Tilt Series ◽

20 Nm

Abstract Electron tomography includes four main steps: tomography data acquisition, image processing, 3D reconstruction, and visualization. After acquisition, tilt-series alignments are performed. Two methods are used to align the tilt-series: cross-correlation and feature tracking. Normally, about 10-20 nm of fiducial markers, such as gold beads, are deposited onto one side of 100 mesh carbon-coated grids during the feature-tracking process. This paper presents a novel method for preparing electron tomography samples with gold beads inside to improve the feature tracking process and quality of 3D reconstruction. Results show that the novel electron tomography sample preparation method improves image alignment, which is essential for successful tomography in many contemporary semiconductor device structures.

Download Full-text

Parallel Option Pricing with Fourier Space Time-Stepping Method on Graphics Processing Units

SSRN Electronic Journal ◽

10.2139/ssrn.1020207 ◽

2007 ◽

Cited By ~ 1

Author(s):

Vladimir Surkov

Keyword(s):

Option Pricing ◽

Graphics Processing Units ◽

Space Time ◽

Fourier Space ◽

Time Stepping ◽

Graphics Processing

Download Full-text

Improving the Efficiency and the Accuracy of 2D Gel Electrophoresis Spot Detection Using Graphics Processing Units

Current Bioinformatics ◽

10.2174/1574893612666170725141905 ◽

2018 ◽

Vol 13 (2) ◽

pp. 193-206 ◽

Cited By ~ 1

Author(s):

Marwa K. Elteir ◽

Shaheera A. Rashwan ◽

Ashraf A. Khalil

Keyword(s):

Gel Electrophoresis ◽

Graphics Processing Units ◽

2D Gel Electrophoresis ◽

Spot Detection ◽

2D Gel ◽

Graphics Processing

Download Full-text

Using graphics processing units on the cloud to accelerate and reduce processing cost of parameters estimation of seismic processing algorithm

10.22564/16cisbgf2019.221 ◽

2019 ◽

Author(s):

Nicholas Okita ◽

Tiago Coimbra ◽

José Ribeiro ◽

Martin Tygel

Keyword(s):

Graphics Processing Units ◽

Parameters Estimation ◽

Processing Algorithm ◽

Seismic Processing ◽

Processing Cost ◽

Graphics Processing

Download Full-text

Review of smoothed particle hydrodynamics: towards converged Lagrangian flow modelling

Proceedings of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rspa.2019.0801 ◽

2020 ◽

Vol 476 (2241) ◽

pp. 20190801

Author(s):

Steven J. Lind ◽

Benedict D. Rogers ◽

Peter K. Stansby

Keyword(s):

Smoothed Particle Hydrodynamics ◽

Graphics Processing Units ◽

Wave Structure ◽

Free Form ◽

Mesh Free ◽

Weakly Compressible ◽

Particle Hydrodynamics ◽

Massively Parallel Computing ◽

Smoothed Particle ◽

Graphics Processing

This paper presents a review of the progress of smoothed particle hydrodynamics (SPH) towards high-order converged simulations. As a mesh-free Lagrangian method suitable for complex flows with interfaces and multiple phases, SPH has developed considerably in the past decade. While original applications were in astrophysics, early engineering applications showed the versatility and robustness of the method without emphasis on accuracy and convergence. The early method was of weakly compressible form resulting in noisy pressures due to spurious pressure waves. This was effectively removed in the incompressible (divergence-free) form which followed; since then the weakly compressible form has been advanced, reducing pressure noise. Now numerical convergence studies are standard. While the method is computationally demanding on conventional processors, it is well suited to parallel processing on massively parallel computing and graphics processing units. Applications are diverse and encompass wave–structure interaction, geophysical flows due to landslides, nuclear sludge flows, welding, gearbox flows and many others. In the state of the art, convergence is typically between the first- and second-order theoretical limits. Recent advances are improving convergence to fourth order (and higher) and these will also be outlined. This can be necessary to resolve multi-scale aspects of turbulent flow.

Download Full-text

Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software

ACM Transactions on Mathematical Software ◽

10.1145/3441850 ◽

2021 ◽

Vol 47 (2) ◽

pp. 1-28

Author(s):

Goran Flegar ◽

Hartwig Anzt ◽

Terry Cojean ◽

Enrique S. Quintana-Ortí

Keyword(s):

Linear Algebra ◽

Graphics Processing Units ◽

High Performance ◽

Numerical Algorithms ◽

Mixed Precision ◽

Before And After ◽

Memory Accesses ◽

Specialized Hardware ◽

The Individual ◽

Graphics Processing

The use of mixed precision in numerical algorithms is a promising strategy for accelerating scientific applications. In particular, the adoption of specialized hardware and data formats for low-precision arithmetic in high-end GPUs (graphics processing units) has motivated numerous efforts aiming at carefully reducing the working precision in order to speed up the computations. For algorithms whose performance is bound by the memory bandwidth, the idea of compressing its data before (and after) memory accesses has received considerable attention. One idea is to store an approximate operator–like a preconditioner–in lower than working precision hopefully without impacting the algorithm output. We realize the first high-performance implementation of an adaptive precision block-Jacobi preconditioner which selects the precision format used to store the preconditioner data on-the-fly, taking into account the numerical properties of the individual preconditioner blocks. We implement the adaptive block-Jacobi preconditioner as production-ready functionality in the Ginkgo linear algebra library, considering not only the precision formats that are part of the IEEE standard, but also customized formats which optimize the length of the exponent and significand to the characteristics of the preconditioner blocks. Experiments run on a state-of-the-art GPU accelerator show that our implementation offers attractive runtime savings.

Download Full-text