Optimal absolute error starting values for Newton-Raphson calculation of square root

Computing ◽  
1991 ◽  
Vol 46 (1) ◽  
pp. 67-86 ◽  
Author(s):  
P. Montuschi ◽  
M. Mezzalama
2014 ◽  
Vol 2014 ◽  
pp. 1-9 ◽  
Author(s):  
Gayathri R. Prabhu ◽  
Bibin Johnson ◽  
J. Sheeba Rani

A Givens rotation based scalable QRD core which utilizes an efficient pipelined and unfolded 2D multiply and accumulate (MAC) based systolic array architecture with dynamic partial reconfiguration (DPR) capability is proposed. The square root and inverse square root operations in the Givens rotation algorithm are handled using a modified look-up table (LUT) based Newton-Raphson method, thereby reducing the area by 71% and latency by 50% while operating at a frequency 49% higher than the existing boundary cell architectures. The proposed architecture is implemented on Xilinx Virtex-6 FPGA for any real matrices of sizem×n, where4≤n≤8andm≥nby dynamically inserting or removing the partial modules. The evaluation results demonstrate a significant reduction in latency, area, and power as compared to other existing architectures. The functionality of the proposed core is evaluated for a variable length adaptive equalizer.


2006 ◽  
Vol 351 (1) ◽  
pp. 101-110 ◽  
Author(s):  
Peter Kornerup ◽  
Jean-Michel Muller

Electronics ◽  
2020 ◽  
Vol 9 (10) ◽  
pp. 1622
Author(s):  
Feibao Xiao ◽  
Feng Liang ◽  
Bin Wu ◽  
Junzhe Liang ◽  
Shuting Cheng ◽  
...  

As a substitute for the IEEE 754-2008 floating-point standard, Posit, a new kind of number system for floating-point numbers, was put forward recently. Hitherto, some studies have proven that Posit is a better floating-point style than IEEE 754-2008 in some fields. However, most of these studies presented the advantages of Posit from the arithmetical aspect, but none of them suggested it had a better hardware implementation than that of IEEE 754-2008. In this paper, we propose several hardware implementations that contain the Posit adder/subtractor, multiplier, divider, and square root. Our goal is to achieve an arbitrary Posit format and exploit the minimum circuit area, which is required in embedded devices. To implement the minimum circuit area for the divider and square root, the alternating addition and subtraction method is used rather than the Newton–Raphson method. Compared with other works, the area of our divider is about 0.2×–0.7× (FPGA). Furthermore, this paper provides the synthesis results for each critical module with the Xilinx Virtex-7 FPGA VC709 platform.


Computation ◽  
2019 ◽  
Vol 7 (3) ◽  
pp. 41 ◽  
Author(s):  
Cezary J. Walczyk ◽  
Leonid V. Moroz ◽  
Jan L. Cieśliński

We present a new algorithm for the approximate evaluation of the inverse square root for single-precision floating-point numbers. This is a modification of the famous fast inverse square root code. We use the same “magic constant” to compute the seed solution, but then, we apply Newton–Raphson corrections with modified coefficients. As compared to the original fast inverse square root code, the new algorithm is two-times more accurate in the case of one Newton–Raphson correction and almost seven-times more accurate in the case of two corrections. We discuss relative errors within our analytical approach and perform numerical tests of our algorithm for all numbers of the type float.


2010 ◽  
Vol 5 (1) ◽  
pp. 42-52 ◽  
Author(s):  
Daniel M. Mu`ñoz ◽  
Diego F. Sanchez ◽  
Carlos H. Llanos ◽  
Mauricio Ayala-Rincón

Many scientific and engineering applications require to perform a large number of arithmetic operations that must be computed in an efficient manner using a high precision and a large dynamic range. Commonly, these applications are implemented on personal computers taking advantage of the floating-point arithmetic to perform the computations and high operational frequencies. However, most common software architectures execute the instructions in a sequential way due to the von Neumann model and, consequently, several delays are introduced in the data transfer between the program memory and the Arithmetic Logic Unit (ALU). There are several mobile applications which require to operate with a high performance in terms of accuracy of the computations and execution time as well as with low power consumption. Modern Field Programmable Gate Arrays (FPGAs) are a suitable solution for high performance embedded applications given the flexibility of their architectures and their parallel capabilities, which allows the implementation of complex algorithms and performance improvements. This paper describes a parameterizable floating-point library for arithmetic operators based on FPGAs. A general architecture was implemented for addition/subtraction and multiplication and two different architectures based on the Goldschmidt’s and the Newton-Raphson algorithms were implemented for division and square root. Additionally, a tradeoff analysis of the hardware implementation was performed, which enables the designer to choose, for general purpose applications, the suitable bit-width representation and error associated, as well as the area cost, elapsed time and power consumption for each arithmetic operator. Synthesis results have demonstrated the effectiveness of the implemented cores on commercial FPGAs and showed that the most critical parameter is the dedicated Digital Signal Processing (DSP) slices consumption. Simulation results were addressed to compute the mean square error (MSE) and maximum absolute error demonstrating the correctness of the implemented floating-point library and achieving and experimental error analysis. The Newton-Raphson algorithm achieves similar MSE results as the Goldschmidt’s algorithm, operating with similar frequencies; however, the first one saves more logic area and dedicated DSP blocks.


2021 ◽  
Vol 1 (2) ◽  
pp. 1-30
Author(s):  
William B. Langdon ◽  
Oliver Krauss

We use continuous optimisation and manual code changes to evolve up to 1024 Newton-Raphson numerical values embedded in an open source GNU C library glibc square root sqrt to implement a double precision cube root routine cbrt, binary logarithm log2 and reciprocal square root function for C in seconds. The GI inverted square root x -1/2 is far more accurate than Quake’s InvSqrt, Quare root. GI shows potential for automatically creating mobile or low resource mote smart dust bespoke custom mathematical libraries with new functionality.


Sign in / Sign up

Export Citation Format

Share Document