scholarly journals Methods for implementing the Kuznyechik algorithm on FPGAs

2018 ◽  
Vol 28 (3) ◽  
pp. 64-70
Author(s):  
I. I. Kalistru ◽  
M. A. Borodin ◽  
A. S. Rybkin ◽  
R. A. Gladko

Increased volumes and speed of data transmission over computer networks, and also the need to protect the transmitted data, require accordingly to increase the speed of cryptographic data processing. One of the ways to achieve high performance is implementation of FPGAs-based cryptographic equipment. Therewith, to cut the cost of equipment, it is important that encryption modules shall consume a minimum possible hardware resources. The work aims to find the most compact high-speed solution for FPGA-based Kuznyechik block cipher. Several methods for hardware implementation of linear transformation, which is used in Kuznyechik cipher, have been reviewed. Various aspects of implementation of these methods taking into account the architecture of target FPGAs are investigated. We also consider aspects of the FPGA implementation of nonlinear transformation, which is used in Kuznyechik block cipher. Resource consumption by various implemented solutions of linear transformation has been estimated. A relatively compact high-speed implemented solution of Kuznyechik block cipher has been obtained and tested on the real equipment. The achieved values of speed for iterative and fully pipelined implementations of the algorithm have been presented.

2012 ◽  
Vol 479-481 ◽  
pp. 65-70
Author(s):  
Xiao Hui Zhang ◽  
Liu Qing ◽  
Mu Li

Based on the target detection of alignment template, the paper designs a lane alignment template by using correlation matching method, and combines with genetic algorithm for template stochastic matching and optimization to realize the lane detection. In order to solve the real-time problem of lane detection algorithm based on genetic algorithm, this paper uses the high performance multi-core DSP chip TMS320C6474 as the core, combines with high-speed data transmission technology of Rapid10, realizes the hardware parallel processing of the lane detection algorithm. By Rapid10 bus, the data transmission speed between the DSP and the DSP can reach 3.125Gbps, it basically realizes transmission without delay, and thereby solves the high speed transmission of the large data quantity between processor. The experimental results show that, no matter the calculated lane line, or the running time is better than the single DSP and PC at the parallel C6474 platform. In addition, the road detection is accurate and reliable, and it has good robustness.


2016 ◽  
Vol 133 (8) ◽  
pp. 17-20
Author(s):  
V.A. Suryawanshi ◽  
G.C. Manna ◽  
S.S. Dorale

2016 ◽  
Vol 120 (1233) ◽  
pp. 1726-1745 ◽  
Author(s):  
A. Travis Krebs ◽  
B. Dr. Götz Bramesfeld

ABSTRACTA multi-objective optimisation process is used to design winglets for a high-performance sailplane. The primary optimisation objective is to maximise the average cross-country speed over a range of thermal strengths. Additional contributions to the cost functions are the limitation of the total drag during high-speed cruise and the additional root bending moment due to the winglet. Rather than being a pure design study, the purpose of the herein presented study is to demonstrate that a multi-objective optimisation approach is a suitable and efficient alternative to the more traditional, experienced-based design approach. The flight performance of the winglet designs are evaluated using a higher-order potential flow method. Results of the optimisation are hand-selected for further analysis. They are compared to a traditionally designed winglet for the same aircraft, designed with similar objectives in mind. The chosen final designs provide an increase in average cross-country speed of 1.5% at lower thermal strengths and 0.4% at higher thermal strengths when compared to the traditional design. When approximating the effects of trim drag due to wing loading and static margin, these performance gains fall to 0.6% and 0.1% respectively, more closely matching the performance of the traditionally designed winglet. The final designs, along with the traditional design, provide performance benefits across all airspeeds of the flight envelope of the base aircraft without winglets.


Electronics ◽  
2021 ◽  
Vol 10 (20) ◽  
pp. 2546
Author(s):  
Alessandro Gabrielli ◽  
Fabrizio Alfonsi ◽  
Alberto Annovi ◽  
Alessandra Camplani ◽  
Alessandro Cerri

In recent years, the technological node used to implement FPGA devices has led to very high performance in terms of computational capacity and in some applications these can be much more efficient than CPUs or other programmable devices. The clock managers and the enormous versatility of communication technology through digital transceivers place FPGAs in a prime position for many applications. For example, from real-time medical image analysis to high energy physics particle trajectory recognition, where computation time can be crucial, the benefits of using frontier FPGA capabilities are even more relevant. This paper shows an example of FPGA hardware implementation, via a firmware design, of a complex analytical algorithm: The Hough transform. This is a mathematical spatial transformation used here to facilitate on-the-fly recognition of the trajectories of ionising particles as they pass through the so-called tracker apparatus within high-energy physics detectors. This is a general study to demonstrate that this technique is not only implementable via software-based systems, but can also be exploited using consumer hardware devices. In this context the latter are known as hardware accelerators. In this article in particular, the Xilinx UltraScale+ FPGA is investigated as it belongs to one of the frontier family devices on the market. These FPGAs make it possible to reach high-speed clock frequencies at the expense of acceptable energy consumption thanks to the 14 nm technological node used by the vendor. These devices feature a huge number of gates, high-bandwidth memories, transceivers and other high-performance electronics in a single chip, enabling the design of large, complex and scalable architectures. In particular the Xilinx Alveo U250 has been investigated. A target frequency of 250 MHz and a total latency of 30 clock periods have been achieved using only the 17 ÷ 53% of LUTs, the 8 ÷ 12% of DSPs, the 1 ÷ 3% of Block Rams and a Flip Flop occupancy range of 9 ÷ 28%.


2021 ◽  
Author(s):  
David Moss

Abstract We propose and experimentally demonstrate a microwave photonic intensity differentiator based on a Kerr optical comb generated by a compact integrated micro-ring resonator (MRR). The on-chip Kerr optical comb, containing a large number of comb lines, serves as a high-performance multi-wavelength source for the transversal filter, which will greatly reduce the cost, size, and complexity of the system. Moreover, owing to the compactness of the integrated MRR, up to 200-GHz frequency spacing of the Kerr optical comb can be achieved, enabling a potential operation bandwidth of over 100 GHz. By programming and shaping individual comb lines according to the calculated tap weights, a reconfigurable intensity differentiator with variable differentiation orders can be realized. The operation principle is theoretically analyzed, and experimental demonstrations of first-order, second-order, and third-order differentiation functions based on the principle are presented. The radio frequency (RF) amplitude and phase responses of multi-order intensity differentiations are characterized, and system demonstrations of real-time differentiations for Gaussian input signal are also performed. The experimental results show good agreement with theory, confirming the effectiveness of our approach.


2013 ◽  
Vol 596 ◽  
pp. 199-203 ◽  
Author(s):  
Yosuke Iijima ◽  
Yasushi Yuminaka

High-speed interfaces become an important role to achieve high performance VLSIsystems. This paper demonstrates a high-speed data transmission technique using Tomlinson-Harashima Precoding (THP). The THP can compensate for low-pass effect of an interconnec-tion at a transmitter, and it can also limit peak and average power of a transmitted signal. Inthis paper, a 200Mbps 4-PAM(Pulse-amplitude modulation) transmitter is designed and simu-lated to demonstrate the THP performance. The experimental implementation using an FPGAdemonstrates high-speed transmission over a long 3D2V coaxial cable.


2014 ◽  
Vol 971-973 ◽  
pp. 1581-1585 ◽  
Author(s):  
Jun Liu ◽  
Yan Tian ◽  
Wei Hao ◽  
Lei Qu

In order to meet the request of high-speed data exchange in embedded systems, this paper details the high-speed SRIO (Serial RapidIO) interface protocol and the process of SRIO access timing between the local endpoint devices and the remote endpoint devices. And also we implement the design of the new high-performance RapidIO interconnection between DSP and FPGA. Through the performance testing of SRIO data transmission system, experimental results show that the design can stably transfer data at high speed between processors.


2011 ◽  
Vol 2011 ◽  
pp. 1-10 ◽  
Author(s):  
Indranil Hatai ◽  
Indrajit Chakrabarti

This paper deals with an FPGA implementation of a high performance FM modulator and demodulator for software defined radio (SDR) system. The individual component of proposed FM modulator and demodulator has been optimized in such a way that the overall design consists of a high-speed, area optimized and low-power features. The modulator and demodulator contain an optimized direct digital frequency synthesizer (DDFS) based on quarter-wave symmetry technique for generating the carrier frequency with spurious free dynamic range (SFDR) of more than 64 dB. The FM modulator uses pipelined version of the DDFS to support the up conversion in the digital domain. The proposed FM modulator and demodulator has been implemented and tested using XC2VP30-7ff896 FPGA as a target device and can operate at a maximum frequency of 334.5 MHz and 131 MHz involving around 1.93 K and 6.4 K equivalent gates for FM modulator and FM demodulator respectively. After applying a 10 KHz triangular wave input and by setting the system clock frequency to 100 MHz using Xpower the power has been calculated. The FM modulator consumes 107.67 mW power while FM demodulator consumes 108.67 mW power for the same input running at same data rate.


2010 ◽  
Vol 439-440 ◽  
pp. 41-45
Author(s):  
Xiao Chen

For high-speed data acquisition and real-time transmission and processing requirements of ultrasonic measurement system, a wireless USB-based ultrasonic data transmission method was presented, combining wireless communication technology with the advantages of USB interface technology. The system consists of ultrasonic signal acquisition module, data transmission module, data acquisition module and the computer. The system uses Cypress Semiconductor's PSoC CYRF69213 chip for wireless data transmission and the chip microcomputer inside as the main control unit. This chip recorded data to the computer and displayed through USB interface. The system is a single-chip USB interface design with very few external components. The system has an USB interface advantages such supporting hot-swappable, plug and play features and the realization of wireless transmission of data without the need for layout of communications cables. The transmission system has good stability, small size, low power, high-performance features, which has a good application prospects.


Author(s):  
Charanjit Singh ◽  
Balwinder Singh

In this paper, a new high speed control circuit is proposed which will act as a critical path for the data which will go from input to output to improve the performance of wave pipelining circuits The wave pipelining is a method of high performance circuit designs which implements pipelining in logic without the use of intermediate registers. Wave pipelining has been widely used in the past few years with a great deal of significant features in technology and applications. It has the ability to improve speed, efficiency, economy in every aspect which it presents. Wave pipelining is being used in wide range of applications like digital filters, network routers, multipliers, fast convolvers, MODEMs, image processing, control systems, radars and many others. In previous work, the operating speed of the wave-pipelined circuit can be increased by the following three tasks: adjustment of the clock period, clock skew and equalization of path delays. The path-delay equalization task can be done theoretically, but the real challenge is to accomplish it in the presence of various different delays. So, the main objective of this paper is to solve the path delay equalization problem by inserting the control circuit in wave pipelined based circuit which will act as critical path for the data that moves from input to output. The proposed technique is evaluated for DSP applications by designing 4- tap FIR filter using Distributed arithmetic algorithm (DAA). Then comparison of this design is done with 4-tap FIR filter designs using conventional pipelining and non pipelining. The synthesis and simulation results based on Xilinx ISE Navigator 12.3 shows that wave pipelined DAA based filter is faster by a factor of 1.43 compared to non pipelined one and the conventional pipelined filter is faster than non pipelined by factor of 1.61 but at the cost of increased logic utilization by 200 %. So, the wave-pipelined DA filters designed with the proposed control circuit can operate at higher frequency than that of non-pipelined but less than that of pipelined. The gain in speed in pipelined compared to that of wavepipelined is at the cost of increased area and more dissipated power. When latency is considered, wavepipelined design filters with the proposed scheme are having the lowest latency among three schemes designed.


Sign in / Sign up

Export Citation Format

Share Document