A dynamic reliability management framework for heterogeneous multicore systems

Author(s):  
Alessandro Baldassari ◽  
Cristiana Bolchini ◽  
Antonio Miele
Author(s):  
Pete Cooper ◽  
Uwe Dolinsky ◽  
Alastair F. Donaldson ◽  
Andrew Richards ◽  
Colin Riley ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-19
Author(s):  
Jahanzeb Anwer ◽  
Sebastian Meisner ◽  
Marco Platzner

Radiation tolerance in FPGAs is an important field of research particularly for reliable computation in electronics used in aerospace and satellite missions. The motivation behind this research is the degradation of reliability in FPGA hardware due to single-event effects caused by radiation particles. Redundancy is a commonly used technique to enhance the fault-tolerance capability of radiation-sensitive applications. However, redundancy comes with an overhead in terms of excessive area consumption, latency, and power dissipation. Moreover, the redundant circuit implementations vary in structure and resource usage with the redundancy insertion algorithms as well as number of used redundant stages. The radiation environment varies during the operation time span of the mission depending on the orbit and space weather conditions. Therefore, the overheads due to redundancy should also be optimized at run-time with respect to the current radiation level. In this paper, we propose a technique called Dynamic Reliability Management (DRM) that utilizes the radiation data, interprets it, selects a suitable redundancy level, and performs the run-time reconfiguration, thus varying the reliability levels of the target computation modules. DRM is composed of two parts. The design-time tool flow of DRM generates a library of various redundant implementations of the circuit with different magnitudes of performance factors. The run-time tool flow, while utilizing the radiation/error-rate data, selects a required redundancy level and reconfigures the computation module with the corresponding redundant implementation. Both parts of DRM have been verified by experimentation on various benchmarks. The most significant finding we have from this experimentation is that the performance can be scaled multiple times by using partial reconfiguration feature of DRM, e.g., 7.7 and 3.7 times better performance results obtained for our data sorter and matrix multiplier case studies compared with static reliability management techniques. Therefore, DRM allows for maintaining a suitable trade-off between computation reliability and performance overhead during run-time of an application.


1995 ◽  
Vol 7 (5) ◽  
pp. 7-15 ◽  
Author(s):  
Charles Tennant

Rover Group is the UK’s largest automotive manufacturer employing 35,000 people which designs, develops and manufactures vehicles in the small, medium, executive and specialist four‐wheel‐drive sectors. Describes the processes deployed at Rover to ensure that quality and reliability are designed into the product through the new product introduction process, in order to achieve the company quality strategy milestones. The quality and reliability processes have been developed as a project management framework, known internally as “common business environment”. Describes the product programme milestone philosophy and supporting processes such as design methodology, reliability management, cost management and programme timing synthetics. The processes are deployed into all project teams at Rover Group through a learning methodology called focused learning. Measurement of common business environment implementation is carried out at project Q&R reviews, which are based on the European Foundation for Quality Management self‐assessment criteria.


2011 ◽  
Vol 71 (1) ◽  
pp. 114-131 ◽  
Author(s):  
Juan Carlos Saez ◽  
Daniel Shelepov ◽  
Alexandra Fedorova ◽  
Manuel Prieto

Sign in / Sign up

Export Citation Format

Share Document