Refine
Has Fulltext
- yes (12) (remove)
Document Type
- Article (8)
- Part of a Book (1)
- Conference Proceeding (1)
- Doctoral Thesis (1)
- Report (1)
Language
- English (12)
Keywords
Institute
- Fakultät für Angewandte Informatik (9)
- Institut für Informatik (9)
- Lehrstuhl für Embedded Systems (3)
- Lehrstuhl für Systemnahe Informatik und Kommunikationssysteme (3)
- Lehrstuhl für Biomedizinische Informatik, Data Mining und Data Analytics (2)
- Medizinische Fakultät (2)
- Universitätsklinikum (2)
- AG Experimentelle Plasmaphysik (EPP) (1)
- Institut für Physik (1)
- Lehrstuhl für Allgemein- und Viszeralchirurgie (1)
To satisfy the enduring demand for increasing computational power, the processor manufacturers try to raise the performance per Watt of a chip, which can be achieved by minimizing the structure sizes of electronic circuits. However, the technological limits are about to be reached, and the further reduction of the supply voltages and rising frequencies will lead to increased error rates due to transient faults, which result from the miniaturization of transistors, and from the growing number of transistors on a chip.
To mitigate such errors, techniques from dependable server systems and safety-critical embedded systems become attractive in commodity systems as well. However, the typically used cycle-by-cycle lockstep execution of a redundant processor is hardly feasible on a complex out-of-order CPU.
Approaches that enable a loose coupling have been proposed, integrated in hardware as well as software-only approaches. Software mechanisms allow to run specific applications redundantly to detect errors on a COTS (commercial-off-the-shelf) processor without hardware modifications.
In recent years, transactional memory gained interest in the research of fault-tolerant systems.
Its property of isolation and the integrated checkpointing mechanism to guarantee atomicity spawned multiple approaches to utilize transactional memory for fault tolerance. With the availability of first hardware implementations, for example TSX in the more expensive processors of the Intel x86 Core family, software mechanisms that rely on hardware transactional for checkpointing became feasible.
This thesis investigates a fail-operational execution with transactional memory on a COTS Intel CPU, based on loosely-coupled redundant execution. An instrumentation mechanism, which was developed as an optimization pass for the LLVM compilation toolchain, and a support library for POSIX compatible systems provide the functionality for error detection and recovery. The feasibility and the effectiveness of the approach were evaluated with benchmarks of the SPEC2017 benchmark suite.
Results show that a fault-tolerant redundant execution can be achieved on an x86 CPU, and that specific enhancements to the hardware could further improve the overall performance.
Multi-threaded applications require further consideration for redundant execution, since indeterminism can occur between redundant pairs of threads, due to diverging synchronization, for example on mutual exclusion. An interface to the Pthread synchronization functions is described, as well as an error recovery mechanism, to enable the redundant and fail-operational execution of multi-threaded applications. This was evaluated by means of benchmarks of the PARSEC suite to prove that redundant multi-threading is feasible on a COTS CPU. The impact of the additional layer on performance and speedup is shown to be minimal.
Modern safety-critical embedded applications like autonomous driving need to be fail-operational. At the same time, high performance and low power consumption are demanded. A common way to achieve this is the use of heterogeneous multi-cores. When applied to such systems, prevalent fault tolerance mechanisms suffer from some disadvantages: Some (e.g. triple modular redundancy) require a substantial amount of duplication, resulting in high hardware costs and power consumption. Others (e.g. lockstep) require supplementary checkpointing mechanisms to recover from errors. Further approaches (e.g. software-based process-level redundancy) cannot handle the indeterminism introduced by multithreaded execution.
This paper presents a novel approach for fail-operational systems using hardware transactional memory, which can also be used for embedded systems running heterogeneous multi-cores. Each thread is automatically split into transactions, which then execute redundantly. The hardware transactional memory is extended to support multiple versions, which allows the reproduction of atomic operations and recovery in case of an error. In our FPGA-based evaluation, we executed the PARSEC benchmark suite with fault tolerance on 12 cores.
Multiversioning hardware transactional memory for fail-operational multithreaded applications
(2022)
Modern safety-critical embedded applications like autonomous driving need to be fail-operational, while high performance and low power consumption are demanded simultaneously. The prevalent fault tolerance mechanisms suffer from disadvantages: Some (e.g. triple modular redundancy) require a substantial amount of duplication, resulting in high hardware costs and power consumption. Others, like lockstep, require supplementary checkpointing mechanisms to recover from errors. Further approaches (e.g. software-based process-level redundancy) cannot handle the indeterminism caused by multithreaded execution. This paper presents a novel approach for fail-operational systems using hardware transactional memory for embedded systems. The hardware transactional memory is extended to support multiple versions, enabling redundant atomic operations and recovery in case of an error. In our FPGA-based evaluation, we executed the PARSEC benchmark suite with fault tolerance on 12 cores. The evaluation shows that multiversioning can successfully recover from all transient errors with an overhead comparable to fault tolerance mechanisms without recovery.
Hydrogen (H2) produced from renewables will have a growing impact on the global energy dynamics towards sustainable and carbon-neutral standards. The share of green H2 is still too low to meet the net-zero target, while the demand for high-quality hydrogen continues to rise. These factors amplify the need for economically viable H2 generation technologies. The present article aims at evaluating the existing technologies for high-quality H2 production based on solar energy. Technologies such as water electrolysis, photoelectrochemical and solar thermochemical water splitting, liquid metal reactors and plasma conversion utilize solar power directly or indirectly (as carbon-neutral electrons) and are reviewed from the perspective of their current development level, technical limitations and future potential.