Hardware multiversioning for fail-operational multithreaded applications

  • Modern safety-critical embedded applications like autonomous driving need to be fail-operational. At the same time, high performance and low power consumption are demanded. A common way to achieve this is the use of heterogeneous multi-cores. When applied to such systems, prevalent fault tolerance mechanisms suffer from some disadvantages: Some (e.g. triple modular redundancy) require a substantial amount of duplication, resulting in high hardware costs and power consumption. Others (e.g. lockstep) require supplementary checkpointing mechanisms to recover from errors. Further approaches (e.g. software-based process-level redundancy) cannot handle the indeterminism introduced by multithreaded execution. This paper presents a novel approach for fail-operational systems using hardware transactional memory, which can also be used for embedded systems running heterogeneous multi-cores. Each thread is automatically split into transactions, which then execute redundantly. The hardwareModern safety-critical embedded applications like autonomous driving need to be fail-operational. At the same time, high performance and low power consumption are demanded. A common way to achieve this is the use of heterogeneous multi-cores. When applied to such systems, prevalent fault tolerance mechanisms suffer from some disadvantages: Some (e.g. triple modular redundancy) require a substantial amount of duplication, resulting in high hardware costs and power consumption. Others (e.g. lockstep) require supplementary checkpointing mechanisms to recover from errors. Further approaches (e.g. software-based process-level redundancy) cannot handle the indeterminism introduced by multithreaded execution. This paper presents a novel approach for fail-operational systems using hardware transactional memory, which can also be used for embedded systems running heterogeneous multi-cores. Each thread is automatically split into transactions, which then execute redundantly. The hardware transactional memory is extended to support multiple versions, which allows the reproduction of atomic operations and recovery in case of an error. In our FPGA-based evaluation, we executed the PARSEC benchmark suite with fault tolerance on 12 cores.show moreshow less

Download full text files

  • Postprint. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Rico AmslingerGND, Christian PiatkaGND, Florian HaasGND, Sebastian Weis, Theo UngererORCiDGND, Sebastian AltmeyerGND
URN:urn:nbn:de:bvb:384-opus4-811751
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/81175
ISBN:9781728199245OPAC
Parent Title (English):2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 9-11 Sept. 2020, Porto, Portugal
Publisher:IEEE
Place of publication:Piscataway, NJ
Editor:Miguel Areias, Jorge Barbosa, Inês Dutra
Type:Part of a Book
Language:English
Year of first Publication:2020
Publishing Institution:Universität Augsburg
Release Date:2020/11/11
First Page:20
Last Page:27
DOI:https://doi.org/10.1109/sbac-pad49847.2020.00014
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Embedded Systems
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):Deutsches Urheberrecht