Refine
Document Type
- Article (8)
- Part of a Book (4)
- Conference Proceeding (3)
- Doctoral Thesis (1)
- Report (1)
Language
- English (17)
Keywords
- Computerarchitektur (1)
- Hardware Design (1)
- Hardwareentwurf (1)
- Information Systems (1)
- Many-Core (1)
- Mehrkernprozessor (1)
- Network Interface (1)
- Parallel Computing (1)
- Parallelverarbeitung (1)
- Software (1)
We recently presented a parallelization approach based on parallel design patterns and leading to structured parallelism. The approach is applicable for the parallelization of sequential code parts of embedded hard real-time software. To reduce work effort it is necessary to rely on tool support. In this context, we here present software for the model-based and multi-objective optimization of a software model with a high degree of parallelism. In addition, we introduce the timing analyzable algorithmic skeletons (TAS) for the fast implementation of the optimized software model. To support the static WCET analysis with the OTAWA toolset, we developed a compact XML format to describe software with TAS instances. Such a model can then easily be translated into the OTAWA XML format representing parallel flow-facts. All software described in this technical report is available under an open source license.
Extended pattern-based parallelization approach for hard real-time systems and its tool support
(2015)
The requirements for today's embedded hard real-time systems are high: They should deliver high performance, be energy-efficient and always react in time. This leads to the use of processors with several cores. However, when the cores are connected via a shared memory, static timing analysis suffers from high pessimism. We see distributed memory many-core processors as a solution where cores communicate via messages. One of them is the Reduced Complexity Many-Core (RC/MC) architecture. It was developed with the goal of high timing predictability.
In our thesis, we present an approach to estimate the Worst-Case Execution Time (WCET) of programs running on this platform. Furthermore, we extend the RC/MC to improve its timing predictability and its worst-case performance. Our first step is the introduction of ready synchronization, which avoids buffer overflows. Second, we design hardware support for broadcasts and multicasts. Third, the RC/MC is extended with hardware supported barriers.
Each of these techniques is evaluated for its impact. We carry out timing analyses of the hardware operations for broadcasts/multicasts and barriers and compare them with their variants without hardware support. Finally, we present three case studies, where we analyze benchmarks taken from the NAS parallel benchmark suite to evaluate the worst-case performance of our extensions in the context of real use cases.