Abstract

Real-time object recognition on low-power embedded devices is a widely requested task, needed in manifold applications. However, it is still a demanding challenge to achieve desired performance goals. For example, for advanced driver assistance systems (ADAS) or autonomously driven [...]

Abstract

This work is devoted to acceleration and upgrade of the CFD code NOISEtte for scale-resolving simulations of compressible turbulent flows using edge-based high-accuracy methods on unstructured hybrid meshes. Attempts to extend the baseline multilevel MPI+OpenMP parallelization towards [...]

Abstract

The sparse matrix-vector product is a widespread operation amongst the scientific computing community. It represents the dominant computational cost in many large-scale simulations relying on iterative methods, and its performance is sensitive to the sparse pattern, the storage format [...]

Abstract

Continuous enhancement in hardware technologies enables scientific computing to advance incessantly and reach further aims. Since the start of the global race for exascale high-performance computing, massively-parallel devices of various architectures have been incorporated into [...]