This work is devoted to acceleration and upgrade of the CFD code NOISEtte for scale-resolving simulations of compressible turbulent flows using edge-based high-accuracy methods on unstructured hybrid meshes. Attempts to extend the baseline multilevel MPI+OpenMP parallelization towards GPU-based hybrid systems have faced the problem: the code is too complex. It is an in-house research code with plenty of numerical methods, schemes, models, most of which are experimental and are not used in practical simulations. This chaotic zoo leads to excessive conditional branches, switches, redundant functional calls that slow down computations. Although the parallel algorithm is fully adapted to the stream processing paradigm, such an immense amount of code is too difficult to port efficiently to OpenCL or CUDA and maintain it in consistency with the CPU version. An approach to survive in the process of adaptation to hybrid systems has been elaborated. It consists of various components, such as creation of a simplified configurations, combining different stages of the algorithm in order to reduce memory traffic, collapsing multiple functions in one function without branches and switches, mixing single and double precision, etc. As a result, the upgraded code is about twice as fast on CPUs and can use GPUs from different manufacturers AMD, NVIDIA, Intel through the OpenCL standard.

Full document

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document
Back to Top

Document information

Published on 11/03/21
Submitted on 11/03/21

Volume 1400 - Software, High Performance Computing, 2021
DOI: 10.23967/wccm-eccomas.2020.048
Licence: CC BY-NC-SA license

Document Score


Views 23
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?