This work is devoted to acceleration and upgrade of the CFD code NOISEtte for scale-resolving simulations of compressible turbulent flows using edge-based high-accuracy methods on unstructured hybrid meshes. Attempts to extend the baseline multilevel MPI+OpenMP parallelization towards GPU-based hybrid systems have faced the problem: the code is too complex. It is an in-house research code with plenty of numerical methods, schemes, models, most of which are experimental and are not used in practical simulations. This chaotic zoo leads to excessive conditional branches, switches, redundant functional calls that slow down computations. Although the parallel algorithm is fully adapted to the stream processing paradigm, such an immense amount of code is too difficult to port efficiently to OpenCL or CUDA and maintain it in consistency with the CPU version. An approach to survive in the process of adaptation to hybrid systems has been elaborated. It consists of various components, such as creation of a simplified configurations, combining different stages of the algorithm in order to reduce memory traffic, collapsing multiple functions in one function without branches and switches, mixing single and double precision, etc. As a result, the upgraded code is about twice as fast on CPUs and can use GPUs from different manufacturers AMD, NVIDIA, Intel through the OpenCL standard.
Are you one of the authors of this document?