Semi-automatic porting of a general fortran CFD code to GPUS: The difficult modules

Abstract

Over the last year, a considerable portion of FEFLO, a general-purpose legacy CFD code operating on unstructured grids, was ported to run on GPUs. Like so many other legacy codes, FEFLO is an adaptive, edge-based finite element code for the solution of compressible and incompressible flow, which was primarily written in Fortran 77 and has previously been ported to vector, shared memory parallel and distributed memory parallel machines. Due to the large size of FEFLO and the likelihood of human error in porting, as well as the desire for continued development within a single codebase, a specialized Python script, based on FParser (Peterson, 2009), was written to perform automated translation from the OpenMP-parallelized loops to GPU kernels implemented in CUDA, along with GPU memory management, while integrating with the existing framework for distributed memory parallelism via MPI. The present paper describes extensions of the script and algorithmic techniques that enable the efficient running on GPUs of the modules that are not straightforward to port. In particular, we consider LU-SGS algorithms, linelet preconditioning and particle-mesh algorithms.

Abstract

Document information

Document Score

Share this document

claim authorship